Automatic Classification of News Subjects in Broadcast News: Application to a Gender Bias Representation Analysis

Pelloin, Valentin; Dodson, Lena; Chapuis, Émile; Hervé, Nicolas; Doukhan, David

Computer Science > Computation and Language

arXiv:2407.14180 (cs)

[Submitted on 19 Jul 2024]

Title:Automatic Classification of News Subjects in Broadcast News: Application to a Gender Bias Representation Analysis

Authors:Valentin Pelloin, Lena Dodson, Émile Chapuis, Nicolas Hervé, David Doukhan

View PDF HTML (experimental)

Abstract:This paper introduces a computational framework designed to delineate gender distribution biases in topics covered by French TV and radio news. We transcribe a dataset of 11.7k hours, broadcasted in 2023 on 21 French channels. A Large Language Model (LLM) is used in few-shot conversation mode to obtain a topic classification on those transcriptions. Using the generated LLM annotations, we explore the finetuning of a specialized smaller classification model, to reduce the computational cost. To evaluate the performances of these models, we construct and annotate a dataset of 804 dialogues. This dataset is made available free of charge for research purposes. We show that women are notably underrepresented in subjects such as sports, politics and conflicts. Conversely, on topics such as weather, commercials and health, women have more speaking time than their overall average across all subjects. We also observe representations differences between private and public service channels.

Comments:	Accepted to Interspeech 2024
Subjects:	Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2407.14180 [cs.CL]
	(or arXiv:2407.14180v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2407.14180

Submission history

From: Valentin Pelloin [view email]
[v1] Fri, 19 Jul 2024 10:15:45 UTC (58 KB)

Computer Science > Computation and Language

Title:Automatic Classification of News Subjects in Broadcast News: Application to a Gender Bias Representation Analysis

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Automatic Classification of News Subjects in Broadcast News: Application to a Gender Bias Representation Analysis

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators