Task 1: Subjectivity

Definition

Systems are challenged to distinguish whether a sentence from a news article expresses the subjective view of the author behind it or presents an objective view on the covered topic instead.

This is a binary classification tasks in which systems have to identify whether a text sequence (a sentence or a paragraph) is subjective (SUBJ) or objective (OBJ).

The task comprises three settings:

Monolingual: train and test on data in a given language L
Multilingual: train and test on data comprising several languages
Zero-shot: train on several languages and test on unseen languages

Datasets

We provide training data in five languages:

Note: Test datasets will be made available once the evaluation cycle starts.

Note: For multilingual and zero-shot settings, all language-specific training data can be used.

Annotation Guidelines

Information regarding the annotation guidelines can be found in the following paper: A Corpus for Sentence-Level Subjectivity Detection on English News Articles.

Evaluation

The official evaluation is macro-averaged F1 between the two classes.

Submission

Scorers, Format Checkers, and Baseline Scripts

All scripts can be found on gitlab at CheckThat! Lab Task 1 repository

Submission guidelines

Make sure that you create one account for each team, and submit it through one account only.
The last file submitted to the leaderboard will be considered as the final submission.
For subtask there are 5 languages (Arabic, Bulgarian, English, German, and Italian). Moreover, we define a multi-lingual evaluation scenario where we use a balanced sub-sample of all 5 languages to define multi-lingual training and evaluation splits.
The name of each output file has to be subtask_[LANG}.tsv where LANG can be arabic, bulgarian, english, german, italian, or multilingual.
Get sure to set .tsv as the file extension; otherwise, you will get an error on the leaderboard.
Examples of submission file names should be subtask_arabic.tsv, subtask_bulgarian.tsv, subtask_english.tsv, subtask_german.tsv, subtask_italian.tsv, subtask_multilingual.tsv.
You have to zip the tsv into a file with the same name, e.g., subtask_arabic.zip, and submit it through the codalab page.
If you participate in the task for more than one language, for each language you must do a different submission.
It is required to submit the team name for each submission and fill out the questionnaire (link will be provided, once the evaluation cycle started started) to provide some details on your approach as we need that information for the overview paper. Your team name must EXACTLY match the one used during the CLEF registration.
You are allowed to submit max 200 submissions per day for each subtask.
We will keep the leaderboard private till the end of the submission period, hence, results will not be available upon submission. All results will be available after the evaluation period.

Submission Site

The submission is done through the Codalab platform at https://codalab.lisn.upsaclay.fr/competitions/22756

Leaderboard

TBA

Organizers

Federico Ruggeri, DISI, University of Bologna, Italy,
Arianna Muti, MilaNLP, Bocconi, Milan
Katerina Korre, DIT, University of Bologna, Italy

Note: Language-specific curators will be announced once the evaluation cycle ended.

Arabic

Rafi Ul Biswas, Hamad Bin Khalifa University (HBKU), Qatar
Wajdi Zaghouani, Northwestern University in Qatar, Qatar
Firoj Alam, Qatar Computing Research Institute, HBKU, Qatar

Contact

For queries, please join the Slack channel

Alternatively, please send an email to: clef-factcheck@googlegroups.com

CheckThat! Lab at CLEF 2025

Contents