Task 1: Subjectivity
Definition
Systems are challenged to distinguish whether a sentence from a news article expresses the subjective view of the author behind it or presents an objective view on the covered topic instead.
This is a binary classification tasks in which systems have to identify whether a text sequence (a sentence or a paragraph) is subjective (SUBJ) or objective (OBJ).
The task comprises three settings:
- Monolingual: train and test on data in a given language L
- Multilingual: train and test on data comprising several languages
- Zero-shot: train on several languages and test on unseen languages
Datasets
We provide training data in five languages:
Note: Test datasets will be made available once the evaluation cycle starts.
Note: For multilingual and zero-shot settings, all language-specific training data can be used.
Annotation Guidelines
Information regarding the annotation guidelines can be found in the following paper: A Corpus for Sentence-Level Subjectivity Detection on English News Articles.
Evaluation
The official evaluation is macro-averaged F1 between the two classes.
Submission
All scripts can be found on gitlab at CheckThat! Lab Task 1 repository
Submission guidelines
- Make sure that you create one account for each team, and submit it through one account only.
- The last file submitted to the leaderboard will be considered as the final submission.
- For subtask there are 5 languages (Arabic, Bulgarian, English, German, and Italian). Moreover, we define a multi-lingual evaluation scenario where we use a balanced sub-sample of all 5 languages to define multi-lingual training and evaluation splits.
- The name of each output file has to be subtask_[LANG}.tsv where LANG can be arabic, bulgarian, english, german, italian, or multilingual.
- Get sure to set .tsv as the file extension; otherwise, you will get an error on the leaderboard.
- Examples of submission file names should be subtask_arabic.tsv, subtask_bulgarian.tsv, subtask_english.tsv, subtask_german.tsv, subtask_italian.tsv, subtask_multilingual.tsv.
- You have to zip the tsv into a file with the same name, e.g., subtask_arabic.zip, and submit it through the codalab page.
- If you participate in the task for more than one language, for each language you must do a different submission.
- It is required to submit the team name for each submission and fill out the questionnaire (link will be provided, once the evaluation cycle started started) to provide some details on your approach as we need that information for the overview paper. Your team name must EXACTLY match the one used during the CLEF registration.
- You are allowed to submit max 200 submissions per day for each subtask.
- We will keep the leaderboard private till the end of the submission period, hence, results will not be available upon submission. All results will be available after the evaluation period.
Submission Site
The submission is done through the Codalab platform at https://codalab.lisn.upsaclay.fr/competitions/22609
Organizers
- Federico Ruggeri, DISI, University of Bologna, Italy,
- Arianna Muti, MilaNLP, Bocconi, Milan
- Katerina Korre, DIT, University of Bologna, Italy
Note: Language-specific curators will be announced once the evaluation cycle ended.
For queries, please join the Slack channel
Alternatively, please send an email to: clef-factcheck@googlegroups.com