CheckThat! Lab at CLEF 2024

Home

Editions

Tasks

Contents

Task 1: Check-Worthiness Estimation

Definition

The aim of this task is to determine whether a claim in a tweet and/or transcriptions is worth fact-checking. Typical approaches to make that decision require to either resort to the judgments of professional fact-checkers or to human annotators to answer several auxiliary questions such as “does it contain a verifiable factual claim?”, and “is it harmful?”, before deciding on the final check-worthiness label https://aclanthology.org/2021.findings-emnlp.56.pdf.

This year, we are offering multi-genre data: the tweets and/or transcriptions should be judged based solely on the text. The task is available in Arabic, English, and Spanish.

Datasets

Data for all languages are available here

Each instance is composed of only text, which could come from a tweet, the transcription of a debate or the transcription of speech.

Evaluation

This is a binary classification task. The official evaluation metric is F_1 over the positive class.

Submission

Scorers, Format Checkers, and Baseline Scripts

All scripts can be found on gitlab at CheckThat! Lab Task 1 repository

Submission guidelines

  • Make sure that you create one account for each team, and submit runs through one account only.
  • The last file submitted to the leaderboard will be considered as the final submission.
  • Name of the output file has to be task1_lang.tsv with .tsv extension (e.g., task1_arabic.tsv); otherwise, you will get an error on the leaderboard. Three languages are possible (Arabic, English, and Dutch).
  • You have to zip the tsv, zip task1_arabic.zip task1_arabic.tsv and submit it through the codalab page.
  • It is required to submit the team name and method description for each submission. Your team name here must EXACTLY match that used during CLEF registration.
  • You are allowed to submit max 200 submissions per day.
  • We will keep the leaderboard private till the end of the submission period, hence, results will not be available upon submission. All results will be available after the evaluation period.

Submission Site

Task 1: Codalab

Contact

Please join the Slack channel for any queries: https://join.slack.com/t/ct-participants/shared_invite/zt-1me6ywqxu-0Q~r7vCm6gfmrmN9xuOO7A

Or send an email to: clef-factcheck@googlegroups.com

Leaderboard

All baselines are random systems.

Task 1 Arabic

Team F1
1 visty 0.569
2 teamopenfact 0.557
3 DSHacker 0.538
4 TurQUaz 0.533
5 SemanticCUETSync 0.532
6 mjmanas54 0.531
7 Fired_from_NLP 0.530
8 Madussree 0.530
9 pandas 0.520
10 hybrinfox 0.519
11 Mirela 0.478
12 DataBees 0.460
13 Baseline 0.418
14 JUNLP 0.212

Task 1 Dutch

Team F1
1 TurQUaz 0.732
2 DSHacker 0.730
3 visty 0.718
4 Mirela 0.650
5 Zamoranesis 0.601
6 FC_RUG 0.594
7 teamopenfact 0.590
8 hybrinfox 0.589
9 mjmanas54 0.577
10 DataBees 0.563
11 JUNLP 0.550
12 Fired_from_NLP 0.543
13 Madussree 0.482
14 Baseline 0.438
15 pandas 0.308
16 SemanticCUETSync 0.218

Task 1 English

Team F1
1 FactFinders 0.802
2 teamopenfact 0.796
3 innavogel 0.780
4 mjmanas54 0.778
5 ZHAW_Students 0.771
6 SemanticCUETSync 0.763
7 SINAI 0.761
8 DSHacker 0.760
9 visty 0.753
10 Fired_from_NLP 0.745
11 TurQUaz 0.718
12 hybrinfox 0.711
13 SSN-NLP 0.706
14 sz06571 0.696
15 NapierNLP 0.675
16 Mirela 0.658
17 Kushal_Chandani 0.658
18 DataBees 0.619
19 Trio_Titans 0.600
20 Madussree 0.583
21 pandas 0.579
22 JUNLP 0.541
23 mariuxi 0.517
24 grig95 0.497
25 CLaC-2 0.494
26 Aqua_Wave 0.339
27 Baseline 0.307

Organizers

  • Firoj Alam, Qatar Computing Research Institute, HBKU, Qatar
  • Maram Hasanain, Qatar Computing Research Institute, HBKU, Qatar
  • Alberto Barrón-Cedeño, Università di Bologna, Italy
  • Reem Suwaileh, HBKU, Qatar
  • Chengkai Li, The University of Texas at Arlington, USA
  • Rubén Miguez, Newtral, Spain
  • Wajdi Zaghouani, HBKU, Qatar
  • Preslav Nakov, Mohamed bin Zayed University of Artificial Intelligence, UAE
  • Sanne Weering, University of Groningen, Netherlands
  • Tommaso Caselli, University of Groningen, Netherlands