Task 3: Fact-Checking Numerical Claims
Definition
This task focuses on verifying claims with numerical quantities and temporal expressions. Numerical claims are defined as those requiring validation of explicit or implicit quantitative or temporal details. Participants must classify each claim as True, False, or Conflicting based on a short list of evidence. The fact-verification task is available in languages Spanish, Arabic and English
Each claim would be provided with top-k BM25 evidence and participants can choose to carefully select from this evidence set or employ re–ranking approaches to improve fact verification performance. The evidence corpus is collected by pooling using multiple advanced claim decomposition approaches and the top-100 Bm25 evidences are retrieved from this pool to provide diverse perspectives required for claim verification.
Datasets
The dataset is collected from various fact-checking domains through Google Fact-check Explorer API12, complete with detailed metadata and an evidence corpus sourced from the web. Our pipeline will filter out numerical claims for the task. An overview of dataset statistics is shown in Table below.
Language |
Count |
English |
15514 |
Spanish |
2082 |
Arabic |
2200 |
Evaluation
We use macro-averaged F1 scores and classwise F1 scores for evaluating the fact-verification performance.
Submission
All script can be found in GitLab at link ChekThat! Lab Task 3 repository
Submission Guidelines
- Each team must create only one account in CodaLab and submit their predictions exclusively through that account.
- We will keep the leaderboard private till the end of the submission period, hence, results will not be available upon submission. All results will be available after the evaluation period.
- You are allowed to submit max 50 submissions per day.
- The last file submitted to the leaderboard will be considered as the final submission.
- The output file should be of format [url]\t[label] (tab separated) for english. For arabic [claim index]\t[label] (tab separated) where [claim index] is the index/position of claim in json list of input file such as 0,1…
- Name of the output file have to be
Task3_Numerical_claims_LANG.tsv
with .tsv
extension (e.g., Task3_Numerical_claims_English.tsv); otherwise, you will get an error on the leaderboard.
- You have to zip the tsv,
zip Task3_Numerical_claims_English.zip Task3_Numerical_claims_English.tsv
and submit it through the codalab page.
Submission Site
The submission is done through the Codalab platform at
Organizers
- Vinay Setty, University of Stavanger, Norway
- Venktesh V, University of Stavanger, Norway
- Boushra Bendou, Carnegie Mellon University in Qatar, Qatar
- Maram Hasanain, Qatar Computing Research Institute, HBKU, Qatar
- Houda Bouamor, Carnegie Mellon University in Qatar, Qatar
- Firoj Alam, Qatar Computing Research Institute, HBKU, Qatar
For queries, please join the Slack channel
Alternatively, please send an email to: clef-factcheck@googlegroups.com