Definition
This task introduces a new, final task into the CLEF CheckThat! lab pipeline, which attempts to automate the fact-checking article writing process. Here, given a claim, its veracity, and a set of evidence documents consulted for fact-checking the claim, generate a full fact-checking article.
Datasets
We reuse the WatClaimCheck dataset as training and validation sets. In this dataset, each datapoint follows the following format:
{
"metadata": {
"claimant": "Faisal Al Qasimi, Carolina Monteiro",
"claim": "OpIndia claimed Greta Thunberg's real name is Ghazala bhat",
"claim_date": "2016-06-20",
"review_date": "2021-02-06",
"id": 42,
"premise_articles": {
"https://web.archive.org/web/20210206135409/https://twitter.com/omar_quraishi/status/1357926247414845441": "42_1.json",
"https://web.archive.org/web/20210206083718/https://twitter.com/runcaralisarun/status/1357714907249086465": "42_2.json",
"https://www.facebook.com/search/photos/?q=opindia%20greta%20ghazala": "42_3.json",
"https://twitter.com/UnSubtleDesi/status/1357723484491718659": "42_4.json"
}
},
"label": {
"reviewer_name": "Alt News",
"reviewer_site": "altnews.in",
"review_url": "https://www.altnews.in/morphed-opindia-screengrab-claims-greta-thunbergs-real-name-is-ghazala-bhat/",
"rating": 0,
"original_rating": "false",
"id": 1,
"review_article": "42.json"
}
}
To download and use a copy of the WatClaimCheck dataset, follow the instructions available here. The dataset also provides scraped content from the premise articles / evidence webpages where available. We encourage participants to develop their own retrieval pipeline for content from a more thorough set of premise articles / evidence. Note that, a component of the evaluation also assesses whether your generated articles use information from all input evidence sources or not.
We will release our test sets, along with content from the evidence webpages where available, in the same format soon!
Evaluation
We will use the mean of the following metrics to assess the generated text:
(i) entailment score, a reference-based metric that measures if the generated text is entailed by the reference;
(ii) citation correctness, which verifies if a text attributed to a citation can be entailed by the corresponding evidence;
and (iii) citation completeness, which computes the proportion of input evidence that is correctly cited in the generated text.
Organizers
- Dhruv Sahnan, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates
- Tanmoy Chakraborty, Indian Institute of Technology - Delhi, India
- Preslav Nakov, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates
Contact dhruv.sahnan@mbzuai.ac.ae for any questions.