CheckThat! Lab at CLEF 2025

Home

Editions

Tasks

Contents

Task 4: Scientific Web Discourse

Definition

  • Subtask 4a (Scientific Web Discourse Detection): Given a social media post (tweet), detect if it contains (1) a scientific claim, (2) a reference to a scientific study / publication, or (3) mentions of scientific entities, e.g. a university or scientist.

  • Subtask 4b (Scientific Claim Source Retrieval): Given an implicit reference to a scientific paper, i.e., a social media post (tweet) that mentions a research publication without a URL, retrieve the mentioned paper from a pool of candidate papers.

Annotation Guidelines

Datasets

Training and test datasets will be made available on GitLab at:

Note: the test datasets will be made available once the evaluation cycle starts.

Subtask 4a Dataset Examples

Tweets:

tweet ID tweet text category 1 category 2 category 3
100000000001 McDonald's breakfast stop then the gym 🏀💪 0 0 0
100000000002 65% of cats born with blue eyes are deaf. 1 0 0
100000000003 @ user 1. The study indicate that the brain atrophy is caused by stress not by religion 1 1 1

Subtask 4b Dataset Examples

Tweets with implicit references:

tweet text cord ID
Peer-reviewed in the New England Journal of Medicine regarding Delta (B.1.617.2):
  •Pfizer is ~90% effective
  •AstraZeneca is ~70% effective.
This falls in line with vaccine efficacy of other variants. Yes, the vaccines ARE indeed effective against Delta.
5g02ykhi
Published in the journal Antiviral Research, the study from Monash University showed that a single dose of Ivermectin could stop the coronavirus growing in cell culture -- effectively eradicating all genetic material of the virus within two days. ivy95jpw

Publications from the CORD-19 dataset:

cord ID study title study date study venue + authors study abstract
5g02ykhi Effectiveness of Covid-19 Vaccines against the B.1.617.2 (Delta) Variant 21-07-2021 New England Journal of Medicine

Jamie Lopez Bernal, Nick Andrews, Charlotte Gower, Eileen Gallagher, Ruth Simmons, Simon Thelwall, Julia Stowe, Elise Tessier, [...]
BACKGROUND: The B.1.617.2 (delta) variant of the severe acute respiratory syndrome coronavirus 2 [...]
ivy95jpw The FDA-approved drug ivermectin inhibits the replication of SARS-CoV-2 in vitro 03-04-2020 Antiviral Research

Caly, Leon; Druce, Julian D.; Catton, Mike G.; Jans, David A.; Wagstaff, Kylie M.
Although several clinical trials are now underway to test possible therapies, the worldwide response to the COVID-19 outbreak has been largely limited to monitoring/containment [...]

Evaluation

  • Subtask 4a: Subtask 4a is a multilabel classification task and will be evaluated by the macro-averaged F1-score.

  • Subtask 4b: Subtask 4b is a retrieval task and will be evaluated by the MRR@5 (Mean Reciprocal Rank) score.

Submission

Scorer, Format Checker, and Baseline Scripts

Scripts will be made available on GitLab at:

Submission Site

TBA

Submission Guidelines

TBA

Leaderboard

To be announced after the evaluation cycle.

Organizers

  • Dimitar Dimtrov, GESIS - Leibniz Institute for the Social Sciences, Cologne, Germany
  • Katarina Boland, Heinrich-Heine-University, Düsseldorf, Germany
  • Konstantin Todorov, LIRMM, CNRS, University of Montpellier, Montpellier, France
  • Salim Hafid, LIRMM, CNRS, University of Montpellier, Montpellier, France
  • Sandra Bringay, LIRMM, CNRS, University of Montpellier, Montpellier, France
  • Sebastian Schellhammer, GESIS - Leibniz Institute for the Social Sciences, Cologne, Germany
  • Stefan Dietze, GESIS - Leibniz Institute for the Social Sciences, Cologne, Germany
  • Yavuz Selim Kartal, GESIS - Leibniz Institute for the Social Sciences, Cologne, Germany

Contact

For queries, please join the Slack channel

Alternatively, please send an email to: clef-factcheck@googlegroups.com