CheckThat! Lab at CLEF 2026

Home

Editions

Tasks

Contents

Task 2: Fact-Checking Numerical Claims

Definition

This task involves verifying naturally occurring claims containing numerical quantities and temporal expressions by improving the reasoning process of Large Language Models (LLMs) through test-time scaling. In contrast to previous editions that focused solely on fact-checking accuracy, this year’s task integrates rationale generation into the evaluation, assessing both the correctness of the veracity prediction and the reasoning quality of the model’s explanation. Though the claims to be fact-checked are leveraged from the previous version, the task setup differs from the last iteration. In this version of the task, we propose a test-time scaling setup to improve the performance of LLM reasoning for claim verification. Given claims, each with associated top-10 evidences, possible reasoning paths are generated and provided as input. Given this data participants are expected to train a verifier model that can help rank the reasoning paths for test claims along with output of verdict from top-ranked reasoning path. We will also release new evaluation (test) sets to avoid leakage and for rigorous evaluation

Datasets

TBA

Evaluation

TBA

Submission

Scorer, Format Checker, and Baseline Scripts

TBA

Submission Site

TBA

Submission Guidelines

TBA

Leaderboard

TBA

Organizers

TBA

Contact

TBA