We compared the abstract claims of every matchable bioRxiv preprint with its peer-reviewed publication, using a large language model to label each pair at the level of the scientific claim.
Interactive versions of every figure in the paper — content change, hedging, fields, claim-type transitions, and the drivers of revision.
Search and filter all 72,644 pairs by author, institution, field, and more. Open any one to see the preprint and published claims side by side with the model's reasoning.
Preprints now disseminate a large share of biomedical research before peer review, and are often regarded as unverified. We compiled every bioRxiv preprint posted between 2018 and 2025 that we could match by DOI to a peer-reviewed version, yielding 72,644 pairs, and used Claude Sonnet 4.6 to parse each abstract pair into one primary and two secondary claims, classifying content change and hedging shift.
Most central claims changed little: 39.9% unchanged, 50.0% minor, 10.2% major. Hedging shifts were uncommon and asymmetric — twice as many claims became more cautious as more confident. Major revisions were more frequent after long peer review and declined over the study period. Papers never posted as preprints were retracted at roughly twice the rate of those that were. The move from preprint to publication leaves the central claims of most biomedical abstracts intact.
Every bioRxiv preprint posted 2018–2025 matched by DOI to its peer-reviewed publication. English abstracts ≥100 characters; first preprint version only. 72,644 pairs across 3,442 journals and 25 fields.
Claude Sonnet 4.6 (temperature 0, locked v7.1 codebook) parsed each abstract pair into one primary and two secondary claims, then classified content change and hedging shift.
On 120 stratified pairs, model–expert agreement (κ 0.63–0.66) matched expert–expert agreement (κ 0.60); replicate model runs agreed at κ=0.75.