Computing Veracity – the Fourth Challenge of Big Data

PHEME rumour dataset: support, certainty and evidentiality

Along with our recent publication in PLoS ONE, we have released a dataset of social media rumours. This dataset can be found on figshare.

The dataset includes rumour tweets, collected and annotated within the journalism use case of the project. These rumours are associated with 9 different breaking news. It contains Twitter conversations which are initiated by a rumourous tweet; the conversations include tweets responding to those rumourous tweets. These tweets have been annotated for support, certainty, and evidentiality.

The dataset contains 330 conversational threads (297 in English, and 33 in German), with a folder for each thread. For more details on the dataset, please refer to the paper.

