Twitter Moderation Research Consortium

 

Twitter Moderation Research Consortium

 

01.

Overview

Through the Twitter Moderation Research Consortium (“TMRC” or the “Consortium”), Twitter shares large-scale datasets concerning platform moderation issues with a global group of members, comprising of public interest researchers from across academia, civil society, NGOs and journalism, studying platform governance issues.

 

Through the Consortium, Twitter will continue to support our existing disclosures of datasets of persistent platform manipulation campaigns, which consist of material that was posted in violation of our platform manipulation and spam policy. Over time, we intend to share similarly comprehensive data about other policy areas with the Consortium. 

 

Transparency is core to our mission and has been a critical part of Twitter from the start. In October 2018, we launched the first archive in the industry of potential foreign information operations we had seen on Twitter. The Consortium continues and expands on that access. We’ve designed the Consortium as an industry-leading effort to increase transparency around Twitter’s content moderation policies and enforcement decisions, so credible, public interest researchers can independently investigate, learn, and produce insights that inform the public, policymakers, and other researchers.


Our goal is to provide increased transparency about more issues that impact the health of the platform, while grappling with the considerable safety, security, and integrity challenges in this space. We hope that expanded transparency through disclosures to the Consortium can help us all learn and build the necessary societal defenses and capacities to protect public conversation.

 

 

 

02.

FAQs

Who is eligible to join the Consortium? 

Consortium membership is by application. Our intent is to be inclusive while aiming to ensure the privacy and security of the Consortium’s data, and its ethical and public interest use. The Consortium welcomes applications from researchers – from diverse backgrounds, experiences, and who use varied methodologies – who undertake data-driven analysis related to content moderation.

 

To be an eligible candidate for membership, applicants must demonstrate the following:

 

  • That they hold a primary institutional affiliation with an academic, journalistic, nonprofit, or civil society research organization. If they are students, they must be master’s or PhD level students; undergraduate students are ineligible at this time.
  • Prior experience and relevant skills for data-driven analysis. Consortium datasets are primarily shared as JSON files and require technical skills to analyze. 
  • A specific public interest research use case for the data provided by the Consortium. (“Public interest research use case” means non-commercial research for journalistic, academic, or non-profit/civil society purposes.)
  • Industry-standard plans and systems for safeguarding the privacy and security of the data provided by the Consortium. Consortium members are required to sign a data use agreement.

 

More information on eligibility and a link to the application is at the bottom of this page.

 

What data is shared with the Consortium? 

To start, we are continuing our ongoing disclosures of persistent platform manipulation campaigns and information operations, which are prohibited by Twitter’s platform manipulation and spam policy. (Manipulation that we can reliably attribute to a government or state linked actor is considered an information operation.) Over time, we intend to share similarly comprehensive data about persistent platform manipulation campaigns that are not attributable to state-backed actors, as well as other content moderation policy areas and enforcement decisions – and we will update this page with more information when we do. The exact data types we share may vary depending on the types of activity in question.

 

Members of the Consortium have access to an archive of information operations datasets starting from 2018. We have attributed these information operations either publicly or internally. Once our teams have identified, removed and investigated these campaigns and any associated violative content, we share datasets with Consortium members. These datasets include profile information, Tweets and media (e.g., images and videos) from accounts we believe are connected to state linked information operations. Tweets and media which were deleted are not included in the datasets. The data the Consortium has access to is not hashed, unlike the public historic archive. Note that not all of the accounts we identified as connected to these campaigns actively Tweeted, so the number of accounts represented in the datasets may be less than the total number of accounts attributed to the information operation and enforced against. 

 

All Consortium datasets require members to be able to analyze large datasets due to their size.

 

How is the publicly accessible information operations archive different from what the Consortium has access to?

Beginning in October 2018, we published the first comprehensive, public archive of data related to state-backed information operations. From that date through early 2022, when we launched the Twitter Moderation Research Consortium, we publicly shared 37 datasets of attributed platform manipulation campaigns originating from 17 countries, spanning more than 200 million Tweets and nine terabytes of media. 

 

With the advent of the Twitter Moderation Research Consortium, we have discontinued public dataset releases, instead focusing on releasing data to the Consortium. The existing archive of information operations datasets continue to be available for download below — while no content has been redacted, some account-specific information has been hashed to protect account privacy.

 

Why is the publicly accessible information operations archive hashed?

For accounts with fewer than 5,000 followers, we hashed certain identifying fields (such as user ID and screen name) in the publicly-accessible archive. While we’ve taken precautions to minimize false positives in these datasets, we’ve nevertheless hashed select fields to reduce the potential for negative impact on authentic or compromised accounts — while still enabling longitudinal research, network analysis, and assessment of the underlying content created by these accounts. 

 

Members of the  Consortium are provided access to unhashed versions of these datasets for research. Consortium members agree to the terms of a data license agreement limiting usage of the unhashed datasets to research purposes, with provisions to ensure the researcher may only use the datasets pursuant to specific limitations and in conjunction with appropriate security measures.

 

Where else can I access Twitter data for research purposes?

If you are an academic, check out free academic access to our API for research here. Learn more about general API access here.

 

What can I do if I believe I've been included here in error?

If you believe your account has been included in one of these datasets in error, please log into your Twitter account and file a suspension appeal here for our full review.

 

03.

Download Hashed Information Operations Archive (2018-2022)

03.

Download Archive

Beginning in October 2018, we published the first comprehensive, public archive of data related to state-backed information operations. From that date through early 2022, when we launched the Twitter Moderation Research Consortium, we publicly shared 37 datasets of attributed platform manipulation campaigns originating from 17 countries, spanning more than 200 million Tweets and nine terabytes of media. 

 

With the advent of the Twitter Moderation Research Consortium, we have discontinued public dataset releases, instead focusing on releasing data to the Consortium. The existing archive of information operations datasets continue to be available for download below — while no content has been redacted, some account-specific information has been hashed to protect account privacy.


You can download the datasets by entering your email address and clicking “Submit”. Your use of the datasets is governed by the Twitter Developer Agreement and Policy. By clicking “Submit”, you agree to the Twitter Developer Agreement and Policy.

 

If you believe your account has been included in one of the datasets in error, please log into your Twitter account and file a suspension appeal here. We carefully review these cases, and may be able to  help restore potentially compromised accounts, or accounts that may have been included in error.

Enter your email:

     

    04.

    Applying To Join The Consortium

    04.

    Applying To Join The Consortium

    Thank you for your interest in joining the Twitter Moderation Research Consortium! Please read this full overview before filling out the application linked here and below.

     

    Consortium membership is by application. Our intent is to be inclusive while aiming to ensure the privacy and security of the Consortium’s data, and its ethical and public interest use. The Consortium welcomes applications from researchers – from diverse backgrounds, experiences, and who use varied methodologies – who undertake data-driven analysis related to content moderation.

     

    To be an eligible candidate for membership, applicants must demonstrate the following:

    • That they hold a primary institutional affiliation with an academic, journalistic, nonprofit, or civil society research organization. If they are students, they must be master’s or PhD level students; undergraduate students are ineligible at this time.
    • Prior experience and relevant skills for data-driven analysis. Consortium datasets are primarily shared as JSON files and require technical skills to analyze. 
    • A specific public interest research use case for the data provided by the Consortium. (“Public interest research use case” means non-commercial research for journalistic, academic, or non-profit/civil society purposes.)
    • Industry-standard plans and systems for safeguarding the privacy and security of the data provided by the Consortium. Consortium members are required to sign a data use agreement.

     

    Consortium Ineligibility

     

    Additionally, applicants are ineligible to join the Consortium if they:

    • Are undergraduate students; only master’s or PhD level students are eligible.

    • Hold industry and government positions as their primary institutional affiliation

    • Do not hold a primary institutional affiliation in academia, journalism, nonprofit, or civil society research organization

    • Plan to share the Consortium’s data with governments or other outside parties. 

     

    Application Processing and Review 

     

    Applications will be reviewed by Twitter, and applicants will be notified of acceptances or rejections. Successful applicants will be researchers with a demonstrable history of independent research or have met other criteria that demonstrate an ability to be entrusted with the Consortium data and to pursue research for a qualified purpose. Qualified research for purposes of the Consortium is academic, journalistic, nonprofit, or civil society research that aims to better understand content moderation and issues of platform integrity.  

     

    Once accepted in the Consortium, Qualified Researchers are provided access to data sets to work independently. Twitter makes no representations about the quality, nature or frequency of the Consortium’s data sets, releases or updates; the work or type of qualified research Consortium members pursue; and does not review nor participate in the decisions or work product of the Consortium’s Qualified Researchers.  

     

    Your decision to complete this application is completely voluntary. By submitting your application you give us permission to use your answers to evaluate your eligibility to become a member of the Consortium. Your individual responses are confidential and your personal information will only be used to evaluate your eligibility to participate. If you wish to withdraw your application after submitting it, please respond to the email you will receive confirming our receipt of your application. You can also contact Twitter by clicking here

     

    Tips on Filling Out and Submitting the Application

     

    We recommend reviewing the application in full, drafting responses in advance in a separate document, and entering all final responses in the form when you are ready to submit. The more information you share with us, the easier it is for us to review and consider the eligibility of your application.

     

    Please fill out the application form in English.

    Other reports