Moderation Research - X Transparency Center

X Moderation Research Consortium

01.

02.

03.

04.

01.

02.

03.

04.

01. Overview

Through the X Moderation Research Consortium (“TMRC” or the “Consortium”), X shares large-scale datasets concerning platform moderation issues with a global group of members, comprising of public interest researchers from across academia, civil society, NGOs and journalism, studying platform governance issues.

Through the Consortium, X will continue to support our existing disclosures of datasets of persistent platform manipulation campaigns, which consist of material that was posted in violation of our platform manipulation and spam policy. Over time, we intend to share similarly comprehensive data about other policy areas with the Consortium.

Transparency is core to our mission and has been a critical part of X from the start. In October 2018, we launched the first archive in the industry of potential foreign information operations we had seen on X. The Consortium continues and expands on that access. We’ve designed the Consortium as an industry-leading effort to increase transparency around X’s content moderation policies and enforcement decisions, so credible, public interest researchers can independently investigate, learn, and produce insights that inform the public, policymakers, and other researchers.

Our goal is to provide increased transparency about more issues that impact the health of the platform, while grappling with the considerable safety, security, and integrity challenges in this space. We hope that expanded transparency through disclosures to the Consortium can help us all learn and build the necessary societal defenses and capacities to protect public conversation.

August 2022

X shared an update on what the company has learned from sharing information operations data and how we intend to advance data-driven transparency in 2022 and beyond.

Blog post

December 2021

X published a dataset of 3,465 accounts, attributed to state linked information operations originating from Venezuela, China, Russia, Tanzania, Mexico, and Uganda.

Blog post

February 2021

X published a dataset of 373 accounts, attributed to state linked information operations originating from Iran, Armenia, Russia (GRU), and Russia.

Blog post

October 2020

X published a dataset of 1,594 accounts, attributed to state linked information operations originating from Iran, Russia, Thailand, Cuba, and Saudi Arabia.

Blog post

June 2020

X published a dataset of 32,242 accounts, attributed to state linked information operations originating from China, Russia, and Turkey.

Blog post

April 2020

X published a dataset of 20,348 accounts, attributed to state linked information operations originating from Serbia, Honduras, Egypt, Indonesia, and a KSA affiliated actor.

Post thread

March 2020

X published a dataset of 70 accounts, attributed to state linked information operations originating from Ghana/Nigeria.

Post thread

December 2019

X published a dataset of 5,929 accounts, attributed to state linked information operations originating from Saudi Arabia.

Blog post

September 2019

X published a dataset of 10,104 accounts, attributed to state linked information operations originating from UAE & Egypt, Saudi Arabia, Spain, Ecuador, and China.

Blog post

August 2019

X published a dataset of 936 accounts, attributed to state linked information operations originating from China.

Blog post

June 2019

X published a dataset of 4,943 accounts, attributed to state linked information operations originating from Iran, Russia, Spain, and Venezuela.

Blog post

January 2019

X published a dataset of 4,711 accounts, attributed to state linked information operations originating from Bangladesh, Russia, Iran, and multiple datasets from Venezuela.

Blog post

October 2018

X published a dataset of 4,383 accounts, attributed to state linked information operations originating from Russia and Iran.

Blog post

02. FAQs

Who is eligible to join the Consortium?

Consortium membership is by application. Our intent is to be inclusive while aiming to ensure the privacy and security of the Consortium’s data, and its ethical and public interest use. The Consortium welcomes applications from researchers – from diverse backgrounds, experiences, and who use varied methodologies – who undertake data-driven analysis related to content moderation.

To be an eligible candidate for membership, applicants must demonstrate the following:

That they hold a primary institutional affiliation with an academic, journalistic, nonprofit, or civil society research organization. If they are students, they must be master’s or PhD level students; undergraduate students are ineligible at this time.
Prior experience and relevant skills for data-driven analysis. Consortium datasets are primarily shared as JSON files and require technical skills to analyze.
A specific public interest research use case for the data provided by the Consortium. (“Public interest research use case” means non-commercial research for journalistic, academic, or non-profit/civil society purposes.)
Industry-standard plans and systems for safeguarding the privacy and security of the data provided by the Consortium. Consortium members are required to sign a data use agreement.

More information on eligibility and a link to the application is at the bottom of this page.

What data is shared with the Consortium?

To start, we are continuing our ongoing disclosures of persistent platform manipulation campaigns and information operations, which are prohibited by X’s platform manipulation and spam policy. (Manipulation that we can reliably attribute to a government or state linked actor is considered an information operation.) Over time, we intend to share similarly comprehensive data about persistent platform manipulation campaigns that are not attributable to state-backed actors, as well as other content moderation policy areas and enforcement decisions – and we will update this page with more information when we do. The exact data types we share may vary depending on the types of activity in question.

Members of the Consortium have access to an archive of information operations datasets starting from 2018. We have attributed these information operations either publicly or internally. Once our teams have identified, removed and investigated these campaigns and any associated violative content, we share datasets with Consortium members. These datasets include profile information, posts and media (e.g., images and videos) from accounts we believe are connected to state linked information operations. Posts and media which were deleted are not included in the datasets. The data the Consortium has access to is not hashed, unlike the public historic archive. Note that not all of the accounts we identified as connected to these campaigns actively posted, so the number of accounts represented in the datasets may be less than the total number of accounts attributed to the information operation and enforced against.

All Consortium datasets require members to be able to analyze large datasets due to their size.

How is the publicly accessible information operations archive different from what the Consortium has access to?

Beginning in October 2018, we published the first comprehensive, public archive of data related to state-backed information operations. From that date through early 2022, when we launched the X Moderation Research Consortium, we publicly shared 37 datasets of attributed platform manipulation campaigns originating from 17 countries, spanning more than 200 million posts and nine terabytes of media.

With the advent of the X Moderation Research Consortium, we have discontinued public dataset releases, instead focusing on releasing data to the Consortium. The existing archive of information operations datasets continue to be available for download below — while no content has been redacted, some account-specific information has been hashed to protect account privacy.

Why is the publicly accessible information operations archive hashed?

For accounts with fewer than 5,000 followers, we hashed certain identifying fields (such as user ID and screen name) in the publicly-accessible archive. While we’ve taken precautions to minimize false positives in these datasets, we’ve nevertheless hashed select fields to reduce the potential for negative impact on authentic or compromised accounts — while still enabling longitudinal research, network analysis, and assessment of the underlying content created by these accounts.

Members of the Consortium are provided access to unhashed versions of these datasets for research. Consortium members agree to the terms of a data license agreement limiting usage of the unhashed datasets to research purposes, with provisions to ensure the researcher may only use the datasets pursuant to specific limitations and in conjunction with appropriate security measures.

Where else can I access X data for research purposes?

If you are an academic, check out free academic access to our API for research here. Learn more about general API access here.

What can I do if I believe I've been included here in error?

If you believe your account has been included in one of these datasets in error, please log into your X account and file a suspension appeal here for our full review.

03. Download Hashed Information Operations Archive (2018-2022)

03. Download Archive

You can download the datasets by entering your email address and clicking “Submit”. Your use of the datasets is governed by the X Developer Agreement and Policy. By clicking “Submit”, you agree to the X Developer Agreement and Policy.

If you believe your account has been included in one of the datasets in error, please log into your X account and file a suspension appeal here. We carefully review these cases, and may be able to help restore potentially compromised accounts, or accounts that may have been included in error.

Thank you for submitting the form.

04. Applying To Join The Consortium

Thank you for your interest in joining the X Moderation Research Consortium! Please read this full overview before filling out the application linked here and below.

To be an eligible candidate for membership, applicants must demonstrate the following:

That they hold a primary institutional affiliation with an academic, journalistic, nonprofit, or civil society research organization. If they are students, they must be master’s or PhD level students; undergraduate students are ineligible at this time.
Prior experience and relevant skills for data-driven analysis. Consortium datasets are primarily shared as JSON files and require technical skills to analyze.
A specific public interest research use case for the data provided by the Consortium. (“Public interest research use case” means non-commercial research for journalistic, academic, or non-profit/civil society purposes.)
Industry-standard plans and systems for safeguarding the privacy and security of the data provided by the Consortium. Consortium members are required to sign a data use agreement.

Consortium Ineligibility

Additionally, applicants are ineligible to join the Consortium if they:

Are undergraduate students; only master’s or PhD level students are eligible.
Hold industry and government positions as their primary institutional affiliation.
Do not hold a primary institutional affiliation in academia, journalism, nonprofit, or civil society research organization.
Plan to share the Consortium’s data with governments or other outside parties.

Application Processing and Review

Applications will be reviewed by X, and applicants will be notified of acceptances or rejections. Successful applicants will be researchers with a demonstrable history of independent research or have met other criteria that demonstrate an ability to be entrusted with the Consortium data and to pursue research for a qualified purpose. Qualified research for purposes of the Consortium is academic, journalistic, nonprofit, or civil society research that aims to better understand content moderation and issues of platform integrity.

Once accepted in the Consortium, Qualified Researchers are provided access to data sets to work independently. X makes no representations about the quality, nature or frequency of the Consortium’s data sets, releases or updates; the work or type of qualified research Consortium members pursue; and does not review nor participate in the decisions or work product of the Consortium’s Qualified Researchers.

Your decision to complete this application is completely voluntary. By submitting your application you give us permission to use your answers to evaluate your eligibility to become a member of the Consortium. Your individual responses are confidential and your personal information will only be used to evaluate your eligibility to participate. If you wish to withdraw your application after submitting it, please respond to the email you will receive confirming our receipt of your application. You can also contact X by clicking here.

Tips on Filling Out and Submitting the Application

We recommend reviewing the application in full, drafting responses in advance in a separate document, and entering all final responses in the form when you are ready to submit. The more information you share with us, the easier it is for us to review and consider the eligibility of your application.

Please fill out the application form in English.

Apply

Other reports

Information Requests

Legal requests for account information

Rules Enforcement

Our Rules and TOS enforcement

Email Security

01.

02.

03.

04.

01.

02.

03.

04.

01.

Overview

X shared an update on what the company has learned from sharing information operations data and how we intend to advance data-driven transparency in 2022 and beyond.

X published a dataset of 3,465 accounts, attributed to state linked information operations originating from Venezuela, China, Russia, Tanzania, Mexico, and Uganda.

X published a dataset of 373 accounts, attributed to state linked information operations originating from Iran, Armenia, Russia (GRU), and Russia.

X published a dataset of 1,594 accounts, attributed to state linked information operations originating from Iran, Russia, Thailand, Cuba, and Saudi Arabia.

X published a dataset of 32,242 accounts, attributed to state linked information operations originating from China, Russia, and Turkey.

X published a dataset of 20,348 accounts, attributed to state linked information operations originating from Serbia, Honduras, Egypt, Indonesia, and a KSA affiliated actor.

X published a dataset of 70 accounts, attributed to state linked information operations originating from Ghana/Nigeria.

X published a dataset of 5,929 accounts, attributed to state linked information operations originating from Saudi Arabia.

X published a dataset of 10,104 accounts, attributed to state linked information operations originating from UAE & Egypt, Saudi Arabia, Spain, Ecuador, and China.

X published a dataset of 936 accounts, attributed to state linked information operations originating from China.

X published a dataset of 4,943 accounts, attributed to state linked information operations originating from Iran, Russia, Spain, and Venezuela.

X published a dataset of 4,711 accounts, attributed to state linked information operations originating from Bangladesh, Russia, Iran, and multiple datasets from Venezuela.

X published a dataset of 4,383 accounts, attributed to state linked information operations originating from Russia and Iran.

02.

FAQs

Who is eligible to join the Consortium?

What data is shared with the Consortium?

How is the publicly accessible information operations archive different from what the Consortium has access to?

Why is the publicly accessible information operations archive hashed?

Where else can I access X data for research purposes?

What can I do if I believe I've been included here in error?

03.

Download Hashed Information Operations Archive (2018-2022)

03.

Download Archive

Enter your email:

Datasets released in December 2021

Datasets released in February 2021

Datasets released in October 2020

Datasets released in June 2020

Datasets released in April 2020

Datasets released in March 2020

Datasets released in December 2019

Datasets released in September 2019

Datasets released in August 2019

Datasets released in June 2019

Datasets released in January 2019

Datasets released in October 2018

04.

Applying To Join The Consortium

04.

Applying To Join The Consortium

Other reports

Legal requests for account information

Our Rules and TOS enforcement

Email security practices