

Content moderation is a thorny issue. Does it even work, and which approach is best?

By Dr. Anjana Susarla
Professor of Information Systems
Michigan State University
Introduction
Metaโs decision to change its content moderation policies by replacing centralized fact-checking teamsย with user-generated community labelingย has stirred up aย storm of reactions. But taken at face value, the changes raise the question of the effectiveness of Metaโs old policy, fact-checking, and its new one, community comments.
With billions of people worldwide accessing their services, platforms such as Metaโs Facebook and Instagram have a responsibility to ensure that users are not harmed by consumer fraud, hate speech, misinformation or other online ills. Given the scale of this problem,ย combating online harmsย is a serious societal challenge. Content moderation plays a role in addressing these online harms.
Moderating contentย involves three steps. The first is scanning online content โ typically, social media posts โ to detect potentially harmful words or images. The second is assessing whether the flagged content violates the law or the platformโs terms of service. The third is intervening in some way. Interventions include removing posts, adding warning labels to posts, and diminishing how much a post can be seen or shared.
Content moderation can range fromย user-driven moderation modelsย on community-based platforms such as Wikipedia to centralized content moderation models such as those used by Instagram. Research shows that both approaches are a mixed bag.
Does Fact-Checking Work?
Metaโs previous content moderation policy relied on third-party fact-checking organizations, which brought problematic content to the attention of Meta staff. Metaโsย U.S. fact-checking organizationsย were AFP USA, Check Your Fact, Factcheck.org, Lead Stories, PolitiFact, Science Feedback, Reuters Fact Check, TelevisaUnivision, The Dispatch and USA TODAY.
Fact-checking relies on impartial expert review. Research shows that itย can reduceย the effects of misinformation but isย not a cure-all. Also, fact-checkingโs effectiveness depends on whetherย users perceiveย the role of fact-checkers and the nature of fact-checking organizationsย as trustworthy.
Crowdsourced Content Moderation
Inย his announcement, Meta CEO Mark Zuckerberg highlighted that content moderation at Meta would shift to a community notes model similar to X, formerly Twitter. Xโs community notes is a crowdsourced fact-checking approach that allows users to write notes to inform others about potentially misleading posts.
Studies are mixed on the effectiveness of X-style content moderation efforts. A large-scale study found little evidence that the introduction of community notes significantlyย reduced engagement with misleading tweets on X. Rather, it appears that such crowd-based efforts might be too slow to effectively reduce engagement with misinformation in the early and most viral stage of its spread.
There have been some successes from quality certifications and badges on platforms. However, community-provided labelsย might not be effectiveย in reducing engagement with misinformation, especially when theyโre not accompanied by appropriate training about labeling for a platformโs users. Research also shows that Xโs Community Notes isย subject to partisan bias.
Crowdsourced initiatives such as the community-edited online reference Wikipedia depend on peer feedback and rely on having a robust system of contributors. As I have written before, a Wikipedia-style modelย needs strong mechanisms of community governanceย to ensure that individual volunteers follow consistent guidelines when they authenticate and fact-check posts. People could game the system in a coordinated manner and up-vote interesting and compelling but unverified content.
Content Moderation and Consumer Harms
A safe and trustworthy online space is akin to a public good, but without motivated people willing to invest effort for the greater common good, the overall user experience could suffer.
Algorithms on social media platforms aim to maximize engagement. However, given that policies that encourage engagement can also result in harm, content moderation also plays a role in consumer safety and product liability.
This aspect of content moderation has implications for businesses that either use Meta for advertising or to connect with their consumers. Content moderation is also aย brand safety issueย because platforms have to balance their desire to keep the social media environment safer against that of greater engagement.
AI Content Everywhere
Content moderation is likely to be further strained by growing amounts of content generated by artificial intelligence tools. AI detection tools are flawed, and developments in generative AI are challenging peopleโs ability to differentiate between human-generated and AI-generated content.
In January 2023, for example, OpenAI launchedย a classifierย that was supposed to differentiate between texts generated by humans and those generated by AI. However, the company discontinued the tool in July 2023 due to its low accuracy.
There is potential for a flood ofย inauthentic accountsย โ AI bots โ that exploit algorithmic and human vulnerabilities toย monetize false and harmful content. For example, they could commit fraud and manipulate opinions for economic or political gain.
Generative AI tools such as ChatGPT make it easier to create large volumes ofย realistic-looking social mediaย profiles and content. AI-generated content primed for engagementย can also exhibit significant biases, such as race and gender. In fact, Meta faced a backlash for its ownย AI-generated profiles, with commentators labeling it โAI-generated slop.โ
More Than Moderation
Regardless of the type of content moderation, the practice alone is not effective atย reducing belief in misinformationย or atย limiting its spread.
Ultimately, research shows that aย combination of fact-checking approachesย in tandem withย audits of platformsย and partnerships with researchers andย citizen activistsย are important in ensuring safe and trustworthy community spaces on social media.
Originally published by The Conversation, 01.15.2025, under the terms of a Creative Commons Attribution/No derivatives license.


