Toxic Comments Experience
Overview
Between 2016–2019, The Coral Project (a collaboration among Mozilla, The New York Times, and The Washington Post) developed open-source tools to improve online community interactions. One of its flagship efforts was integrating early machine learning toxicity detection via Google’s Perspective API, allowing moderators to review potentially harmful comments more efficiently.
Problem
Online news comment sections were increasingly overwhelmed by harassment, spam, and hate speech. Automated systems like Disqus’s early “toxic filter” often hid or deleted comments outright, removing human judgment from the loop and eroding user trust.
Coral took a different stance: moderation is a conversation, not a purge.
Coral’s Approach
List youInstead of treating AI as a gatekeeper, Coral used it as an assistant:
- The Perspective API scored comments for potential toxicity.
- Moderators saw a confidence score and context, not a yes/no decision.
- The system surfaced comments for review, allowing moderators to apply human nuance.
- Training loops improved accuracy over time based on human feedback, not blunt automation.
This approach prioritized transparency, human agency, and contextual review—values still rare in early AI ethics work.r role, collaborators, and primary responsibilities.
Designing the Experience
These principles were translated into an interface that:
- Visually distinguished suggestion from enforcement (color-coded signals, not red flags).
- Offered quick moderation actions without removing full context.
- Included inline learning cues, helping moderators understand why a comment was scored a certain way.
- Encouraged trust calibration between human moderators and the algorithm over time.
Impact
- The project became one of the first real-world applications of AI-assisted moderation in journalism.
- It influenced later systems (e.g., Reddit, YouTube, Facebook) that now integrate assistive toxicity models.
- It demonstrated a human-centered philosophy: AI should augment—not replace—editorial judgment.