Moderating With Humans, For Humans

Nov 7, 20254 min read4 reads

Moderating With Humans, For Humans

hen I joined Bluesky, one of my first promises was that we would get things wrong.

We're a small and well-intentioned group doing our best to make social media work in less-than-ideal times. We make thousands of moderation decisions every day. Most work out fine—but not all.

That's why I'm writing this series: to pull back the curtain on the inherently imperfect systems we're building. Because we're moderating human language and behavior—and humans are complicated.

Let me start with the foundation. Later posts will dig into the layers that make this more complex: automated systems, appeals, scale.

At its core, moderation is humans making decisions for other humans. It's messy, subjective, deeply human work. I wish I could tell you there's always a "right" answer. The reality? Outside of the worst content—the illegal stuff, the clear-cut violations—we're constantly drawing boundaries on what speech belongs on the app. Why? Because in a sense, the speech we allow in this space also serves to set our identity as an app.

When you browse Bluesky and see something you dislike, find offensive, or think violates our Community Guidelines, you can report it. Here's the thing: most people don't actually read those guidelines. So many reports come in simply because someone disagrees with a post.

That's where most misunderstandings begin—in the gap between what users feel moderation should be and what it can realistically achieve at scale.

Mike Masnick, a member of our Board, has frequently written that content moderation at scale is impossible to do well. When people think about content moderation, they tend to hold three core expectations. Each makes sense on its own. Together, they're impossible to deliver.

First: Perfect accuracy.

Every decision should use full context—the user's intent, history, identity, tone, relationships, the unseen nuance behind each post.

Second: Instant response.

Harmful content should disappear immediately, with the depth of human judgment delivered in seconds.

Third: Values alignment.

Moderation should always reflect the users' personal sense of fairness, justice, or community values—and never contradict it.

These expectations sit at the heart of most public disagreements. When any one fails, people experience it as betrayal rather than limitation.

The assumption behind "perfect accuracy" is that moderators somehow know the full story: who said what, to whom, in what tone, with what history, and in what idiom or deep context.

At scale, that's not possible.

People arrive at Bluesky with a full life, history, set of relationships, and personal identity, and Bluesky doesn’t know about any of it. When you sign up for Bluesky, we get an email address. That's it. We don't collect gender, religion, or political beliefs. We don't try to infer them from your posts. We think respecting privacy in this manner is good. It also can be limiting for content moderators.

So when someone writes "kill yourself" in a reply, we're not making a judgment about the person, their bio, who they're talking to, or whether the recipient knows it’s a joke. We're assessing whether the content violates our terms of service.

This leads to accusations that we're targeting someone's identity—when it's actually the opposite. We don't have that data. And honestly? You shouldn't want us to have it.

Secondly, we started with only human moderators. No automation. Just people.

Yet users often assume that if a "bad account" exists for a few hours, it's a value judgment. What it actually reflects is operational reality.

During the November 2025 US election, we received 250,000 tickets in a single day. Last week, moderators reviewed 38,000 tickets in a day that *weren't* violations of our Community Guidelines.

Every time someone reports a major newspaper as a "fascist org" (real example), or President Obama as misleading, a moderator has to assess that report against our guidelines and close the ticket. The more low-quality reports we receive, the longer everything takes.

The third area—values alignment—is where we've seen the most friction.

Most people want a values-aligned space. I want Bluesky to be a better place than the other major platforms. The problem arises when users want their personal values to determine who gets to stay versus who leaves, without clear violations of our Community Guidelines. Passionate users are happy to advocate for terminating accounts they disagree with. But without a clear violation, we risk becoming a mirror image of the reason people came to Bluesky in the first place. As a light hearted example, I reviewed a report that requested we suspend an account for "bearing false witness against his neighbor, in clear violation of God's Ten Commandments." That may matter deeply to that user, but it's not a violation of our Community Guidelines. Our Community Guidelines are incorporated by reference in our Terms of Service. The Ten Commandments are not.

Every system we build—from Ozone to stackable moderation—is an attempt to improve one of those three axes: accuracy, speed, or alignment.

But we'll never reach perfection on all three simultaneously. That's the trade-off every moderation team in the world faces.

-Aaron

aaron.bsky.team

Did you enjoy this article?

Recommend it — Standard Reader surfaces well-loved writing to more readers across the network.

Trust Issues @aaron.bsky.team