AI NewsNews

Anthropic Fable 5 safeguards also block legitimate cyber work

Published: June 10, 2026Updated: July 10, 2026

Anthropic deliberately set a wide safety margin in Fable 5, reducing misuse risk while increasing false positives in legitimate defensive work.

Anthropic released Fable 5 on June 9, 2026 as a public version of a Mythos-class model with additional safeguards for cybersecurity tasks. Soon after launch, some researchers reported that its classifiers were also blocking legitimate defensive work.

Why do the blocks happen?

Cybersecurity is dual use: the same technique can support an audit or an attack. Anthropic says Fable 5 uses a wider safety margin, deliberately accepting more false positives. That does not mean the model is intended to block all cybersecurity work. The company lists secure coding, debugging, log analysis, threat hunting, and malware reverse engineering among benign activities.

What changed later?

After a temporary suspension in June, the model became globally available again on July 1. Anthropic published a more detailed split between blocked and permitted categories and launched a program for reporting jailbreaks. The launch criticism remains relevant, but the original story needed this later context.

The business takeaway is practical: an operational AI system must be tested not only for misuse, but also for false blocks that stop legitimate work.

Sources: - Anthropic: Fable 5 cyber safeguards - TechCrunch: researcher reaction at launch

Anthropic Fable 5 safeguards also block legitimate cyber work

Instagram account recovery bug may have affected 20,225 people

Gemini Spark beyond the demo: where Google's agent needs control

The first good automation candidate is not always AI