Cybersecurity researchers aren't happy about the guard rail on Anthropic's Fable

Anthropic released its latest model Fable on Tuesday, billing it as a public and limited version of its powerful and much-hyped cybersecurity model Mythos.

But not everyone is happy with the restrictions, and a number of cybersecurity researchers and professionals have posted complaints online.

“[Fable] rejects any request that may be tangentially cyber-related. Even innocent tasks like reading a blog post,” said Valentina “Chompie” Palmiotti, a well-known security researcher who works at IBM X-Force.

When a prompt triggers its firewall, Fable pauses the chat, saying its “security measures flagged this message for cybersecurity or biology topics.”

The firewall was put in place to limit the risk that Fable could be used to develop malware or compromise software – a long-standing concern within Anthropic. The limitations on biology come from a similar concern about the development of biological weapons.

When the AI giant released Mythos in April, it restricted the model to a limited number of companies and organizations in what it called Project Glasswing, an effort to implement the model to secure critical software and infrastructure. Last week, Anthropic expanded access to Mythos to hundreds of organizations in 15 countries.

But despite the good intentions, many cybersecurity experts are still put off by the haphazard nature of the restrictions. Matt Suiche, a cybersecurity veteran, told TechCrunch that “if you ask it to write secure code, it assumes it’s cybersecurity-related work instead of best practices in software development, and you get downgraded.” Fable is programmed to fall back to Claude Opus 4.8 if it hits a guardrail. “It appears to be keyword-based, so anything in the lexical area of ’cyber security’ triggers the guardrail.”

Contact us

Do you have more information about how hackers use AI? Or how cyber security companies use artificial intelligence? We would love to hear from you. From a non-working device and network, you can contact Lorenzo Franceschi-Bicchierai securely on Signal at +1 917 257 1382 or via Telegram and Keybase @lorenzofb or email.

“But it’s understandable since we’re still in the early days and they’re still adapting their guardrails. I’m sure they’ll evolve over time as Anthropic and other frontier model companies will collaborate more with the current new generation of cybersecurity companies,” said Suiche, who is a member of the technical staff at Tolmo, an AI cybersecurity startup. “It’s better to catch more people than not enough when you make a release like that and to relax the guardrail over time.”

Another researcher griped to X that “even asking for a code review” triggers Fable’s firewall.

Anthropic did not immediately respond to a request for comment.

Aside from bumper guards inside its models, Anthropic requires cybersecurity professionals to apply for the Cyber Verification Program. If approved, applicants will have fewer restrictions on using Claude for cybersecurity work. OpenAI has a similar program called Trusted Access for Cyber.

When you buy through links in our articles, we may earn a small commission. This does not affect our editorial independence.

Breaking

Cybersecurity researchers aren’t happy about the guard rail on Anthropic’s Fable

Contact us

By umniy.com

You Missed

The first ATLAS report on AI

Malaysian GP: F1 considers Sepang’s return to 2026 calendar as replacement race ahead of Singapore Grand Prix | F1 news

How OpenAI’s human error led to the AI-powered Hugging Face hack

New Draft Clarity Bill Will Prevent Trump And Officials From Issuing Crypto With A 2029 Sunset

Cybersecurity researchers aren’t happy about the guard rail on Anthropic’s Fable

Contact us

By umniy.com

Related Post

You Missed