Anthropic’s debut most powerful AI so far amid the controversy ‘whistling’ controversy

Artificial Intelligence firm Anthropic has launched the latest generations of its chatbott amid criticism of a test environmental behavior that may report to officers to some users.

Anthropic unveiled Cloud Opus 4 and Cloud Sonnet 4 on 22 May, Claim This Cloud Ops 4 is its most powerful model, “and the world’s best coding model,” while Cloud Sonnet 4 is an important upgrade from its predecessor, “better coding and argument”.

The firm said that both are upgrade hybrid models, offering two modes-“close-established reactions and extended thinking for deep logic.”

Both AI model Can also option even among arguments, Research And the use of equipment, such as web search, to improve reactions, said this.

anthropic It was added that the Cloud Opus 4 agent outperforms the contestants in the coding benchmark. It is capable of working continuously for hours on complex, long -running tasks, “What can AI agents do.”

Anthropic claims that Chatbot has scored a 72.5% score on a rigorous software engineering benchmark, improved OpenA’s GPT -4.1, which scored 54.6% after April launch.

Cloud V4 Benchmark. Source: anthropic

Connected: Openai ignored experts when it releases highly agreed chat

Major players of the AI ​​industry have extended towards the “Reasoning Model” in 2025, which will work through problems before responding.

Openai started the shift in December with its “O” series, followed by Gemini 2.5 Pro with its experimental “deep think” ability.

Cloud mice on misuse of tests

On 22 May, the first developer conference of Anthropic was seen by controversy and backlash over a feature of Cloud 4 Opus.

Developers and users strongly reacted to the revelations that the model could autonomally report to the authorities if it detects “immorally immoral” behavior, According To enter the enterprise.

The report cited the anthropic AI alignment researcher Sam Boman, who wrote on X that the chatbot “would use the command-line tool to contact the press, contact the regulators, try to lock you out of the concerned system, or all of the above.”

However, Boman later Stated That he “removed the earlier tweet on whistleblowing as it was being taken out of the context.”

He clarified that this feature was only “tested in the environment where we give it unusually free access to equipment and very unusual instructions.”

Source: Sam Boman

CEO of Stability AI, Emad Mostke, Said For the anthropic team, “This is a complete misbehavior and you need to close it – it is a great betrayal of belief and a slippery slope.”

magazine: Aye blindness, ‘Good’ propaganda bots, Openi Doomsde Bunker: fixes AI I