AI Model Breaks Out
/Anthropic said it would hold back its newest model (Mythos) because the prototype was too good at finding software weaknesses. The A.I. had identified thousands of them, “including some in every major operating system and web browser.” During safety tests, an Anthropic researcher got an email from Mythos while he was eating a sandwich in the park. That was a surprise because the model wasn’t supposed to be online. It had escaped its test environment. It also bragged about breaking the rules and attempted to cover its tracks. -New York Times
