AI Exploiting The Loopholes
/When researchers presented a large language model with 72 simulated regulatory environments, the AI learned to exploit loopholes in everything from credit card rewards programs to school funding formulas, despite never being instructed to do so. The loopholes couldn’t be patched fast enough to keep up with the mischief. In more than 100 iterations of five scenarios, the model kept finding new exploits, each more subtle than the last. And existing safety mechanisms didn’t catch the rule-bending behavior. -Science.org
