The Hard Problem of Controlling Powerful AI Systems - Computerphile
As AI systems become more capable, rule-based safeguards, hard-coded restrictions, and simple alignment strategies start to break down. Buck Shlegeris talks about some tactics we might use as detailed in a recent paper. The referenced paper: https: arxiv.org abs 2504.10374 Computerphile is...