Control a powerful AI
Control a powerful AI


https://www.youtube.com/watch?v=6unxqr50kqg
Author: Anthropic – Duration: 00:51:28
Anthropic researchers Ethan Perez, Joe Benton and Akbir Khan discuss the control of the AI - an approach to the risk management of advanced AI systems. They discuss the real world assessments showing how humans find it difficult to detect the misleading AI, the three main models of threats that researchers work to mitigate and the global idea of controlling highly capable AI systems whose objectives can differ from ours. 0:00 Introduction 0:33 What is AI control? 2:56 Control assessments in practice 5:39 Results of evaluations 7:27 Surveillance protocols 13:18 How control differs from alignment 16:09 The alignment challenge Foing 23:10 Ensure the evaluations work for future models 26:09 Open Questions in Control Research 34:15 Lessons learned from control 37:14 Why work on control now? 43:26 Key threat models 48:35 Optimistic signs






