It is useful to consider AI safety within the broader context of safety engineering. This provides fundamental principles for identifying and managing risks, drawing general lessons from the management of other systems such as airplanes or nuclear power plants. Principles of safe design, crucial for improving a system's safety and controllability, are discussed, particularly in relation to AI systems.
Any competent form of risk management needs to consider tail events with low probability but high impact. We explore the concepts of tail events and black swans—essentially unpredictable unknowns. We consider how these concepts can be applied as part of strategies to mitigate unforeseeable risks from AI.
N. G. Leveson, Engineering a Safer World: Systems Thinking Applied to Safety. MIT Press, 2011.
C. Perrow, Normal Accidents: Living with High Risk Technologies. Princeton University Press, 1999.
N. N. Taleb, The Black Swan: The Impact of the Highly Improbable. Random House, 2007.
N. N. Taleb, Antifragile: Things That Gain from Disorder. Random House, 2012.
Citation:
Dan Hendrycks. Introduction to AI Safety, Ethics and Society. Taylor & Francis, (2024). ISBN: 9781032798028. URL: www.aisafetybook.com
Cookies Notice: This website uses cookies to identify pages that are being used most frequently. This helps us analyze data about web page traffic and improve our website. We only use this information for the purpose of statistical analysis and then the data is removed from the system. We do not and will never sell user data. Read more about our cookie policy on our privacy policy. Please contact us if you have any questions.