Preferences

While it may seem intuitive to let AI systems be guided by people's preferences, there are a number of objections to relying on these.

Review Questions

Explain the difference between stated, revealed, and idealized preferences.

Answer:

Stated preferences are expressed, revealed preferences are inferred from choices, and idealized preferences are those held with perfect information and judgment.

‍

View Answer

How can we use stated preferences to train AIs? What is one practical limitation of using human feedback to train AI systems?

Answer:

Humans can directly evaluate and rank AI outputs as training feedback. A limitation is infeasibility as tasks become too complex for humans to judge.

View Answer

Why might using human preferences alone be insufficient for comprehensive machine ethics?

Answer:

Preferences may not capture important factors like autonomy, may need to be aggregated, and could still lead to unethical outcomes if satisfied.

View Answer

Citation:
Dan Hendrycks. Introduction to AI Safety, Ethics and Society. Taylor & Francis, (2024). ISBN: 9781032798028. URL: www.aisafetybook.com

Cookies Notice: This website uses cookies to identify pages that are being used most frequently. This helps us analyze data about web page traffic and improve our website. We only use this information for the purpose of statistical analysis and then the data is removed from the system. We do not and will never sell user data. Read more about our cookie policy on our privacy policy. Please contact us if you have any questions.