Explain the difference between stated, revealed, and idealized preferences.
Stated preferences are expressed, revealed preferences are inferred from choices, and idealized preferences are those held with perfect information and judgment.
How can we use stated preferences to train AIs? What is one practical limitation of using human feedback to train AI systems?
Humans can directly evaluate and rank AI outputs as training feedback. A limitation is infeasibility as tasks become too complex for humans to judge.
Why might using human preferences alone be insufficient for comprehensive machine ethics?
Preferences may not capture important factors like autonomy, may need to be aggregated, and could still lead to unethical outcomes if satisfied.