LiveFR

Alignment

Definition: Alignment is the effort to make an AI system act in line with human intentions and values, in a helpful, honest and safe way.

It spans training techniques like RLHF and Constitutional AI, and research on controlling highly capable models. It is a central focus of Anthropic's work.

See also

← Full AI glossary · AI news