What is Guardrails vs alignment ?

Question

Accepted Answer

Guardrails filter an already-trained model's inputs and outputs, like an external barrier; alignment aims to shape the model's own behavior during training. Guardrails act at the surface and can be bypassed, but adjust quickly; alignment goes deeper but is decided upstream, at training time. A robust approach usually combines both layers of defense.

Guardrails vs alignment

See also