Mixture of Experts
Definition: A Mixture of Experts (MoE) model activates only part of its parameters ('experts') per request, for more power at lower compute cost.
It's a popular architecture for scaling efficiently, used by several recent large models.