Category: Concepts

Token Efficiency

A measure of how effectively a model utilizes tokens (or inputs) to produce outputs, impacting computational resource usage.

GRPO

An optimization technique used in reinforcement learning to improve policy performance based on feedback from the environment.

Dual-Mode Thinking

A cognitive strategy that allows a model to switch between quick, heuristic-based decision-making and slower, analytical reasoning.