
Judge-Time Compute: When LLM Evaluation Moves from a Single Score to a Composable Pipeline
·3727 words·18 mins
Judge-time compute: stacking structured, composable weak-model calls at evaluation time instead of assuming one expensive judge pass is enough—Verdict, agreement metrics, and production guardrails, with evidence boundaries called out.