
When Queries Become Whole Blocks of Code: The Split Between RAG Evaluation and Search-Style Benchmarks
·1993 words·10 mins
Production RAG no longer matches short-query IR leaderboards—BEIR co-author Nandan Thakur on why search benchmarks and long-context, nugget-level RAG evaluation are diverging axes.