Agent on Neat Guy Coding

Agent on Neat Guy Codinghttps://neatguycoding.com/tags/agent/Recent content in Agent on Neat Guy CodingHugo -- gohugo.ioen© 2026 NeatGuyCodingMon, 18 May 2026 00:00:00 +0000Agent Oversight Stack: From Static Evaluation to Trajectory-Level Observabilityhttps://neatguycoding.com/posts/2026-05-18-weaviate-podcast-patronus-ai-with-anand-kannappan-weaviate-podcast-122/Mon, 18 May 2026 00:00:00 +0000https://neatguycoding.com/posts/2026-05-18-weaviate-podcast-patronus-ai-with-anand-kannappan-weaviate-podcast-122/Agent oversight stack: from static evaluation to trajectory-level observability—evaluation, observability, and supervision for multi-agent systems, with Percival, Lynx, and Glider, and evidence boundaries called out.Agent-Agnostic Java Quality Guardrails: Put Standards in the Repo with AGENTS.md and Static Analysishttps://neatguycoding.com/posts/2026-05-18-javaone-2026-agent-agnostic-guardrails-universal-java-code-quality-with-agents-md-and/Mon, 18 May 2026 00:00:00 +0000https://neatguycoding.com/posts/2026-05-18-javaone-2026-agent-agnostic-guardrails-universal-java-code-quality-with-agents-md-and/Agent-agnostic Java quality guardrails: use AGENTS.md and static analysis to encode standards in the repository.Agents on Semi-Structured Retrieval: STaRK Benchmark and AvaTaR Optimizationhttps://neatguycoding.com/posts/2026-05-18-weaviate-podcast-optimizing-retrieval-agents-with-shirley-wu-weaviate-podcast-115/Mon, 18 May 2026 00:00:00 +0000https://neatguycoding.com/posts/2026-05-18-weaviate-podcast-optimizing-retrieval-agents-with-shirley-wu-weaviate-podcast-115/Stanford’s STaRK benchmark and AvaTaR contrastive optimization for retrieval agents on semi-structured knowledge bases—metrics, multi-vector limits, when agents lose to dense retrievers, and what to ship in production.AI-Powered Search: When RAG, Agents, and Classic IR Get Rewiredhttps://neatguycoding.com/posts/2026-05-18-weaviate-podcast-doug-turnbull-and-trey-grainger-on-ai-powered-search-weaviate-podcast-13/Mon, 18 May 2026 00:00:00 +0000https://neatguycoding.com/posts/2026-05-18-weaviate-podcast-doug-turnbull-and-trey-grainger-on-ai-powered-search-weaviate-podcast-13/AI-Powered Search: When RAG, agents, and classic IR get rewired—retrieval quality vs. agent loops, long context vs. searchable history, leaderboard embeddings vs. domain corpora, with Doug Turnbull and Trey Grainger on what ships.Data Agents: When Code-Writing Models Meet the Real Data Stackhttps://neatguycoding.com/posts/2026-05-18-weaviate-podcast-data-agents-with-shreya-shankar-weaviate-podcast-135/Mon, 18 May 2026 00:00:00 +0000https://neatguycoding.com/posts/2026-05-18-weaviate-podcast-data-agents-with-shreya-shankar-weaviate-podcast-135/Data agents across Snowflake, MySQL, Mongo, and Salesforce—DAB benchmarks, DocETL, tribal knowledge, and agent-first databases, with verifiable claims separated from speaker opinion.Enterprise AI on Exabyte-Scale Unstructured Content: Permissions, Layered Retrieval, and Agent Boundarieshttps://neatguycoding.com/posts/2026-05-18-weaviate-podcast-box-ai-with-ben-kus-and-bob-van-luijt-weaviate-podcast-120/Mon, 18 May 2026 00:00:00 +0000https://neatguycoding.com/posts/2026-05-18-weaviate-podcast-box-ai-with-ben-kus-and-bob-van-luijt-weaviate-podcast-120/Enterprise AI on exabyte-scale unstructured content: permissions, layered retrieval, and agent boundaries—engineering lessons from Box × Weaviate on ACL-aware RAG, embedding economics, and production agents.Enterprise RAG and Agents: From Frankenstein Pipelines to an Optimizable Whole Systemhttps://neatguycoding.com/posts/2026-05-18-weaviate-podcast-contextual-ai-with-amanpreet-singh-weaviate-podcast-114/Mon, 18 May 2026 00:00:00 +0000https://neatguycoding.com/posts/2026-05-18-weaviate-podcast-contextual-ai-with-amanpreet-singh-weaviate-podcast-114/Enterprise RAG and agents: from stitched-together pipelines to an end-to-end optimizable system—RAG 2.0, active retrieval, preference learning (KTO/APO), and LMUnit-style evaluation, with evidence boundaries called out.Enterprise RAG and Agents: When Vector Databases Meet Four Decades of Analytics Softwarehttps://neatguycoding.com/posts/2026-05-18-weaviate-podcast-saurabh-mishra-and-bob-van-luijt-on-weaviate-and-sas-weaviate-podcast-12/Mon, 18 May 2026 00:00:00 +0000https://neatguycoding.com/posts/2026-05-18-weaviate-podcast-saurabh-mishra-and-bob-van-luijt-on-weaviate-and-sas-weaviate-podcast-12/Enterprise RAG and agents when vector databases meet four decades of analytics software—engineering tensions in regulated industries, SAS RAM, Weaviate integration, and production boundaries.Enterprise RAG on Financial Research Corpora: Engineering Trade-offs in Vector Stores, Agents, and Evalhttps://neatguycoding.com/posts/2026-05-18-weaviate-podcast-morningstar-intelligence-engine-with-aravind-kesiraju-weaviate-podcast-1/Mon, 18 May 2026 00:00:00 +0000https://neatguycoding.com/posts/2026-05-18-weaviate-podcast-morningstar-intelligence-engine-with-aravind-kesiraju-weaviate-podcast-1/Enterprise RAG on financial research corpora: engineering trade-offs across vector stores, agents, and eval—ingestion throughput, retrieval granularity, entitlements, and agent latency.From 'It Runs' to 'It's Controlled': Reliable Java AI Agents with Domain Modeling and Kooghttps://neatguycoding.com/posts/2026-05-18-javaone-2026-reliable-ai-agents-using-domain-modeling-with-koog-in-java/Mon, 18 May 2026 00:00:00 +0000https://neatguycoding.com/posts/2026-05-18-javaone-2026-reliable-ai-agents-using-domain-modeling-with-koog-in-java/Use domain modeling to move Java AI agents from ‘it runs’ to ‘it’s controlled’—orchestration, contracts, and type-safe pipelines with Koog.From RAG to Search Agents: Three Tensions in Retrieval, Synthetic Data, and Evaluationhttps://neatguycoding.com/posts/2026-05-18-weaviate-podcast-search-agents-with-nandan-thakur-weaviate-podcast-137/Mon, 18 May 2026 00:00:00 +0000https://neatguycoding.com/posts/2026-05-18-weaviate-podcast-search-agents-with-nandan-thakur-weaviate-podcast-137/From RAG to search agents: BEIR co-author Nandan Thakur on BrowseComp-Plus, synthetic data pipelines, GRPO economics, and why retrieval benchmarks, training cost, and harness design pull in different directions.Query Agent on a Vector Database: Auditable Retrieval and Two Ways to Ask Your Datahttps://neatguycoding.com/posts/2026-05-18-weaviate-podcast-weaviate-s-query-agent-with-charles-pierse-weaviate-podcast-128/Mon, 18 May 2026 00:00:00 +0000https://neatguycoding.com/posts/2026-05-18-weaviate-podcast-weaviate-s-query-agent-with-charles-pierse-weaviate-podcast-128/Query Agent on a vector database: auditable retrieval, Ask vs Search modes, schema introspection, multi-collection routing, and what is verified in docs versus speaker claims.Stateful Agents and Context Compilation: The Engineering Divide from MemGPT to Lettahttps://neatguycoding.com/posts/2026-05-18-weaviate-podcast-letta-ai-with-sarah-wooders-weaviate-podcast-117/Mon, 18 May 2026 00:00:00 +0000https://neatguycoding.com/posts/2026-05-18-weaviate-podcast-letta-ai-with-sarah-wooders-weaviate-podcast-117/Stateful agents and context compilation: how Letta (from MemGPT) treats the context window as a compiled runtime view—memory tiers, agentic RAG, tool-call unification, multi-agent blocks, and observability—with evidence boundaries called out.Synthetic Data: Boundaries of Data Fabrication in RAG, Agents, and Evaluationhttps://neatguycoding.com/posts/2026-05-18-weaviate-podcast-synthetic-data-with-david-berenstein-and-ben-burtenshaw-weaviate-podcast/Mon, 18 May 2026 00:00:00 +0000https://neatguycoding.com/posts/2026-05-18-weaviate-podcast-synthetic-data-with-david-berenstein-and-ben-burtenshaw-weaviate-podcast/Synthetic data for RAG, agents, and offline evaluation—when to augment, how to trust the distribution, and pipelines from distilabel and Persona Hub to Hub SQL and quality filters.When Format Constraints Hurt LLMs: A Split Between Agent Pipelines and Benchmark Evaluationhttps://neatguycoding.com/posts/2026-05-18-weaviate-podcast-let-me-speak-freely-with-zhi-rui-tam-weaviate-podcast-108/Mon, 18 May 2026 00:00:00 +0000https://neatguycoding.com/posts/2026-05-18-weaviate-podcast-let-me-speak-freely-with-zhi-rui-tam-weaviate-podcast-108/When format constraints hurt LLMs: the same structured-output techniques often lower scores on reasoning tasks and raise them on discrete classification—from agent pipelines to benchmark evaluation.