<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>Sam Latino — writing</title><description>Notes on agent systems, evals, and self-hosted LLM infrastructure. Rust + Python.</description><link>https://samlatino.dev/</link><language>en-us</language><item><title>BM25 beat my vector database (sometimes)</title><link>https://samlatino.dev/writing/bm25-beat-my-vector-database/</link><guid isPermaLink="true">https://samlatino.dev/writing/bm25-beat-my-vector-database/</guid><description>A crossover framework for lexical versus vector retrieval on code — and the adversarial bench harness I built so my own argument can lose.</description><pubDate>Wed, 10 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Every model fails tool calling differently</title><link>https://samlatino.dev/writing/every-model-fails-tool-calling-differently/</link><guid isPermaLink="true">https://samlatino.dev/writing/every-model-fails-tool-calling-differently/</guid><description>Tool calling is the load-bearing primitive of every agent stack, and open models break it in at least eleven distinguishable ways. Naming the failure modes changes how you build the layer above.</description><pubDate>Sat, 06 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Red-teaming my own agents with the OWASP Agentic Top 10</title><link>https://samlatino.dev/writing/red-teaming-my-own-agents/</link><guid isPermaLink="true">https://samlatino.dev/writing/red-teaming-my-own-agents/</guid><description>Turning &quot;resists prompt injection&quot; into a regression number: a deterministic harness, 146 probes across five OWASP agentic categories, and a hardening sweep that went 73% → 3% → 0%.</description><pubDate>Wed, 03 Jun 2026 00:00:00 GMT</pubDate></item></channel></rss>