uses — Sam Latino

Languages

Rust: daemons, CLI tools, anything that needs a static binary and zero runtime surprise
Python: eval harnesses, model-API-adjacent code, anything with a fast iteration loop
TypeScript: Astro, GitHub Actions, one-off scripts

Inference

vLLM: serving open-weight models, tool-call formatting, structured output
LiteLLM: routing layer over vLLM; single key surface for all tooling
Self-hosted GPU server: one machine on the local network; no cloud inference dependency for production workloads

Storage

SQLite (FTS5/BM25): lexical retrieval by default; single-file deploys; zero ops overhead
LanceDB: optional vector tier when semantic recall is genuinely required
PostgreSQL: relational workloads that need constraints and joins

Rust crates (frequent)

axum + tokio: HTTP servers; async runtime
rusqlite: SQLite, including FTS5 virtual tables
criterion: micro-benchmarks; the patchbay route path runs under it
insta: snapshot tests; catches regressions in rendered output
serde / serde_json: everywhere

Python packages (frequent)

pytest + hypothesis: unit tests and property-based testing
httpx: async HTTP client for model APIs and mockservers
jsonschema: conformance checking in callcheck and eval-gate
rich: terminal output in CLI harnesses

Editor and terminal

Neovim: primary editor
Windows Terminal: daily driver shell
Git + GitHub CLI (gh): version control; CI via GitHub Actions

Hardware

One desktop as the daily driver. One GPU server on the local network for model inference — Qwen models, served via vLLM. The GPU server handles all production inference; nothing goes to a cloud provider during a working session.

The machines sit on a private mesh network; SSH is the only management interface, and no web dashboards are exposed.

What I don't use

Cloud LLM APIs during development. The cost model and the latency model both change if you can assume the model is local, and I'd rather build to that assumption than paper over the seams. The tradeoffs show up in patchbay's privacy-routing design and in millstone's case against an embedding dependency in the indexing path.