sam@latino:~$

sam@latino:~$ cat uses.md

uses

The stack I reach for. No affiliate links; no sponsored placement.


Languages

Rust
daemons, CLI tools, anything that needs a static binary and zero runtime surprise
Python
eval harnesses, model-API-adjacent code, anything with a fast iteration loop
TypeScript
Astro, GitHub Actions, one-off scripts

Inference

vLLM
serving open-weight models, tool-call formatting, structured output
LiteLLM
routing layer over vLLM; single key surface for all tooling
Self-hosted GPU server
one machine on the local network; no cloud inference dependency for production workloads

Storage

SQLite (FTS5/BM25)
lexical retrieval by default; single-file deploys; zero ops overhead
LanceDB
optional vector tier when semantic recall is genuinely required
PostgreSQL
relational workloads that need constraints and joins

Rust crates (frequent)

axum + tokio
HTTP servers; async runtime
rusqlite
SQLite, including FTS5 virtual tables
criterion
micro-benchmarks; the patchbay route path runs under it
insta
snapshot tests; catches regressions in rendered output
serde / serde_json
everywhere

Python packages (frequent)

pytest + hypothesis
unit tests and property-based testing
httpx
async HTTP client for model APIs and mockservers
jsonschema
conformance checking in callcheck and eval-gate
rich
terminal output in CLI harnesses

Editor and terminal

Neovim
primary editor
Windows Terminal
daily driver shell
Git + GitHub CLI (gh)
version control; CI via GitHub Actions

Hardware

One desktop as the daily driver. One GPU server on the local network for model inference — Qwen models, served via vLLM. The GPU server handles all production inference; nothing goes to a cloud provider during a working session.

The machines sit on a private mesh network; SSH is the only management interface, and no web dashboards are exposed.

What I don't use

Cloud LLM APIs during development. The cost model and the latency model both change if you can assume the model is local, and I'd rather build to that assumption than paper over the seams. The tradeoffs show up in patchbay's privacy-routing design and in millstone's case against an embedding dependency in the indexing path.