Tagged: Llm
12 posts

June 25, 2026 · 3 min read · blog
AI agents fail decisions for the same four reasons people do. The WRAP framework, encoded as a system-prompt gate, fixes what a bigger model can't.

June 16, 2026 · 3 min read · blog
Your AI agent reaches for googled best practices before your own proven fixes. Wire a trust order into your CLAUDE.md and agent loop instead.

June 3, 2026 · 4 min read · blog
Agents forget and good ones cost. The fix is not a better model. Put the goal in deterministic scripts and run a cheap model against them.

May 29, 2026 · 3 min read · blog
Opus 4.8 shipped May 28. For unattended cron agents, the upgrades that matter are not the benchmark scores. A use-case breakdown from real builds.

May 12, 2026 · 4 min read · blog
Context engineering sounds new. It is the file-naming hygiene developers always had, load-bearing now because LLMs read what you point them at.

April 21, 2026 · 5 min read · case-studies
Self-hosted German PII redactor for SAP prod→dev copies. Plugs in after TDMS/Delphix/Informatica to cover free-text NOTES columns, unclassified Z-tables, and OCR'd attachments. DSGVO-konform, Apache 2.0, runs on a single consumer GPU.

April 17, 2026 · 13 min read · guides
How to choose an LLM for production workloads. 7 selection criteria, a decision tree, an evaluation process, and a requirements checklist from real deployments. Download the free AI Automation Checklist.

April 16, 2026 · 15 min read · guides
Self-hosted LLM vs API cost analysis with break-even math. When to self-host, when to stay on Claude, and the hybrid pattern most production teams actually use. Download the free AI Automation Checklist.

April 15, 2026 · 18 min read · guides
Feature matrix, pricing, reliability and EU hosting across major LLM APIs. Where Anthropic, OpenAI and Google win, and what to pick for production.

April 11, 2026 · 12 min read · guides
LLM API cost comparison for 2026. Model your real workload costs with prompt caching, output tokens, reasoning, and batch API factored in. Download the free AI Automation Checklist.

April 5, 2026 · 16 min read · blog
Complete self-hosted LLM Kubernetes guide. Deploy vLLM on GPU nodes with manifests, HPA, monitoring, and cost modeling. Practitioner notes included. Download the free AI Automation Checklist.

April 1, 2026 · 16 min read · blog
End-to-end RAG pipeline tutorial. Qdrant + Claude Sonnet 4.6 + local embeddings. Real code for chunking, retrieval, augmentation, and citation-grounded answers. Download the free AI Automation Checklist.