Blog

June 19, 2026 · 15 min read

The Procurement Inversion Takes a Paper Cut

MiniMax M3 extends the open-weight procurement default at frontier coding, but on harder terms — non-MIT license, datacenter hardware floor, and DeepSeek wins cache pricing 17x over.

#open-weight-models#minimax#ai-procurement#coding-tools#llm-deployment

June 18, 2026 · 16 min read

The Extensions Apple Built and Did Not Ship

iOS 27 Beta 1 contains the full third-party AI Extensions framework — Settings panel, App Store category, entitlements, four system surfaces — with the runtime gated off. None of the four reasons Apple deferred it are technical.

#apple#ai-distribution#wwdc-2026#anthropic#consumer-ai#ios-27

June 17, 2026 · 13 min read

The Integrated Package Wins the Procurement Cycle

Microsoft is the first hyperscaler to ship GA eval, GA governance, an open cross-vendor spec, and a flagship Big-4 deployment in one release window. AWS and Google have 90 days to respond before RFP season closes.

#ai-agents#microsoft#kpmg#agent-frameworks#enterprise-ai

June 16, 2026 · 16 min read

Voluntary With Teeth: EO 14409, NSPM-11, and the Eleven-Day Mythos Recall

EO 14409 set up a voluntary AI framework on June 2. Eleven days later, Commerce used existing export-control authority to force Anthropic to disable Fable 5 and Mythos 5 worldwide. The two tracks now run in parallel.

#ai-policy#mythos#anthropic#eo-14409#export-controls

June 4, 2026 · 13 min read

The 32B Open VLA Re-Prices Every Closed AV Stack

NVIDIA open-weighted Cosmos 3 and a 32B reasoning VLA for L4 robotaxis. The chip vendor just open-sourced the model layer above its own GPUs.

#AI#Physical-AI#Open-Weights#Autonomous-Vehicles#NVIDIA#Cosmos#Alpamayo#VLA

June 3, 2026 · 13 min read

The Runtime Commoditized. The Eval Layer Is Still Your Problem.

Antigravity 2.0, MS Agent Framework 1.0, and AWS AgentCore Runtime now form a commodity baseline. The harness around them is still the buyer's problem, and that gap is where the sticker shock is coming from.

#Agent Engineering#AI in Production#Agentic AI#Enterprise#Governance

June 2, 2026 · 13 min read

The Unsupervised Agent Tax: What a 24/7 VM Subscription Actually Buys You

Gemini Spark runs on dedicated GCP VMs whether you use it or not. The $100 AI Ultra tier prices the always-on VM, not session compute.

#AI Economics#Google Gemini#Consumer AI#AI Subscriptions#Agent Infrastructure

June 2, 2026 · 10 min read

The Boomerang Completes

Anthropic shipped Opus 4.8, a $65B Series H, and a Mythos GA commitment in one week. Three weeks later, ENISA joined Glasswing as the first foreign agency.

#AI Policy#Anthropic#Mythos#Project Glasswing#Regulation

May 27, 2026 · 5 min read

Multi-Agent System Design: Moving From Prompts to Contracts

Secure multi-agent handoffs by replacing natural language prompts with Pydantic schemas and deterministic contracts to prevent data leakage and system failure.

#AI#Architecture#Agentic-Workflows#Multi-Agent#Python#Security

May 20, 2026 · 7 min read

The Agentic Web: Why AI Agents Need a Resume (Arscontexta Skill Graph)

The Arscontexta Skill Graph solves O(n) context bloat in AI agents by replacing massive system prompts with a navigable SKILLS.md folder structure.

#AI#Architecture#Arscontexta#Agentic-Workflows#AI-Agents#Multi-Agent

May 20, 2026 · 5 min read

Watching every fund filing on Earth, for free

Situational Awareness LP just tripled its book to $13.7B and FinX parsed every position before the document had been on EDGAR for an hour. The pipeline that surfaces those filings the minute they land is ~250 lines of Python, one JSON registry, and a Telegram bot. $0 per month. Point your agent at this article.

#SEC#EDGAR#13F#Open Source#Python#Telegram#AI Agents#Finance#Pipeline

May 14, 2026 · 19 min read

The Pilot-to-Production Death March

Five Reddit posts, every major vendor's framework release, and Uber's 1,500-agent data point all converged on one diagnosis: the harness is the constraint, not the LLM.

#Agent Engineering#AI in Production#Agentic AI#DevOps#Uber

May 13, 2026 · 15 min read

The Diligence Wall: Anthropic Shipped 10 Finance Agents Into the Carve-Out

Anthropic shipped 10 finance agents 18 days after SR 26-2 carved generative and agentic AI out of model risk. Here is the Day 1 diligence checklist.

#AI in Finance#Anthropic#Model Risk#SR 11-7#Regulated AI#Agent Engineering

May 12, 2026 · 9 min read

The Overtake Underneath: What Bifurcated When Anthropic Passed OpenAI

Anthropic crossed $30B ARR while OpenAI sat at $24B. The model layer bifurcated by buyer surface, and Bain's $2T sector revenue math got harder, not easier.

#AI Economics#Anthropic#OpenAI#Enterprise AI#AI Bubble

May 11, 2026 · 16 min read

The Mythos Boomerang

Anthropic's Mythos gating fight became federal AI policy in four weeks. Five labs, CAISI, and an FDA-style executive order changed the regime.

#AI Policy#Anthropic#Mythos#CAISI#Regulation

May 7, 2026 · 18 min read

The Procurement Inversion: When Self-Hosting Chinese Weights Becomes the Compliant Choice

DeepSeek V4-Pro shipped MIT-licensed at 1.6T parameters while Anthropic Mythos sits behind White House gating, and enterprise procurement just walked through the open-weight door.

#AI#open-source#deepseek#procurement#compliance#self-hosting

May 6, 2026 · 13 min read

Same Capex, Different Denominators: What Q1 2026 Hyperscaler Earnings Said

Meta, Microsoft, Alphabet, and Amazon all spent like hyperscalers in Q1 2026. The market priced one of them as a thesis bet. Here's why.

#AI#Capex#Earnings#Macro#Financing

May 5, 2026 · 10 min read

The Cognitive Debt Studies Don't Prove What People Think

A close read of the MIT EEG paper, the Microsoft survey, and the METR developer study — what the evidence actually supports about AI agents and human cognition.

#AI#Research#Career#Cognitive-Science#Strategy

May 4, 2026 · 16 min read

The Mythos Gate vs the DeepSeek Door

The White House blocked Anthropic's plan to expand Mythos access on April 30. Six days earlier, DeepSeek shipped 1.6T parameter weights for vulnerability discovery on Hugging Face.

#AI#Policy#Open-Source#Cybersecurity#Strategy

April 28, 2026 · 12 min read

The Design Decisions You're Making Without Knowing It: What Academics Found Inside Claude Code

A research team reverse-engineered Claude Code and found that 1.6% is AI logic and 98.4% is infrastructure. The paper maps every design decision agent builders make implicitly.

#AI#Architecture#Agent#Claude-Code#Strategy

April 27, 2026 · 10 min read

Transformer Forecast Collapse: Why More Powerful Models Make Worse Financial Predictions

A formal proof shows that increasing transformer expressivity leads to strictly worse financial predictions under MSE loss. PatchTST lost to a linear model on 92% of test windows.

#AI#Finance#Benchmarks#Strategy#Machine-Learning

April 24, 2026 · 10 min read

Stanford's AI Index vs the Consensus: What 400 Pages of Data Actually Show

Stanford's 2026 AI Index contradicts consensus narratives. The US ranks 24th in AI adoption, benchmarks predict nothing about reliability, and transparency is collapsing.

#AI#Research#Benchmarks#Open-Source#Strategy

April 23, 2026 · 10 min read

The Deskilling Feedback Loop: When Vibe Coders Never Become Senior Engineers

AI coding tools create a pipeline time bomb. Junior devs develop a 17% comprehension gap while senior engineers drown in 98% more PR reviews.

#AI#Career#Developer-Tools#Engineering#Strategy#Code-Quality

April 22, 2026 · 9 min read

The 25% Tax on AI: How Chip Tariffs Are Sorting the Industry

A 25% tariff on AI chips with a 100MW exemption floor favors hyperscalers. Startups face 50-75% cost increases while domestic production is years away.

#AI#Economics#Policy#Infrastructure#Startups

April 21, 2026 · 9 min read

Tennessee Wants to Jail Your AI Engineer for 15 Years

Tennessee SB 1493 makes chatbot building a Class A felony while the White House pushes deregulation. 78 AI bills across 27 states and zero federal preemption.

#AI#Policy#Regulation#Career#Strategy

April 20, 2026 · 10 min read

The Great Flattening: Why AI Is Killing Middle Management First

AI layoffs in Q1 2026 targeted middle management, not entry-level workers. Oracle, Amazon, Block, and Snap cut managers to fund AI infrastructure.

#AI#Career#Layoffs#Management#Strategy#Labor-Market

April 17, 2026 · 8 min read

LLMs Don't Hear Music. Here's What They Actually Need.

You can't hand a song to an LLM and expect it to understand production. I built a decomposition pipeline that extracts 40+ parameters per stem, and the pattern applies far beyond music.

#AI#Audio#LLM#Signal-Processing#Architecture#Python

April 16, 2026 · 8 min read

Allbirds Just Became an AI Company. That Should Worry You.

A shoe company rebranded as NewBird AI, surged 600%, and revealed exactly where we are in the AI hype cycle.

#AI#Markets#Strategy#AI Hype Cycle#GPU Infrastructure

April 16, 2026 · 11 min read

The AI Second Brain Wave: What Claude Code + Obsidian Actually Gets You

Everyone's building AI second brains with Claude Code and Obsidian. Here's what actually works, what breaks, and why I built two Obsidian plugins before the trend existed.

#AI#Architecture#Obsidian#Plugin#Knowledge-Management#Second-Brain

April 15, 2026 · 13 min read

Open Source Just Passed Frontier: What GLM-5.1 Means for Builders

GLM-5.1 topped SWE-Bench Pro under MIT license. Gemma 4 shipped edge variants under Apache 2.0. Open source model parity is here — with caveats builders need to understand.

#AI#Local#open-source#glm-5#gemma-4#swe-bench#Benchmarks#Edge-AI

April 14, 2026 · 12 min read

The $650B Zero-ROI Disconnect: AI's Biggest Bet vs the Data

Amazon, Meta, Google, and Microsoft are spending $650B on AI infrastructure. Goldman Sachs says the GDP contribution is basically zero. Here's what the data actually shows.

#AI#Architecture#ROI#Fine-Tuning#Benchmarks

April 14, 2026 · 11 min read

Inside Foliome: How the Agent Actually Works

The technical companion to Foliome's release. How the agent builds its own bank integrations, orchestrates parallel sync with MFA routing, recovers from errors without human intervention, and serves a dashboard from your local machine.

#AI#Open Source#Claude Code#Playwright#Architecture#Automation

April 13, 2026 · 8 min read

Foliome: Your Money, Your Machine, Your Agent

I haven't logged into a bank app in three months. An open-source financial operating system where your AI agent syncs your banks, classifies spending, and gives you complete financial intelligence. Everything local. Everything yours.

#AI#Open Source#Personal Finance#Claude Code#Automation

April 13, 2026 · 13 min read

The Keynesian Folly: Why AI Will Never Fully Automate Finance

Financial markets are reflexive. AI models degrade the patterns they exploit. The CFA Institute says augmentation, not automation.

#AI#Finance#CFA#Reflexivity#Career#Automation

April 9, 2026 · 8 min read

AI Platform Divergence: What It Means for Builders

OpenAI is betting on agents, Anthropic on coding, Google on multimodal. Here's the evidence, the strategy layer, and what to do if you're building in between.

#AI#Architecture

April 6, 2026 · 6 min read

DPO Fine-Tuning a 1.2B Model: What Worked, What Broke, What I'd Skip

DPO fine-tuning a 1.2B model with LoRA improved style but degraded reasoning after 400 samples — accuracy dropped from 39/40 to 34/40, and the base model won.

#AI#Fine-Tuning#DPO#LoRA#Small-Language-Model

April 3, 2026 · 6 min read

Multi-Model Routing: Instruct vs Thinking on Edge Devices

How I route between two LFM2.5-1.2B models on a Raspberry Pi 5 — when to use Instruct, when to use Thinking, and why user-controlled toggles beat automatic detection.

#AI#Architecture#Local#Raspberry-Pi#Model-Routing#Edge-AI

April 1, 2026 · 6 min read

The Frontier Model Tax

Airbnb doesn't run OpenAI in production. Cursor built its own Tab model. The shift to fine-tuned small models is real — here's the decision framework for builders.

#AI#Fine-Tuning#Local#Architecture#Tutorial

March 30, 2026 · 4 min read

Agent AI Execution: Why Unix Shell Beats JSON Function Calling

Building agents with a single shell execution tool reduces formatting errors and improves autonomy by leveraging LLM training data instead of brittle JSON schemas.

#AI#Agents#Architecture#Python#Bash#LLMs

March 26, 2026 · 5 min read

M5 Max and Local LLMs: Why the Best AI Tools Will Never Be Open Sourced

M5 Max hardware enables 120B parameter models to run locally at 65.8 tokens per second, shifting developer focus from public open-source projects to private, hyper-personalized dark tools for maximum leverage.

#AI#Hardware#Local#M5-Max#LLM#Automation

March 25, 2026 · 9 min read

The 2026 Solo Builder AI Stack: Picking Tools for Optionality, Not Features

Every solo builder is sharing their AI stack. Nobody explains why they chose it. The difference between a stack and a dependency trap.

#AI#Solopreneur#Architecture#Automation#n8n#Productization

March 24, 2026 · 8 min read

The AI Displacement Report Card: What the Data Actually Shows

WEF predicted 85M AI job losses by 2025. Actual US figure: 200K-300K. Oil, tariffs, and DOGE explain more than AI does.

#AI#Career-Transition#Economics#Labor-Market#Research#Tariffs

March 23, 2026 · 9 min read

The Consultant-to-Builder Pipeline: How to Learn Technical Skills by Solving Problems

AI displaced 200K-300K jobs while employment grew 2.5%. The opportunity isn't learning to code. It's solving a $1,000 problem in your domain with AI tools.

#AI#Career-Transition#Productization#Vibe-Coding#Solopreneur#Automation

March 19, 2026 · 5 min read

Text-to-Data with Snowflake Cortex: Building a Natural Language Portal

Build a secure text-to-data portal using Snowflake Cortex and Llama 3.1 to translate English into Pandas while maintaining a strict Python sandbox.

#AI#Architecture#Snowflake-Cortex#Streamlit#Llama-3-1#Text-to-Data#Python-Sandbox

March 18, 2026 · 6 min read

Gemini Agentic Workflows: Solving Model Laziness with Skill Graphs

Reduce Gemini agent shortcuts and memory loss by replacing flat tool lists with a file-system Skill Graph for structured, stateful workflows.

#AI#Architecture#Gemini#Agents#Prompt-Engineering#Orchestration#LLM-Tools

March 17, 2026 · 6 min read

The Entry Point Is Closing

Anthropic's own labor data shows AI isn't eliminating white-collar jobs yet — it's quietly shrinking entry-level hiring. Here's what that means if you're making the move into tech right now.

#AI#Career#Productivity#Strategy

March 16, 2026 · 4 min read

The End of the Context Window War: How Recursive Language Models Solve Infinite Data

Recursive Language Models replace massive context windows with a programmatic loop, letting small local models process infinitely long prompts.

#AI#Architecture#Paper-Review#Local-LLM#Skill-Graph

March 12, 2026 · 6 min read

Extracting an Obsidian Plugin From a Note App

I took the voice transcription pipeline out of my note app and shipped it as an Obsidian plugin. No platform. No local server. Just the slice that worked.

#AI#Architecture#Obsidian#Plugin#Speech-to-Text

March 11, 2026 · 4 min read

AI Agent Architecture: Why Single Prompt Files Fail at Scale

Why single prompt files fail for AI agents—and how failure-driven codification built a 108,000-line C# system with 19 specialized agents.

#AI#Architecture#Software-Engineering#AI-Agents#Cursor#Context-Window

March 10, 2026 · 5 min read

Agent Memory Architecture: Why Documentation is Agent RAM

How to manage a 24% knowledge-to-code ratio using a 3-tier context architecture—Hot, Warm, and Cold storage for AI agents.

#AI#Architecture#Agents#MCP#Prompt-Engineering

March 9, 2026 · 6 min read

AI Context Coverage: The 25% Rule for Large Codebases

How a 24% knowledge-to-code ratio prevents AI hallucinations and architectural rot in 100k+ line repositories.

#AI#Architecture#LLM-Context#Cursor#Software-Engineering

March 5, 2026 · 6 min read

AI Agent Architecture: How to Build 'Identity' Without Breaking Performance

Stop using 2,000-word system prompts. Learn how Tiered Context Architecture, Style Injection, and the Observer Pattern optimize AI agent performance and cost.

#AI#Architecture#Tutorial#Prompt-Engineering#LLM-Agents#Prompt-Caching#OpenClaw

March 4, 2026 · 13 min read

OpenClaw: Framework or File Bloat? What to Reuse (and Skip)

An evaluation of the OpenClaw gateway, the 'soul.md' ecosystem, and why separating Skills from Souls is the key to reliable agents.

#AI#Architecture#OpenClaw#Self-Hosted#AI-Gateway

March 3, 2026 · 10 min read

Recursive Language Models: How to Process Infinite Context With an 8B Model

RLMs let small models handle inputs 100x beyond their context window by storing prompts in a REPL and recursively processing slices. Here's how it works.

#AI#Architecture#Tutorial#Local#Inference#RLM#Long-Context

March 2, 2026 · 11 min read

The LLM Parameter Lie: What Actually Matters in 2026

Parameter counts are a vanity metric. How to read LLM architectures, what active parameters mean for your hardware, and the benchmarks that actually matter.

#AI#Architecture#Benchmarks#LLM#MoE#Inference

March 1, 2026 · 5 min read

The 2026 AI Stack: Building Compound Systems and Agentic Workflows

A strategic blueprint for the modern AI stack: moving from simple LLM wrappers to model routing, multi-agent orchestration, and local-first memory.

#AI#Architecture#Model-Routing#Multi-Agent#Local-AI#System-Design

February 26, 2026 · 10 min read

ChatGPT Ads: What Changes When the Search Box Talks Back

ChatGPT Ads target conversation context, not keywords. Here's what OpenAI confirmed, what's speculated, and how GEO content prepares you for both.

#AI#Marketing#SEO#GEO#ChatGPT-Ads

February 25, 2026 · 8 min read

Custom Blocks in Obsidian: The Scribe Block

Obsidian doesn't have custom blocks, so I faked it with a code fence, a hidden div, and two features that actually make it useful.

#AI#Tutorial#Obsidian#Plugin#Speech-to-Text

February 24, 2026 · 5 min read

Lazy Engineering: Why GraphRAG Beats 1M Token Context Windows

Large context windows introduce massive KV cache costs and latency. GraphRAG provides better retrieval accuracy for complex datasets at a fraction of the compute expense.

#AI#Architecture#GraphRAG#Long-Context#KV-Cache#RAG-vs-Long-Context

February 24, 2026 · 9 min read

RTX 5090 Local AI ROI: GPU Cluster vs. API TCO Analysis

An RTX 5090 cluster breaks even against DeepSeek API costs at 450M tokens of inference, providing a 14-month ROI for high-volume developers.

#AI#Hardware#Local#RTX-5090#DeepSeek#TCO

February 24, 2026 · 4 min read

System 2 Distillation: Porting DeepSeek-R1 Reasoning via Probability Mapping

Distill DeepSeek-R1 reasoning into small models for $0.50 using logit-based probability mapping. A high-signal alternative to standard fine-tuning.

#AI#Fine-Tuning#Distillation#Local#DeepSeek#Hardware#System-2

February 24, 2026 · 5 min read

Qwen 3.5 vs Liquid LFM: Edge AI Benchmarks on Raspberry Pi 5

Comparing Qwen 3.5-2B and Liquid LFM2.5-1.2B on Raspberry Pi 5: benchmarks for GPQA intelligence, RAM usage, and time-to-first-token latency.

#AI#Local-AI#Raspberry-Pi#Edge-AI#Qwen-3.5#Liquid-LFM#Benchmarks

February 23, 2026 · 8 min read

Voice Transcription Market: Current State and the Missing Piece

Voice transcription in 2026: who ships what (Monologue, Wispr Flow, Hey Lemon), what builders use (Whisper, AssemblyAI), and why fragmentation and lock-in are the real problem.

#AI#Voice#Whisper#Speech-to-Text#Dictation#Wispr-Flow#Monologue

February 19, 2026 · 7 min read

DIY Local Voice Assistant: Building a Raspberry Pi 5 AI Device

How to build a private, fully local voice assistant on Raspberry Pi 5 using Python and Whisper for on-device speech processing and LLM inference.

#AI#Hardware#Local#Raspberry-Pi#Voice-Assistant#Python

February 18, 2026 · 13 min read

Launch Platform ROI: Data on Product Hunt, HN, and AppSumo

Actual traffic and conversion data for Product Hunt, Hacker News, and AppSumo — including the 10% featured rate and specific ROI math for indie makers.

#Marketing#Launch#Product-Hunt#Hacker-News#AppSumo#Indie

February 18, 2026 · 7 min read

LFM2.5-1.2B on Raspberry Pi 5: llama.cpp Optimization Guide

Optimizing llama.cpp for LFM2.5-1.2B on a Raspberry Pi 5. Recommended settings for quantization, threads, and KV cache to maximize local LLM performance.

#AI#Tutorial#Local#Raspberry-Pi#llama.cpp#LFM2.5#Edge-AI

February 17, 2026 · 6 min read

Raspberry Pi 5 Local Voice Assistant: A Whisper and Liquid LFM AI Stack

Build a private AI voice assistant on Raspberry Pi 5 with Whisper and Liquid LFM2.5. Includes memory budgets, hardware setup, and Python code.

#AI#Local#Tutorial#Raspberry-Pi-5#Whisper#Liquid-LFM

February 16, 2026 · 6 min read

Deep Research: Exa.ai vs Other Providers

Embedding deep research in a note app with Exa.ai. I built a 5-phase enrichment pipeline, then simplified to Exa with a light review step.

#AI#Architecture#Deep-Research#Exa#RAG

February 16, 2026 · 11 min read

GEO: The New Rules of Search Visibility

AI search engines don't rank pages — they cite sources. Here's what actually works for getting cited by ChatGPT, Perplexity, Claude, and Google AI Overview.

#AI#SEO#GEO#Tutorial#ChatGPT#Perplexity

February 15, 2026 · 5 min read

Handling Chain-of-Thought for GPT-OSS-120b: Harmony Format and Provider Quirks

GPT-OSS-120b uses OpenAI's Harmony token protocol — analysis, commentary, and final channels. Here's how we handle CoT leakage and provider routing.

#AI#OpenRouter#Chain-of-Thought

February 15, 2026 · 7 min read

Workspace vs Document Assistant: Why You Need Both

The best work happens when conversation stays continuous. Document-localized chat gives the workspace assistant richer context — here's how we architected it.

#AI#Architecture#RAG

February 13, 2026 · 7 min read

LFM2.5-1.2B vs LFM2-2.6B: Why We Chose the Smaller Model

Benchmarking Liquid AI's LFM2.5-1.2B against LFM2-2.6B on a Pi 5 — the smaller model scores higher on IFEval (+9), runs 2.3x faster, and fits in under 1GB.

#AI#Local#Raspberry-Pi#Benchmarks#Liquid-AI

February 11, 2026 · 4 min read

Why We Stopped Forcing One Model to Do Everything

We built a Reasoner-Planner-Solver pipeline for a Pi 5 voice assistant, then replaced it with Instruct/Thinking model routing. Here's why simpler won.

#AI#Architecture#Local#Raspberry-Pi#Model-Routing

February 4, 2026 · 5 min read

Ditching Exa AI for Self-Hosted Search

Replacing Exa AI with self-hosted SearXNG and trafilatura on a Raspberry Pi — fully local web search in 2-3 seconds, no API keys, no data leaving the device.

#AI#Local#Tutorial#Raspberry-Pi#Self-Hosted#SearXNG

February 2, 2026 · 6 min read

From 78% to 97%: Making a 1.2B Model Actually Work

Three tricks that took a 1.2B model's tool routing from 78% to 97% — renaming tools, adding a calibration line, and a regex post-processor.

#AI#Local#Raspberry-Pi#Prompt-Engineering#Tool-Calling

February 2, 2026 · 7 min read

One Model, Three Roles: How a 1.2B Model Plays Reasoner, Planner, and Solver

One 1.2B model plays Reasoner, Planner, and Solver with different system prompts on a Raspberry Pi 5. Three LLM calls, 15-30 seconds for a reasoning task.

#AI#Architecture#Local#Raspberry-Pi#ReWOO

February 2, 2026 · 6 min read

When to Stop Adding Rules: Building a Negation-Aware Sanitizer Without Overfitting

I kept adding regex patterns to fix edge cases until I realized I was overfitting a rule-based system — here's the framework for knowing when to stop

#AI#Local#Raspberry-Pi#Tool-Calling#Prompt-Engineering

January 31, 2026 · 5 min read

Thinking Without Thinking Models

How I got multi-step reasoning working on a Raspberry Pi 5 with a 1.2B model — no thinking model needed. ReWOO cuts it to 2 LLM calls and 80% fewer tokens.

#AI#Local#Architecture#Raspberry-Pi#ReWOO

May 24, 2024 · 3 min read

The SKILLS.md Standard: Why the Agentic Web Needs a Universal Resume

Stop writing massive Python files to tell your AI agents what they can do. We need a portable, plain-text standard for agent capabilities.

#AI#Architecture#Agentic-Web#Standards

May 20, 2024 · 4 min read

The Instruction Gap: When AI Starts Building Its Own Systems

We want AI to work autonomously, but letting it write its own playbook creates a new kind of technical debt.

#AI#Architecture#Multi-Agent#Future-Thoughts