Blog

April 15, 2026 · 13 min read

Open Source Just Passed Frontier: What GLM-5.1 Means for Builders

GLM-5.1 topped SWE-Bench Pro under MIT license. Gemma 4 shipped edge variants under Apache 2.0. Open source model parity is here — with caveats builders need to understand.

#AI#Local#open-source#glm-5#gemma-4#swe-bench#Benchmarks#Edge-AI
April 14, 2026 · 12 min read

The $650B Zero-ROI Disconnect: AI's Biggest Bet vs the Data

Amazon, Meta, Google, and Microsoft are spending $650B on AI infrastructure. Goldman Sachs says the GDP contribution is basically zero. Here's what the data actually shows.

#AI#Architecture#ROI#Fine-Tuning#Benchmarks
April 14, 2026 · 11 min read

Inside Foliome: How the Agent Actually Works

The technical companion to Foliome's release. How the agent builds its own bank integrations, orchestrates parallel sync with MFA routing, recovers from errors without human intervention, and serves a dashboard from your local machine.

#AI#Open Source#Claude Code#Playwright#Architecture#Automation
April 13, 2026 · 8 min read

Foliome: Your Money, Your Machine, Your Agent

I haven't logged into a bank app in three months. An open-source financial operating system where your AI agent syncs your banks, classifies spending, and gives you complete financial intelligence. Everything local. Everything yours.

#AI#Open Source#Personal Finance#Claude Code#Automation
April 9, 2026 · 8 min read

AI Platform Divergence: What It Means for Builders

OpenAI is betting on agents, Anthropic on coding, Google on multimodal. Here's the evidence, the strategy layer, and what to do if you're building in between.

#AI#Architecture
April 3, 2026 · 6 min read

Multi-Model Routing: Instruct vs Thinking on Edge Devices

How I route between two LFM2.5-1.2B models on a Raspberry Pi 5 — when to use Instruct, when to use Thinking, and why user-controlled toggles beat automatic detection.

#AI#Architecture#Local#Raspberry-Pi#Model-Routing#Edge-AI
April 1, 2026 · 6 min read

The Frontier Model Tax

Airbnb doesn't run OpenAI in production. Cursor built its own Tab model. The shift to fine-tuned small models is real — here's the decision framework for builders.

#AI#Fine-Tuning#Local#Architecture#Tutorial
March 17, 2026 · 6 min read

The Entry Point Is Closing

Anthropic's own labor data shows AI isn't eliminating white-collar jobs yet — it's quietly shrinking entry-level hiring. Here's what that means if you're making the move into tech right now.

#AI#Career#Productivity#Strategy
March 12, 2026 · 6 min read

Extracting an Obsidian Plugin From a Note App

I took the voice transcription pipeline out of my note app and shipped it as an Obsidian plugin. No platform. No local server. Just the slice that worked.

#AI#Architecture#Obsidian#Plugin#Speech-to-Text
March 2, 2026 · 11 min read

The LLM Parameter Lie: What Actually Matters in 2026

Parameter counts are a vanity metric. How to read LLM architectures, what active parameters mean for your hardware, and the benchmarks that actually matter.

#AI#Architecture#Benchmarks#LLM#MoE#Inference
February 25, 2026 · 8 min read

Custom Blocks in Obsidian: The Scribe Block

Obsidian doesn't have custom blocks, so I faked it with a code fence, a hidden div, and two features that actually make it useful.

#AI#Tutorial#Obsidian#Plugin#Speech-to-Text
February 24, 2026 · 5 min read

Lazy Engineering: Why GraphRAG Beats 1M Token Context Windows

Large context windows introduce massive KV cache costs and latency. GraphRAG provides better retrieval accuracy for complex datasets at a fraction of the compute expense.

#AI#Architecture#GraphRAG#Long-Context#KV-Cache#RAG-vs-Long-Context
February 23, 2026 · 8 min read

Voice Transcription Market: Current State and the Missing Piece

Voice transcription in 2026: who ships what (Monologue, Wispr Flow, Hey Lemon), what builders use (Whisper, AssemblyAI), and why fragmentation and lock-in are the real problem.

#AI#Voice#Whisper#Speech-to-Text#Dictation#Wispr-Flow#Monologue
February 18, 2026 · 13 min read

Launch Platform ROI: Data on Product Hunt, HN, and AppSumo

Actual traffic and conversion data for Product Hunt, Hacker News, and AppSumo — including the 10% featured rate and specific ROI math for indie makers.

#Marketing#Launch#Product-Hunt#Hacker-News#AppSumo#Indie
February 18, 2026 · 7 min read

LFM2.5-1.2B on Raspberry Pi 5: llama.cpp Optimization Guide

Optimizing llama.cpp for LFM2.5-1.2B on a Raspberry Pi 5. Recommended settings for quantization, threads, and KV cache to maximize local LLM performance.

#AI#Tutorial#Local#Raspberry-Pi#llama.cpp#LFM2.5#Edge-AI
February 16, 2026 · 6 min read

Deep Research: Exa.ai vs Other Providers

Embedding deep research in a note app with Exa.ai. I built a 5-phase enrichment pipeline, then simplified to Exa with a light review step.

#AI#Architecture#Deep-Research#Exa#RAG
February 16, 2026 · 11 min read

GEO: The New Rules of Search Visibility

AI search engines don't rank pages — they cite sources. Here's what actually works for getting cited by ChatGPT, Perplexity, Claude, and Google AI Overview.

#AI#SEO#GEO#Tutorial#ChatGPT#Perplexity
February 15, 2026 · 7 min read

Workspace vs Document Assistant: Why You Need Both

The best work happens when conversation stays continuous. Document-localized chat gives the workspace assistant richer context — here's how we architected it.

#AI#Architecture#RAG
February 13, 2026 · 7 min read

LFM2.5-1.2B vs LFM2-2.6B: Why We Chose the Smaller Model

Benchmarking Liquid AI's LFM2.5-1.2B against LFM2-2.6B on a Pi 5 — the smaller model scores higher on IFEval (+9), runs 2.3x faster, and fits in under 1GB.

#AI#Local#Raspberry-Pi#Benchmarks#Liquid-AI
February 11, 2026 · 4 min read

Why We Stopped Forcing One Model to Do Everything

We built a Reasoner-Planner-Solver pipeline for a Pi 5 voice assistant, then replaced it with Instruct/Thinking model routing. Here's why simpler won.

#AI#Architecture#Local#Raspberry-Pi#Model-Routing
February 4, 2026 · 5 min read

Ditching Exa AI for Self-Hosted Search

Replacing Exa AI with self-hosted SearXNG and trafilatura on a Raspberry Pi — fully local web search in 2-3 seconds, no API keys, no data leaving the device.

#AI#Local#Tutorial#Raspberry-Pi#Self-Hosted#SearXNG
February 2, 2026 · 6 min read

From 78% to 97%: Making a 1.2B Model Actually Work

Three tricks that took a 1.2B model's tool routing from 78% to 97% — renaming tools, adding a calibration line, and a regex post-processor.

#AI#Local#Raspberry-Pi#Prompt-Engineering#Tool-Calling
January 31, 2026 · 5 min read

Thinking Without Thinking Models

How I got multi-step reasoning working on a Raspberry Pi 5 with a 1.2B model — no thinking model needed. ReWOO cuts it to 2 LLM calls and 80% fewer tokens.

#AI#Local#Architecture#Raspberry-Pi#ReWOO