Blog

Thoughts, research, and insights on technology.

All AI & Machine Learning Data Engineering Opinion

samkhya v1.0: Plug Claude, GPT-4o-mini, or Local Ollama Into Your SQL Query Optimizer

samkhya v1.0 ships an LLM-pluggable corrector backend for embedded analytical engines — DataFusion, DuckDB, Polars, Postgres, Iceberg, gpudb. Plug Claude, GPT-4o-mini, or local Ollama into the cardinality-estimation slot via a simple HTTP wire contract (Python FastAPI and Node TypeScript reference servers ship in the box). Every LLM output is clamped from above by a provable pessimistic ceiling (LpJoinBound — 40.95× tighter than the 2008 AGM bound) so the LLM can never make your plan worse than the engine's native estimate. Transport-floor latency measured at P95 0.07–0.11 ms; live-LLM end-to-end cells honestly marked PROJECTED pending API budget.

May 17, 202616 min read

Read

AI & Machine Learning7 min read

On-Device AI Just Got Real

For three years, on-device AI was a demo that almost worked. In June 2026 it stopped being one. Sparse models like Apple's AFM 3 and Google's Gemma 4 made intelligence large in flash, small in motion, free to run, and offline by default.

Jun 27, 2026Read

Opinion10 min read

The Coding-Agent Arms Race: Who Survives the H1-2026 Shakeout

In six months, AI coding agents went from features to a brutal platform war: $26B startups, a new frontier model every six weeks, pricing whiplash, and a reverse-acquihire that gutted a unicorn. The agent you build on is now a strategic bet.

Jun 20, 2026Read

Data Engineering10 min read

Streaming OLAP: The Post-Kafka Stack for Real-Time Analytics

The Kafka + Flink + ClickHouse/Pinot/Druid stack we built between 2018 and 2024 is fragmenting into three forks: single-engine streaming SQL, table-format-as-stream, and OLAP databases that eat the streaming layer entirely. Kafka isn't dying — it's becoming plumbing.

Jun 13, 2026Read

Data Engineering6 min read

What I Learned Writing GPU Kernels for SQL Aggregates

Three months, two abandoned designs, one breakthrough. The one-paragraph version: Apple Silicon GPUs don't have 64-bit atomic_fetch_add until very recent OS versions, and that single missing instruction shapes every other architectural decision in a Metal SQL aggregate engine.

Jun 6, 2026Read

Data Engineering5 min read

Multi-Aggregate Fusion: One Read, Four Answers

Every analytical engine treats SELECT SUM(x), MIN(x), MAX(x), COUNT(x) FROM t as four passes of the column. Fuse them into a single kernel and the speedup ratio against four-pass code becomes 9x to 25x. Here's why the technique works, and the data shape where it doesn't.

May 30, 2026Read

Data Engineering6 min read

Apple Silicon's Unified Memory Is the Quiet Revolution in Analytical Compute

M3 Ultra ships 512 GB of memory at 819 GB/s, addressable by the GPU with zero PCIe transfer cost. Every GPU database project from the past decade was architected around the assumption that memory bandwidth came at PCIe-tax prices. That assumption is now wrong on a fifth of the developer laptops in the world.

May 23, 2026Read

Data Engineering27 min read

Why I built a GPU SQL engine in 2026 — when every other one died

Every standalone GPU database built between 2013 and 2024 was acqui-hired or pivoted. So why ship gpudb in 2026? Because nobody had wired Apple Silicon's unified memory into a SQL engine — and DuckDB hands you a hundred-thousand-user distribution channel without writing a database from scratch.

May 9, 2026Read

Opinion4 min read

The Cost of Being Right: AI-Generated Code at Production Scale

Generating code with AI is cheap. Reviewing, testing, and deploying it isn't. The new bottleneck for engineering teams isn't writing the code — it's trusting it.

May 1, 2026Read

Data Engineering5 min read

Databricks vs Snowflake vs The New Wave: The Data Engineering Paradigm Shift

Snowflake just posted $4.68B in FY26 revenue at 29% growth. Databricks crossed $5.4B ARR in February at 65% growth. And neither chart explains why the most interesting data infrastructure being shipped in 2026 is single-process, embeddable, and runs on a laptop.

Apr 27, 2026Read

AI & Machine Learning4 min read

Why Small Models Are Eating Their Teachers

In 2024, you needed a 70B model to get good answers. In 2026, a 7B model trained on the right data beats it on most real-world tasks. The mechanism isn't a secret — it's distillation done well, and it's reshaping the entire model economy.

Apr 25, 2026Read

Opinion3 min read

The Vibe Coding Economy: When 'Make This' Is the Spec

The phrase 'vibe coding' was a joke in 2024. By 2026, it describes how a non-trivial fraction of new software is actually built. The economics of this shift are stranger and more durable than the meme suggested.

Apr 18, 2026Read

AI & Machine Learning13 min read

OpenClaw vs. Anthropic: The Week the Subscription Era Ended for AI Agents

In April 2026, Anthropic blocked the year's most viral open-source agent framework from spending Claude subscriptions. The fight wasn't really about one developer in Vienna — it was about an unspoken truth the labs had been dodging for a year: chat is the past, agents are the interface, and subscriptions cannot price what comes next.

Apr 15, 2026Read

AI & Machine Learning3 min read

A Million Tokens, A Thousand Disappointments

Every frontier model now claims a 1M-token context window. In production, almost no one uses more than 64K. Here's the gap between the benchmark and the reality, and what to do about it.

Apr 10, 2026Read

AI & Machine Learning3 min read

MCP and the Quiet Standardization of AI Tool Use

Model Context Protocol started as Anthropic's spec for hooking Claude into tools. A year later, every major AI provider, IDE, and SaaS vendor speaks it. This is what protocol-winning looks like in real time.

Apr 1, 2026Read

Opinion3 min read

The Death of Prompt Engineering

For two years, the most-clicked role on LinkedIn was Prompt Engineer. In 2026, that role is quietly disappearing — because the model is now the one writing the prompts.

Mar 22, 2026Read

Data Engineering10 min read

Iceberg's Puffin Sidecars: Portable Stats for the Open Lakehouse

Apache Iceberg's Puffin file format is the most strategically important subsystem nobody is talking about. It is the mechanism by which an open lakehouse can carry warehouse-grade statistics across vendors — write the sketch once in Trino, read it tomorrow in Snowflake, plan a join correctly on the first cold query.

Mar 18, 2026Read

AI & Machine Learning9 min read

Why AI Agents Are Replacing SaaS Dashboards in 2026

Enterprise teams are ditching traditional SaaS dashboards for autonomous AI agents that monitor, decide, and act. Here's what's driving the shift and what it means for software builders.

Mar 14, 2026Read

AI & Machine Learning14 min read

Understanding Retrieval-Augmented Generation: Architecture, Pitfalls, and Production Lessons

RAG is the most deployed LLM pattern in production today. After building RAG systems for 18 months, here are the architectural decisions that matter and the mistakes that don't show up until scale.

Mar 10, 2026Read

AI & Machine Learning10 min read

The Real Cost of Running LLMs in Production: A Breakdown

Token costs are just the tip of the iceberg. After running LLM workloads in production for a year, here's where the money actually goes — and how to cut costs without cutting quality.

Mar 3, 2026Read

AI & Machine Learning13 min read

Building Reliable AI Pipelines: Lessons from 50 Production Failures

AI systems fail differently than traditional software. After investigating 50 production incidents across ML systems, here are the patterns — and the engineering practices that prevent them.

Feb 22, 2026Read

Data Engineering6 min read

DuckDB Ate the Modern Data Stack

An embedded analytical engine with no servers, no cluster, no migration cost just quietly displaced Spark for small data and Snowflake XS for medium data. MotherDuck closed Series B at a $400M post-money. Here's the part everyone undercounts.

Feb 17, 2026Read

AI & Machine Learning11 min read

Fine-Tuning vs. Prompting vs. RAG: Choosing the Right LLM Strategy

Three approaches to customizing LLM behavior, each with different tradeoffs. A decision framework based on your data, budget, and accuracy requirements.

Feb 12, 2026Read

Data Engineering5 min read

Iceberg, Delta, Hudi: Pick One in 2026 and Move On

The table-format wars are functionally over. Iceberg won on interop. Delta won on installed base. Hudi won on streaming upserts. The decision tree for a new project in 2026 is shorter than the comparison-blog industry wants you to believe.

Feb 12, 2026Read

Data Engineering9 min read

Polars vs DuckDB in 2026: When To Pick Which

Polars ate Pandas. DuckDB ate everything below the warehouse. The 2023 expectation was a cage match between two in-process analytical engines — the 2026 reality is they ate different cake, and the decision is mostly about whether your team thinks in DataFrames or SQL.

Jan 8, 2026Read

AI & Machine Learning9 min read

Autonomous Code Review: How AI Agents Are Raising the Bar for Software Quality

AI agents don't just write code — they review it. Autonomous code review catches bugs, security flaws, and design issues that human reviewers miss. Here's how it works.

Nov 18, 2025Read

Data Engineering10 min read

Vector Indexes in OLAP Engines: 2025 Is Where Search Ate Analytics

DuckDB, ClickHouse, Snowflake, BigQuery, Postgres — by late 2025 every serious analytical engine ships a native vector index. That wasn't an AI-hype reflex. It was the realization that embedding search is just a column scan with a different distance function, and the warehouse-plus-vector-DB split was operational waste for the 90% case.

Oct 20, 2025Read

AI & Machine Learning8 min read

The Tool-Use Revolution: How Function Calling Transformed LLMs Into Agents

The single most important capability that turned language models into agents wasn't better reasoning — it was tool use. Here's the technical story of how function calling changed everything.

Oct 5, 2025Read

Data Engineering10 min read

Apache Arrow IPC vs JSON: The Numbers Behind the Switch

Most data-API traffic in 2025 still moves as JSON because humans need to read it. But for any system actually shipping columnar batches between services — analytical pipelines, feature stores, embedding services, MCP-style tool calls — Arrow IPC is 3-30× faster end-to-end. Honest accounting of when the switch pays off and when JSON is still correct.

Aug 15, 2025Read

AI & Machine Learning9 min read

RAG Is Dead, Long Live Agentic RAG: The Evolution of AI Knowledge Systems

Traditional RAG retrieves documents and stuffs them into context. Agentic RAG plans queries, evaluates results, and iterates until it finds the right answer.

Jul 22, 2025Read

AI & Machine Learning9 min read

Building Production AI Agents: Lessons from Shipping Autonomous Systems

Building a demo agent is easy. Shipping one that handles edge cases, recovers from failures, and earns user trust is hard. Here are the lessons learned.

Jun 15, 2025Read

AI & Machine Learning10 min read

Claude, GPT, Gemini: Comparing AI Agent Capabilities in Real-World Tasks

Not all AI agents are created equal. A practical comparison of Claude, GPT-4, and Gemini on real software engineering tasks — coding, debugging, and system design.

May 20, 2025Read

AI & Machine Learning8 min read

Multi-Agent Systems: When AI Agents Learn to Collaborate

Single agents are powerful. Teams of specialized agents working together are transformative. Here's how multi-agent architectures are reshaping complex problem-solving.

Apr 10, 2025Read

AI & Machine Learning9 min read

The Agentic Paradigm Shift: Why 2025 Changed Everything in AI Development

The shift from AI-as-tool to AI-as-agent represents the biggest paradigm change since the internet. Here's how we got here and where it's heading.

Mar 1, 2025Read