Research, Build,
Story

Research, stories, and insights on AI and Data Engineering.

AI & Machine Learning

Deep dives into neural networks, NLP, and AI systems

Data Engineering

Pipelines, warehouses, GPU-accelerated query engines, and big-data systems

Stories

Illustrated novels that teach the modern AI stack as story, with audio narration in English and Hindi

With gratitude to every unsung hero across AI and Data — the engineers, researchers, and builders whose quiet work makes all of this real.

— P.S.

From The First Mind

All chapters

Featured

samkhya v1.0: Plug Claude, GPT-4o-mini, or Local Ollama Into Your SQL Query Optimizer

samkhya v1.0 ships an LLM-pluggable corrector backend for embedded analytical engines — DataFusion, DuckDB, Polars, Postgres, Iceberg, gpudb. Plug Claude, GPT-4o-mini, or local Ollama into the cardinality-estimation slot via a simple HTTP wire contract (Python FastAPI and Node TypeScript reference servers ship in the box). Every LLM output is clamped from above by a provable pessimistic ceiling (LpJoinBound — 40.95× tighter than the 2008 AGM bound) so the LLM can never make your plan worse than the engine's native estimate. Transport-floor latency measured at P95 0.07–0.11 ms; live-LLM end-to-end cells honestly marked PROJECTED pending API budget.

Read more

Latest Posts

View all
On-Device AI Just Got Real
AI & Machine Learning7 min read

On-Device AI Just Got Real

For three years, on-device AI was a demo that almost worked. In June 2026 it stopped being one. Sparse models like Apple's AFM 3 and Google's Gemma 4 made intelligence large in flash, small in motion, free to run, and offline by default.

Read
The Coding-Agent Arms Race: Who Survives the H1-2026 Shakeout
Opinion10 min read

The Coding-Agent Arms Race: Who Survives the H1-2026 Shakeout

In six months, AI coding agents went from features to a brutal platform war: $26B startups, a new frontier model every six weeks, pricing whiplash, and a reverse-acquihire that gutted a unicorn. The agent you build on is now a strategic bet.

Read
Streaming OLAP: The Post-Kafka Stack for Real-Time Analytics
Data Engineering10 min read

Streaming OLAP: The Post-Kafka Stack for Real-Time Analytics

The Kafka + Flink + ClickHouse/Pinot/Druid stack we built between 2018 and 2024 is fragmenting into three forks: single-engine streaming SQL, table-format-as-stream, and OLAP databases that eat the streaming layer entirely. Kafka isn't dying — it's becoming plumbing.

Read
What I Learned Writing GPU Kernels for SQL Aggregates
Data Engineering6 min read

What I Learned Writing GPU Kernels for SQL Aggregates

Three months, two abandoned designs, one breakthrough. The one-paragraph version: Apple Silicon GPUs don't have 64-bit atomic_fetch_add until very recent OS versions, and that single missing instruction shapes every other architectural decision in a Metal SQL aggregate engine.

Read
Multi-Aggregate Fusion: One Read, Four Answers
Data Engineering5 min read

Multi-Aggregate Fusion: One Read, Four Answers

Every analytical engine treats SELECT SUM(x), MIN(x), MAX(x), COUNT(x) FROM t as four passes of the column. Fuse them into a single kernel and the speedup ratio against four-pass code becomes 9x to 25x. Here's why the technique works, and the data shape where it doesn't.

Read
Apple Silicon's Unified Memory Is the Quiet Revolution in Analytical Compute
Data Engineering6 min read

Apple Silicon's Unified Memory Is the Quiet Revolution in Analytical Compute

M3 Ultra ships 512 GB of memory at 819 GB/s, addressable by the GPU with zero PCIe transfer cost. Every GPU database project from the past decade was architected around the assumption that memory bandwidth came at PCIe-tax prices. That assumption is now wrong on a fifth of the developer laptops in the world.

Read

Stay Curious

Exploring the frontiers of AI, data, and technology. New research and insights published regularly.

About the Author