Research, Build,
Story

Research, stories, and insights on AI and Data Engineering.

AI & Machine Learning

Deep dives into neural networks, NLP, and AI systems

Data Engineering

Pipelines, warehouses, GPU-accelerated query engines, and big-data systems

Stories

Illustrated novels that teach the modern AI stack as story, with audio narration in English and Hindi

With gratitude to every unsung hero across AI and Data — the engineers, researchers, and builders whose quiet work makes all of this real.

— P.S.

From The First Mind

All chapters

Featured

samkhya v1.0: Plug Claude, GPT-4o-mini, or Local Ollama Into Your SQL Query Optimizer

samkhya v1.0 ships an LLM-pluggable corrector backend for embedded analytical engines — DataFusion, DuckDB, Polars, Postgres, Iceberg, gpudb. Plug Claude, GPT-4o-mini, or local Ollama into the cardinality-estimation slot via a simple HTTP wire contract (Python FastAPI and Node TypeScript reference servers ship in the box). Every LLM output is clamped from above by a provable pessimistic ceiling (LpJoinBound — 40.95× tighter than the 2008 AGM bound) so the LLM can never make your plan worse than the engine's native estimate. Transport-floor latency measured at P95 0.07–0.11 ms; live-LLM end-to-end cells honestly marked PROJECTED pending API budget.

Read more

Latest Posts

View all
samkhya v1.0: Plug Claude, GPT-4o-mini, or Local Ollama Into Your SQL Query Optimizer
Data Engineering16 min read

samkhya v1.0: Plug Claude, GPT-4o-mini, or Local Ollama Into Your SQL Query Optimizer

samkhya v1.0 ships an LLM-pluggable corrector backend for embedded analytical engines — DataFusion, DuckDB, Polars, Postgres, Iceberg, gpudb. Plug Claude, GPT-4o-mini, or local Ollama into the cardinality-estimation slot via a simple HTTP wire contract (Python FastAPI and Node TypeScript reference servers ship in the box). Every LLM output is clamped from above by a provable pessimistic ceiling (LpJoinBound — 40.95× tighter than the 2008 AGM bound) so the LLM can never make your plan worse than the engine's native estimate. Transport-floor latency measured at P95 0.07–0.11 ms; live-LLM end-to-end cells honestly marked PROJECTED pending API budget.

Read
Why I built a GPU SQL engine in 2026 — when every other one died
Data Engineering27 min read

Why I built a GPU SQL engine in 2026 — when every other one died

Every standalone GPU database built between 2013 and 2024 was acqui-hired or pivoted. So why ship gpudb in 2026? Because nobody had wired Apple Silicon's unified memory into a SQL engine — and DuckDB hands you a hundred-thousand-user distribution channel without writing a database from scratch.

Read
The Cost of Being Right: AI-Generated Code at Production Scale
Opinion4 min read

The Cost of Being Right: AI-Generated Code at Production Scale

Generating code with AI is cheap. Reviewing, testing, and deploying it isn't. The new bottleneck for engineering teams isn't writing the code — it's trusting it.

Read
Databricks vs Snowflake vs The New Wave: The Data Engineering Paradigm Shift
Data Engineering5 min read

Databricks vs Snowflake vs The New Wave: The Data Engineering Paradigm Shift

Snowflake just posted $4.68B in FY26 revenue at 29% growth. Databricks crossed $5.4B ARR in February at 65% growth. And neither chart explains why the most interesting data infrastructure being shipped in 2026 is single-process, embeddable, and runs on a laptop.

Read
Why Small Models Are Eating Their Teachers
AI & Machine Learning4 min read

Why Small Models Are Eating Their Teachers

In 2024, you needed a 70B model to get good answers. In 2026, a 7B model trained on the right data beats it on most real-world tasks. The mechanism isn't a secret — it's distillation done well, and it's reshaping the entire model economy.

Read
The Vibe Coding Economy: When 'Make This' Is the Spec
Opinion3 min read

The Vibe Coding Economy: When 'Make This' Is the Spec

The phrase 'vibe coding' was a joke in 2024. By 2026, it describes how a non-trivial fraction of new software is actually built. The economics of this shift are stranger and more durable than the meme suggested.

Read

Stay Curious

Exploring the frontiers of AI, data, and technology. New research and insights published regularly.

About the Author