06-reference

etl is dead

Sat Apr 04 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·article ·source: https://www.dataengineeringweekly.com/p/etl-is-dead ·by Ananth Packkildurai

ETL Is Dead

Ananth Packkildurai argues that ETL remains operationally active but is dead as the defining identity of data engineering. The shift from human-operated to AI-agent-operated data systems restructures what competencies matter.

Mental model

The core analogy: Amazon’s physical warehouse robotization. When Kiva robots replaced human pickers, Amazon didn’t optimize the existing layout — they rebuilt entirely. Wide aisles, logical grouping, signage — all designed for human cognition — became unnecessary. The data warehouse is at the same inflection point.

Current architectures optimize for humans:

When agents become the primary operators, these affordances become overhead. Context erodes through each handoff “like a game of telephone.”

Packkildurai proposes ECL as the successor mental model:

This is the convergent evolution pattern again — the same pressure toward semantic precision keeps surfacing. Business glossaries, semantic layers, data catalogs, and knowledge graphs all promised to capture institutional meaning but failed because the economics weren’t aligned: humans bore maintenance costs while benefits were diffuse.

Why this time is different

The economic inversion: when AI agents are the consumer, bad context produces systematic hallucination at enterprise scale — not mere human frustration. The cost of not maintaining context now exceeds the cost of maintaining it. That feedback loop is structurally new and explains why similar efforts failed for two decades.

Dimensional modeling survives (sort of)

Kimball’s first two steps — identify the business process, select the grain — represent permanent “context architecture.” The subsequent steps (star schema, dimension tables, fact tables) were rendering choices optimized for human analysts querying relational databases. The thinking survives; the format may not. See The Data Warehouse Toolkit for the original framework.

Historical pendulum

EraTradeoff
RelationalMaximum semantic precision, operational rigidity
HadoopOperational flexibility, semantic collapse (data swamps)
LakehouseCompromise with incomplete semantic layers
ECLDecouples semantic precision from physical rigidity

This maps directly to the DEDP ETL tool evolution — every generation tries to resolve the same tension.

Connections

Open questions