Data Engineering After AI — Ananth Packkildurai
The mental model: the irreducible work was never about moving data — it was always about meaning. As AI becomes capable of generating pipeline code, the traditional ETL framework gives way to ECL: Extract, Contextualize, Link. The mechanical work of transformation becomes automatable; the architectural and organizational work of establishing, validating, and governing data meaning becomes the essential human contribution.
From ETL to ECL
ETL made sense when source systems were siloed and formats inconsistent. But the transformation step — encoding business rules into code — was always the most fragile component. AI handles this mechanical work competently now. What remains:
- Extract — still necessary, still requires architectural judgment about reliability and latency
- Contextualize — the critical new step where “AI inference and human judgment meet.” Semantic meaning of fields, organizational definitions, business context that no schema captures
- Link — establishing entity relationships across systems, increasingly valuable as AI consumes data at scale
Early Binding + Late Binding
Early binding (data contracts): Treat contracts as “executable constraints with real failure semantics,” not documentation. When AI agents generate transformation code, bad contracts are amplified at scale. This aligns with Ananth’s data contracts piece — specs, not artifacts.
Late binding (contextualize pipeline): A dedicated agentic pipeline runs alongside infrastructure:
- Event-driven triggers when new datasets arrive
- AI inference analyzing schema and data profiles
- Validation workflows separating high-confidence from human-review items
- A Context Store housing validated semantic definitions
The Context Architect
The new role for data engineers: not pipeline construction but context stewardship.
- Designing contractual foundations at data sources
- Building lineage infrastructure preserving context through transformations
- Governing the contextualize pipeline and context store
- Balancing prescribed context (early binding) with discovered context (late binding)
This is the augmentation thesis in action — AI handles the mechanical pipeline work, humans handle meaning, coordination, and architectural judgment.
Open Questions
ECL remains an emerging framework. Tooling is maturing, but organizational patterns for governing the context store, adjudicating team conflicts, and formalizing discovered context lack established templates. This is the frontier.
Connections
- 06-reference/2026-04-04-claude-code-not-replacing-data-engineers — the augmentation thesis. Ananth’s ECL framework gives concrete shape to what “augmentation not replacement” looks like in data engineering specifically.
- 06-reference/2026-04-04-dedp-etl-tool-comparisons — the ETL tool landscape is the world ECL is replacing. Tool comparisons will shift from “which moves data best” to “which contextualizes best.”
- 06-reference/2026-04-04-dedp-data-contracts-schema-evolution — early binding in the ECL framework. Contracts as executable constraints, not documentation.
- 06-reference/2026-04-05-dew-missing-layer-ai-stack — the context graph is the technical implementation of the “Link” step in ECL. The context store is the reasoning layer.
- 06-reference/concepts/analytics-as-craft — the contextualize step is where craft lives. AI can move data; understanding what data means in an organizational context is irreducibly human craft.
- 06-reference/2026-04-04-dedp-convergent-evolution — ECL is the next convergent pattern emerging across the industry.
- 06-reference/2026-04-04-steam-steel-infinite-minds — context fragmentation is the data engineering version of the broader fragmentation problem. ECL’s contextualize step is a direct response.
- 01-projects/phdata/index — consulting angle: helping clients move from ETL to ECL is a multi-quarter engagement. Context architecture is advisory-grade work.
- 06-reference/2026-04-04-dedp-semantic-layer-bi-olap-virtualization — the semantic layer is a precursor to the context store. Same impulse, less ambitious scope.
Part of a series: see also The Missing Interface, The Missing Layer in Your AI Stack, and Data Contracts: A Missed Opportunity.