DEDP 5.3 — Data-Asset Reusability Pattern

The DRY principle applied to data engineering. The chapter defines four sub-patterns for minimizing duplicated business logic across the data stack. This is the most directly actionable chapter in Batch 1 — it gives you a vocabulary for decisions every data team makes implicitly.

Four Sub-Patterns

1. Template Parameterization

What: Code reuse through variables and templating. Scope: Function to pipeline level. Examples: SQL CTEs, dbt macros (Jinja), Airflow templated fields. When: Repetitive SQL patterns, SCD Type 2 logic, standard transformations applied to many tables.

2. Asset Materialization

What: Persist data for reuse without recomputation. Scope: Table/view level. Examples: Materialized views, dbt materializations, OBT. When: Expensive queries run repeatedly, downstream consumers need consistent snapshots. Connection: This is the cache pattern from [[06-reference/2026-04-04-dedp-mv-obt-dbt-olap-dwa]] formalized as a reusability strategy.

3. Logic Encapsulation

What: Bundle complex logic into simplified interfaces. Scope: System/platform level. Examples: Semantic layers, MDM systems, BI universe tools. When: Multiple teams consume the same metrics, business logic spans systems. Connection: This is the semantic layer pattern from [[06-reference/2026-04-04-dedp-semantic-layer-bi-olap-virtualization]] and [[06-reference/2026-04-03-headless-bi]] framed as a reusability decision.

4. Parametric-Driven Generation

What: Automated code generation from configuration/metadata. Scope: Cross-system level. Examples: DWA tools, declarative data stacks, metrics layers. When: Large-scale standardization, many similar pipelines, metadata-driven architectures. Connection: Maps to [[06-reference/2026-04-03-data-products-taxonomy]] vision of data products as configuration.

When to Use vs. When to Avoid

Use when:

Complex environment with fast development cycles
Repetitive transformations across teams
Need for consistency in business logic
Multiple technologies requiring standardization
Data modeling patterns (SCD, Data Vault) applied across many tables

Avoid when:

Small, simple platforms where the abstraction cost exceeds the reuse benefit
Fine-grained control requirements that abstractions obscure
Rapidly changing business requirements that outpace template maintenance
One-time use cases

The "when to avoid" list is as valuable as the "when to use" list. Over-abstraction is the #1 failure mode — teams build reusable frameworks for problems that only occur twice.

Trade-Offs

Benefits: Reduced maintenance, faster development, enhanced governance, consistency across consumers.

Costs: Higher initial investment, risk of over-abstraction, performance overhead from added layers, learning curve, requires engineering culture + documentation discipline.

The storage vs. compute trade-off for materialization is explicit: materializing saves compute but costs storage and freshness. This is the same caching trade-off from the previous chapters, now formalized.

What Matters for Consulting

This chapter gives you a decision framework for every [[01-projects/phdata/index]] engagement. When a client asks "should we use dbt macros?" the answer depends on which sub-pattern they need:

Template Parameterization → yes, dbt macros
Asset Materialization → dbt materializations, not macros
Logic Encapsulation → semantic layer, not dbt
Parametric Generation → DWA or custom codegen, not dbt

The "when to avoid" section prevents over-engineering. Most small data teams should not invest in reusability frameworks. They should write straightforward SQL until they feel the duplication pain concretely. [[06-reference/concepts/analytics-as-craft]] — craft means knowing when simplicity beats elegance.

Parametric-driven generation is the future of [[01-projects/data-marketplace/index]]. Data products as configuration, not code. Metadata drives generation. This sub-pattern is the technical foundation for productized data delivery — define the schema, the quality rules, the access policies in metadata, and generate the pipelines.

Logic encapsulation = semantic layer by another name. This reframes the semantic layer conversation from "do we need a semantic layer tool?" to "where does our business logic encapsulation live?" Some clients already have logic encapsulation in their BI tool or their dbt project — they may not need a standalone semantic layer.

What's Academic

The software engineering pattern analogies (Factory Pattern, Abstract Factory, Facade, Cache-Aside) are accurate but unnecessary for practitioners. The Dagster example is useful but tool-specific.

Key Takeaway

Reusability in data engineering has four distinct sub-patterns. Most teams conflate them, leading to wrong-tool-for-the-job decisions. Template parameterization, asset materialization, logic encapsulation, and parametric generation each solve different problems at different scopes. Name the pattern, then pick the tool. [[06-reference/concepts/systems-over-goals]] — build reusability systems that match your actual duplication problems, not theoretical ones.