06-reference

cw transformation layer

Thu Apr 02 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·case-study ·source: notion ·by Mr. Ben / ConnectWise era
analytics-engineeringdbtdata-infrastructuretool-selectionversion-control

The Transformation Layer

An internal proposal advocating for the adoption of dbt, GitHub, and a structured transformation layer at ConnectWise. This is the foundational architecture document that predates and enables the other ConnectWise analytics work. It contains a thorough technology evaluation, deployment tier design, and comparison of alternatives — a model for how to make infrastructure decisions and justify them to leadership.

Design Objectives

The transformation layer needed to be:

These objectives are a reusable checklist for evaluating any analytics infrastructure. See 06-reference/concepts/systems-over-goals — each objective describes a system property, not a one-time deliverable.

The Proposed Stack

ToolRoleCost
SnowflakeData storeNegotiated procurement price
dbt CoreTransformation frameworkFree
DataGripSQL authoring$200/analyst/year
VS Codedbt developmentFree
GitHubVersion controlFree (basic team plan)
dbt CloudScheduling, docs, CI$600/developer/year

Total incremental cost: remarkably low. This is the “shoestring budget” reality documented in 06-reference/2026-04-03-cw-hardening-reporting-pipelines.

Why GitHub

A thorough competitive analysis that is a template for any technology selection document:

  1. Industry default — Founded 2008, 4x the deployments of competitors on Stackshare. GitLab (2014) and Azure DevOps (2012, no Git until 2013) started later.
  2. Best integrations — dbt Cloud’s CI features (temporary schema builds on PR, automated test suites) only work with GitHub. Market leadership attracts integration investment.
  3. Microsoft ownership — Acquired for $7.5B in 2018. Azure DevOps features migrating to GitHub (Pipelines became Actions, Boards became Projects). Basic Teams plan is now free.
  4. Limited migration burden — Only two existing projects under version control (ADF and a dbt PoC).

dbt Preview: Key Capabilities

Reusable Business Logic via Macros

The killer example: defining Thoma Bravo’s MRR/ARR calculation as a macro ({{ cw_mrr_logic() }}) so analysts do not need to remember “some blend of special lenses and credit memo types” during a live board meeting. Business logic centralized once, used everywhere.

Deployment Tiers

A three-tier model directly applicable to any analytics team:

  1. Development — Individual feature branches, isolated Snowflake schemas. Autonomy for ad hoc and quick fixes.
  2. Release/Staging — Mirror of master with latest approved changes. Enables side-by-side comparison: “How will this definition update impact historical reporting?”
  3. Production/Master — The source of truth. Only updated through approved PRs.

The cultural forcing function: “No one should approve their own pull request.”

Fault Tolerance

dbt Cloud provides: if a scheduled job fails, data remains as it was before the run started, and an email alert fires. Compared to pipeline designs where failures corrupt production data, this is a significant improvement.

Documentation and Data Lineage

dbt docs generate builds a full static site of documentation. dbt Cloud hosts it and allows read-only access for non-technical teams (Partner Success, Sales, FP&A).

Tool Comparison: Why Not _____?

A masterclass in “why not” documentation:

AlternativeVerdict
Power BI DataFlowsLacks version control; becomes overwhelming beyond simple aggregations
Azure Data FactoryMay serve the “L” in ELT, but dbt is better for the “T” (testing, SQL-native)
SSISSteep learning curve, difficult deployment/version control. SQL-based dbt is more accessible.
Apache AirflowFree but server costs exceed dbt Cloud licenses. Maintenance burden diverts team from insight delivery.
Alteryx”$60-$80K for a vendor that does not significantly improve our capabilities” — would rather hire an FTE

Reusable Patterns

Consulting Credibility

This document is a template for 01-projects/phdata/index consulting engagements: evaluating a client’s analytics infrastructure, selecting tools, justifying decisions, and designing deployment processes. The “why GitHub” analysis alone demonstrates the structured thinking clients pay for. The connection to 06-reference/2026-04-03-snowflake-rapid-growth-doordash is clear — Snowflake as the foundation, with dbt as the transformation layer, is the modern data stack pattern that scales.