06-reference

cw dbt onboarding checklist

Thu Apr 02 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·case-study ·source: notion ·by Mr. Ben / ConnectWise era
dbtanalytics-engineeringonboardingbest-practicesdata-testing

DBT Onboarding Checklist

A technical onboarding guide and curated resource collection for the ConnectWise Data Services team. Beyond the setup checklist, this document captures deeply practical dbt best practices, testing strategies, project structure recommendations, and tool evaluation notes drawn from the dbt community circa 2020. Many of these patterns remain canonical.

Getting Started Checklist

Software Stack

dbt Profile Template

default:
  target: dev
  outputs:
    dev:
      type: snowflake
      account: continuum.us-east-1
      user: bwilson
      role: BA_ADM_RL
      database: BA_SNDB
      warehouse: QUERY_WH
      schema: dbt_baw
      threads: 1
      client_session_keep_alive: False

Testing Strategy

A structured approach to when and what to test:

When to Test

  1. During development — Source uniqueness/not_null, staging join duplication, dim/fact relationship and recency tests
  2. During pull requests — Ensure dbt test runs as part of CI
  3. In production — Periodic test suite runs to catch source system data entry errors

What to Document

  1. Sources — How loaded, known caveats, poorly named field definitions
  2. Final models — Grain, column definitions (especially business logic columns)
  3. Intermediate models — As needed

This testing philosophy connects to 06-reference/2026-04-03-cw-hardening-reporting-pipelines where data tests are positioned as the final guard against erroneous reports.

Four Data Modeling Activities

A clean taxonomy (attributed to dbt community discourse on dbt vs. Looker):

  1. Cleansing — Fix problems or inconsistencies (dbt sweet spot)
  2. Re-shaping/pre-aggregating — Reliability, convenience, performance (dbt sweet spot)
  3. Creating metadata — Structures and relationships (shared between dbt and BI; leave final metric calculations for the report to preserve interactivity)
  4. Applying algorithms — Classification or prediction (ML layer, not dbt)

dbt Project Design: Three Interfaces

  1. The Graph — Data lineage (WHERE does data come from?)
  2. The Code — Good structure enables collaboration and onboarding
  3. The Warehouse — Naming and organization of the materialized data

Seven Prescriptive Actions for dbt Projects

  1. Choose nomenclature standards and stick to them
  2. Public vs. Private models — Prefix internal/intermediate assets with double underscore (__stg_billings)
  3. Modular model files — Descriptive CTEs, cap at 6 per model, break into ephemeral models beyond that
  4. Macros in moderation
  5. Reusable assets — One table should answer multiple questions. Regularly run ad hoc queries should be refactored into models.
  6. Documentation — Document models and lint SQL
  7. Be your own best advocate — Defining models is foundational work that benefits many others

These prescriptive actions are the 06-reference/concepts/analytics-as-craft philosophy distilled into daily practice.

Key dbt Techniques

Snapshots (Change Data Capture)

Blue/Green Deployments

Seed Files for Reference Data

Tags for Scheduling

SQL Deduplication Patterns

Window function approach (Snowflake):

select *, row_number() over (
    partition by user_id
    order by created_at desc
) = 1 as is_most_recent_record
from users
qualify is_most_recent_record

Analytics Maturity Stages (from dbt’s Startup Founder’s Guide)

Applied to ConnectWise, which was implementing several stages simultaneously:

StageKey Actions
EarlyData warehouse + ETL + BI tool selection
MidVersion-controlled SQL modeling, event tracking, churn modeling
GrowthData testing, PR/code reviews, documentation for knowledge scaling

The AirBnB “Scaling Knowledge” reference is notable — documentation as a scaling strategy, not an afterthought.

dbt Viewpoint (Philosophy)

Three principles that guided the ConnectWise implementation:

  1. Analytics is collaborative — Version control, QA, documentation, modularity
  2. Analytic code is an asset — Environments, SLAs (“make analytics calm again”), design for maintainability
  3. Analytics workflows require automated tools

Reusable Patterns

Consulting Credibility

This document demonstrates deep operational knowledge of 01-projects/phdata/index-relevant skills: dbt implementation, team onboarding, testing strategy, and project structure design. The curated resource collection shows someone who was actively engaged with the dbt community and applying those patterns in production. Relevant to 01-projects/phdata/career-transition as evidence of hands-on analytics engineering leadership.