DBT Onboarding Checklist

A technical onboarding guide and curated resource collection for the ConnectWise Data Services team. Beyond the setup checklist, this document captures deeply practical dbt best practices, testing strategies, project structure recommendations, and tool evaluation notes drawn from the dbt community circa 2020. Many of these patterns remain canonical.

Getting Started Checklist

Software Stack

VS Code with extensions (dbt formatter, Better Jinja, Rainbow CSV, YAML)
DataGrip (SQL client, JetBrains license)
GitHub account on CW org
Windows Subsystem for Linux (WSL) with Oh My ZSH
Python virtual environment for dbt CLI
dbt profiles.yml configured for Snowflake

dbt Profile Template

default:
  target: dev
  outputs:
    dev:
      type: snowflake
      account: continuum.us-east-1
      user: bwilson
      role: BA_ADM_RL
      database: BA_SNDB
      warehouse: QUERY_WH
      schema: dbt_baw
      threads: 1
      client_session_keep_alive: False

Testing Strategy

A structured approach to when and what to test:

When to Test

During development — Source uniqueness/not_null, staging join duplication, dim/fact relationship and recency tests
During pull requests — Ensure dbt test runs as part of CI
In production — Periodic test suite runs to catch source system data entry errors

What to Document

Sources — How loaded, known caveats, poorly named field definitions
Final models — Grain, column definitions (especially business logic columns)
Intermediate models — As needed

This testing philosophy connects to 06-reference/2026-04-03-cw-hardening-reporting-pipelines where data tests are positioned as the final guard against erroneous reports.

Four Data Modeling Activities

A clean taxonomy (attributed to dbt community discourse on dbt vs. Looker):

Cleansing — Fix problems or inconsistencies (dbt sweet spot)
Re-shaping/pre-aggregating — Reliability, convenience, performance (dbt sweet spot)
Creating metadata — Structures and relationships (shared between dbt and BI; leave final metric calculations for the report to preserve interactivity)
Applying algorithms — Classification or prediction (ML layer, not dbt)

dbt Project Design: Three Interfaces

The Graph — Data lineage (WHERE does data come from?)
The Code — Good structure enables collaboration and onboarding
The Warehouse — Naming and organization of the materialized data

Seven Prescriptive Actions for dbt Projects

Choose nomenclature standards and stick to them
Public vs. Private models — Prefix internal/intermediate assets with double underscore (__stg_billings)
Modular model files — Descriptive CTEs, cap at 6 per model, break into ephemeral models beyond that
Macros in moderation
Reusable assets — One table should answer multiple questions. Regularly run ad hoc queries should be refactored into models.
Documentation — Document models and lint SQL
Be your own best advocate — Defining models is foundational work that benefits many others

These prescriptive actions are the 06-reference/concepts/analytics-as-craft philosophy distilled into daily practice.

Key dbt Techniques

Snapshots (Change Data Capture)

Requires unique ID + updated_at timestamp
Alternative: field-level change detection
Challenge at CW: Corporate Billings and Bookings tables lacked unique IDs
Community pattern: version numbering on top of snapshots

Blue/Green Deployments

Build new schema fully before swapping to production
Reduces risk if something goes wrong — Snowflake schema swap automation

Seed Files for Reference Data

Replace long CASE statements with CSV seed files (e.g., country mappings)

Tags for Scheduling

Tag models as hourly vs. nightly for differentiated refresh cadence (dbt run -m tag:hourly)

SQL Deduplication Patterns

Window function approach (Snowflake):

select *, row_number() over (
    partition by user_id
    order by created_at desc
) = 1 as is_most_recent_record
from users
qualify is_most_recent_record

Analytics Maturity Stages (from dbt’s Startup Founder’s Guide)

Applied to ConnectWise, which was implementing several stages simultaneously:

Stage	Key Actions
Early	Data warehouse + ETL + BI tool selection
Mid	Version-controlled SQL modeling, event tracking, churn modeling
Growth	Data testing, PR/code reviews, documentation for knowledge scaling

The AirBnB “Scaling Knowledge” reference is notable — documentation as a scaling strategy, not an afterthought.

dbt Viewpoint (Philosophy)

Three principles that guided the ConnectWise implementation:

Analytics is collaborative — Version control, QA, documentation, modularity
Analytic code is an asset — Environments, SLAs (“make analytics calm again”), design for maintainability
Analytics workflows require automated tools

Reusable Patterns

Testing pyramid for analytics — Dev/PR/Production mirrors software engineering testing strategy
Four modeling activities taxonomy — Clear boundaries for what dbt should and should not do
Three interfaces framework — Graph/Code/Warehouse as design dimensions
Seven prescriptive actions — A ready-made team standards document
Maturity stage mapping — Assess where a company is and what to implement next

Consulting Credibility

This document demonstrates deep operational knowledge of 01-projects/phdata/index-relevant skills: dbt implementation, team onboarding, testing strategy, and project structure design. The curated resource collection shows someone who was actively engaged with the dbt community and applying those patterns in production. Relevant to 01-projects/phdata/career-transition as evidence of hands-on analytics engineering leadership.