Snowflake Model Registry - SnowPro Gen AI C02 study notes

Model Registry is a schema-level Snowflake object for managing ML models with version control, metadata, and SQL/Python inference.

Core API (Python)

from snowflake.ml.registry import Registry

reg = Registry(session=session, database_name="MYDB", schema_name="ML")

# Log a model (creates a new version)
mv = reg.log_model(
    model=my_sklearn_pipeline,
    model_name="customer_churn",
    version_name="v1",
    sample_input_data=X_train.head(),
    comment="baseline LR",
    metrics={"auc": 0.84},
    conda_dependencies=["scikit-learn==1.5.0"],
)

# List versions
reg.get_model("customer_churn").versions()

# Inference
mv.run(input_df, function_name="predict")

Three classes

Class	Represents
`Registry`	All models in a schema
`Model`	A named model (collection of versions)
`ModelVersion`	A specific version artifact

Supported model types (built-in)

scikit-learn
XGBoost
LightGBM
CatBoost
PyTorch
TensorFlow / Keras
Hugging Face pipelines
Prophet
MLflow pyfunc
Custom Python models ("bring your own model type") via custom_model.CustomModel base class

Deployment targets

Target	Use case	Compute
Warehouse	CPU inference, low/medium throughput, called from SQL or Python	Standard virtual warehouse
Snowpark Container Services	GPU inference, REST endpoint, custom runtime	SPCS service backed by a compute pool

The same model object can be served from both targets (different deployments).

SQL inference

After logging, a model is callable from SQL:

SELECT MYDB.ML.CUSTOMER_CHURN!PREDICT(*) FROM new_customers;

The ! syntax invokes a method on a model version. Method names match what was registered (predict, predict_proba, custom methods).

Partitioned models

For very large datasets, partitioned inference runs the model across multiple warehouse partitions in parallel. Defined when you call run() with a partition_column argument. Each partition is a separate model invocation — useful for per-customer or per-region models.

Versioning behavior

Each log_model call with a new version_name creates a new version
Versions are immutable
Use aliases (e.g., default, production) to point at a current version without changing references
metrics, comment, tags are mutable metadata

RBAC

USAGE on database/schema
READ on model = call inference
Ownership/USAGE on model = modify, log new versions
SNOWFLAKE.ML.MLOPS role for advanced ops (model observability, drift)

Observability (ML Observability)

Logs prediction inputs/outputs to a monitoring table
Tracks drift, performance over time
Configurable per model version

Common exam framings

"How do you deploy an open-source LLM for GPU inference inside Snowflake?" → Model Registry with SPCS deployment target
"How do you keep the production model swap atomic without changing application code?" → use a model alias
"Where is the artifact actually stored?" → Snowflake-managed stage, abstracted away from the user; you don't manage stage paths directly
"Why use partitioned models?" → train one model per partition (per-tenant) and run them in parallel for batch inference

Pitfalls

Conda dependency mismatch between training env and warehouse env → inference fails at runtime. Always pass conda_dependencies explicitly.
Sample input data shape mismatch → signature inference fails. Pass a representative sample_input_data.
Forgetting to grant READ on the model to the role that runs inference
Trying to deploy a GPU-bound model to a warehouse target — must use SPCS for GPU