Cortex AI functions - SnowPro Gen AI C02 study notes
This is the largest domain (~40% per C01 proxy). Master it cold. Per community feedback, the exam tests function signatures, model names, and which function to pick for which use case.
Two function namespaces - know the difference
Snowflake ships LLM features under two namespaces. The AI_* functions are the modern, recommended API; the SNOWFLAKE.CORTEX.* functions are legacy but still tested.
Modern: AI_* (recommended)
| Function | Returns | Notes |
|---|---|---|
AI_COMPLETE(model, prompt) |
text | Supports claude-sonnet-4-6, claude-opus-4-7, llama3.1-70b, mistral-large2, openai-gpt-5, deepseek-r1, others. Also accepts images. |
AI_CLASSIFY(input, categories) |
classification result | Multi-label + image classification supported |
AI_FILTER(input, condition) |
BOOLEAN | Use in WHERE / SELECT / JOIN for LLM-driven filtering |
AI_AGG(column, prompt) |
aggregate text | Not subject to context window limits - aggregates across rows |
AI_EXTRACT(input, schema) |
structured object | Extract from text, images, documents |
AI_SENTIMENT(text) |
sentiment score + class | |
AI_SUMMARIZE_AGG(column, prompt) |
summary | Not subject to context window limits |
AI_EMBED(model, input) |
vector | Models: snowflake-arctic-embed-m, multilingual-e5-large, voyage-multilingual-2 |
AI_SIMILARITY(emb1, emb2) |
similarity score | |
AI_TRANSCRIBE(stage_file) |
text + timestamps + speakers | Audio/video |
AI_PARSE_DOCUMENT(stage, path, opts) |
text + layout | OCR or LAYOUT mode |
AI_REDACT(text) |
redacted text | Removes PII |
AI_TRANSLATE(text, from, to) |
translated text | |
AI_COUNT_TOKENS(model, input) |
int | Token count for billing estimation |
Legacy: SNOWFLAKE.CORTEX.* (still on exam)
| Function | Returns | Models |
|---|---|---|
COMPLETE(model, prompt) |
text | Multiple model options |
CLASSIFY_TEXT(text, categories) |
classification | |
EMBED_TEXT_768(model, text) |
768-d vector | e5-base-v2, snowflake-arctic-embed-m |
EMBED_TEXT_1024(model, text) |
1024-d vector | snowflake-arctic-embed-l-v2.0, multilingual-e5-large |
SENTIMENT(text) |
score -1..1 | |
SUMMARIZE(text) |
summary | |
TRANSLATE(text, from, to) |
translated | |
EXTRACT_ANSWER(text, question) |
answer | QA over source text |
ENTITY_SENTIMENT(text) |
entity-sentiment pairs | |
PARSE_DOCUMENT(stage, path, opts) |
extracted text | Will be deprecated end of 2026 |
COUNT_TOKENS(model, text) |
int |
Helpers
PROMPT(...)- builds prompt objects (for image inputs etc.)TO_FILE(stage_ref)- file reference helper for stage-stored docsTRY_COMPLETE(...)- returns NULL on failure instead of raising
Cortex Search
Managed retrieval service. Hybrid retrieval = vector + BM25 keyword + semantic reranking.
DDL pattern
CREATE OR REPLACE CORTEX SEARCH SERVICE service_name
ON search_column
ATTRIBUTES filter_col_1, filter_col_2
WAREHOUSE = my_wh
TARGET_LAG = '1 hour'
EMBEDDING_MODEL = 'snowflake-arctic-embed-l-v2.0'
AS (SELECT id, search_column, filter_col_1, filter_col_2 FROM source_table);
Embedding model options
| Model | Dims | Languages | Window |
|---|---|---|---|
snowflake-arctic-embed-m-v1.5 |
768 | EN only | (default) |
snowflake-arctic-embed-l-v2.0 |
1024 | multilingual | 512 tokens |
snowflake-arctic-embed-l-v2.0-8k |
1024 | multilingual | 8192 tokens |
voyage-multilingual-2 |
1024 | multilingual | 32K tokens |
Refresh patterns
- TARGET_LAG controls auto-refresh interval
- With PRIMARY KEY defined: incremental refresh (only changed rows)
- Without PRIMARY KEY: full reindex on each refresh
- Service periodically triggers full rebuilds to compact fragmented segments
Query syntax
SQL preview:
SELECT SNOWFLAKE.CORTEX.SEARCH_PREVIEW(
'service_name',
'{"query":"...", "columns":[...], "filter":{...}, "limit":5}'
);
Python:
svc.search(query="...", columns=[...], filter={...}, limit=5)
Cortex Analyst
Managed text-to-SQL over structured data. Backed by a semantic model YAML.
Semantic model components
- Logical tables - business entities
- Dimensions - categorical attributes
- Facts - quantitative row-level data
- Metrics - aggregations / KPIs
- Relationships - join definitions
The semantic model is the bridge between business terminology and physical schema.
REST API
- Accepts natural language question + semantic model reference (stage path or inline YAML)
- Returns interpreted question + generated SQL
- Supports multi-turn conversation (follow-ups with context)
When to use which
| Use case | Pick |
|---|---|
| BI / structured data questions | Cortex Analyst |
| Document / unstructured search | Cortex Search |
| RAG over knowledge base | Cortex Search + AI_COMPLETE |
| End-to-end agent combining both | Cortex Agents |
Fine-tuning via Cortex
SELECT SNOWFLAKE.CORTEX.FINETUNE('CREATE', 'my_domain_model', 'mistral-7b',
'SELECT prompt, completion FROM training_table');
SELECT SNOWFLAKE.CORTEX.FINETUNE('DESCRIBE', '<job_id>');
SELECT SNOWFLAKE.CORTEX.FINETUNE('SHOW');
SELECT SNOWFLAKE.CORTEX.FINETUNE('CANCEL', '<job_id>');
Fine-tuning modes: CREATE, DESCRIBE, SHOW, CANCEL.
RAG pipeline pattern (memorize this end-to-end)
-- 1. Upload PDFs to a stage
PUT file:///local/*.pdf @docs_stage AUTO_COMPRESS=FALSE;
-- 2. Parse documents
CREATE OR REPLACE TABLE parsed AS
SELECT relative_path,
SNOWFLAKE.CORTEX.PARSE_DOCUMENT(@docs_stage, relative_path,
{'mode':'LAYOUT'}) AS doc_json
FROM DIRECTORY(@docs_stage);
-- 3. Chunk text (SPLIT_TEXT_RECURSIVE_CHARACTER)
CREATE OR REPLACE TABLE chunks AS
SELECT relative_path, c.value::STRING AS chunk
FROM parsed,
LATERAL FLATTEN(input =>
SNOWFLAKE.CORTEX.SPLIT_TEXT_RECURSIVE_CHARACTER(
doc_json:content::STRING, 'markdown', 1500, 200)) c;
-- 4. Embed (optional - Cortex Search handles this if you create a service)
-- OR create a Cortex Search Service directly on the chunks table
CREATE OR REPLACE CORTEX SEARCH SERVICE my_rag
ON chunk
ATTRIBUTES relative_path
WAREHOUSE = wh_xs
TARGET_LAG = '1 hour'
EMBEDDING_MODEL = 'snowflake-arctic-embed-l-v2.0'
AS (SELECT chunk, relative_path FROM chunks);
-- 5. Query: retrieve top-k chunks then complete
WITH hits AS (
SELECT SNOWFLAKE.CORTEX.SEARCH_PREVIEW(
'my_rag',
OBJECT_CONSTRUCT('query', 'what is X?', 'limit', 5)::STRING) AS res
)
SELECT AI_COMPLETE('claude-sonnet-4-6',
'Answer using only this context: ' || res || ' Question: what is X?')
FROM hits;
Cost + governance gotchas (commonly tested per community feedback)
- Token-based billing for COMPLETE / AI_COMPLETE — cost scales with input + output tokens
- CORTEX_MODELS_ALLOWLIST account parameter restricts which models can be called
- Role-based grants on
SNOWFLAKE.CORTEX_USERdatabase role required to call any Cortex function - Cortex Search billing has multiple components: embedding generation, warehouse credits for refresh, serving credits per query
SNOWFLAKE.ACCOUNT_USAGE.CORTEX_FUNCTIONS_QUERY_HISTORYand related views = where you see usage- Cross-region inference: enable when local region doesn't host a needed model; data crosses regions, governance implication
- AI Observability via TruLens integration for eval
Common pitfalls to flag
- Confusing AI_EXTRACT (structured extraction from text/images) with PARSE_DOCUMENT (file→text from stage)
- Embedding model dimension mismatch (768 vs 1024) — must match across pipeline
- Forgetting that
EXTRACT_ANSWERonly returns answers found in the provided text (no generative fallback) - AI_AGG vs SUMMARIZE_AGG — both bypass context window; AI_AGG is more general (custom prompt), SUMMARIZE_AGG is purpose-built