Database DevOps for Snowflake and Databricks: why schema management is still stuck in 2015
Your application code has versioning, CI gates, blue-green deploys, automatic rollback, and observability on the deploy itself. Your data warehouse schema has a Slack channel and a senior engineer who reviews everything.
This is not a hypothetical. It is the actual workflow at the overwhelming majority of teams running production Snowflake or Databricks: an engineer writes a CREATE TABLE or ALTER TABLE on their laptop, pastes it into #data-engineering for review, someone eyeballs it for typos, and then someone runs it in prod. It works about 95% of the time. The other 5% lights up the on-call rotation — a dropped column a Looker dashboard depended on, a type narrowing that silently truncated three months of revenue data, a streaming table recreated from earliest because nobody knew it carried a checkpoint cursor.
We solved this for application code 15 years ago. We never solved it for schemas. And a broken schema deploy is harder to recover from than a broken application deploy, because by the time you notice, the data has already moved.
Let’s talk about why this gap persists, what “state-based schema management” actually means, and how a safety classifier turns “I hope this is fine” into “the tool refused, here’s why.”
Why this is still unsolved in 2026
It’s not for lack of tools. It’s that the tools that exist were shaped for a different problem.
Migration-based tools (Flyway, Liquibase, schemachange) ask you to write a versioned migration file — V1__init.sql, V2__add_column.sql — and apply them in order. That’s the correct model for an OLTP database under one team’s control, where the history of changes is the source of truth. It’s the wrong model for a data warehouse, where the source of truth is “what the table should look like right now,” not “the sequence of edits I made to get here.” Migration chains rot: they drift from reality, they’re painful to reconcile when two engineers branch, and nobody can answer “what should production look like?” by reading them.
Transformation tools (dbt, SQLMesh) are excellent and explicitly not DDL tools. dbt’s own docs say schema DDL happens “outside dbt.” Every dbt shop still answers “how do we ship a CREATE TABLE?” with something ad hoc. SQLMesh is the strongest cross-platform state-based framework out there, but it manages transformation models — not procedures, tasks, streams, masking policies, warehouses, or the full DDL surface.
Data modelers (erwin, ER/Studio) model the schema beautifully and then hand you a SQL script to run manually. They have no live-warehouse awareness — the model doesn’t know if production has drifted from it — and no safety classification.
The native platform tools finally arrived — Snowflake DCM Projects, Databricks Declarative Automation Bundles, dbt Fusion — and they validate the category. But DCM is Snowflake-only and Preview-grade (its changeset is explicitly not fidelity-guaranteed), Bundles are Databricks-only and part-Python (no clean static diff), Fusion is a transformation engine that’s GA only on Snowflake. None of them classify whether a change is dangerous. None of them run offline. None span both warehouses.
Meanwhile, SQL Server has had the right shape since 2013: Microsoft’s SQL Database Projects and SqlPackage. You declare the schema as files, compare against the live database, generate a migration script with safety opt-ins, and deploy. It just doesn’t exist for the cloud warehouses.
That’s the gap. The shape is known and proven; nobody ported it to Snowflake and Databricks with the warehouse-specific intelligence those platforms demand.
How state-based schema management works
The core idea is a one-sentence inversion of migration-based thinking:
You describe the desired end state, and the engine computes the diff against reality.
Concretely, you author one .sql file per object:
databases/ANALYTICS/schemas/PUBLIC/tables/ORDERS.sql
databases/ANALYTICS/schemas/PUBLIC/views/DAILY_REVENUE.sql
databases/ANALYTICS/schemas/PUBLIC/streams/ORDERS_STREAM.sql
Each file holds a CREATE OR ALTER statement (Snowflake) or CREATE OR REPLACE (Databricks Delta/Iceberg) that reads as a desired-state spec. Your git repo is the source of truth. There is no migration chain to maintain.
When you run a compare, the engine does three things:
- Reads your project into a typed object model — every table, view, procedure, stream, task, dynamic table, etc. becomes a structured object.
- Reads the target — a live account, a built artifact, or another project — into the same model. For a live account this is reverse-engineering: pull the catalog via
GET_DDL/SHOWand reconstruct the same typed objects. - Diffs the two models object by object, field by field, and emits a list of changes: added, removed, modified, with the per-field detail of what changed.
The compare is directional and symmetric — you can diff project ↔ live, project ↔ artifact, artifact ↔ live, in any direction. That alone answers the question no migration tool can answer cleanly: what is actually different between my repo and production right now? This is the same question Snowflake’s DCM changeset tries to answer but doesn’t guarantee fidelity on in Preview — here, the compare engine is the whole product, so fidelity is the point.
Implementation note for the curious: the object model is a discriminated union keyed on objectType, and every switch over the type union is exhaustivity-checked at compile time. The high-value types (Table, View, Stream, Task, Procedure, Function) get hand-tuned extractors and differs; the long tail of ~30 more object types routes through a generic SHOW <type> + GET_DDL path where the platform’s own emitted DDL is authoritative. The design work is mostly deciding which types can be treated as opaque DDL and which need a structural field-by-field differ.
What the safety classifier does
A diff that just lists changes is necessary but not sufficient. The interesting question isn’t what changed — it’s which of these changes will hurt me, and how badly.
So every change in a compare gets classified into one of four tiers:
| Tier | Meaning | Example |
|---|---|---|
SAFE |
No data loss, no surprising cost | Add a nullable column; add a view |
EXPENSIVE |
Safe but may trigger a costly rebuild/backfill | Re-cluster a large table; rebuild that rewrites files |
DESTRUCTIVE |
Loses data, but recoverable | Drop a column (Snowflake Time Travel can restore it) |
UNRECOVERABLE |
Loses something that cannot be recovered | Drop a Snowflake Stream (cursor gone); drop a Databricks streaming table (checkpoint gone); drop a managed table (files deleted) |
The DESTRUCTIVE vs UNRECOVERABLE distinction is the part that matters most, and it’s where platform-specific knowledge earns its keep. A generic SQL tool calls every DROP “destructive” and leaves the judgment to the human reading the diff. But dropping a column you can restore from Time Travel is genuinely different from dropping a Stream, whose cursor position is gone the moment you commit. The classifier knows:
- Dropping a Snowflake Stream loses an irrecoverable cursor →
UNRECOVERABLE. - A type-narrowing
ALTER COLUMNcan silently truncate existing data → gated behindallowNarrowingTypes. - Dropping a Databricks managed table deletes the underlying files (external tables leave them) →
UNRECOVERABLEvs merelyDESTRUCTIVE. - Dropping a Databricks streaming table loses its checkpoint; the recreated table re-reads from earliest, which is almost always wrong →
UNRECOVERABLE.
Crucially, this is structural, not advisory. Anything classified destructive or unrecoverable refuses to run unless you pass an explicit gate (allowDropTable, allowDropColumn, allowNarrowingTypes, allowTableRebuild, allowUnrecoverableDrop). It’s not a -- WARNING comment you scroll past at 5pm on a Friday. The default is “no,” and you have to mean it to override.
Each finding ships with a code, a remediation, and the safer alternatives — so the tool teaches as it gates rather than just blocking.
Honest comparison vs the alternatives
No tool wins on every axis, and pretending otherwise destroys credibility. Here’s where each alternative legitimately stops, drawn from a feature-by-feature scorecard:
| Capability | Flyway / Liquibase | dbt | Native (DCM / Bundles) | This (SDT/DDT) |
|---|---|---|---|---|
| State-based (desired-state diff) | No (migration-based)¹ | Partial | Yes | Yes |
| 4-tier safety classifier with reasoning | No² | No | No | Yes |
| Runs fully offline (CI / pre-commit) | Yes | Yes | No (needs warehouse/runtime) | Yes |
| Spans Snowflake and Databricks | Generic SQL only | Yes (transformations) | No (single-platform each) | Yes |
| Full DDL surface (procs/tasks/streams/policies) | Yes | No (models only) | Partial | Yes |
| Data-flow lineage (READS/WRITES) | No | Model-only | No | Yes |
| AI-narrated diff (BYO key) | No | No | No | Yes |
¹ Liquibase Secure 5.1 (Feb 2026) added modeled change control for Snowflake objects — a real step toward state-based — but it’s Snowflake-only (UC stays generic SQL), and its risk-pattern scanning is lint-style pattern matching, not a 4-tier classifier with per-finding reasoning.
² Atlas (which I’d put in this row too) ships 50+ lint rules and went stable on Snowflake/Databricks in v1.0 — genuinely good — but the lint rules report what’s wrong, not a structured UNRECOVERABLE / DESTRUCTIVE / EXPENSIVE / WARNING classification with reversibility analysis, and its column-level lineage is Atlas-Cloud-only.
Where the alternatives are legitimately ahead, because honesty matters: Bytebase has a polished enterprise approval-workflow web UI this doesn’t match. SQLMesh’s virtual environments and safe incremental models are a genuinely novel transformation UX. Liquibase Secure has compliance-grade audit trails across 65+ databases. DataGrip’s SQL completion is best-in-class. If your need is one of those, use that tool. The specific combination this fills — state-based + platform-native + 4-tier safety reasoning + both warehouses + offline + full DDL surface — is the thing that exists nowhere else.
What it looks like in practice
# Pull a live account into .sql files (adopt against an existing warehouse)
sdt extract --target snowflake://prod --out ./MyProject.sdtproj
# Edit a table file — say, delete a column line — then compare
sdt compare --source ./MyProject.sdtproj --target snowflake://prod
# → DROP COLUMN ORDERS.LEGACY_STATUS [DESTRUCTIVE] gate: allowDropColumn
# Try to publish without the gate — it refuses
sdt publish --source ./MyProject.sdtproj --target snowflake://prod --apply
# → refused: 1 DESTRUCTIVE change requires --allowDropColumn
# AI-narrate the diff in plain English (bring-your-own-key) to paste into the PR
sdt compare --source ./MyProject.sdtproj --target snowflake://prod --explain
The Databricks side (ddt) mirrors this command-for-command — same lifecycle, same safety semantics, same project format. A team that learns one picks up the other in a day.
Try the beta
SDT (Snowflake) and DDT (Databricks Unity Catalog) are available now. The deterministic core — compare, safe deploy, extract, lint, format, lineage, Object Explorer, schema diagram — stays free forever; AI write-side features move to a Pro tier but keep working with notices, not a hard paywall.
A few things worth knowing up front:
- AI is bring-your-own-key. Anthropic / OpenAI / Azure / OpenAI-compatible / self-hosted. Your prompts go from your machine to your provider; nothing routes through my servers and there’s no inference markup. The whole tool is fully useful with AI disabled — it’s a layer on a deterministic engine, not the engine.
- Telemetry is opt-in and never includes your SQL, identifiers, or data — only sanitized error fingerprints.
- It’s closed-source but free, Apache-2.0 on the distributed artifact, and the project format is plain SQL + JSON in your repo. If you ever stop using it, your files are still yours.
- It’s a solo project. So if you try it, the single most useful thing you can do is tell me where it breaks.
Links:
- VS Code: search “SDT — Snowflake Data Tools” / “DDT — Databricks Data Tools”, or:
- CLI:
npm i -g @sdt-tools/cli/npm i -g @ddt-tools/cli - GitHub (issues, release notes, privacy): https://github.com/GVOrganization/sdt-tools and https://github.com/GVOrganization/ddt-tools
- 60-second demos: compare · safe deploy · extract (Snowflake) — compare · deploy · extract (Databricks)
If you ship schema changes against Snowflake or Databricks more than once a week, I’d genuinely value your honest take — especially the parts where it’s wrong.
SDT and DDT are independent tools, not affiliated with or endorsed by Snowflake Inc. or Databricks Inc. “Snowflake”, “Databricks”, “Unity Catalog”, and “Delta Lake” are trademarks of their respective owners, used here descriptively.
Ship schema changes against Snowflake or Databricks?
SDT and DDT put a safety classifier on every change — data-loss operations refuse by default. Free tier, runs local-only.
Install SDT Install DDT See pricing