← All writing

Strata: The Database Layer Needs Its Own Pipeline

Phoenix extracts intent from application code and rebuilds from zero. The database layer — stored procedures, SQL Agent jobs, linked servers, a decade of accumulated logic — doesn't follow the same rules. Introducing Strata: the sibling methodology for database archaeology.

Michael Shatny··8 min read

The Database Is Not the Application

When you run Phoenix on a legacy system, the application layer is legible. You extract the intent — what the code was trying to accomplish — and rebuild it from zero in a modern stack. Seven agents. One mission. The methodology holds.

Then you open the database.

It is not one thing. It is fifteen years of decisions made by different people under different pressures. Tables that were normalized, then denormalized, then partially re-normalized when someone ran out of time. Stored procedures that contain business logic nobody remembers writing. SQL Agent jobs that run at 2am and touch four systems. Linked servers that abstract a dependency nobody wants to name explicitly. Synonyms that exist because a migration was started in 2019 and never finished.

Phoenix assumes a transformation target. The database layer does not have one single transformation target. It has four layers, and each layer has a different relationship to the concept of rebuilding.

Forcing all four through a rebuild pipeline produces the wrong output for three of them.

Four Layers, Four Questions

The friction becomes clear when you ask what “modernize this” means for each layer:

LayerContentsThe real question
SchemaTables, views, constraints, indexesMigrate in place or redesign for the new stack?
LogicStored procedures, functions, triggersExtract to application layer or keep in database?
JobsSQL Agent jobs, schedules, step chainsPreserve as operational infrastructure or replace?
DependenciesLinked servers, synonyms, external aliasesDocument and retire, or carry forward?

These are different questions. Schema migration is an execution problem. Logic extraction is an architecture problem. Job preservation is an operational problem. Dependency retirement is a risk problem. A pipeline designed to answer one of them cannot answer all four cleanly.

What the database layer needs is not a rebuild pipeline. It needs an archaeology pipeline. Excavate what's there. Understand the layers. Make an explicit decision about each one.

That's what Strata is.

The Non-Transformative Scope

The most important concept in Strata has no equivalent in Phoenix: the non-transformative classification.

A SQL Agent job that runs nightly billing rollups is not a rebuild candidate. It is operational infrastructure. It schedules. It chains. It touches systems the new application may not own for months. The right decision is not “rebuild this” — it is “preserve this, explicitly, with a reason, until Phase 2 is stable.”

In a standard pipeline, that job would fall through the cracks. Nobody would rebuild it because it doesn't fit the rebuild frame. Nobody would document the decision not to rebuild it because there's no artifact type for that. It would just exist, quietly, in production, undocumented and unaccounted for.

Strata introduces a decision construct. Every object gets one of three classifications:

transform

Extract to application layer or migrate to new schema. Becomes a Phoenix input.

preserve

Retain as operational infrastructure. Accounted for, scope is clear, decision is explicit.

retire

Superseded, no longer called, safe to remove. Documented before deletion.

Nothing falls through. Nothing is assumed. The non-transformative scope is named — which means it can be reasoned about, tracked, and eventually resolved.

Five Agents, One Brief

Strata runs five agents in sequence. Each one targets a specific layer. Each one produces .sil artifacts in the standard EMBER format.

AgentNameProduces
S-00Schema Signalschema — data model map, one per domain
S-01Logic Extractorroutine — stored procedure and function traces
S-02Job Mapperjob — SQL Agent job definitions and schedules
S-03Dependency Tracerdependency — linked servers, synonyms, external refs
S-04Classifierdecision — transform / preserve / retire per object

No build pass. No certifier. The terminal artifact from S-04 is a classification map — every object in the database accounted for, every decision named.

billing.classification.sil
CONSTRUCT  decision
ID         billing.classification
VERSION    1
─────────────────────────────────────────
scope:     billing domain — 4 tables, 12 routines, 3 jobs, 1 linked server

transform:
  sp_generate_invoice     extract to application service layer
  sp_apply_credit         extract to application service layer
  sp_void_invoice         extract to application service layer

preserve:
  nightly-rollup job      operational, no app equivalent
  REPORTING linked server  archive target, retain during migration

retire:
  sp_legacy_batch_import  superseded by API ingestion (2019)
  sp_fix_tax_rounding     one-time correction, never called post-2021

open:
  sp_calculate_arrears    business logic unclear, requires domain review

Three stored procedures transform. One job and one linked server preserve. Two dead procedures retire. One item is open — it needs a domain expert before a decision can be made. That honesty is load-bearing: an open item named explicitly is better than a closed item decided incorrectly.

What a Routine Trace Looks Like

S-01 reads stored procedures the same way Phoenix A-01 reads server-side code — as business logic traces. But the output is different. A routine artifact names what the procedure reads, what it writes, what it depends on, and where the gaps are:

billing.sp_generate_invoice.sil
CONSTRUCT  routine
ID         billing.sp_generate_invoice
VERSION    1
─────────────────────────────────────────
type:      stored-procedure
entry:     EXEC billing.sp_generate_invoice @order_id, @issued_by
reads:     orders, order_lines, customers, tax_rates
writes:    invoices, invoice_lines, audit_log

logic:
  - Calculates line totals before tax
  - Applies customer-specific tax rate if present, else default
  - Inserts invoice header then lines in a single transaction
  - Writes audit entry on completion

boundaries:
   linked server REPORTING for audit mirror (write-through)

gaps:
  - No rollback on audit write failure — silent data loss possible
  - Tax rate lookup has no fallback if customer record missing

confidence: high

The two gaps are real findings, not edge cases. Silent data loss on an audit write failure is a production integrity issue. A missing fallback on tax rate lookup is a financial calculation bug. A-01 in Phoenix found these classes of bug in the wealth2track engagement. S-01 in Strata finds them in the database layer.

How Strata Feeds Phoenix

Strata and Phoenix are siblings. On a real engagement — an application backed by an enterprise database — both run. Strata runs first.

Engagement sequence
Strata S-00 → S-04   database excavation
↓ decision artifact
Phoenix A-04       solution architect reads Strata output
Phoenix A-05       builder knows what database layer it can touch
Phoenix A-06       certifier knows what was in scope

Without Strata, the Phoenix A-04 architect is making assumptions about the database layer. Which stored procedures contain business logic that belongs in the new application? Which jobs must keep running during migration? Which linked servers are real dependencies versus historical artifacts?

With Strata, those are named decisions. The architect reads the classification map and builds on known ground. The builder knows exactly which routines to re-implement in the application layer. The certifier knows what was explicitly left in place and why.

Strata doesn't replace Phoenix. It removes the largest class of assumption Phoenix would otherwise have to make.

The Name

Strata — geological layers. The same word geologists use for the distinct bands of rock that record different eras of deposition. You don't transform strata. You read them. You classify what you find. You understand what was laid down when, and why, and what it means for what sits above it.

An enterprise database is the same thing. Fifteen years of decisions, deposited in layers. Each layer was coherent when it was laid down. The incoherence is in how they interact now. Strata reads the layers in order, names what it finds, and produces a map of what to do next.

That map is what Phoenix needs to rebuild without guessing.

Status and Next

The Strata methodology is fully specified. Five agents, five EMBER construct types, a terminal classification brief. The full specification is at phoenix.semanticintent.dev/strata/.

Methodology

Specified — five agents, five construct types, decision artifact

strata-runtime

On the roadmap — same pattern as phoenix-runtime

EMBER constructs

schema · routine · job · dependency · decision

Phoenix Runtime

The sibling pipeline — application layer rebuild

The five new EMBER construct types — schema, routine, job, dependency, decision — will land in @semanticintent/ember before the runtime ships. The format needs to be stable before the tooling is built around it.

Michael Shatny is a software developer and methodology engineer. Phoenix and Strata are part of the Methodology-as-Infrastructure family — analytical frameworks compiled into deterministic, artifact-producing pipelines.

ORCID: 0009-0006-2011-3258