Strata: The Database Layer Needs Its Own Pipeline
Phoenix extracts intent from application code and rebuilds from zero. The database layer — stored procedures, SQL Agent jobs, linked servers, a decade of accumulated logic — doesn't follow the same rules. Introducing Strata: the sibling methodology for database archaeology.
The Database Is Not the Application
When you run Phoenix on a legacy system, the application layer is legible. You extract the intent — what the code was trying to accomplish — and rebuild it from zero in a modern stack. Seven agents. One mission. The methodology holds.
Then you open the database.
It is not one thing. It is fifteen years of decisions made by different people under different pressures. Tables that were normalized, then denormalized, then partially re-normalized when someone ran out of time. Stored procedures that contain business logic nobody remembers writing. SQL Agent jobs that run at 2am and touch four systems. Linked servers that abstract a dependency nobody wants to name explicitly. Synonyms that exist because a migration was started in 2019 and never finished.
Phoenix assumes a transformation target. The database layer does not have one single transformation target. It has four layers, and each layer has a different relationship to the concept of rebuilding.
Forcing all four through a rebuild pipeline produces the wrong output for three of them.
Four Layers, Four Questions
The friction becomes clear when you ask what “modernize this” means for each layer:
| Layer | Contents | The real question |
|---|---|---|
| Schema | Tables, views, constraints, indexes | Migrate in place or redesign for the new stack? |
| Logic | Stored procedures, functions, triggers | Extract to application layer or keep in database? |
| Jobs | SQL Agent jobs, schedules, step chains | Preserve as operational infrastructure or replace? |
| Dependencies | Linked servers, synonyms, external aliases | Document and retire, or carry forward? |
These are different questions. Schema migration is an execution problem. Logic extraction is an architecture problem. Job preservation is an operational problem. Dependency retirement is a risk problem. A pipeline designed to answer one of them cannot answer all four cleanly.
What the database layer needs is not a rebuild pipeline. It needs an archaeology pipeline. Excavate what's there. Understand the layers. Make an explicit decision about each one.
That's what Strata is.
The Non-Transformative Scope
The most important concept in Strata has no equivalent in Phoenix: the non-transformative classification.
A SQL Agent job that runs nightly billing rollups is not a rebuild candidate. It is operational infrastructure. It schedules. It chains. It touches systems the new application may not own for months. The right decision is not “rebuild this” — it is “preserve this, explicitly, with a reason, until Phase 2 is stable.”
In a standard pipeline, that job would fall through the cracks. Nobody would rebuild it because it doesn't fit the rebuild frame. Nobody would document the decision not to rebuild it because there's no artifact type for that. It would just exist, quietly, in production, undocumented and unaccounted for.
Strata introduces a decision construct. Every object gets one of three classifications:
transformExtract to application layer or migrate to new schema. Becomes a Phoenix input.
preserveRetain as operational infrastructure. Accounted for, scope is clear, decision is explicit.
retireSuperseded, no longer called, safe to remove. Documented before deletion.
Nothing falls through. Nothing is assumed. The non-transformative scope is named — which means it can be reasoned about, tracked, and eventually resolved.
Five Agents, One Brief
Strata runs five agents in sequence. Each one targets a specific layer. Each one produces .sil artifacts in the standard EMBER format.
| Agent | Name | Produces |
|---|---|---|
| S-00 | Schema Signal | schema — data model map, one per domain |
| S-01 | Logic Extractor | routine — stored procedure and function traces |
| S-02 | Job Mapper | job — SQL Agent job definitions and schedules |
| S-03 | Dependency Tracer | dependency — linked servers, synonyms, external refs |
| S-04 | Classifier | decision — transform / preserve / retire per object |
No build pass. No certifier. The terminal artifact from S-04 is a classification map — every object in the database accounted for, every decision named.
CONSTRUCT decision
ID billing.classification
VERSION 1
─────────────────────────────────────────
scope: billing domain — 4 tables, 12 routines, 3 jobs, 1 linked server
transform:
sp_generate_invoice → extract to application service layer
sp_apply_credit → extract to application service layer
sp_void_invoice → extract to application service layer
preserve:
nightly-rollup job → operational, no app equivalent
REPORTING linked server → archive target, retain during migration
retire:
sp_legacy_batch_import → superseded by API ingestion (2019)
sp_fix_tax_rounding → one-time correction, never called post-2021
open:
sp_calculate_arrears → business logic unclear, requires domain reviewThree stored procedures transform. One job and one linked server preserve. Two dead procedures retire. One item is open — it needs a domain expert before a decision can be made. That honesty is load-bearing: an open item named explicitly is better than a closed item decided incorrectly.
What a Routine Trace Looks Like
S-01 reads stored procedures the same way Phoenix A-01 reads server-side code — as business logic traces. But the output is different. A routine artifact names what the procedure reads, what it writes, what it depends on, and where the gaps are:
CONSTRUCT routine
ID billing.sp_generate_invoice
VERSION 1
─────────────────────────────────────────
type: stored-procedure
entry: EXEC billing.sp_generate_invoice @order_id, @issued_by
reads: orders, order_lines, customers, tax_rates
writes: invoices, invoice_lines, audit_log
logic:
- Calculates line totals before tax
- Applies customer-specific tax rate if present, else default
- Inserts invoice header then lines in a single transaction
- Writes audit entry on completion
boundaries:
→ linked server REPORTING for audit mirror (write-through)
gaps:
- No rollback on audit write failure — silent data loss possible
- Tax rate lookup has no fallback if customer record missing
confidence: highThe two gaps are real findings, not edge cases. Silent data loss on an audit write failure is a production integrity issue. A missing fallback on tax rate lookup is a financial calculation bug. A-01 in Phoenix found these classes of bug in the wealth2track engagement. S-01 in Strata finds them in the database layer.
How Strata Feeds Phoenix
Strata and Phoenix are siblings. On a real engagement — an application backed by an enterprise database — both run. Strata runs first.
Without Strata, the Phoenix A-04 architect is making assumptions about the database layer. Which stored procedures contain business logic that belongs in the new application? Which jobs must keep running during migration? Which linked servers are real dependencies versus historical artifacts?
With Strata, those are named decisions. The architect reads the classification map and builds on known ground. The builder knows exactly which routines to re-implement in the application layer. The certifier knows what was explicitly left in place and why.
Strata doesn't replace Phoenix. It removes the largest class of assumption Phoenix would otherwise have to make.
The Name
Strata — geological layers. The same word geologists use for the distinct bands of rock that record different eras of deposition. You don't transform strata. You read them. You classify what you find. You understand what was laid down when, and why, and what it means for what sits above it.
An enterprise database is the same thing. Fifteen years of decisions, deposited in layers. Each layer was coherent when it was laid down. The incoherence is in how they interact now. Strata reads the layers in order, names what it finds, and produces a map of what to do next.
That map is what Phoenix needs to rebuild without guessing.
Status and Next
The Strata methodology is fully specified. Five agents, five EMBER construct types, a terminal classification brief. The full specification is at phoenix.semanticintent.dev/strata/.
Methodology
Specified — five agents, five construct types, decision artifact
strata-runtime
On the roadmap — same pattern as phoenix-runtime
EMBER constructs
schema · routine · job · dependency · decision
Phoenix Runtime
The sibling pipeline — application layer rebuild
The five new EMBER construct types — schema, routine, job, dependency, decision — will land in @semanticintent/ember before the runtime ships. The format needs to be stable before the tooling is built around it.