The Config-as-Code Multiplier: A Full Marketing Data Stack. Solo. In Weeks.

My previous post introduced cognitecture — orchestrating AI with context, to outcomes you verify and own. Theory. This post is the proof.

One client. Entertainment venue, fifteen data sources — ticketing, ads, POS, web analytics, tourism. The kind of project that actually needs a data team but can’t afford one. I had myself and AI agents.

22 repos — pipelines, transformations, infra, monitoring
50 dbt models across 6 domains — 28 staging, 5 intermediate cross-domain aggregations, 17 mart-layer fact tables
8+ dashboards and counting
Under $100/month infrastructure (all serverless — Cloud Run, BigQuery, Cloud SQL)

What a 3-person team delivers in 4–5 months, I delivered in 6 weeks — while working on other clients. My agents ran full-time: nights, weekends, in parallel. I reviewed, verified, and steered.

The insight isn’t that agents are powerful. It’s that the tools you choose determine how powerful they become.

Why config-as-code is the multiplier

Imperative code requires understanding. The agent needs to grasp your architecture, your patterns, the implicit contracts between components. When it gets something wrong, the failure mode is subtle — code that runs but produces wrong results.

Declarative config requires filling in structure. A dbt model is a SQL SELECT and a YAML schema. A dlt pipeline is a TOML file and a Python source definition. Terraform is the desired end state in HCL.

The agent doesn’t need to understand your architecture — the config format enforces it.

Fewer ways to get it wrong — structure limits what the agent can produce
The structure itself is documentation — YAML, HCL, TOML are human-readable by definition
Verification is concrete — does the config parse? Do the numbers match? Do the tests pass?

Most teams never think about this. But choosing declarative tools is the highest-leverage decision you can make for agent-assisted work.

the multiplier effect

Imperative

Requires understanding Subtle failure modes Liability multiplier

Declarative

Structure enforces correctness Verifiable by design Force multiplier

The stack

Every layer follows the same principle: declarative config that agents can read, modify, and extend without understanding the broader system.

Ingestion — dlt

dlt handles API-to-warehouse pipelines. Python source definitions paired with TOML config — endpoints, auth, incremental cursors, merge dispositions.

dltHub has gone all-in on LLM-native workflows. Their getting-started guide assumes an AI-assisted IDE. dlt init ships AGENT.md files and YAML scaffolds as agent context. Their workspace indexes 10,000+ REST APIs as structured context packages. The result: 50,000 custom connectors in a single month — a 20x increase driven by LLM-assisted development.

Point an agent at API docs, give it the destination schema, get a working pipeline. I verified with row counts and timestamp checks — arithmetic, not judgment. Caught config errors in 3 of 15 sources. Date format mismatches, pagination quirks. Fixed in minutes. The actual bottleneck? Getting the right credentials at every data source. TikTok had me in a 2FA loop that took three days to resolve. No agent can sweet-talk a support chatbot.

dlt’s own docs say it plainly: “The LLM doesn’t know how many records your source should have, whether the schema matches your business needs, or if incremental loading is actually working.” That’s your job.

Transformation — dbt

dbt — every model a SQL SELECT, every schema a YAML file. The project structure enforces naming and dependency order. An agent that built one staging model knows exactly how to build the next.

dbt Labs released open-source Agent Skills — knowledge packs that turn generalist agents into specialized data agents. Their framing: “Agents can see your dbt project as a graph, not a pile of files.” The structured context dbt already enforces — lineage, semantic definitions, contracts — is exactly what agents need. SQL stops being text and becomes a governed system.

50 models across six domains. Staging → intermediate → mart. Agents generate SQL, schema YAML with Lightdash meta tags, and tests in one pass. Documentation comes free because it lives in YAML alongside the model — not a separate task nobody ever does.

Visualization — Lightdash

Lightdash reads dbt’s YAML schemas directly. Meta tags define dimensions, metrics, formatting. An agent that writes a dbt model with proper meta tags has done 80% of the dashboard work.

They call it Agentic BI: “Most AI agents in BI tools are just chatbots on top of SQL. No consistency, no governance. We started with a semantic layer.” Dashboards are YAML in git. The semantic layer can improve itself — when an agent calculates a metric that doesn’t exist, it proposes a new definition for human review.

Eight dashboards and counting. Agent builds charts, I verify numbers against source systems. But dashboards aren’t the real deliverable — a self-service analytics platform is. A well-described dbt model with proper dimensions and metrics means the client explores data independently inside Lightdash’s GUI. They don’t need me to build every chart. They ask new questions, slice by new dimensions, create new saved charts — all within governed definitions that live in code. The semantic layer means they can’t accidentally redefine “revenue” or double-count conversions. That’s the difference between delivering dashboards and delivering capability.

Infrastructure — Terraform

43 resources via Terraform — Cloud Run, Cloud SQL, networking, IAM, secrets. All HCL.

HashiCorp ships Agent Skills that inject HCL conventions into coding assistants. A Terraform MCP Server gives agents real-time access to provider docs, so generated HCL uses current resource definitions. Project Infragraph aims to give agents a live knowledge graph of your actual infrastructure.

No mode switch between “writing data models” and “managing infrastructure.” Same abstraction level. Edit config files. Done.

Where it breaks

Arithmetic

Row counts Timestamps Cursor advancing Schema parsing

Structural

dbt test coverage CI/CD validation Data quality checks Type consistency

Judgment

Business logic Attribution models Source-of-truth calls Dedup across eras

← Automatable · Human judgment required →

Verification bottleneck. Agents produce faster than you can verify. Five agents in parallel means five outputs to review. Automated tests catch structural errors. Business logic errors still need a human who knows the domain. I haven’t solved this.

Historical data migration. Three eras of ticketing systems, different schemas, different ID formats, different definitions of the same concepts. Agents wrote the SQL and dedup logic. But which records are duplicates across systems? What happens when historical data contradicts itself? Those decisions required client conversations and domain judgment. No CLAUDE.md can replace that.

Attribution. Ad platform numbers and GA4 numbers never match. Different models, different counting methods, different cookie windows. “Which number is right?” is a business decision, not a technical one. Agents can show you both numbers. They can’t choose for you.

The 2 AM test. Can you debug this without the agent? Pipeline fails, Claude is down. Can you read the config, find the error, fix it? This is why the stack choice matters — YAML and SQL are debuggable at 2 AM without an agent. Imperative generated code isn’t. If you can’t trace a failure without firing up an AI session, your architecture isn’t good enough. Every time I caught myself approving something I didn’t fully understand, I stopped and learned it first.

You can’t explain why the output is good — only that the agent produced it? You’ve crossed from practicing to pretending.

What I threw away

The declarative-first principle wasn’t theoretical — it was validated by trying the alternatives and watching them fail.

Metabase as the visualization layer. Agents choked on it. No config-as-code, no semantic layer — agents couldn’t produce reliable dashboard definitions. Every chart was a one-off imperative creation that couldn’t be version-controlled or templated.

Evidence.dev was the opposite: beautifully agent-friendly (markdown files with SQL blocks), but not enough interactive features for the client to use as a self-service tool. Great for publishing reports, wrong for exploratory analytics.

GCP Dataform instead of dbt. Switched because agents work markedly better with dbt — larger training corpus, bigger community, and native Lightdash integration sealed the decision.

Scope decisions are taste in action. Every “no” here reinforced the pattern: if the tool doesn’t let agents work declaratively, the tool gets replaced.

How context compounds

Every repo has a CLAUDE.md. Project structure, conventions, key files, deployment process. Any agent session starts with full context. No ramp-up.

Each completed task adds context for the next. The agent that built ticketing staging models now has context for mart models. The CLAUDE.md that described three domains now describes six. By repo fifteen, the agent scaffolded complete pipelines from template to production in a single session.

22 repos doesn’t mean 22x the work. It means a compounding context advantage that accelerates every task after the first few.

With agents handling the heavy lifting, I had bandwidth for the stuff I’d always wanted to try but could never justify. Reverse-engineering an internal tool’s API to extract years of historical data — no official endpoint, just network-tab archaeology and persistence. Building a scraper for another dataset that also had no public API. Both done with Claude, both duct-tape-and-determination projects I’d considered before but couldn’t defend the time investment. When the cost of trying drops to an evening, “probably not worth it” becomes “let’s find out.”

The takeaway

The leverage isn’t in the agents. It’s in the stack choice.

Config as code, agents become force multipliers. Imperative code, agents become liability multipliers. More generated, less trustworthy, more time reviewing than you saved.

One person. Fifteen sources. 50 models. 8+ dashboards. 22 repos. Under $100/month. A client who couldn’t justify a data team now has a self-service analytics platform with better infrastructure than most companies that can afford one.

Because I chose tools that turned my judgment into leverage.

What’s next

The dbt + Lightdash agent workflow works so well that I’m considering setting up OpenClaw — letting the client ask an agent via WhatsApp to build new models and dashboards. “Hey, can I get a breakdown of ticket sales by tourism segment?” and the agent scaffolds the dbt model, adds the Lightdash metrics, opens a PR. Skip the middleman entirely.

Which does raise the question: if I give the client an agent that can do what I do… what exactly am I billing for? Taste, apparently. And debugging TikTok 2FA.

Read the cognitecture manifesto →