How to Design the Best Possible Architecture for a New App

“Architecture” is one of the most abused words in software. To one engineer it means a folder structure; to another, a diagram of microservices and Kafka topics that the app does not need and probably never will. The truth is narrower and more useful: architecture is the set of decisions that are expensive to reverse. Everything else is just code, and code is cheap to change. So the real skill of designing an architecture for a brand-new app is not drawing the most elaborate diagram. It is identifying the handful of decisions that will be costly to unwind later, making those well, and deliberately deferring everything else.

That framing matters because a new app sits in a uniquely dangerous spot. You have the most freedom you will ever have — no legacy, no migrations, no users to keep online — and the least information you will ever have, because you do not yet know your real load, your real domain edges, or which features will matter. Over-commit now and you build a distributed system for a product that turns out to need a single database. Under-commit and your business logic fuses to your web framework, your domain model contradicts itself across three modules, and the “quick” prototype calcifies into something nobody can safely change. Both failures are avoidable, and they are avoided by the same move: get the boundaries and the model right, keep the volatile details swappable, and refuse to gold-plate.

This guide gives you a sequenced workflow built from eight AI agent skills, each packaging a canonical book so your coding agent — Claude, Claude Code, Claude Cowork, Codex, Cursor, OpenClaw, Hermes Agent, or any other agentskills.io-compatible agent — applies the actual framework rather than a vague memory of it. You will establish dependency boundaries that keep frameworks and databases at arm’s length (Clean Architecture), model the business so the code reads like the domain and the boundaries fall where the business actually splits (Domain-Driven Design), size the system honestly with back-of-the-envelope math instead of cargo-culting scale (System Design), make deliberate data decisions about storage engines, replication, and consistency (Data-Intensive Apps), treat complexity itself as the enemy at every step (Software Design), design for the failures production will inevitably throw at you (Release It!), apply the meta-disciplines that keep the codebase reversible and orthogonal (Pragmatic Programmer), and ruthlessly cut scope so you build the essential version and ship it (The 37signals Way).

Architecture is the set of decisions that are expensive to reverse. Make those well; defer everything else.

A note on order and dose. The phases below run from the most foundational and hardest-to-reverse (boundaries, domain) to the more tunable (data, resilience) to the cross-cutting disciplines (complexity, reversibility, scope-cutting) that you apply throughout. You do not need every skill at full strength on day one — a weekend project might use three of these lightly. A funded team building toward a real launch will want the whole stack. If you have not yet validated that anyone wants this app, start with the sibling guide on how to create a new app with AI skills and come back here once you have earned the right to build. And if you are hardening something you already vibe-coded, the guide on taking a vibe-coded prototype to production is the better entry point.

Phase 1: Draw the boundaries that make every other decision reversible

Start here because this is the decision that buys back all the others. The single most leveraged thing you can do for a new app’s architecture is to keep your business rules independent of the things most likely to change underneath them — the web framework, the database, the third-party APIs, the delivery mechanism. The Clean Architecture skill, from Robert C. Martin’s book, distills this into one rule that is worth memorizing: source code dependencies point inward, from frameworks toward use cases toward entities. Nothing in an inner circle may know anything about an outer one.

Why this is the highest-leverage move: every boundary you draw correctly is an option to defer or swap a decision later without a rewrite. If your business logic does not know whether data lives in SQLite or Postgres or DynamoDB, you can start with the simplest option and change your mind when you have evidence — which, for a new app, you always will. The database is a detail. The web framework is a detail. The payment provider is a detail. Details should be plugins to your business rules, not the skeleton your application is built around. The skill’s diagnostic question is the one to keep asking yourself: can you test your core business rules with no database running, no web server, and no framework loaded? If the answer is no, your dependencies point the wrong way, and you will feel it as pain on every future change.

The mechanism that enforces the rule is Dependency Inversion. Your use case defines an interface — say InvoiceRepository — and your infrastructure provides the implementation, PostgresInvoiceRepository. A controller translates an incoming HTTP request into a plain request object, calls the use case, and gets back a plain response object; an ORM entity or a framework type must never cross that boundary. Have your agent design this layering for a concrete feature in your actual app rather than in the abstract.

Prompt

Use the clean-architecture skill to design the Clean Architecture layering for the checkout flow of my marketplace app — define the entities, the PlaceOrder use case with its request and response models, the repository and payment-gateway interfaces it depends on, and show how the HTTP controller and the Postgres adapter sit in the outer ring without leaking framework or ORM types into the core

Clean Architecture

The trap to avoid is treating the four concentric circles as sacred geometry. They are not. The number of layers is not fixed, and a small app can collapse interface adapters and frameworks into a single outer layer without sinning. What matters is the direction of dependencies, not the count of folders. Draw full boundaries — with interfaces on both sides — only at points of genuine volatility: the database, external services, the delivery mechanism. Everywhere else, a partial boundary or no boundary at all is the correct, simpler choice. Ask your agent to keep you honest about where a boundary earns its cost.

Prompt

Use the clean-architecture skill to review my proposed module layout for over-engineering — I have separate interface, adapter, and infrastructure layers for every single feature including a trivial settings page — and tell me which boundaries are pulling their weight at points of real volatility and which are ceremony I should collapse

Clean Architecture

One more decision belongs here because it is genuinely hard to reverse: the monolith-versus-services question. The skill is blunt about it — a microservice with a fat shared data model is just a distributed monolith, which is worse than a clean monolith in almost every way. Services are units of deployment, not units of architecture. For a new app, the right default is a well-structured modular monolith with clean internal boundaries; you can extract a service later precisely because the boundaries already exist. Splitting first, before you know where the seams are, is the expensive mistake.

Phase 2: Model the domain so the boundaries fall where the business splits

Clean Architecture tells you that you need boundaries and which way dependencies should flow. It does not tell you where to put the boundaries. For that you need to understand the business, and the Domain-Driven Design skill, from Eric Evans’s book, is how you turn that understanding into structure. Its foundational claim is that the model is the code and the code is the model: when the words your team says out loud are the exact words in the codebase — a Ubiquitous Language — an entire class of translation bugs simply ceases to exist.

This is radically cheaper to do for a new app than at any later point, because you are inventing the vocabulary from a blank page. Name things after domain concepts, never technical roles. An Order that knows how to place() and cancel() is worlds better than an OrderManager with a process() method, because a domain expert can read the former and tell you precisely where it is wrong. The skill treats naming difficulty as a design signal in its own right: if a concept resists a clean name, your model is probably incorrect, and that is information, not an annoyance.

The two strategic ideas that do the most architectural work are bounded contexts and aggregates, and they map directly onto the boundaries from Phase 1. A bounded context is a region within which a word means exactly one thing. “Customer” in your billing context and “Customer” in your support context can be — should be — different models with different attributes, and trying to unify them into one omniscient Customer class is a classic mistake that produces a bloated, self-contradicting object. Context boundaries are where you will eventually consider splitting services, so finding them now is the same work as deciding your future architecture. Have your agent map them from the language your team actually uses.

Prompt

Use the domain-driven-design skill to build the ubiquitous language and bounded contexts for my marketplace app where the team uses listing, product, order, and fulfillment interchangeably across the seller dashboard, the buyer checkout, and the warehouse integration, and tell me exactly where the same word legitimately means different things in different contexts

Domain-Driven Design

Aggregates are the other pillar, and they are where data consistency starts. An aggregate is a cluster of objects governed by a single root that enforces invariants — everything inside the aggregate boundary is immediately consistent, and everything outside it is eventually consistent. This is not a minor implementation detail; it is the decision that determines your transactional boundaries and, downstream, how the system behaves under concurrency and scale. The skill’s hard rules: keep aggregates small (one root plus a minimal cluster), and reference other aggregates by ID rather than holding direct object references. The most common new-app failure here is the anemic domain model, where entities are passive data bags and all the real logic leaks into a swamp of service classes. Push the behavior back into the entities and value objects where it belongs.

Prompt

Use the domain-driven-design skill to design the Order aggregate for my checkout context — identify the aggregate root, which fields are value objects like Money and Address, what invariants the root must enforce such as total matching the sum of line items and an order never being confirmed without a payment reference, and which related concepts like Buyer and Listing should be referenced by ID instead of embedded

Domain-Driven Design

Finally, apply strategic design so you do not spread your best thinking thin. Not every part of the app deserves a deep model. Identify your core domain — the thing that is genuinely your competitive advantage — and concentrate your modeling effort there. For supporting subdomains, build something serviceable without over-engineering it; for generic subdomains like authentication, email delivery, or payments, buy or use open source rather than lovingly hand-crafting what is already a commodity. This is where DDD and the anti-gold-plating theme of this guide converge: depth where it differentiates, thin adapters everywhere else.

Prompt

Use the domain-driven-design skill to classify the subdomains of my marketplace app into core, supporting, and generic — given that our differentiator is a matching algorithm between buyers and niche sellers — and tell me where to invest deep domain modeling versus where to buy or use an off-the-shelf component instead of building

Domain-Driven Design

DDD and Clean Architecture are partners, not rivals. The DDD repository interface lives in the domain layer; its implementation lives in infrastructure — which is exactly where the Dependency Rule wants it. You are describing the same well-structured system from two angles: one gives you the direction of dependencies, the other gives you the location and meaning of the boundaries they cross.

Phase 3: Size the system honestly — then build far less than you fear

Now comes the discipline that cuts against every instinct you absorbed from reading architecture blog posts: prove how small your system can be before you build it big. The System Design skill, from Alex Xu’s System Design Interview, is as valuable for telling you what not to build as for telling you what to build. Its first principle is to start with requirements, not solutions, and its estimation tools let you demonstrate, on the back of an envelope, that you almost certainly do not need sharding, a service mesh, or a multi-region deployment yet.

Do the arithmetic before you provision anything. The skill hands you the formulas: queries per second is daily active users times actions per day divided by 86,400 seconds, with peak load typically two to five times the average; storage is records per day times record size times retention period. For a new app with a few hundred or a few thousand users, that calculation reliably says the same thing — a single well-indexed database with a cache in front of read-heavy paths will carry you comfortably for a long time. The skill names premature sharding as an explicit mistake: scale vertically first, then add read replicas, then cache aggressively, and shard last, only when an estimate or a real bottleneck demands it. Have your agent run the numbers and then tell you plainly which techniques you can skip.

Prompt

Use the system-design skill to do a back-of-the-envelope capacity estimate for my marketplace app assuming 5,000 daily active users who each view 20 listings and place 0.5 orders per day, with listings averaging 50KB including images, and then tell me honestly which scaling techniques — sharding, read replicas, CDN, message queues, multi-region — I do NOT need yet at this scale

System Design

When you do hit a decision that the math justifies, the skill gives you the right building block for the specific problem rather than a kitchen sink. A read-heavy listing page wants a cache-aside layer with a sensible TTL and explicit invalidation on write. A spiky, slow background job — image processing, sending a batch of emails, generating a report — wants a message queue so the work is decoupled from the request path and the queue absorbs the spike. Global static assets want a CDN at the edge so your origin only serves the API. Reach for each block when a real bottleneck or a credible estimate points to it, and not a moment before. The skill’s four-step process — scope, high-level design, deep-dive on the riskiest components, then tradeoffs — keeps you from either staying too abstract or drowning in premature detail.

Prompt

Use the system-design skill to design the asynchronous flow for my listing-image processing step, which takes 6 seconds per upload and currently blocks the create-listing request — use a message queue and a worker pool, and include how the client is notified when processing completes and how I keep the create-listing happy path fast and responsive

System Design

The point of this phase is restraint backed by evidence. You are allowed to build for scale — once you can show, with numbers, that you are about to need it. Until then, the most professional architecture decision is the boring one.

Phase 4: Make deliberate data decisions, because data outlives the code

If Phase 3 is about how much system you need, this phase is about getting the part that lasts the longest correct. The Data-Intensive Apps skill, from Martin Kleppmann’s Designing Data-Intensive Applications, opens with the observation that should govern this entire phase: data outlives code. Applications get rewritten and frameworks come and go, but the data persists for years, sometimes decades. A careless schema or an unconsidered consistency model is the kind of mistake you live with long after the framework you chose this week is forgotten.

Start with the data model and storage engine, chosen against your actual access patterns rather than habit or hype. The skill’s framing: relational models excel at many-to-many relationships and ad-hoc queries; document models reduce impedance mismatch for self-contained aggregates with strong locality; graph models win for recursive traversals over densely connected data. Underneath, storage engines trade reads against writes — log-structured (LSM-tree) engines like Cassandra’s give you excellent write throughput at the cost of read amplification, while page-oriented (B-tree) engines like Postgres’s give predictable read latency. The mistake the skill calls out by name is choosing a database by popularity instead of by fit. Notice, too, how cleanly this phase rests on Phase 2: your DDD aggregates often map naturally to documents, while your relational integrity constraints map to foreign keys. Have your agent reason it through per workload.

Prompt

Use the ddia-systems skill to evaluate the right data model and storage-engine characteristics for the four distinct data workloads in my marketplace app — user and seller profiles that are read-heavy and relational, an append-only event log of every listing view, buyer-to-seller messaging threads, and a recommendation graph of who-bought-what — and tell me whether to use one Postgres instance or introduce polyglot persistence

Data-Intensive Apps

Then make the consistency and replication decisions explicit, because the defaults will surprise you. The skill is emphatic that most databases default to read committed or snapshot isolation, not serializable — which means anomalies like write skew can appear in production even though your code looks correct. For a new app this is exactly the moment to decide, per operation, where you need a serializable transaction or an explicit SELECT ... FOR UPDATE, and where eventual consistency is genuinely fine. The classic example is inventory: two buyers racing to purchase the last item of stock is a write-skew waiting to happen unless you lock the row. Replication has its own tradeoffs — single-leader with read replicas is the simple, strong-consistency default for a read-heavy app, but replication lag introduces read-your-writes anomalies you must design around (a user updates their profile, then sees the stale version served from a replica).

Prompt

Use the ddia-systems skill to walk through the concurrency and consistency design for my checkout — specifically the case where two buyers try to purchase the last unit of a listing at the same time — explain the write-skew risk under snapshot isolation, show me the correct locking or serializable-transaction approach, and tell me which other operations in a marketplace can safely tolerate eventual consistency

Data-Intensive Apps

A final, durable idea from this skill that pays off enormously for a new app: separate your system of record from your derived data. Your source-of-truth tables are one thing; your search index, your caches, and your materialized analytics views are derived from them and should be rebuildable from source. The skill points to change data capture and event sourcing as the mechanisms. You do not have to adopt full event sourcing on day one — but designing so that derived data is always reconstructable from the system of record means a future requirement change does not force a terrifying one-way data migration. Reach for this when you have a real second read pattern (a search index, an analytics dashboard), not preemptively.

Phase 5: Treat complexity as the enemy and keep modules deep

Every phase so far has been adding structure. This one is the counterweight that keeps the structure from becoming its own disease. The Software Design skill, from John Ousterhout’s A Philosophy of Software Design, rests on a single sentence worth taping to your monitor: the greatest limitation in building software is our own ability to understand the systems we are creating. Complexity — anything about the structure that makes the system hard to understand and modify — is the enemy, and the right test for every design decision is simply: does this make the overall system simpler or more complex?

The skill’s most useful concept for a new app is deep versus shallow modules. A module’s interface is the complexity it imposes on the rest of the system; its implementation is the functionality it provides. A deep module hides significant power behind a simple interface — think of how file.read(path) conceals disk blocks, buffering, caching, and encoding. A shallow module is the opposite: a complicated interface wrapping very little, so it adds more cognitive load than it removes. This is the precise corrective to a real risk in this guide — that you read Phases 1 and 2 and contract “classitis,” the disease of splitting everything into a swarm of tiny single-method classes and interfaces, each one a new boundary the next developer must learn. Clean layering and deep modules are allies; clean layering and classitis are not. Have your agent judge whether your abstractions are actually pulling their weight.

Prompt

Use the software-design-philosophy skill to review my proposed design for the order-processing module against deep-versus-shallow module principles — I have split it into OrderValidator, OrderEnricher, OrderPersister, and OrderNotifier classes that each expose one method and mostly pass data to the next — and tell me whether I have created shallow pass-through modules I should consolidate into one deep module with a simple interface

Software Design

Two more ideas from this skill sharpen the whole architecture. The first is information hiding and its evil twin, information leakage: when a single design decision is reflected in multiple modules, a change to it ripples everywhere, and that is one of the strongest red flags in all of software design. Each module should encapsulate a piece of knowledge — a data format, a protocol, a policy — that the rest of the system does not need to know. The second is the strategic-versus-tactical distinction, which is the deepest justification for this entire guide. Tactical programming optimizes for getting the next feature working right now and quietly accumulates complexity with every shortcut; strategic programming invests a steady ten to twenty percent extra effort to keep the design clean, and that investment compounds until the strategic codebase is faster to work in within months. The skill is pointed about startups specifically: early shortcuts compound into crippling debt exactly as the team grows and the cost of confusion multiplies.

Prompt

Use the software-design-philosophy skill to audit this module boundary for information leakage — my pricing logic reads a discount-rules format that is also parsed independently in the admin UI and the reporting job, so the same knowledge lives in three places — and show me how to encapsulate that knowledge in one deep module so a change to the rules format touches exactly one file

Software Design

Your primary job is a great design that happens to work, not working code that happens to have a design.

Phase 6: Design for the failures production will absolutely send you

A design that only works when every dependency is healthy is not finished — it is a demo. The Release It! skill, from Michael Nygard’s book, starts from a premise every new-app builder underestimates: the software that passes QA is not the software that survives production, because production is actively hostile. Every system will eventually be pushed past its design limits; the only question is whether it degrades gracefully or collapses catastrophically. And the cheapest time to build in graceful degradation is now, in the architecture, not after your first 2 a.m. outage.

The most important lesson is that integration points are the number-one killer, and the most dangerous failure is not a crash but a slow response. A dependency that returns an error quickly is survivable; a dependency that hangs ties up your threads, exhausts your connection pools, and propagates the delay up the entire call chain until the whole system freezes — with no error in the logs to tell you why. The non-negotiable architectural decision that follows is that every outbound call gets a timeout — both connect and read timeouts — and the critical ones get a circuit breaker that trips after a threshold of failures and periodically tests for recovery. A circuit breaker that has tripped is not a bug; it is the system correctly protecting itself from a downstream failure. Bulkheads complete the picture: isolate connection and thread pools per dependency so one failing integration cannot drain the resources the rest of the app needs. For a new app integrating a payment provider, an email service, and maybe an LLM API, these patterns are the difference between a contained blip and a total outage. Have your agent design them around your real integration points.

Prompt

Use the release-it skill to design the resilience layer for my marketplace app, which makes outbound calls to Stripe for payments, SendGrid for email, and a third-party shipping-rate API during checkout — use timeouts, circuit breakers, and bulkheads, specify sensible connect and read timeout values and breaker thresholds for each, and show how checkout degrades gracefully when the shipping-rate API is slow or down rather than hanging the whole request

Release It!

Two more Release It! ideas belong in the architecture from the start rather than bolted on later. The first is the pairing of unbounded result sets and steady-state hygiene: a list endpoint with no LIMIT is a harmless query in testing that becomes an out-of-memory crash once real data outgrows your assumptions, so paginate every list from day one and design automatic cleanup for the cruft that accumulates (old sessions, logs, temp files). The second is the decoupling of deployment from release: deploying code to servers and exposing it to users are separate operations, and separating them with feature flags and a fast rollback path means a bad change is a flag flip away from being undone rather than a thirty-minute roll-forward. Build your schema migrations to be backward-compatible (expand-contract) so old and new code can run side by side during a deploy. Have your agent design the readiness essentials.

Prompt

Use the release-it skill to design the production-readiness essentials for my new app before launch — pagination and result-set limits on every list endpoint, deep health checks that verify the database and queue are reachable rather than just that the process is alive, the RED metrics I should emit per endpoint, and a backward-compatible expand-contract migration plan so I can deploy without downtime

Release It!

Resist over-rotating here too. You do not need full chaos engineering, multi-region failover, and a service mesh for your first thousand users. You need timeouts on every call, circuit breakers and bulkheads on the critical ones, pagination everywhere, deep health checks, and a rollback you trust. That short list prevents the overwhelming majority of self-inflicted production disasters at a tiny fraction of the cost.

Phase 7: Apply the meta-disciplines that keep the whole thing reversible

The skills so far each govern a slice of the system. The Pragmatic Programmer skill, from Hunt and Thomas, operates one level up — the cross-cutting habits that determine whether the architecture stays easy to change over years rather than rotting into something nobody dares touch. Four of its ideas are load-bearing for a new app, and they tie the previous phases together.

First, the tracer bullet. Rather than building the system layer by layer and praying the pieces connect at the end, build one thin but completely real vertical slice through every layer — from the HTTP entry point through the use case to the database and back — and keep it as production code. It gives you end-to-end feedback on day two instead of month two, and it proves the boundaries you designed in Phase 1 actually link up before you invest in fleshing them out. This is the single best first thing to build once your boundaries are drawn.

Prompt

Use the pragmatic-programmer skill to design the thinnest possible tracer bullet for my marketplace app — one real end-to-end slice from an authenticated create-listing request through the use case and repository to a saved row and back to a JSON response — that exercises every architectural layer with minimal functionality so I can confirm the wiring and the Clean Architecture boundaries actually connect before building any real features

Pragmatic Programmer

Second, reversibility: there are no final decisions, so abstract third-party dependencies behind your own interfaces and never let a vendor’s API leak into your business logic. This is the same instinct as Clean Architecture’s adapters, stated as a habit — the forking-road test asks whether you could switch from Postgres to a different store, or one LLM provider to another, in a week. If not, you are coupled, and you should fix it while it is still cheap. Third, orthogonality: changing the database should not break the UI; changing the auth provider should not touch billing. The test is to ask how many modules a dramatic change to one requirement would affect, and to aim for the answer “one.” Fourth, DRY for knowledge, not for coincidence — two code blocks that look alike but encode different business rules are not duplication, and merging them couples concepts that should evolve independently. This is the subtle counterpoint to Phase 5’s information-hiding: deduplicate knowledge ruthlessly, but do not deduplicate things that merely resemble each other. Ask your agent to audit a planned design against these.

Prompt

Use the pragmatic-programmer skill to audit my plan to call the OpenAI SDK directly from four different service classes against the reversibility and orthogonality principles — show me the adapter interface that would let me swap the model provider later without touching business logic, and flag anywhere I would be coupling unrelated modules to a single vendor's API shape

Pragmatic Programmer

The cultural glue is the Broken Window theory: the first hack is the most expensive because it grants permission for every hack after it. In a brand-new codebase, establishing zero tolerance for unexplained broken windows — fix it now, or board it up with a tracked ticket — is one of the cheapest high-leverage habits available, and it protects every architectural decision you made in the earlier phases from slow erosion.

Phase 8: Cut scope ruthlessly so you build the essential version

The final discipline is the one that decides whether any of the above ever ships. The The 37signals Way skill, distilled from Getting Real, Rework, and Shape Up, exists to counterbalance the natural tendency of an architecture document to grow without limit. Its core principle is the perfect closing note for this guide: build less. The best products do fewer things exceptionally well, and simplicity is the destination you fight toward, not the place you start. Half a product beats a half-assed product.

The mechanism is fixing time and flexing scope. Instead of estimating how long an open-ended architecture will take (an estimate that will balloon to fill whatever time you allow), set an appetite — how much time this work is genuinely worth — and then cut scope to fit it. For a new app, this is the antidote to architectural gold-plating: the message-queue infrastructure, the event-sourcing layer, the generic plugin system, the configurability for use cases you have invented in your head. The skill’s framing of every preference as “a decision the team could not or would not make” applies directly to architecture — every speculative abstraction is a decision you are deferring to an imaginary future at the cost of present complexity. Have your agent shape the architectural work into something bounded.

Prompt

Use the 37signals-way skill to shape the architecture work for the first version of my marketplace app into a six-week appetite — given the boundaries, domain model, data decisions, and resilience patterns I have planned, tell me what is genuinely essential for launch versus what is gold-plating I should cut or defer, and identify the rabbit holes that could blow the scope

The 37signals Way

This is also where YAGNI — you aren’t gonna need it — gets teeth. The skill’s curator mindset says to say no by default so the great decisions can breathe, and to make tiny, reversible decisions rather than big irreversible ones. Notice how neatly that closes the loop with the opening principle of this guide: architecture is the set of decisions expensive to reverse, so the fewer of those you make speculatively, the better. Every abstraction you don’t build is a decision you keep cheap. The discipline is not to avoid architecture — it is to do exactly enough of it, applied where it compounds, and to ship.

Prompt

Use the 37signals-way skill to review my v1 technical plan for premature abstraction and feature creep — I have designed a generic event-sourcing system, a plugin architecture, and a configurable multi-tenant layer for an app that currently has zero users — and tell me which of these to delete from v1, which to replace with the simplest thing that could possibly work, and what the genuinely essential architecture is

The 37signals Way

Your checklist

Common mistakes

Letting the framework be the architecture. The fastest way to a rewrite is to scaffold your app from a framework’s conventions and let its models, ORM entities, and request objects flow all the way into your business logic. The framework is a detail; when it dictates your structure, swapping or upgrading it becomes a rewrite. Keep it at the edges. Clean Architecture’s Dependency Rule is the cure: framework calls inward, never the reverse.

Over-engineering for scale you cannot prove you need. Sharding, microservices, event sourcing, a service mesh, and multi-region failover are all real tools — and all wrong for an app with a few hundred users. The System Design skill’s estimation math almost always shows that a single indexed database plus a cache is enough for a long time. Premature sharding and premature service-splitting are among the most expensive mistakes a new team makes. Earn complexity with evidence.

Over-correcting into classitis. The opposite failure is reading about clean layering and shattering the codebase into dozens of tiny single-method classes, each a shallow module that adds interface cost without hiding meaningful complexity. Ousterhout’s deep-module principle is the corrective: a few deep modules with simple interfaces beat a swarm of shallow ones. Boundaries should fall at real volatility and real domain seams, not between every verb.

Ignoring the database’s actual consistency guarantees. Assuming your transactions are serializable when your database defaults to snapshot isolation leads to write-skew bugs — like two buyers each successfully purchasing the last unit of stock — that appear only under concurrent load and never in testing. Decide consistency per operation, know your default isolation level, and lock explicitly where invariants demand it.

Treating resilience as a post-launch concern. Skipping timeouts, circuit breakers, and pagination because “we’ll add them when we need them” means you add them during your first outage, under pressure, at 2 a.m. A slow dependency with no timeout can freeze your entire app with nothing in the logs. These patterns are architecture, not operations, and they are cheapest to design in from the start.

Anemic domain models. When entities are passive data bags and all behavior lives in service classes, business rules scatter and duplicate, and the code stops resembling the domain. Push behavior into the entities and value objects that own the data. A domain expert should be able to read your core classes and tell you where they are wrong.

Confusing build-less with build-carelessly. Cutting scope is not permission to ship something broken or to skip the boundaries that keep options open. The 37signals discipline is to build the essential version well — half a product, not a half-assed one. Cut features and speculative abstractions; never cut the small set of decisions that are expensive to reverse.

Frequently asked questions

Isn’t this too much process for a small app or an MVP?

No, because most of it is subtraction, not addition. The expensive-feeling phases — system design, data-intensive decisions, resilience — are largely about proving you can build less: a single database instead of a sharded cluster, eventual consistency where it is fine, timeouts and pagination instead of a full chaos-engineering practice. The genuinely additive work is concentrated in Phases 1 and 2 — boundaries and a domain model — and that work pays for itself almost immediately because it is what keeps every later decision cheap to change. For a true weekend project, lean on Clean Architecture’s Dependency Rule, a quick Ubiquitous Language, timeouts on outbound calls, and the 37signals instinct to cut scope. That is a few hours of thinking that saves weeks.

How do I know which decisions are actually “expensive to reverse”?

Use the forking-road test from the Pragmatic Programmer skill: imagine the requirement changes, and ask how much of the system you would have to rewrite. Decisions that touch your data model, your transactional and consistency boundaries, and the seams between bounded contexts tend to be expensive, because data outlives code and migrations are painful. Decisions hidden behind an interface — which web framework, which database product, which payment provider, which queue — are cheap if you drew the boundary, because only the adapter changes. The whole strategy of this guide is to convert as many expensive decisions as possible into cheap ones by putting a boundary in front of them, so that the irreducibly expensive set is small enough to get right with care.

Should I start with microservices or a monolith?

A modular monolith, almost always. The Clean Architecture skill is explicit that services are units of deployment, not units of architecture, and that a microservice with a fat shared data model is a distributed monolith — strictly worse than a clean one, because you have added network calls, partial-failure modes, and operational overhead without gaining real independence. The right sequence is to build a well-structured monolith whose internal bounded contexts have clean boundaries, then extract a service later if a real need appears — a piece that must scale independently, a team that needs to deploy on its own cadence. Because the boundaries already exist, the extraction is mechanical rather than a redesign. Splitting first, before you know where the seams are, forces you to guess the boundaries and almost guarantees you guess wrong.

How do these eight skills actually work together in practice?

They form a dependency chain that mirrors the system you are building. Domain-Driven Design tells you where the boundaries belong (the bounded contexts and aggregate seams); Clean Architecture tells you which way dependencies cross them and how to keep details swappable. Data-Intensive Apps decides what lives inside those boundaries at the persistence layer, and System Design tells you how much infrastructure that actually requires — usually far less than you feared. Software Design is the quality gate that keeps all those modules deep and leak-free instead of multiplying into shallow ceremony. Release It! hardens the integration points between everything against real-world failure. Pragmatic Programmer supplies the cross-cutting habits — tracer bullets, reversibility, orthogonality — that hold the structure together over time. And the 37signals Way is the governor on the whole thing, fixing time and cutting scope so you build the essential version and ship. You invoke each one by telling your agent which skill to use for the decision in front of you.

What if I already have a prototype I built fast and now need to make it solid?

Then this guide is still the right map, but your entry point is different. Read the dedicated guide on taking a vibe-coded prototype to production, which is written for exactly that situation — characterizing the existing code, finding the seams, and hardening it without a rewrite. Come back to this guide for the target state: the boundaries you are refactoring toward (Phase 1), the domain model the code should express (Phase 2), and the resilience patterns to add at the integration points (Phase 6). The difference is sequencing — you will be introducing boundaries into existing code rather than drawing them on a blank page — but the destination is the same.

Start designing your architecture

The best architecture for a new app is not the most sophisticated one. It is the one that gets the small number of expensive-to-reverse decisions right — the boundaries, the domain model, the data and consistency choices, the resilience at the edges — and stays aggressively simple everywhere else. These eight skills give your AI agent the actual frameworks behind each of those decisions, so the guidance you get is grounded in the canonical books rather than improvised.

Install the stack and start with the phase that matches your situation:

npx skills add wondelai/skills --all --global

Then open your agent and point it at the first decision in front of you — most teams should begin with Clean Architecture and Domain-Driven Design, since boundaries and the domain model are what make every later choice cheap. Work through the phases in order, pulling the data and resilience skills in as real bottlenecks and integration points appear.

When you are ready to move from architecture to building it out feature by feature with clean, well-tested code, continue with the sibling guide on how to create a new app with AI skills — it picks up the validation, implementation, and shipping cadence around the architecture you have just designed.

How to Design the Best Possible Architecture for a New App

Phase 1: Draw the boundaries that make every other decision reversible

Phase 2: Model the domain so the boundaries fall where the business splits

Phase 3: Size the system honestly — then build far less than you fear

Phase 4: Make deliberate data decisions, because data outlives the code

Phase 5: Treat complexity as the enemy and keep modules deep

Phase 6: Design for the failures production will absolutely send you

Phase 7: Apply the meta-disciplines that keep the whole thing reversible

Phase 8: Cut scope ruthlessly so you build the essential version

Your checklist

Common mistakes

Frequently asked questions

Isn’t this too much process for a small app or an MVP?

How do I know which decisions are actually “expensive to reverse”?

Should I start with microservices or a monolith?

How do these eight skills actually work together in practice?

What if I already have a prototype I built fast and now need to make it solid?

Start designing your architecture

Related guides

Get all 50 skills, free

Don’t guess your AI engineering level.
Measure it.

AI Developer Scorecard

CTO Scorecard

How to Design the Best Possible Architecture for a New App

Phase 1: Draw the boundaries that make every other decision reversible

Phase 2: Model the domain so the boundaries fall where the business splits

Phase 3: Size the system honestly — then build far less than you fear

Phase 4: Make deliberate data decisions, because data outlives the code

Phase 5: Treat complexity as the enemy and keep modules deep

Phase 6: Design for the failures production will absolutely send you

Phase 7: Apply the meta-disciplines that keep the whole thing reversible

Phase 8: Cut scope ruthlessly so you build the essential version

Your checklist

Common mistakes

Frequently asked questions

Isn’t this too much process for a small app or an MVP?

How do I know which decisions are actually “expensive to reverse”?

Should I start with microservices or a monolith?

How do these eight skills actually work together in practice?

What if I already have a prototype I built fast and now need to make it solid?

Start designing your architecture

Related guides

How to Create a New Business with AI Skills

How to Create a New Website with AI Skills

How to Create a New App with AI Skills

Get all 50 skills, free

Don’t guess your AI engineering level. Measure it.

AI Developer Scorecard

CTO Scorecard

Don’t guess your AI engineering level.
Measure it.