Designing a Multi-Tenant Database Schema: Patterns and Trade-offs
Multi-tenancy is one of the most consequential architectural decisions in SaaS database design. Learn the three main patterns, when to use each, and how to implement tenant isolation correctly.
When you build a SaaS application, one of the earliest architectural decisions you will make is how to handle multi-tenancy at the database level. This choice affects data isolation, query performance, operational complexity, compliance posture, and how difficult it will be to migrate individual tenants in the future. Getting it right is worth spending significant time on before writing any application code.
There are three fundamental approaches to multi-tenant database design, each with a distinct set of trade-offs. The right choice depends on your tenant count, compliance requirements, team size, and growth trajectory. ER Flow makes it easy to design and visualize whichever pattern you choose before committing to an implementation.
Approach 1: Shared Tables with `tenant_id`
This is the most common approach for high-growth SaaS products. Every tenant-scoped table gets a tenant_id column β a foreign key to a central tenants table. All tenants share the same tables, and tenant isolation is enforced at the application and query level. The schema looks like this: tenants(id, name, slug, plan_id, created_at), then every entity table adds tenant_id NOT NULL REFERENCES tenants(id). A users table becomes users(id, tenant_id, email, password_hash, created_at). An orders table becomes orders(id, tenant_id, user_id, status, total_cents, created_at).
Advantages: Simple to implement, easy to add new tenants (just insert a row in tenants), low operational overhead, and efficient use of database resources. One schema migration applies to all tenants simultaneously. Horizontal scaling is straightforward β you can shard by tenant_id when a single database instance reaches its limits.
Disadvantages: Application-level bugs can accidentally expose cross-tenant data. A single missing WHERE tenant_id = ? filter in a query is a potential security incident. One large tenant's heavy queries can degrade performance for other tenants sharing the same database instance. Enterprise customers with strict compliance requirements may refuse this model entirely.
Approach 2: Schema-per-Tenant
In PostgreSQL, a "schema" is a namespace within a database. Each tenant gets their own schema: tenant_acme.users, tenant_beta.users. The application sets the search_path to the correct schema at connection time. This approach provides significantly stronger isolation than shared tables β a bug in the application code would need to both skip the search_path setting and somehow access another tenant's schema.
Advantages: Strong isolation without the risk of cross-tenant data leakage from query bugs. Each tenant's schema can have slightly different structure if needed (useful during staged rollouts). Moving a single tenant to dedicated infrastructure is straightforward β copy the schema to the new instance and update the connection routing.
Disadvantages: PostgreSQL supports many schemas per database, but performance can degrade with thousands of them. Migrations must be applied to every tenant schema individually β your migration tooling needs to iterate over all tenant schemas, which is slower and harder to roll back safely. Backup and restore operations become more complex. This pattern is also MySQL-specific in its implementation; MySQL schemas are equivalent to databases, making schema-per-tenant effectively the same as database-per-tenant in that engine.
Approach 3: Database-per-Tenant
The most isolated approach: each tenant gets a completely separate database instance. The application has a routing layer that maps tenant_id to the correct database connection string. This is the model used by enterprise SaaS products with strict compliance requirements β HIPAA, SOC 2 Type II, FedRAMP. Each tenant's data is physically separate from all others.
Advantages: Perfect isolation β a database incident affecting one tenant has zero impact on others. You can run different database versions, different configurations, or different backup schedules per tenant. Makes it trivially easy to offer tenants a dedicated instance option at a premium price tier. Compliance certifications like HIPAA become straightforward to achieve and demonstrate.
Disadvantages: Enormous operational overhead. Migrations must be applied to every tenant database, often through an orchestration system that retries failures and tracks progress. Connection pooling becomes complex β a server with 1,000 tenants needs sophisticated connection management. Cost scales linearly: 1,000 tenants means at least 1,000 database instances. This pattern is only viable with heavy automation and a team large enough to manage it.
Row-Level Security in PostgreSQL
PostgreSQL's Row-Level Security (RLS) gives you shared-table isolation enforced at the database level rather than the application level. You define security policies on each table: ALTER TABLE orders ENABLE ROW LEVEL SECURITY, then CREATE POLICY tenant_isolation ON orders USING (tenant_id = current_setting('app.current_tenant_id')::uuid). Your application sets the session variable before each query: SET LOCAL app.current_tenant_id = 'tenant-uuid-here'.
PostgreSQL then automatically appends the policy condition to every query on that table β even if the application code forgets to filter by tenant_id. This is the best of both worlds: the simplicity of shared tables with database-enforced isolation. The downside is that RLS adds a small overhead to every query and requires careful testing to ensure policies work correctly for all query patterns including joins.
Indexing for Multi-Tenant Schemas
Every indexed column in a multi-tenant schema should include tenant_id as the first column in the index. Instead of CREATE INDEX idx_orders_status ON orders(status), use CREATE INDEX idx_orders_status ON orders(tenant_id, status). This is critical for query performance: the database uses the index to find all orders for a specific tenant with a given status, rather than scanning across all orders from all tenants.
For polymorphic references (commonly used in audit logs and activity feeds), the composite index should be (tenant_id, auditable_type, auditable_id). The tenant_id always comes first in compound indexes for multi-tenant tables β it is the highest-selectivity filter that eliminates the most rows in the earliest possible scan step. In ER Flow, you can add index definitions directly to each table in the diagram, making your indexing strategy visible and reviewable alongside the schema structure.
Cascade Deletes and Tenant Cleanup
When a tenant cancels and you need to delete all their data, cascade deletes are your most reliable tool. Define ON DELETE CASCADE on every foreign key that references tenants(id). This means a single DELETE FROM tenants WHERE id = ? will cascade through your entire schema β deleting all users, orders, invoices, and other tenant-scoped records in the correct dependency order.
Test your cascade behavior thoroughly in a staging environment before relying on it in production. For very large tenants, a cascade delete can become a long-running transaction that locks tables and impacts other tenants. The safer approach is to soft-delete the tenant first (tenants.deleted_at = NOW()), then batch-delete child records in the background over time, and finally hard-delete the tenant row when all children have been cleaned up. Document this deletion workflow in your ER diagram using ER Flow's note feature, so the operational procedure is attached to the schema design itself.
Whichever approach you choose, model it explicitly in ER Flow before implementing it. The visual diagram makes the tenant isolation pattern concrete β you can see at a glance whether every table has a tenant_id, whether foreign keys are correctly cascading from the tenants table, and whether indexes are properly structured for multi-tenant query patterns. Design decisions that seem abstract in a technical discussion become immediately verifiable on a shared canvas.