Back to Blog
SaaS

Multi-Tenancy Deep Dive: Row-Level vs Schema vs Database Isolation

Zyptr Admin
26 February 2024
9 min read

Why This Decision Matters So Much

Multi-tenancy architecture is one of those decisions that's easy to make and incredibly expensive to change. We've helped three companies migrate between multi-tenancy models, and each migration was a 3-5 month project involving data migration, query rewriting, and extensive testing. Get it right the first time if you can.

We've implemented all three approaches in production: row-level isolation (shared schema, shared database), schema-level isolation (separate schemas, shared database), and database-level isolation (separate databases per tenant). Here's what we've learned from operating each.

Row-Level Isolation: Our Default Recommendation

Every table has a tenant_id column. Every query includes a WHERE tenant_id = ? clause (or, better, PostgreSQL RLS policies enforce this automatically). All tenants share the same tables, same indexes, same connection pool.

Advantages: simplest to implement and operate, most cost-efficient (one database handles thousands of tenants), schema changes apply to all tenants simultaneously, and aggregate queries across tenants are straightforward (useful for admin dashboards and analytics). We use this for our SaaS products and most client projects.

The risk: a missing WHERE clause in one query could expose one tenant's data to another. This is why we strongly recommend PostgreSQL RLS — the database enforces isolation regardless of application code bugs. We set the current tenant at the connection level (SET app.current_tenant = '...') and RLS policies reference this variable. Even if a developer writes SELECT * FROM orders without a WHERE clause, RLS ensures only the current tenant's orders are returned.

Performance consideration: at scale, the tenant_id column needs to be part of every composite index. We've seen significant query performance degradation at 500K+ rows per table when tenant_id wasn't included in relevant indexes. Always create indexes as (tenant_id, other_columns) not just (other_columns).

Schema-Level Isolation: The Middle Ground

Each tenant gets their own PostgreSQL schema within a shared database. So tenant_abc has their own orders table in the tenant_abc schema, completely separate from tenant_xyz's orders table in the tenant_xyz schema. We set the search_path at connection time to route queries to the correct schema.

Advantages: stronger data isolation (schemas are truly separate), per-tenant schema customization is possible (if tenant A needs a custom field, you add it to their schema without affecting others), and data export per tenant is straightforward (dump one schema).

Disadvantages: schema migrations must be applied to every tenant's schema individually. At 500 tenants, running a migration means executing the same ALTER TABLE 500 times. We built a migration runner that parallelizes this, but it still takes 10-15 minutes for a large migration across 500 schemas. Connection pooling is also trickier — you need to manage search_path per connection, which means most connection poolers need custom configuration. We use PgBouncer in transaction mode with a custom query to set the schema at the start of each transaction.

We recommend schema isolation for SaaS products that serve a smaller number of larger tenants (think enterprise SaaS with 50-500 customers) where data isolation is a contractual requirement and per-tenant customization is needed.

Database-Level Isolation: Maximum Isolation, Maximum Cost

Each tenant gets their own database. Full stop. We've used this exactly once — for a financial services client where regulatory requirements mandated complete data separation between tenants, including separate encryption keys, separate backup policies, and the ability to physically host different tenants in different regions.

Advantages: absolute data isolation, per-tenant performance tuning, independent backup and restore, and compliance with strict data residency requirements (tenant A's database in Mumbai, tenant B's database in Singapore).

Disadvantages: operational complexity is enormous. Every new tenant means provisioning a new database, configuring connections, setting up monitoring, and managing schema migrations across all databases. We manage this with Terraform for provisioning and a custom migration orchestrator, but it's still 10x the operational overhead of shared-database approaches. Cost scales linearly with tenant count — 100 tenants means 100 database instances.

Our Decision Framework

Fewer than 10,000 tenants, no special compliance requirements, and you want simplicity? Row-level isolation with RLS. Fifty to 500 enterprise tenants with isolation requirements and per-tenant customization needs? Schema isolation. Strict regulatory requirements mandating physical data separation? Database isolation. And if you're genuinely unsure, start with row-level. You can always migrate to schema or database isolation later (it's easier to go from less isolation to more than the reverse).

multi-tenancypostgresqlsaas-architecturedatabase
Let's Work Together

Have a Project in Mind?
Great?

Let's talk about building your next product.