Multi-tenant SaaS: 7 pitfalls you only learn in production
Seven concrete mistakes I've made or watched happen building Salonnare - multi-tenant SaaS is hard, and you'd rather not learn these the way I did.
Building a multi-tenant SaaS feels like a standard webapp with one extra column in the database. In practice, between "works for one client" and "works for 100 clients without them seeing each other's data" lies a chasm where most junior teams fall.
Here are seven concrete pitfalls from Salonnare's production journey that I recommend every team internalise in advance.
1. Forgetting tenant_id in one query is game over
The classic: one SELECT * FROM bookings WHERE status = 'scheduled' without a tenant_id filter and suddenly every salon sees every other salon's appointments. This isn't hypothetical - it's the #1 bug in multi-tenant systems.
Solution: a tenant-scoped query wrapper. In Salonnare's Drizzle implementation every select(), insert(), update() and delete() automatically gets a tenant_id filter injected. Developers simply can't read cross-tenant data by accident - the wrapper refuses the query.
// Route handler
const rows = await req.db.select().from(bookings).where(eq(bookings.status, 'scheduled'));
// Wrapper injects: AND bookings.tenant_id = ?
Plus 19 unit tests that explicitly verify the wrapper can't be bypassed. That's the level of paranoia required.
2. Audit log without tenant_id
If your audit log table lacks tenant_id, an investigator can't tell which tenant did what after an incident. And if the column exists but developers forget to populate it - equally bad. Make tenant_id NOT NULL and enforce it via a database constraint.
3. Background jobs without tenant context
Email workers, cron jobs and other async tasks run outside the HTTP request context. Very easy to forget to pass tenantId, after which the job operates on the wrong dataset.
Solution: every enqueue function requires tenantId as a parameter. The job payload includes tenant_id, and the worker hydrates the tenant-scoped DB wrapper before calling the handler.
4. Subscription cancellation doesn't clean up data
Customer cancels, subscription ends - and the data sits in production for months. This is both a GDPR issue (retention) and a cost issue (storage + backups).
Solution: a retention job that soft-deletes 30 days after cancellation and hard-deletes 90 days after. Plus an export tool so customers can download their own data before the retention clock starts.
5. Plan feature flags without centralised config
Hardcoding each plan limit in individual routes ends in a mess. "Pro tier gets 50 users" is scattered across six routes, and after four plan adjustments nobody knows what's where.
Solution: one plan_limits.ts config with PLAN_LIMITS[plan].maxStaff, plus middleware requireFeature('loyalty') and checkStaffLimit() on create routes. Plan changes require exactly one commit.
6. Webhook delivery without idempotency
Stripe sometimes sends webhooks twice. So does Mollie. If your payment handler lacks an idempotency check, you register a €29 subscription as €58 - and the customer rightly demands their money back.
Solution: on every webhook event, store event.id in a processed_webhooks table. Before the handler runs, it checks if the event has already been processed. Simple, critical, too often skipped.
7. Forgetting subdomain SSL renewals in production
Each tenant on its own subdomain? Check. Automatic SSL via Let's Encrypt? Usually. But if you use wildcard certs, missed renewals break all tenants at once. And Let's Encrypt's 90-day term arrives faster than you think.
Solution: Certbot with DNS-01 challenge for wildcard certs, a cron job that checks daily, and a monitoring alert when the cert expires within 14 days. Two weeks of buffer gives you time to intervene manually.
Conclusion
Building multi-tenant SaaS is a one-time architectural investment for a recurring product. The list above isn't theoretical - each mistake has caused production issues somewhere. The question isn't whether you'll encounter them, but whether you find them before or after launch.
Building a SaaS yourself? Get in touch for an architecture review before you go live. Two hours before delivery prevents two months of refactoring after.
