MariaDB on a 4GB VPS: what breaks and what actually works
What I learned from running a production multi-tenant MariaDB on a 4GB VPS - Docker memory limits, connection pool tuning, idempotent migrations and the settings that decide between running and swapping.
Salonnare runs on a 4GB RAM VPS. We host MariaDB plus a Next.js marketing site, a Node.js API, a Python worker, an nginx, a Docusaurus docs site and a DDNS daemon - all via Docker Compose. For a SaaS serving multiple customers this is tight but not impossible. Here's what I've learned keeping it in production for months without OOM kills or customer-visible downtime.
The mental model: everything is a budget
On a 4GB VPS every Docker process has a memory budget. The OS takes about 400 MB, so you have 3.6 GB to divide across all your services. My current split:
# docker-compose.saas.yml (simplified)
services:
mariadb:
deploy: { resources: { limits: { memory: 1g } } }
server:
deploy: { resources: { limits: { memory: 768m } } }
worker:
deploy: { resources: { limits: { memory: 256m } } }
web:
deploy: { resources: { limits: { memory: 128m } } }
docs:
deploy: { resources: { limits: { memory: 128m } } }
ddns:
deploy: { resources: { limits: { memory: 64m } } }
Total: ~2.4 GB allocated, ~1.2 GB headroom for file system cache and transient spikes. More important than the exact split: you set hard limits. Without a limit MariaDB will happily push all your RAM into its buffer pool, and at the first spike in API traffic your server container gets OOM-killed.
innodb_buffer_pool_size is your most important setting
InnoDB caches recently-read rows and indexes in the buffer pool. Default is around 128 MB. On a 1 GB MariaDB instance you can comfortably set it to 70-75%:
# my.cnf
[mysqld]
innodb_buffer_pool_size = 700M
innodb_buffer_pool_instances = 1
innodb_log_file_size = 128M
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
O_DIRECT prevents double-buffering (kernel page cache + InnoDB cache of the same data). On a tight VPS that's literally hundreds of MB savings. innodb_flush_log_at_trx_commit = 2 flushes logs to OS on every commit, and to disk every second - a fraction less durability for noticeably better write throughput. For a SaaS without financial transactions acceptable; for a bank no.
Connection pool: more is not better
In my server config:
// Drizzle/mysql2 pool config
{
connectionLimit: 20, // hard cap
acquireTimeout: 10_000, // 10s max waiting for connection
idleTimeout: 30_000, // release unused connections after 30s
}
20 connections sounds low, but every MariaDB connection costs about 1-2 MB of thread memory. 100 connections = 200 MB you can't put in the buffer pool. Rule of thumb: connection pool size = roughly 2× your CPU core count, no higher. On a 2-core VPS 8-12 connections is enough, with 20 as a safety margin for traffic spikes.
Many ORMs (including older Knex configs) default to 100+ connections. That's a leftover from physical server days and today almost never the right choice.
Migrations must be idempotent
Salonnare now runs 104 migrations on every server start. They must all be idempotent - running twice gives the same result as running once. Otherwise you lose half an hour of uptime per server restart, or worse, get a crash mid-deploy.
-- Works
ALTER TABLE bookings ADD COLUMN IF NOT EXISTS deleted_at DATETIME NULL;
CREATE TABLE IF NOT EXISTS audit_log (
id INT AUTO_INCREMENT PRIMARY KEY,
-- ...
);
CREATE INDEX IF NOT EXISTS idx_bookings_status ON bookings(status);
-- Doesn't work
ALTER TABLE bookings ADD COLUMN deleted_at DATETIME NULL; -- crashes 2nd time
IF NOT EXISTS is your friend on ADD COLUMN, CREATE TABLE, CREATE INDEX. For column renames or type changes you have to get a bit creative:
-- Conditional ALTER via INFORMATION_SCHEMA
SET @stmt = (
SELECT IF(
EXISTS(
SELECT 1 FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'clients' AND COLUMN_NAME = 'old_name'
),
'ALTER TABLE clients CHANGE old_name new_name VARCHAR(255)',
'SELECT 1'
)
);
PREPARE s FROM @stmt; EXECUTE s; DEALLOCATE PREPARE s;
Ugly, but idempotent. In production, idempotent migrations run on every server start without any risk.
Slow queries kill your whole VPS
On a big server you buy your way out of a slow query with more hardware. On a 4GB VPS, one report query with five non-indexed joins takes your entire server down - the buffer pool gets flushed, other queries fall back to disk i/o, and the whole stack slows for minutes.
What I do:
slow_query_log = 1on withlong_query_time = 2. Every query longer than 2 seconds lands in a log. Once a week I scan it.EXPLAINfor every new query with more than 1 join. No index = stop, add the index first, then ship.- Reporting endpoints run on a separate read replica. Trivial on big VPSes, just doable on 4GB via a second container with capped resources.
Backup without downtime
mysqldump --single-transaction on InnoDB tables gives a consistent snapshot without a write lock. Once a night via cron, output to a Cloudflare R2 bucket:
docker exec mariadb mariadb-dump \
--single-transaction \
--quick \
--routines \
--triggers \
--all-databases \
| gzip | rclone rcat r2:salonnare-backups/$(date +%F).sql.gz
7-day retention, then prune. Restore tested once per quarter - backups you never restore aren't backups, it's hope.
What I've learned to avoid in production
- MyISAM tables. No transactions, no row-level locking, looser recovery. On modern MariaDB there's no reason to use anything other than InnoDB.
- TEMPORARY tables for reporting. Eat memory, and spilled to disk are slow. Write your report query as a regular SELECT with CTEs or subqueries.
- Non-indexed
ORDER BYon large tables. Filesort uses disk space (/tmpin container) and blocks other queries. SELECT *on tables with BLOB columns. Even if you don't read the blob, MariaDB streams the entire row. Select explicit columns you need.
What I recommend
- Start with hard memory limits per container. No limit = OOM kill at first spike.
- Set
innodb_buffer_pool_sizeto ~70% of your MariaDB container limit, withO_DIRECT. - Cap your connection pool at 2× CPU count, not higher. More connections = more thread memory, no extra throughput.
- Keep every migration idempotent from day 1. Retrofitting later is painful.
- Slow query log on, scan once a week, add missing indexes.
A 4GB VPS isn't the end station for your SaaS, but it's a fine starting point. Salonnare runs stably on it for over a year with a growing tenant count, and an upgrade to 8GB only sits on the roadmap once the buffer pool no longer fits.
