sqlite
SQLite Configuration and Best Practices
This document describes the SQLite configuration used in this project, best practices, and maintenance procedures.
Current Configuration
Database Setup (internal/db/setup.go)
Our SQLite database is configured for optimal performance and safety with the following settings:
Connection Parameters
// URL parameters (applied at connection initialization)
_journal=WAL // Hint for initial journal mode
_timeout=20000 // 20 second timeout
_busy_timeout=20000 // 20 second busy timeout
// Note: These URL parameters provide hints during connection initialization.
// PRAGMA statements in enablePragmas() ensure settings are applied regardless
// of database state.
Connection Pool Settings
// Write connection (main database)
SetMaxOpenConns(1) // CRITICAL: Only 1 writer for SQLite WAL mode
SetMaxIdleConns(1)
SetConnMaxLifetime(0) // Connections never expire
SetConnMaxIdleTime(0)
// Read connection (optional read-only connection)
SetMaxOpenConns(25) // Multiple readers allowed in WAL mode
SetMaxIdleConns(25)
Why only 1 writer?
SQLite with WAL mode allows multiple concurrent readers but only one writer at a time. Having more than one write connection can cause SQLITE_BUSY errors.
PRAGMA Settings
Applied at startup via enablePragmas():
PRAGMA foreign_keys = ON; -- Enable foreign key constraints
PRAGMA synchronous = NORMAL; -- Balance between speed and safety
PRAGMA strict = ON; -- Strict aggregate functions
PRAGMA temp_store = MEMORY; -- Store temporary tables in RAM
PRAGMA mmap_size = 268435456; -- 256MB memory-mapped I/O
PRAGMA cache_size = -64000; -- 64MB page cache
PRAGMA wal_autocheckpoint = 1000; -- Checkpoint every 1000 pages
PRAGMA Settings Explained
foreign_keys = ON
CRITICAL: Enables foreign key constraint enforcement. Without this, foreign keys are ignored.
-- With foreign_keys = ON:
DELETE FROM users WHERE id = 1; -- Error if purchases reference this user
-- Without it:
DELETE FROM users WHERE id = 1; -- Succeeds, orphans purchases!
synchronous = NORMAL
Controls how often SQLite syncs to disk:
FULL(default): Very safe, slowerNORMAL: Good balance - safe in WAL modeOFF: Fast but risks corruption on power loss
With WAL mode, NORMAL is safe and recommended.
strict = ON
Strict mode for aggregate functions:
-- With strict = ON:
SELECT sum(text_column) FROM table; -- Error
-- Without it:
SELECT sum(text_column) FROM table; -- Returns 0 (silent failure)
temp_store = MEMORY
Stores temporary tables and indices in RAM instead of disk. Faster but uses more memory.
mmap_size = 268435456
Memory-mapped I/O (256MB). SQLite maps database pages directly to memory, reducing system calls. Improves read performance significantly.
cache_size = -64000
Page cache size in KB (negative = KB, positive = pages). 64MB cache improves performance for frequently accessed data.
wal_autocheckpoint = 1000
Automatically checkpoint WAL file after 1000 pages. Prevents WAL file from growing too large.
WAL Mode (Write-Ahead Logging)
What is WAL?
Instead of writing directly to the database file, SQLite writes changes to a separate WAL file:
database.db -- Main database file
database.db-wal -- Write-Ahead Log (pending changes)
database.db-shm -- Shared memory file (index for WAL)
Benefits
- Multiple concurrent readers - Readers don't block each other
- Readers don't block writer - Reads from checkpoint, writes to WAL
- Better performance - Sequential writes to WAL are faster
- Crash recovery - WAL file can be replayed on startup
Checkpointing
Periodically, WAL changes are merged back into the main database file. This is called "checkpointing".
Manual checkpoint:
PRAGMA wal_checkpoint(TRUNCATE); -- Full checkpoint, truncate WAL
In this project, checkpointing happens:
- Automatically: Every 1000 pages (
wal_autocheckpoint) - Weekly: During VACUUM cronjob (Sunday 3 AM)
Maintenance
VACUUM Cronjob
Schedule: Every Sunday at 3:00 AM (internal/case/cronjob/vacuumdatabase/)
What it does:
func VacuumDB(ctx context.Context) error {
// 1. Checkpoint WAL file
PRAGMA wal_checkpoint(TRUNCATE)
// 2. Reclaim unused space (defragment)
VACUUM
// 3. Update query optimizer statistics
ANALYZE
}
Why weekly?
VACUUMreclaims space from deleted rowsANALYZEupdates statistics for the query planner- Both operations lock the database, so run during low-traffic hours
Manual Maintenance
If needed, you can trigger manually via GraphQL admin API or:
sqlite3 database.db "PRAGMA wal_checkpoint(TRUNCATE);"
sqlite3 database.db "VACUUM;"
sqlite3 database.db "ANALYZE;"
Best Practices
1. Always Use Transactions for Writes
// Bad: Multiple statements without transaction
db.Exec("INSERT INTO users ...")
db.Exec("INSERT INTO purchases ...")
// Good: Single transaction
tx, _ := db.Begin()
tx.Exec("INSERT INTO users ...")
tx.Exec("INSERT INTO purchases ...")
tx.Commit()
Benefits:
- Atomicity: All-or-nothing
- Performance: 100x faster for bulk inserts
- Consistency: Readers see consistent state
2. Use Prepared Statements
// Bad: String concatenation (SQL injection risk)
db.Exec("SELECT * FROM users WHERE email = '" + email + "'")
// Good: Prepared statement (via sqlc)
queries.GetUserByEmail(ctx, email)
3. Add Indexes for Frequent Queries
-- Before adding index:
SELECT * FROM purchases WHERE user_id = ?; -- Table scan (slow)
-- Add index:
CREATE INDEX idx_purchases_user_id ON purchases(user_id);
-- After:
SELECT * FROM purchases WHERE user_id = ?; -- Index seek (fast)
Where to add indexes:
- Foreign key columns (for JOINs)
- WHERE clause columns
- ORDER BY columns
- Composite indexes for multi-column queries
Example composite index:
-- Query:
SELECT * FROM user_subgraph_accesses
WHERE user_id = ? AND subgraph_id = ?;
-- Index:
CREATE INDEX idx_user_subgraph_accesses_user_subgraph
ON user_subgraph_accesses(user_id, subgraph_id);
4. Use Partial Indexes for Filtered Queries
-- Instead of:
CREATE INDEX idx_api_keys_disabled_at ON api_keys(disabled_at);
-- Better (only active keys):
CREATE INDEX idx_api_keys_active
ON api_keys(id)
WHERE disabled_at IS NULL;
Smaller index, faster queries for common case.
5. Use CHECK Constraints for Validation
CREATE TABLE users (
email TEXT NOT NULL UNIQUE,
CHECK (email LIKE '%@%') -- Database-level validation
);
CREATE TABLE purchases (
status TEXT NOT NULL DEFAULT 'pending',
CHECK (status IN ('pending', 'completed', 'failed'))
);
6. Prefer NOT NULL Where Possible
-- Bad:
created_at DATETIME -- Can be NULL
-- Good:
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
NULL values complicate queries and can cause bugs.
7. Use Unique Constraints for Business Logic
-- Prevent duplicate subscriptions at database level:
CREATE UNIQUE INDEX idx_user_subgraph_accesses_unique
ON user_subgraph_accesses(user_id, subgraph_id)
WHERE revoke_id IS NULL;
Performance Tuning
Analyze Query Plans
EXPLAIN QUERY PLAN
SELECT * FROM users WHERE email = 'test@example.com';
-- Output shows if index is used:
-- SEARCH users USING INDEX idx_users_email (email=?) ✅ Good
-- SCAN users ❌ Bad (table scan)
Check Table Sizes
SELECT
name,
SUM(pgsize) / 1024 / 1024 AS size_mb
FROM dbstat
GROUP BY name
ORDER BY size_mb DESC
LIMIT 10;
Monitor WAL Size
-- Check WAL file size:
PRAGMA wal_checkpoint;
-- Returns (busy, log, checkpointed)
-- If log > 1000, consider manual checkpoint
Check Fragmentation
SELECT freelist_count FROM pragma_freelist_count();
-- If > 1000, run VACUUM to defragment
Find Unused Indexes
-- After running workload, check index usage:
SELECT * FROM sqlite_stat1;
-- If an index has low/no stats, consider dropping it
Common Pitfalls
1. ❌ Multiple Write Connections
// BAD: Multiple writers cause SQLITE_BUSY
db.SetMaxOpenConns(10) // Don't do this for write connection!
// GOOD: Single writer
db.SetMaxOpenConns(1)
2. ❌ Forgetting Foreign Keys
// Without PRAGMA foreign_keys = ON:
// Foreign key constraints are ignored!
DELETE FROM users WHERE id = 1; // Orphans dependent data
3. ❌ Long Transactions
// BAD: Long transaction blocks other writers
tx, _ := db.Begin()
// ... lots of work ...
time.Sleep(10 * time.Second) // Other writers blocked!
tx.Commit()
// GOOD: Short transactions
tx, _ := db.Begin()
tx.Exec(...) // Fast operations only
tx.Commit()
4. ❌ Not Using Transactions for Bulk Inserts
// BAD: 1000 individual transactions (very slow)
for i := 0; i < 1000; i++ {
db.Exec("INSERT INTO ...")
}
// GOOD: Single transaction (100x faster)
tx, _ := db.Begin()
for i := 0; i < 1000; i++ {
tx.Exec("INSERT INTO ...")
}
tx.Commit()
5. ❌ Using LIKE Without Index
-- Slow (can't use index):
WHERE email LIKE '%@gmail.com'
-- Fast (can use index):
WHERE email LIKE 'user@%' -- Prefix search
6. ❌ SELECT * Instead of Specific Columns
-- Bad: Reads all columns (slower, more memory)
SELECT * FROM users WHERE id = ?
-- Good: Only needed columns
SELECT id, email FROM users WHERE id = ?
Monitoring Queries
Database Statistics
-- Overall database info:
SELECT * FROM pragma_database_list;
-- Page count and size:
SELECT
page_count * page_size / 1024 / 1024 AS db_size_mb,
page_count,
page_size
FROM pragma_page_count(), pragma_page_size();
-- Integrity check:
PRAGMA integrity_check;
-- Should return: ok
Table Statistics
-- Index usage stats:
SELECT * FROM sqlite_stat1;
-- Table sizes:
SELECT
tbl AS table_name,
SUM(payload) / 1024 / 1024 AS data_mb,
SUM(pgsize) / 1024 / 1024 AS total_mb
FROM dbstat
GROUP BY tbl
ORDER BY total_mb DESC;
Performance Diagnostics
-- Check if indexes are being used:
EXPLAIN QUERY PLAN <your-query>;
-- Detailed execution plan:
EXPLAIN <your-query>;
-- Current connections (if using shared cache):
PRAGMA wal_checkpoint;
Backup Strategy
Online Backup (While Server Running)
# Using SQLite backup API (recommended)
sqlite3 database.db ".backup database-backup.db"
# Or using WAL checkpoint:
sqlite3 database.db "PRAGMA wal_checkpoint(TRUNCATE);"
cp database.db database-backup.db
Backup Files to Include
# Main database
database.db
# WAL files (if not checkpointed)
database.db-wal
database.db-shm
Note: With WAL mode, you must either:
- Checkpoint before copying (
PRAGMA wal_checkpoint(TRUNCATE)) - Copy both
.db,.db-wal, and.db-shmfiles atomically
Migration Best Practices
Safe Schema Changes
SQLite doesn't support ALTER TABLE for many operations. Use the recreate pattern:
-- 1. Create new table with desired schema
CREATE TABLE users_new (
id INTEGER PRIMARY KEY,
email TEXT NOT NULL UNIQUE,
new_column TEXT
) /* NOTE: Don't add STRICT to existing tables */;
-- 2. Copy data
INSERT INTO users_new (id, email, new_column)
SELECT id, email, NULL FROM users;
-- 3. Drop old table
DROP TABLE users;
-- 4. Rename new table
ALTER TABLE users_new RENAME TO users;
-- 5. Recreate indexes
CREATE INDEX idx_users_email ON users(email);
Foreign Key Considerations
When recreating tables with foreign keys:
-- 1. Disable foreign key checks temporarily
PRAGMA foreign_keys = OFF;
-- 2. Recreate tables (as above)
-- 3. Re-enable foreign key checks
PRAGMA foreign_keys = ON;
-- 4. Verify no violations
PRAGMA foreign_key_check;
Performance Benchmarks
Typical performance characteristics:
- Reads: ~100k-500k reads/sec (with proper indexes)
- Writes: ~1k-10k writes/sec (single writer limit)
- Bulk inserts: ~50k-100k rows/sec (in transaction)
- Database size: Handles databases up to 281 TB (theoretical limit)
For this project (note-based application):
- Expected writes: Low-medium (user actions, note updates)
- Expected reads: High (serving note content)
- Verdict: SQLite is perfect for this workload
When to Consider PostgreSQL
SQLite is excellent for this project, but consider PostgreSQL if:
- Multiple simultaneous writers - Need >1 write/sec sustained
- Geographic distribution - Need read replicas in different regions
- Very large dataset - Multi-TB with complex analytics
- Network access - Need remote database access
For single-server applications with moderate write load, SQLite is often faster and simpler than PostgreSQL.
Resources
Summary
Our SQLite configuration is optimized for:
- ✅ Data integrity (foreign keys, constraints)
- ✅ Performance (WAL mode, indexes, caching)
- ✅ Reliability (automated maintenance, backups)
- ✅ Safety (transaction isolation, crash recovery)
Key takeaways:
- One writer connection - Critical for avoiding SQLITE_BUSY
- WAL mode - Enables concurrent reads
- Weekly VACUUM - Keeps database compact
- Proper indexes - Essential for query performance
- Foreign keys ON - Enforces data integrity