# Milestone 3: Database Selection **Status**: Decided **Goal**: Choose database for developer accounts, app metadata, and analytics. ## Decision **SQLite + Litestream** for self-hosted deployment on Synology NAS. ``` Database: SQLite 3.x (WAL mode) Driver: modernc.org/sqlite (pure Go, no CGO) Backup: Litestream continuous replication Storage: Synology volume (/volume1/mosis/) ``` ### Rationale 1. **Single container** - No separate database service needed 2. **Minimal resources** - ~50MB RAM, perfect for NAS 3. **Zero ops** - No connection pooling, no tuning 4. **Continuous backup** - Litestream replicates to local storage 5. **Point-in-time recovery** - Restore to any moment 6. **Sufficient scale** - Handles 1000s of developers easily ### Architecture ``` ┌─────────────────────────────────────────┐ │ Synology NAS │ │ ┌─────────────────────────────────┐ │ │ │ mosis-portal container │ │ │ │ ├── Go binary │ │ │ │ ├── SQLite (portal.db) │ │ │ │ └── Litestream │ │ │ └──────────────┬──────────────────┘ │ │ │ │ │ ┌──────────────▼──────────────────┐ │ │ │ /volume1/mosis/ │ │ │ │ ├── data/portal.db │ │ │ │ ├── data/portal.db-wal │ │ │ │ ├── backups/ (litestream) │ │ │ │ └── packages/ (app uploads) │ │ │ └─────────────────────────────────┘ │ └─────────────────────────────────────────┘ ``` ### Litestream Configuration ```yaml dbs: - path: /data/portal.db replicas: - type: file path: /backups/portal retention: 720h # 30 days ``` --- ## Overview The database stores all persistent data: developer accounts, app metadata, versions, telemetry events, and audit logs. --- ## Requirements ### Data Characteristics | Data Type | Volume | Access Pattern | Consistency | |-----------|--------|----------------|-------------| | Developers | 10K rows | Read-heavy, low write | Strong | | Apps | 100K rows | Read-heavy | Strong | | Versions | 500K rows | Read-heavy | Strong | | API Keys | 50K rows | Read-heavy | Strong | | Telemetry | 100M+ rows | Write-heavy, append | Eventual OK | | Audit Logs | 10M+ rows | Write-heavy, append | Eventual OK | ### Query Patterns - Get developer by email - List apps by developer - Get app with latest version - Search apps by name/tags - Aggregate telemetry by app/day - Time-range queries on events --- ## Options Analysis ### Option A: PostgreSQL #### Characteristics ``` Type: Relational (SQL) ACID: Full JSON: Native JSONB support Full-text: Built-in tsvector Scaling: Vertical + read replicas ``` #### Pros | Advantage | Details | |-----------|---------| | Battle-tested | Decades of reliability | | ACID compliance | Strong consistency | | JSON support | JSONB for flexible data | | Full-text search | No separate search engine needed | | Extensions | PostGIS, pg_trgm, etc. | | Tooling | pgAdmin, great ORMs | #### Cons | Disadvantage | Details | |--------------|---------| | Ops overhead | Need connection pooling | | Scaling writes | Vertical scaling limits | | Time-series | Not optimized for telemetry | #### Hosting Options | Provider | Free Tier | Paid | |----------|-----------|------| | Supabase | 500MB | $25/mo | | Neon | 512MB | $19/mo | | Railway | 1GB | $5/mo | | AWS RDS | - | $15/mo+ | | Self-hosted | - | VPS cost | --- ### Option B: SQLite + Litestream #### Characteristics ``` Type: Embedded relational ACID: Full Scaling: Single writer Backup: Litestream to S3 ``` #### Pros | Advantage | Details | |-----------|---------| | Zero ops | No separate DB server | | Fast reads | In-process, no network | | Simple backup | Litestream handles replication | | Low cost | Just storage costs | | Portable | Easy local development | #### Cons | Disadvantage | Details | |--------------|---------| | Single writer | Limits write concurrency | | No horizontal scale | One server only | | Limited features | No full-text (without FTS5) | #### Cost Estimate | Component | Cost/month | |-----------|------------| | S3 storage (10GB) | $0.25 | | Compute | Included in app server | --- ### Option C: PostgreSQL + TimescaleDB #### Characteristics ``` Type: Time-series extension Base: PostgreSQL Scaling: Automatic partitioning Compression: Native ``` #### Pros | Advantage | Details | |-----------|---------| | Best of both | Relational + time-series | | Auto-partition | Handles telemetry scale | | Compression | 90%+ compression ratio | | Continuous aggregates | Pre-computed rollups | #### Cons | Disadvantage | Details | |--------------|---------| | Complexity | More to manage | | Cost | Higher than plain Postgres | | Learning curve | New concepts | --- ### Option D: Hybrid Approach ``` PostgreSQL → Developers, Apps, Versions, API Keys ClickHouse/QuestDB → Telemetry, Analytics Redis → Caching, Sessions ``` #### Pros | Advantage | Details | |-----------|---------| | Right tool for job | Optimized for each use case | | Scale independently | Telemetry won't affect main DB | | Performance | Best possible for each workload | #### Cons | Disadvantage | Details | |--------------|---------| | Complexity | Multiple systems to manage | | Cost | More infrastructure | | Consistency | Cross-DB transactions hard | --- ## Schema Design (SQLite) ### Core Tables ```sql -- Developers CREATE TABLE developers ( id TEXT PRIMARY KEY, -- UUID as text email TEXT UNIQUE NOT NULL, name TEXT NOT NULL, password_hash TEXT, oauth_provider TEXT, oauth_id TEXT, verified INTEGER DEFAULT 0, created_at TEXT DEFAULT (datetime('now')), updated_at TEXT DEFAULT (datetime('now')) ); -- API Keys CREATE TABLE api_keys ( id TEXT PRIMARY KEY, developer_id TEXT NOT NULL REFERENCES developers(id) ON DELETE CASCADE, name TEXT NOT NULL, key_hash TEXT NOT NULL, key_prefix TEXT NOT NULL, -- For display: "mk_abc..." permissions TEXT DEFAULT '[]', -- JSON array last_used_at TEXT, expires_at TEXT, created_at TEXT DEFAULT (datetime('now')) ); -- Apps CREATE TABLE apps ( id TEXT PRIMARY KEY, developer_id TEXT NOT NULL REFERENCES developers(id) ON DELETE CASCADE, package_id TEXT UNIQUE NOT NULL, -- com.dev.app name TEXT NOT NULL, description TEXT, category TEXT, tags TEXT DEFAULT '[]', -- JSON array status TEXT DEFAULT 'draft', -- draft, published, suspended created_at TEXT DEFAULT (datetime('now')), updated_at TEXT DEFAULT (datetime('now')) ); -- App Versions CREATE TABLE app_versions ( id TEXT PRIMARY KEY, app_id TEXT NOT NULL REFERENCES apps(id) ON DELETE CASCADE, version_code INTEGER NOT NULL, version_name TEXT NOT NULL, package_url TEXT NOT NULL, package_size INTEGER NOT NULL, signature TEXT NOT NULL, permissions TEXT DEFAULT '[]', -- JSON array min_mosis_version TEXT, release_notes TEXT, status TEXT DEFAULT 'draft', -- draft, review, approved, published, rejected review_notes TEXT, published_at TEXT, created_at TEXT DEFAULT (datetime('now')), UNIQUE(app_id, version_code) ); -- Developer Signing Keys CREATE TABLE signing_keys ( id TEXT PRIMARY KEY, developer_id TEXT NOT NULL REFERENCES developers(id) ON DELETE CASCADE, name TEXT NOT NULL, public_key TEXT NOT NULL, fingerprint TEXT NOT NULL, is_active INTEGER DEFAULT 1, created_at TEXT DEFAULT (datetime('now')) ); ``` ### Telemetry Tables ```sql -- Telemetry Events (append-only, partition by month via separate tables) CREATE TABLE telemetry_events ( id INTEGER PRIMARY KEY AUTOINCREMENT, app_id TEXT NOT NULL, device_id TEXT NOT NULL, -- Hashed for privacy event_type TEXT NOT NULL, event_data TEXT, -- JSON string mosis_version TEXT, timestamp TEXT NOT NULL -- ISO8601 format ); -- Crash Reports CREATE TABLE crash_reports ( id TEXT PRIMARY KEY, app_id TEXT NOT NULL, app_version TEXT NOT NULL, device_id TEXT NOT NULL, crash_type TEXT NOT NULL, message TEXT, stack_trace TEXT, context TEXT, -- JSON string mosis_version TEXT, timestamp TEXT NOT NULL, created_at TEXT DEFAULT (datetime('now')) ); -- Daily aggregates (computed by background job) CREATE TABLE telemetry_daily ( app_id TEXT NOT NULL, date TEXT NOT NULL, -- YYYY-MM-DD event_type TEXT NOT NULL, count INTEGER NOT NULL, unique_devices INTEGER NOT NULL, PRIMARY KEY (app_id, date, event_type) ); -- Audit Logs CREATE TABLE audit_logs ( id INTEGER PRIMARY KEY AUTOINCREMENT, developer_id TEXT, action TEXT NOT NULL, resource_type TEXT, resource_id TEXT, details TEXT, -- JSON string ip_address TEXT, user_agent TEXT, created_at TEXT DEFAULT (datetime('now')) ); ``` **Note**: For high-volume telemetry, consider: - Separate SQLite database file for telemetry (isolates write load) - Monthly table rotation with application-level partitioning - Aggressive data retention (delete events older than 90 days) ### Indexes ```sql -- Developers CREATE INDEX idx_developers_email ON developers(email); CREATE INDEX idx_developers_oauth ON developers(oauth_provider, oauth_id); -- API Keys CREATE INDEX idx_api_keys_developer ON api_keys(developer_id); CREATE INDEX idx_api_keys_prefix ON api_keys(key_prefix); -- Apps CREATE INDEX idx_apps_developer ON apps(developer_id); CREATE INDEX idx_apps_package ON apps(package_id); CREATE INDEX idx_apps_status ON apps(status); CREATE INDEX idx_apps_name ON apps(name); -- For LIKE searches -- Versions CREATE INDEX idx_versions_app ON app_versions(app_id); CREATE INDEX idx_versions_status ON app_versions(status); -- Signing Keys CREATE INDEX idx_signing_keys_developer ON signing_keys(developer_id); CREATE INDEX idx_signing_keys_fingerprint ON signing_keys(fingerprint); -- Telemetry CREATE INDEX idx_telemetry_app ON telemetry_events(app_id, timestamp); CREATE INDEX idx_telemetry_type ON telemetry_events(event_type, timestamp); -- Crashes CREATE INDEX idx_crashes_app ON crash_reports(app_id, timestamp); CREATE INDEX idx_crashes_type ON crash_reports(crash_type); -- Audit Logs CREATE INDEX idx_audit_developer ON audit_logs(developer_id); CREATE INDEX idx_audit_created ON audit_logs(created_at); ``` **Full-text Search**: For app search, use SQLite FTS5: ```sql -- Create FTS5 virtual table for app search CREATE VIRTUAL TABLE apps_fts USING fts5( name, description, tags, content='apps', content_rowid='rowid' ); -- Triggers to keep FTS in sync CREATE TRIGGER apps_ai AFTER INSERT ON apps BEGIN INSERT INTO apps_fts(rowid, name, description, tags) VALUES (NEW.rowid, NEW.name, NEW.description, NEW.tags); END; CREATE TRIGGER apps_ad AFTER DELETE ON apps BEGIN INSERT INTO apps_fts(apps_fts, rowid, name, description, tags) VALUES ('delete', OLD.rowid, OLD.name, OLD.description, OLD.tags); END; CREATE TRIGGER apps_au AFTER UPDATE ON apps BEGIN INSERT INTO apps_fts(apps_fts, rowid, name, description, tags) VALUES ('delete', OLD.rowid, OLD.name, OLD.description, OLD.tags); INSERT INTO apps_fts(rowid, name, description, tags) VALUES (NEW.rowid, NEW.name, NEW.description, NEW.tags); END; ``` --- ## Migration Strategy ### Approach: Incremental Migrations ``` migrations/ ├── 001_create_developers.sql ├── 002_create_apps.sql ├── 003_create_versions.sql ├── 004_create_telemetry.sql └── ... ``` ### Tools - **Go**: golang-migrate, goose - **Node.js**: Prisma Migrate, Drizzle Kit - **Rust**: sqlx migrate, refinery ### Rollback Strategy - Every migration has up/down - Test rollbacks in staging - Keep migrations small and focused --- ## Backup Strategy ### PostgreSQL ```bash # Daily full backup pg_dump -Fc $DATABASE_URL > backup_$(date +%Y%m%d).dump # Continuous WAL archiving to S3 archive_command = 'aws s3 cp %p s3://backups/wal/%f' ``` ### SQLite + Litestream ```yaml # litestream.yml dbs: - path: /data/mosis.db replicas: - url: s3://backups/mosis retention: 720h # 30 days ``` ### Recovery Time Objectives | Scenario | RTO | RPO | |----------|-----|-----| | Hardware failure | 1 hour | 5 minutes | | Data corruption | 4 hours | 1 hour | | Disaster recovery | 24 hours | 24 hours | --- ## Recommendation ### For MVP/Early Stage **SQLite + Litestream** - Simplest to operate - Lowest cost - Good enough for initial scale - Easy migration to PostgreSQL later ### For Production Scale **PostgreSQL + TimescaleDB** - Handles all data types well - Time-series for telemetry - Proven at scale - Good tooling ecosystem ### Hybrid (If needed later) ``` PostgreSQL → Core data (developers, apps) TimescaleDB → Telemetry (same cluster, extension) Redis → Caching, rate limiting ``` --- ## Deliverables - [x] Final database selection (SQLite + Litestream) - [x] Complete schema design (core + telemetry + FTS5) - [ ] Migration scripts (golang-migrate) - [x] Backup/restore procedures (Litestream to local storage) - [x] ~~Connection pooling setup~~ (not needed for SQLite) - [ ] Monitoring queries --- ## Open Questions 1. ~~Expected telemetry volume per day?~~ → Start simple, optimize if needed 2. ~~How long to retain raw telemetry?~~ → 90 days raw, daily aggregates indefinitely 3. ~~Need for real-time analytics vs batch?~~ → Batch is sufficient for MVP 4. ~~Multi-region requirements?~~ → Single NAS deployment for now --- ## References - [PostgreSQL JSONB performance](https://www.postgresql.org/docs/current/datatype-json.html) - [TimescaleDB vs InfluxDB](https://www.timescale.com/blog/timescaledb-vs-influxdb/) - [Litestream documentation](https://litestream.io/) - [SQLite at scale](https://www.sqlite.org/whentouse.html)