# Milestone 3: Database Selection **Status**: Planning **Goal**: Choose database for developer accounts, app metadata, and analytics. --- ## Overview The database stores all persistent data: developer accounts, app metadata, versions, telemetry events, and audit logs. --- ## Requirements ### Data Characteristics | Data Type | Volume | Access Pattern | Consistency | |-----------|--------|----------------|-------------| | Developers | 10K rows | Read-heavy, low write | Strong | | Apps | 100K rows | Read-heavy | Strong | | Versions | 500K rows | Read-heavy | Strong | | API Keys | 50K rows | Read-heavy | Strong | | Telemetry | 100M+ rows | Write-heavy, append | Eventual OK | | Audit Logs | 10M+ rows | Write-heavy, append | Eventual OK | ### Query Patterns - Get developer by email - List apps by developer - Get app with latest version - Search apps by name/tags - Aggregate telemetry by app/day - Time-range queries on events --- ## Options Analysis ### Option A: PostgreSQL #### Characteristics ``` Type: Relational (SQL) ACID: Full JSON: Native JSONB support Full-text: Built-in tsvector Scaling: Vertical + read replicas ``` #### Pros | Advantage | Details | |-----------|---------| | Battle-tested | Decades of reliability | | ACID compliance | Strong consistency | | JSON support | JSONB for flexible data | | Full-text search | No separate search engine needed | | Extensions | PostGIS, pg_trgm, etc. | | Tooling | pgAdmin, great ORMs | #### Cons | Disadvantage | Details | |--------------|---------| | Ops overhead | Need connection pooling | | Scaling writes | Vertical scaling limits | | Time-series | Not optimized for telemetry | #### Hosting Options | Provider | Free Tier | Paid | |----------|-----------|------| | Supabase | 500MB | $25/mo | | Neon | 512MB | $19/mo | | Railway | 1GB | $5/mo | | AWS RDS | - | $15/mo+ | | Self-hosted | - | VPS cost | --- ### Option B: SQLite + Litestream #### Characteristics ``` Type: Embedded relational ACID: Full Scaling: Single writer Backup: Litestream to S3 ``` #### Pros | Advantage | Details | |-----------|---------| | Zero ops | No separate DB server | | Fast reads | In-process, no network | | Simple backup | Litestream handles replication | | Low cost | Just storage costs | | Portable | Easy local development | #### Cons | Disadvantage | Details | |--------------|---------| | Single writer | Limits write concurrency | | No horizontal scale | One server only | | Limited features | No full-text (without FTS5) | #### Cost Estimate | Component | Cost/month | |-----------|------------| | S3 storage (10GB) | $0.25 | | Compute | Included in app server | --- ### Option C: PostgreSQL + TimescaleDB #### Characteristics ``` Type: Time-series extension Base: PostgreSQL Scaling: Automatic partitioning Compression: Native ``` #### Pros | Advantage | Details | |-----------|---------| | Best of both | Relational + time-series | | Auto-partition | Handles telemetry scale | | Compression | 90%+ compression ratio | | Continuous aggregates | Pre-computed rollups | #### Cons | Disadvantage | Details | |--------------|---------| | Complexity | More to manage | | Cost | Higher than plain Postgres | | Learning curve | New concepts | --- ### Option D: Hybrid Approach ``` PostgreSQL → Developers, Apps, Versions, API Keys ClickHouse/QuestDB → Telemetry, Analytics Redis → Caching, Sessions ``` #### Pros | Advantage | Details | |-----------|---------| | Right tool for job | Optimized for each use case | | Scale independently | Telemetry won't affect main DB | | Performance | Best possible for each workload | #### Cons | Disadvantage | Details | |--------------|---------| | Complexity | Multiple systems to manage | | Cost | More infrastructure | | Consistency | Cross-DB transactions hard | --- ## Schema Design ### Core Tables ```sql -- Developers CREATE TABLE developers ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), email VARCHAR(255) UNIQUE NOT NULL, name VARCHAR(100) NOT NULL, password_hash VARCHAR(255), oauth_provider VARCHAR(50), oauth_id VARCHAR(255), verified BOOLEAN DEFAULT FALSE, created_at TIMESTAMPTZ DEFAULT NOW(), updated_at TIMESTAMPTZ DEFAULT NOW() ); -- API Keys CREATE TABLE api_keys ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), developer_id UUID REFERENCES developers(id) ON DELETE CASCADE, name VARCHAR(100) NOT NULL, key_hash VARCHAR(255) NOT NULL, key_prefix VARCHAR(10) NOT NULL, -- For display: "mk_abc..." permissions JSONB DEFAULT '[]', last_used_at TIMESTAMPTZ, expires_at TIMESTAMPTZ, created_at TIMESTAMPTZ DEFAULT NOW() ); -- Apps CREATE TABLE apps ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), developer_id UUID REFERENCES developers(id) ON DELETE CASCADE, package_id VARCHAR(255) UNIQUE NOT NULL, -- com.dev.app name VARCHAR(100) NOT NULL, description TEXT, category VARCHAR(50), tags VARCHAR(50)[] DEFAULT '{}', status VARCHAR(20) DEFAULT 'draft', -- draft, published, suspended created_at TIMESTAMPTZ DEFAULT NOW(), updated_at TIMESTAMPTZ DEFAULT NOW() ); -- App Versions CREATE TABLE app_versions ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), app_id UUID REFERENCES apps(id) ON DELETE CASCADE, version_code INTEGER NOT NULL, version_name VARCHAR(20) NOT NULL, package_url TEXT NOT NULL, package_size BIGINT NOT NULL, signature VARCHAR(512) NOT NULL, permissions JSONB DEFAULT '[]', min_mosis_version VARCHAR(20), release_notes TEXT, status VARCHAR(20) DEFAULT 'draft', -- draft, review, approved, published, rejected review_notes TEXT, published_at TIMESTAMPTZ, created_at TIMESTAMPTZ DEFAULT NOW(), UNIQUE(app_id, version_code) ); -- Developer Signing Keys CREATE TABLE signing_keys ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), developer_id UUID REFERENCES developers(id) ON DELETE CASCADE, name VARCHAR(100) NOT NULL, public_key TEXT NOT NULL, fingerprint VARCHAR(64) NOT NULL, is_active BOOLEAN DEFAULT TRUE, created_at TIMESTAMPTZ DEFAULT NOW() ); ``` ### Telemetry Tables (if using PostgreSQL) ```sql -- Telemetry Events (consider partitioning by time) CREATE TABLE telemetry_events ( id BIGSERIAL, app_id UUID NOT NULL, device_id VARCHAR(64) NOT NULL, -- Hashed event_type VARCHAR(50) NOT NULL, event_data JSONB, mosis_version VARCHAR(20), timestamp TIMESTAMPTZ NOT NULL, PRIMARY KEY (timestamp, id) ) PARTITION BY RANGE (timestamp); -- Create monthly partitions CREATE TABLE telemetry_events_2024_01 PARTITION OF telemetry_events FOR VALUES FROM ('2024-01-01') TO ('2024-02-01'); -- Crash Reports CREATE TABLE crash_reports ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), app_id UUID NOT NULL, app_version VARCHAR(20) NOT NULL, device_id VARCHAR(64) NOT NULL, crash_type VARCHAR(50) NOT NULL, message TEXT, stack_trace TEXT, context JSONB, mosis_version VARCHAR(20), timestamp TIMESTAMPTZ NOT NULL, created_at TIMESTAMPTZ DEFAULT NOW() ); -- Daily aggregates (materialized or computed) CREATE TABLE telemetry_daily ( app_id UUID NOT NULL, date DATE NOT NULL, event_type VARCHAR(50) NOT NULL, count BIGINT NOT NULL, unique_devices BIGINT NOT NULL, PRIMARY KEY (app_id, date, event_type) ); ``` ### Indexes ```sql -- Developers CREATE INDEX idx_developers_email ON developers(email); CREATE INDEX idx_developers_oauth ON developers(oauth_provider, oauth_id); -- Apps CREATE INDEX idx_apps_developer ON apps(developer_id); CREATE INDEX idx_apps_package ON apps(package_id); CREATE INDEX idx_apps_status ON apps(status); CREATE INDEX idx_apps_search ON apps USING gin(to_tsvector('english', name || ' ' || COALESCE(description, ''))); -- Versions CREATE INDEX idx_versions_app ON app_versions(app_id); CREATE INDEX idx_versions_status ON app_versions(status); -- Telemetry CREATE INDEX idx_telemetry_app ON telemetry_events(app_id, timestamp); CREATE INDEX idx_telemetry_type ON telemetry_events(event_type, timestamp); -- Crashes CREATE INDEX idx_crashes_app ON crash_reports(app_id, timestamp); CREATE INDEX idx_crashes_type ON crash_reports(crash_type); ``` --- ## Migration Strategy ### Approach: Incremental Migrations ``` migrations/ ├── 001_create_developers.sql ├── 002_create_apps.sql ├── 003_create_versions.sql ├── 004_create_telemetry.sql └── ... ``` ### Tools - **Go**: golang-migrate, goose - **Node.js**: Prisma Migrate, Drizzle Kit - **Rust**: sqlx migrate, refinery ### Rollback Strategy - Every migration has up/down - Test rollbacks in staging - Keep migrations small and focused --- ## Backup Strategy ### PostgreSQL ```bash # Daily full backup pg_dump -Fc $DATABASE_URL > backup_$(date +%Y%m%d).dump # Continuous WAL archiving to S3 archive_command = 'aws s3 cp %p s3://backups/wal/%f' ``` ### SQLite + Litestream ```yaml # litestream.yml dbs: - path: /data/mosis.db replicas: - url: s3://backups/mosis retention: 720h # 30 days ``` ### Recovery Time Objectives | Scenario | RTO | RPO | |----------|-----|-----| | Hardware failure | 1 hour | 5 minutes | | Data corruption | 4 hours | 1 hour | | Disaster recovery | 24 hours | 24 hours | --- ## Recommendation ### For MVP/Early Stage **SQLite + Litestream** - Simplest to operate - Lowest cost - Good enough for initial scale - Easy migration to PostgreSQL later ### For Production Scale **PostgreSQL + TimescaleDB** - Handles all data types well - Time-series for telemetry - Proven at scale - Good tooling ecosystem ### Hybrid (If needed later) ``` PostgreSQL → Core data (developers, apps) TimescaleDB → Telemetry (same cluster, extension) Redis → Caching, rate limiting ``` --- ## Deliverables - [ ] Final database selection - [ ] Complete schema design - [ ] Migration scripts - [ ] Backup/restore procedures - [ ] Connection pooling setup (if PostgreSQL) - [ ] Monitoring queries --- ## Open Questions 1. Expected telemetry volume per day? 2. How long to retain raw telemetry? 3. Need for real-time analytics vs batch? 4. Multi-region requirements? --- ## References - [PostgreSQL JSONB performance](https://www.postgresql.org/docs/current/datatype-json.html) - [TimescaleDB vs InfluxDB](https://www.timescale.com/blog/timescaledb-vs-influxdb/) - [Litestream documentation](https://litestream.io/) - [SQLite at scale](https://www.sqlite.org/whentouse.html)