419 lines
10 KiB
Markdown
419 lines
10 KiB
Markdown
# Milestone 3: Database Selection
|
|
|
|
**Status**: Planning
|
|
**Goal**: Choose database for developer accounts, app metadata, and analytics.
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
The database stores all persistent data: developer accounts, app metadata, versions, telemetry events, and audit logs.
|
|
|
|
---
|
|
|
|
## Requirements
|
|
|
|
### Data Characteristics
|
|
|
|
| Data Type | Volume | Access Pattern | Consistency |
|
|
|-----------|--------|----------------|-------------|
|
|
| Developers | 10K rows | Read-heavy, low write | Strong |
|
|
| Apps | 100K rows | Read-heavy | Strong |
|
|
| Versions | 500K rows | Read-heavy | Strong |
|
|
| API Keys | 50K rows | Read-heavy | Strong |
|
|
| Telemetry | 100M+ rows | Write-heavy, append | Eventual OK |
|
|
| Audit Logs | 10M+ rows | Write-heavy, append | Eventual OK |
|
|
|
|
### Query Patterns
|
|
|
|
- Get developer by email
|
|
- List apps by developer
|
|
- Get app with latest version
|
|
- Search apps by name/tags
|
|
- Aggregate telemetry by app/day
|
|
- Time-range queries on events
|
|
|
|
---
|
|
|
|
## Options Analysis
|
|
|
|
### Option A: PostgreSQL
|
|
|
|
#### Characteristics
|
|
```
|
|
Type: Relational (SQL)
|
|
ACID: Full
|
|
JSON: Native JSONB support
|
|
Full-text: Built-in tsvector
|
|
Scaling: Vertical + read replicas
|
|
```
|
|
|
|
#### Pros
|
|
| Advantage | Details |
|
|
|-----------|---------|
|
|
| Battle-tested | Decades of reliability |
|
|
| ACID compliance | Strong consistency |
|
|
| JSON support | JSONB for flexible data |
|
|
| Full-text search | No separate search engine needed |
|
|
| Extensions | PostGIS, pg_trgm, etc. |
|
|
| Tooling | pgAdmin, great ORMs |
|
|
|
|
#### Cons
|
|
| Disadvantage | Details |
|
|
|--------------|---------|
|
|
| Ops overhead | Need connection pooling |
|
|
| Scaling writes | Vertical scaling limits |
|
|
| Time-series | Not optimized for telemetry |
|
|
|
|
#### Hosting Options
|
|
| Provider | Free Tier | Paid |
|
|
|----------|-----------|------|
|
|
| Supabase | 500MB | $25/mo |
|
|
| Neon | 512MB | $19/mo |
|
|
| Railway | 1GB | $5/mo |
|
|
| AWS RDS | - | $15/mo+ |
|
|
| Self-hosted | - | VPS cost |
|
|
|
|
---
|
|
|
|
### Option B: SQLite + Litestream
|
|
|
|
#### Characteristics
|
|
```
|
|
Type: Embedded relational
|
|
ACID: Full
|
|
Scaling: Single writer
|
|
Backup: Litestream to S3
|
|
```
|
|
|
|
#### Pros
|
|
| Advantage | Details |
|
|
|-----------|---------|
|
|
| Zero ops | No separate DB server |
|
|
| Fast reads | In-process, no network |
|
|
| Simple backup | Litestream handles replication |
|
|
| Low cost | Just storage costs |
|
|
| Portable | Easy local development |
|
|
|
|
#### Cons
|
|
| Disadvantage | Details |
|
|
|--------------|---------|
|
|
| Single writer | Limits write concurrency |
|
|
| No horizontal scale | One server only |
|
|
| Limited features | No full-text (without FTS5) |
|
|
|
|
#### Cost Estimate
|
|
| Component | Cost/month |
|
|
|-----------|------------|
|
|
| S3 storage (10GB) | $0.25 |
|
|
| Compute | Included in app server |
|
|
|
|
---
|
|
|
|
### Option C: PostgreSQL + TimescaleDB
|
|
|
|
#### Characteristics
|
|
```
|
|
Type: Time-series extension
|
|
Base: PostgreSQL
|
|
Scaling: Automatic partitioning
|
|
Compression: Native
|
|
```
|
|
|
|
#### Pros
|
|
| Advantage | Details |
|
|
|-----------|---------|
|
|
| Best of both | Relational + time-series |
|
|
| Auto-partition | Handles telemetry scale |
|
|
| Compression | 90%+ compression ratio |
|
|
| Continuous aggregates | Pre-computed rollups |
|
|
|
|
#### Cons
|
|
| Disadvantage | Details |
|
|
|--------------|---------|
|
|
| Complexity | More to manage |
|
|
| Cost | Higher than plain Postgres |
|
|
| Learning curve | New concepts |
|
|
|
|
---
|
|
|
|
### Option D: Hybrid Approach
|
|
|
|
```
|
|
PostgreSQL → Developers, Apps, Versions, API Keys
|
|
ClickHouse/QuestDB → Telemetry, Analytics
|
|
Redis → Caching, Sessions
|
|
```
|
|
|
|
#### Pros
|
|
| Advantage | Details |
|
|
|-----------|---------|
|
|
| Right tool for job | Optimized for each use case |
|
|
| Scale independently | Telemetry won't affect main DB |
|
|
| Performance | Best possible for each workload |
|
|
|
|
#### Cons
|
|
| Disadvantage | Details |
|
|
|--------------|---------|
|
|
| Complexity | Multiple systems to manage |
|
|
| Cost | More infrastructure |
|
|
| Consistency | Cross-DB transactions hard |
|
|
|
|
---
|
|
|
|
## Schema Design
|
|
|
|
### Core Tables
|
|
|
|
```sql
|
|
-- Developers
|
|
CREATE TABLE developers (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
email VARCHAR(255) UNIQUE NOT NULL,
|
|
name VARCHAR(100) NOT NULL,
|
|
password_hash VARCHAR(255),
|
|
oauth_provider VARCHAR(50),
|
|
oauth_id VARCHAR(255),
|
|
verified BOOLEAN DEFAULT FALSE,
|
|
created_at TIMESTAMPTZ DEFAULT NOW(),
|
|
updated_at TIMESTAMPTZ DEFAULT NOW()
|
|
);
|
|
|
|
-- API Keys
|
|
CREATE TABLE api_keys (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
developer_id UUID REFERENCES developers(id) ON DELETE CASCADE,
|
|
name VARCHAR(100) NOT NULL,
|
|
key_hash VARCHAR(255) NOT NULL,
|
|
key_prefix VARCHAR(10) NOT NULL, -- For display: "mk_abc..."
|
|
permissions JSONB DEFAULT '[]',
|
|
last_used_at TIMESTAMPTZ,
|
|
expires_at TIMESTAMPTZ,
|
|
created_at TIMESTAMPTZ DEFAULT NOW()
|
|
);
|
|
|
|
-- Apps
|
|
CREATE TABLE apps (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
developer_id UUID REFERENCES developers(id) ON DELETE CASCADE,
|
|
package_id VARCHAR(255) UNIQUE NOT NULL, -- com.dev.app
|
|
name VARCHAR(100) NOT NULL,
|
|
description TEXT,
|
|
category VARCHAR(50),
|
|
tags VARCHAR(50)[] DEFAULT '{}',
|
|
status VARCHAR(20) DEFAULT 'draft', -- draft, published, suspended
|
|
created_at TIMESTAMPTZ DEFAULT NOW(),
|
|
updated_at TIMESTAMPTZ DEFAULT NOW()
|
|
);
|
|
|
|
-- App Versions
|
|
CREATE TABLE app_versions (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
app_id UUID REFERENCES apps(id) ON DELETE CASCADE,
|
|
version_code INTEGER NOT NULL,
|
|
version_name VARCHAR(20) NOT NULL,
|
|
package_url TEXT NOT NULL,
|
|
package_size BIGINT NOT NULL,
|
|
signature VARCHAR(512) NOT NULL,
|
|
permissions JSONB DEFAULT '[]',
|
|
min_mosis_version VARCHAR(20),
|
|
release_notes TEXT,
|
|
status VARCHAR(20) DEFAULT 'draft', -- draft, review, approved, published, rejected
|
|
review_notes TEXT,
|
|
published_at TIMESTAMPTZ,
|
|
created_at TIMESTAMPTZ DEFAULT NOW(),
|
|
UNIQUE(app_id, version_code)
|
|
);
|
|
|
|
-- Developer Signing Keys
|
|
CREATE TABLE signing_keys (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
developer_id UUID REFERENCES developers(id) ON DELETE CASCADE,
|
|
name VARCHAR(100) NOT NULL,
|
|
public_key TEXT NOT NULL,
|
|
fingerprint VARCHAR(64) NOT NULL,
|
|
is_active BOOLEAN DEFAULT TRUE,
|
|
created_at TIMESTAMPTZ DEFAULT NOW()
|
|
);
|
|
```
|
|
|
|
### Telemetry Tables (if using PostgreSQL)
|
|
|
|
```sql
|
|
-- Telemetry Events (consider partitioning by time)
|
|
CREATE TABLE telemetry_events (
|
|
id BIGSERIAL,
|
|
app_id UUID NOT NULL,
|
|
device_id VARCHAR(64) NOT NULL, -- Hashed
|
|
event_type VARCHAR(50) NOT NULL,
|
|
event_data JSONB,
|
|
mosis_version VARCHAR(20),
|
|
timestamp TIMESTAMPTZ NOT NULL,
|
|
PRIMARY KEY (timestamp, id)
|
|
) PARTITION BY RANGE (timestamp);
|
|
|
|
-- Create monthly partitions
|
|
CREATE TABLE telemetry_events_2024_01 PARTITION OF telemetry_events
|
|
FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
|
|
|
|
-- Crash Reports
|
|
CREATE TABLE crash_reports (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
app_id UUID NOT NULL,
|
|
app_version VARCHAR(20) NOT NULL,
|
|
device_id VARCHAR(64) NOT NULL,
|
|
crash_type VARCHAR(50) NOT NULL,
|
|
message TEXT,
|
|
stack_trace TEXT,
|
|
context JSONB,
|
|
mosis_version VARCHAR(20),
|
|
timestamp TIMESTAMPTZ NOT NULL,
|
|
created_at TIMESTAMPTZ DEFAULT NOW()
|
|
);
|
|
|
|
-- Daily aggregates (materialized or computed)
|
|
CREATE TABLE telemetry_daily (
|
|
app_id UUID NOT NULL,
|
|
date DATE NOT NULL,
|
|
event_type VARCHAR(50) NOT NULL,
|
|
count BIGINT NOT NULL,
|
|
unique_devices BIGINT NOT NULL,
|
|
PRIMARY KEY (app_id, date, event_type)
|
|
);
|
|
```
|
|
|
|
### Indexes
|
|
|
|
```sql
|
|
-- Developers
|
|
CREATE INDEX idx_developers_email ON developers(email);
|
|
CREATE INDEX idx_developers_oauth ON developers(oauth_provider, oauth_id);
|
|
|
|
-- Apps
|
|
CREATE INDEX idx_apps_developer ON apps(developer_id);
|
|
CREATE INDEX idx_apps_package ON apps(package_id);
|
|
CREATE INDEX idx_apps_status ON apps(status);
|
|
CREATE INDEX idx_apps_search ON apps USING gin(to_tsvector('english', name || ' ' || COALESCE(description, '')));
|
|
|
|
-- Versions
|
|
CREATE INDEX idx_versions_app ON app_versions(app_id);
|
|
CREATE INDEX idx_versions_status ON app_versions(status);
|
|
|
|
-- Telemetry
|
|
CREATE INDEX idx_telemetry_app ON telemetry_events(app_id, timestamp);
|
|
CREATE INDEX idx_telemetry_type ON telemetry_events(event_type, timestamp);
|
|
|
|
-- Crashes
|
|
CREATE INDEX idx_crashes_app ON crash_reports(app_id, timestamp);
|
|
CREATE INDEX idx_crashes_type ON crash_reports(crash_type);
|
|
```
|
|
|
|
---
|
|
|
|
## Migration Strategy
|
|
|
|
### Approach: Incremental Migrations
|
|
|
|
```
|
|
migrations/
|
|
├── 001_create_developers.sql
|
|
├── 002_create_apps.sql
|
|
├── 003_create_versions.sql
|
|
├── 004_create_telemetry.sql
|
|
└── ...
|
|
```
|
|
|
|
### Tools
|
|
- **Go**: golang-migrate, goose
|
|
- **Node.js**: Prisma Migrate, Drizzle Kit
|
|
- **Rust**: sqlx migrate, refinery
|
|
|
|
### Rollback Strategy
|
|
- Every migration has up/down
|
|
- Test rollbacks in staging
|
|
- Keep migrations small and focused
|
|
|
|
---
|
|
|
|
## Backup Strategy
|
|
|
|
### PostgreSQL
|
|
```bash
|
|
# Daily full backup
|
|
pg_dump -Fc $DATABASE_URL > backup_$(date +%Y%m%d).dump
|
|
|
|
# Continuous WAL archiving to S3
|
|
archive_command = 'aws s3 cp %p s3://backups/wal/%f'
|
|
```
|
|
|
|
### SQLite + Litestream
|
|
```yaml
|
|
# litestream.yml
|
|
dbs:
|
|
- path: /data/mosis.db
|
|
replicas:
|
|
- url: s3://backups/mosis
|
|
retention: 720h # 30 days
|
|
```
|
|
|
|
### Recovery Time Objectives
|
|
| Scenario | RTO | RPO |
|
|
|----------|-----|-----|
|
|
| Hardware failure | 1 hour | 5 minutes |
|
|
| Data corruption | 4 hours | 1 hour |
|
|
| Disaster recovery | 24 hours | 24 hours |
|
|
|
|
---
|
|
|
|
## Recommendation
|
|
|
|
### For MVP/Early Stage
|
|
**SQLite + Litestream**
|
|
- Simplest to operate
|
|
- Lowest cost
|
|
- Good enough for initial scale
|
|
- Easy migration to PostgreSQL later
|
|
|
|
### For Production Scale
|
|
**PostgreSQL + TimescaleDB**
|
|
- Handles all data types well
|
|
- Time-series for telemetry
|
|
- Proven at scale
|
|
- Good tooling ecosystem
|
|
|
|
### Hybrid (If needed later)
|
|
```
|
|
PostgreSQL → Core data (developers, apps)
|
|
TimescaleDB → Telemetry (same cluster, extension)
|
|
Redis → Caching, rate limiting
|
|
```
|
|
|
|
---
|
|
|
|
## Deliverables
|
|
|
|
- [ ] Final database selection
|
|
- [ ] Complete schema design
|
|
- [ ] Migration scripts
|
|
- [ ] Backup/restore procedures
|
|
- [ ] Connection pooling setup (if PostgreSQL)
|
|
- [ ] Monitoring queries
|
|
|
|
---
|
|
|
|
## Open Questions
|
|
|
|
1. Expected telemetry volume per day?
|
|
2. How long to retain raw telemetry?
|
|
3. Need for real-time analytics vs batch?
|
|
4. Multi-region requirements?
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- [PostgreSQL JSONB performance](https://www.postgresql.org/docs/current/datatype-json.html)
|
|
- [TimescaleDB vs InfluxDB](https://www.timescale.com/blog/timescaledb-vs-influxdb/)
|
|
- [Litestream documentation](https://litestream.io/)
|
|
- [SQLite at scale](https://www.sqlite.org/whentouse.html)
|