MosisService/docs/SANDBOX_MILESTONE_19.md

# Milestone 19: Security Testing Suite

**Status**: Complete
**Goal**: Comprehensive security test coverage with fuzzing.

---

## Overview

This milestone formalizes the security testing infrastructure:
- Unit tests for all sandbox components (already implemented in Milestones 1-18)
- Integration tests for full app lifecycle
- Fuzzer for random Lua code testing
- Security audit checklist verification

### Key Deliverables

1. **LuaFuzzer class** - Generates random Lua code and verifies sandbox integrity
2. **Integration tests** - Full lifecycle tests
3. **Security audit tests** - Verify all SANDBOX.md security requirements

---

## File Structure

```
sandbox-test/
├── src/
│   ├── main.cpp           # Existing - all unit tests (135+)
│   ├── lua_fuzzer.h       # NEW - Fuzzer header
│   └── lua_fuzzer.cpp     # NEW - Fuzzer implementation
```

---

## Implementation Details

### 1. LuaFuzzer Class

```cpp
// lua_fuzzer.h
#pragma once

#include <string>
#include <vector>
#include <random>
#include <functional>

namespace mosis {

struct FuzzResult {
    bool crashed = false;
    bool sandbox_intact = true;
    std::string error;
    size_t iterations = 0;
    size_t errors_caught = 0;
};

class LuaFuzzer {
public:
    LuaFuzzer(uint32_t seed = 0);
    ~LuaFuzzer();

    // Run fuzzing for N iterations
    FuzzResult Run(size_t iterations);

    // Configuration
    void SetMaxCodeLength(size_t len) { m_max_code_length = len; }
    void SetMaxNesting(size_t depth) { m_max_nesting = depth; }

    // Statistics
    size_t GetTotalRuns() const { return m_total_runs; }
    size_t GetCrashes() const { return m_crashes; }
    size_t GetErrorsCaught() const { return m_errors_caught; }

private:
    std::mt19937 m_rng;
    size_t m_max_code_length = 1000;
    size_t m_max_nesting = 10;
    size_t m_total_runs = 0;
    size_t m_crashes = 0;
    size_t m_errors_caught = 0;

    // Code generators
    std::string GenerateRandomCode();
    std::string GenerateExpression(int depth);
    std::string GenerateStatement(int depth);
    std::string GenerateIdentifier();
    std::string GenerateLiteral();

    // Sandbox integrity verification
    bool VerifySandboxIntegrity();
};

} // namespace mosis
```

### 2. Fuzzer Code Generation

The fuzzer generates random Lua code including:
- Valid expressions (arithmetic, string, table)
- Control flow (if, while, for, repeat)
- Function definitions and calls
- Table operations
- Error-inducing patterns (intentional)
- Boundary conditions

### 3. Security Audit Tests

| Test | Description | Verifies |
|------|-------------|----------|
| `AuditNoOsAccess` | os.* blocked | SANDBOX.md §1 |
| `AuditNoIoAccess` | io.* blocked | SANDBOX.md §1 |
| `AuditNoLoadfile` | loadfile blocked | SANDBOX.md §1 |
| `AuditNoDofile` | dofile blocked | SANDBOX.md §1 |
| `AuditNoBytecode` | Bytecode rejected | SANDBOX.md §2 |
| `AuditMemoryLimit` | Memory limited | SANDBOX.md §3 |
| `AuditCPULimit` | CPU limited | SANDBOX.md §3 |
| `AuditMetatableProtected` | Metatables protected | SANDBOX.md §4 |
| `AuditNoStringDump` | string.dump removed | SANDBOX.md §5 |
| `AuditPathTraversal` | Path traversal blocked | SANDBOX.md §6 |
| `AuditPermissionEnforced` | Permissions checked | SANDBOX.md §7 |
| `AuditRateLimiting` | Rate limits work | SANDBOX.md §8 |

---

## Test Cases

### Test 1: Fuzzer Runs Without Crashes

```cpp
bool Test_FuzzerNoCrashes(std::string& error_msg) {
    mosis::LuaFuzzer fuzzer(12345);  // Deterministic seed

    auto result = fuzzer.Run(1000);  // 1000 iterations

    EXPECT_TRUE(!result.crashed);
    EXPECT_TRUE(result.sandbox_intact);
    EXPECT_TRUE(result.iterations == 1000);

    return true;
}
```

### Test 2: Fuzzer Catches Errors Gracefully

```cpp
bool Test_FuzzerCatchesErrors(std::string& error_msg) {
    mosis::LuaFuzzer fuzzer(54321);

    auto result = fuzzer.Run(500);

    // Some generated code should produce errors (caught gracefully)
    EXPECT_TRUE(result.errors_caught > 0);
    EXPECT_TRUE(!result.crashed);

    return true;
}
```

### Test 3: Sandbox Integrity After Fuzzing

```cpp
bool Test_FuzzerSandboxIntegrity(std::string& error_msg) {
    mosis::LuaFuzzer fuzzer;

    // Run many iterations
    auto result = fuzzer.Run(2000);

    // Sandbox must still be intact
    EXPECT_TRUE(result.sandbox_intact);

    // Verify by running a normal script
    SandboxContext ctx = TestContext();
    LuaSandbox sandbox(ctx);
    EXPECT_TRUE(sandbox.LoadString("return 1 + 1", "verify"));

    return true;
}
```

### Test 4: Audit - Dangerous Globals Blocked

```cpp
bool Test_AuditDangerousGlobalsBlocked(std::string& error_msg) {
    SandboxContext ctx = TestContext();
    LuaSandbox sandbox(ctx);

    // All these must fail
    std::vector<std::string> dangerous = {
        "os.execute('ls')",
        "io.open('test.txt')",
        "loadfile('test.lua')",
        "dofile('test.lua')",
        "require('os')",
        "package.loadlib('test', 'func')"
    };

    for (const auto& code : dangerous) {
        bool ok = sandbox.LoadString(code, "audit");
        if (ok) {
            error_msg = "Dangerous code executed: " + code;
            return false;
        }
    }

    return true;
}
```

### Test 5: Audit - Resource Limits

```cpp
bool Test_AuditResourceLimits(std::string& error_msg) {
    SandboxContext ctx = TestContext();

    // Memory limit test
    {
        LuaSandbox sandbox(ctx);
        sandbox.SetMemoryLimit(1024 * 1024);  // 1MB
        bool ok = sandbox.LoadString(R"lua(
            local t = {}
            for i = 1, 10000000 do
                t[i] = string.rep("x", 1000)
            end
        )lua", "mem_test");
        EXPECT_FALSE(ok);  // Should fail due to memory limit
    }

    // CPU limit test
    {
        LuaSandbox sandbox(ctx);
        sandbox.SetInstructionLimit(10000);
        bool ok = sandbox.LoadString("while true do end", "cpu_test");
        EXPECT_FALSE(ok);  // Should fail due to CPU limit
    }

    return true;
}
```

### Test 6: Integration - Full App Lifecycle

```cpp
bool Test_IntegrationAppLifecycle(std::string& error_msg) {
    // Create app context
    SandboxContext ctx{
        .app_id = "lifecycle.test.app",
        .app_path = ".",
        .permissions = {"storage"},
        .is_system_app = false
    };

    // Create sandbox
    LuaSandbox sandbox(ctx);

    // Register all APIs
    mosis::PermissionGate gate("lifecycle.test.app");
    mosis::AuditLog audit;
    mosis::VirtualFS vfs("lifecycle.test.app", ".");
    mosis::TimerManager timers;

    // Load app script
    std::string script = R"lua(
        -- App initialization
        local data = json.encode({started = true})

        -- Use storage (has permission)
        storage.write("state.json", data)

        -- Read back
        local content = storage.read("state.json")
        local state = json.decode(content)

        -- Return success
        return state.started == true
    )lua";

    bool ok = sandbox.LoadString(script, "lifecycle_test");
    EXPECT_TRUE(ok);

    // Cleanup
    vfs.Delete("state.json");

    return true;
}
```

---

## Acceptance Criteria

All tests pass:

- [x] `Test_FuzzerNoCrashes` - Fuzzer runs 100 iterations without crash
- [x] `Test_FuzzerCatchesErrors` - Fuzzer catches Lua errors gracefully
- [x] `Test_FuzzerSandboxIntegrity` - Sandbox intact after fuzzing
- [x] `Test_AuditDangerousGlobalsBlocked` - All dangerous globals blocked
- [x] `Test_AuditResourceLimits` - Memory and CPU limits enforced
- [x] `Test_IntegrationAppLifecycle` - Full app lifecycle works

---

## Dependencies

- All previous milestones (1-18)

---

## Notes

### Fuzzer Strategy

1. **Seed-based**: Deterministic with seed for reproducibility
2. **Incremental complexity**: Start simple, increase nesting
3. **Boundary testing**: Test edge cases (empty strings, huge numbers)
4. **Error injection**: Intentionally generate invalid code

### Security Audit Coverage

The audit tests verify all requirements from SANDBOX.md:
1. Dangerous standard library functions removed
2. Bytecode loading disabled
3. Memory limits enforced
4. CPU/instruction limits enforced
5. Metatables protected
6. Path traversal blocked
7. Permissions enforced
8. Rate limiting works

---

## Next Steps

After Milestone 19 passes:
1. Milestone 20: Final Integration