Files
MosisService/docs/SANDBOX_MILESTONE_19.md

8.2 KiB

Milestone 19: Security Testing Suite

Status: Complete Goal: Comprehensive security test coverage with fuzzing.


Overview

This milestone formalizes the security testing infrastructure:

  • Unit tests for all sandbox components (already implemented in Milestones 1-18)
  • Integration tests for full app lifecycle
  • Fuzzer for random Lua code testing
  • Security audit checklist verification

Key Deliverables

  1. LuaFuzzer class - Generates random Lua code and verifies sandbox integrity
  2. Integration tests - Full lifecycle tests
  3. Security audit tests - Verify all SANDBOX.md security requirements

File Structure

sandbox-test/
├── src/
│   ├── main.cpp           # Existing - all unit tests (135+)
│   ├── lua_fuzzer.h       # NEW - Fuzzer header
│   └── lua_fuzzer.cpp     # NEW - Fuzzer implementation

Implementation Details

1. LuaFuzzer Class

// lua_fuzzer.h
#pragma once

#include <string>
#include <vector>
#include <random>
#include <functional>

namespace mosis {

struct FuzzResult {
    bool crashed = false;
    bool sandbox_intact = true;
    std::string error;
    size_t iterations = 0;
    size_t errors_caught = 0;
};

class LuaFuzzer {
public:
    LuaFuzzer(uint32_t seed = 0);
    ~LuaFuzzer();

    // Run fuzzing for N iterations
    FuzzResult Run(size_t iterations);

    // Configuration
    void SetMaxCodeLength(size_t len) { m_max_code_length = len; }
    void SetMaxNesting(size_t depth) { m_max_nesting = depth; }

    // Statistics
    size_t GetTotalRuns() const { return m_total_runs; }
    size_t GetCrashes() const { return m_crashes; }
    size_t GetErrorsCaught() const { return m_errors_caught; }

private:
    std::mt19937 m_rng;
    size_t m_max_code_length = 1000;
    size_t m_max_nesting = 10;
    size_t m_total_runs = 0;
    size_t m_crashes = 0;
    size_t m_errors_caught = 0;

    // Code generators
    std::string GenerateRandomCode();
    std::string GenerateExpression(int depth);
    std::string GenerateStatement(int depth);
    std::string GenerateIdentifier();
    std::string GenerateLiteral();

    // Sandbox integrity verification
    bool VerifySandboxIntegrity();
};

} // namespace mosis

2. Fuzzer Code Generation

The fuzzer generates random Lua code including:

  • Valid expressions (arithmetic, string, table)
  • Control flow (if, while, for, repeat)
  • Function definitions and calls
  • Table operations
  • Error-inducing patterns (intentional)
  • Boundary conditions

3. Security Audit Tests

Test Description Verifies
AuditNoOsAccess os.* blocked SANDBOX.md §1
AuditNoIoAccess io.* blocked SANDBOX.md §1
AuditNoLoadfile loadfile blocked SANDBOX.md §1
AuditNoDofile dofile blocked SANDBOX.md §1
AuditNoBytecode Bytecode rejected SANDBOX.md §2
AuditMemoryLimit Memory limited SANDBOX.md §3
AuditCPULimit CPU limited SANDBOX.md §3
AuditMetatableProtected Metatables protected SANDBOX.md §4
AuditNoStringDump string.dump removed SANDBOX.md §5
AuditPathTraversal Path traversal blocked SANDBOX.md §6
AuditPermissionEnforced Permissions checked SANDBOX.md §7
AuditRateLimiting Rate limits work SANDBOX.md §8

Test Cases

Test 1: Fuzzer Runs Without Crashes

bool Test_FuzzerNoCrashes(std::string& error_msg) {
    mosis::LuaFuzzer fuzzer(12345);  // Deterministic seed

    auto result = fuzzer.Run(1000);  // 1000 iterations

    EXPECT_TRUE(!result.crashed);
    EXPECT_TRUE(result.sandbox_intact);
    EXPECT_TRUE(result.iterations == 1000);

    return true;
}

Test 2: Fuzzer Catches Errors Gracefully

bool Test_FuzzerCatchesErrors(std::string& error_msg) {
    mosis::LuaFuzzer fuzzer(54321);

    auto result = fuzzer.Run(500);

    // Some generated code should produce errors (caught gracefully)
    EXPECT_TRUE(result.errors_caught > 0);
    EXPECT_TRUE(!result.crashed);

    return true;
}

Test 3: Sandbox Integrity After Fuzzing

bool Test_FuzzerSandboxIntegrity(std::string& error_msg) {
    mosis::LuaFuzzer fuzzer;

    // Run many iterations
    auto result = fuzzer.Run(2000);

    // Sandbox must still be intact
    EXPECT_TRUE(result.sandbox_intact);

    // Verify by running a normal script
    SandboxContext ctx = TestContext();
    LuaSandbox sandbox(ctx);
    EXPECT_TRUE(sandbox.LoadString("return 1 + 1", "verify"));

    return true;
}

Test 4: Audit - Dangerous Globals Blocked

bool Test_AuditDangerousGlobalsBlocked(std::string& error_msg) {
    SandboxContext ctx = TestContext();
    LuaSandbox sandbox(ctx);

    // All these must fail
    std::vector<std::string> dangerous = {
        "os.execute('ls')",
        "io.open('test.txt')",
        "loadfile('test.lua')",
        "dofile('test.lua')",
        "require('os')",
        "package.loadlib('test', 'func')"
    };

    for (const auto& code : dangerous) {
        bool ok = sandbox.LoadString(code, "audit");
        if (ok) {
            error_msg = "Dangerous code executed: " + code;
            return false;
        }
    }

    return true;
}

Test 5: Audit - Resource Limits

bool Test_AuditResourceLimits(std::string& error_msg) {
    SandboxContext ctx = TestContext();

    // Memory limit test
    {
        LuaSandbox sandbox(ctx);
        sandbox.SetMemoryLimit(1024 * 1024);  // 1MB
        bool ok = sandbox.LoadString(R"lua(
            local t = {}
            for i = 1, 10000000 do
                t[i] = string.rep("x", 1000)
            end
        )lua", "mem_test");
        EXPECT_FALSE(ok);  // Should fail due to memory limit
    }

    // CPU limit test
    {
        LuaSandbox sandbox(ctx);
        sandbox.SetInstructionLimit(10000);
        bool ok = sandbox.LoadString("while true do end", "cpu_test");
        EXPECT_FALSE(ok);  // Should fail due to CPU limit
    }

    return true;
}

Test 6: Integration - Full App Lifecycle

bool Test_IntegrationAppLifecycle(std::string& error_msg) {
    // Create app context
    SandboxContext ctx{
        .app_id = "lifecycle.test.app",
        .app_path = ".",
        .permissions = {"storage"},
        .is_system_app = false
    };

    // Create sandbox
    LuaSandbox sandbox(ctx);

    // Register all APIs
    mosis::PermissionGate gate("lifecycle.test.app");
    mosis::AuditLog audit;
    mosis::VirtualFS vfs("lifecycle.test.app", ".");
    mosis::TimerManager timers;

    // Load app script
    std::string script = R"lua(
        -- App initialization
        local data = json.encode({started = true})

        -- Use storage (has permission)
        storage.write("state.json", data)

        -- Read back
        local content = storage.read("state.json")
        local state = json.decode(content)

        -- Return success
        return state.started == true
    )lua";

    bool ok = sandbox.LoadString(script, "lifecycle_test");
    EXPECT_TRUE(ok);

    // Cleanup
    vfs.Delete("state.json");

    return true;
}

Acceptance Criteria

All tests pass:

  • Test_FuzzerNoCrashes - Fuzzer runs 100 iterations without crash
  • Test_FuzzerCatchesErrors - Fuzzer catches Lua errors gracefully
  • Test_FuzzerSandboxIntegrity - Sandbox intact after fuzzing
  • Test_AuditDangerousGlobalsBlocked - All dangerous globals blocked
  • Test_AuditResourceLimits - Memory and CPU limits enforced
  • Test_IntegrationAppLifecycle - Full app lifecycle works

Dependencies

  • All previous milestones (1-18)

Notes

Fuzzer Strategy

  1. Seed-based: Deterministic with seed for reproducibility
  2. Incremental complexity: Start simple, increase nesting
  3. Boundary testing: Test edge cases (empty strings, huge numbers)
  4. Error injection: Intentionally generate invalid code

Security Audit Coverage

The audit tests verify all requirements from SANDBOX.md:

  1. Dangerous standard library functions removed
  2. Bytecode loading disabled
  3. Memory limits enforced
  4. CPU/instruction limits enforced
  5. Metatables protected
  6. Path traversal blocked
  7. Permissions enforced
  8. Rate limiting works

Next Steps

After Milestone 19 passes:

  1. Milestone 20: Final Integration