Files
MosisService/docs/SANDBOX_MILESTONE_12.md

13 KiB

Milestone 12: Virtual Hardware - Microphone

Status: Complete Goal: Audio recording with mandatory recording indicators and security controls.


Overview

This milestone implements secure microphone access for Lua apps:

  • Permission required (microphone permission)
  • User gesture required to start recording
  • Mandatory recording indicator (system-controlled, cannot be hidden)
  • Single recording session per app
  • Sample rate limiting
  • Automatic cleanup on app stop

Key Deliverables

  1. MicrophoneInterface class - Session management, permission checks
  2. RecordingSession class - Active recording session wrapper
  3. Lua microphone API - microphone.start(), session methods
  4. Recording indicator - System-level UI notification

File Structure

src/main/cpp/sandbox/
├── microphone_interface.h      # NEW - Microphone API header
└── microphone_interface.cpp    # NEW - Microphone implementation

Implementation Details

1. MicrophoneInterface Class

// microphone_interface.h
#pragma once

#include <string>
#include <memory>
#include <functional>
#include <mutex>
#include <atomic>
#include <vector>
#include <chrono>

struct lua_State;

namespace mosis {

class PermissionGate;

enum class AudioFormat {
    PCM_16BIT,
    PCM_FLOAT
};

struct RecordingConfig {
    int sample_rate = 44100;    // 8000, 16000, 22050, 44100, 48000
    int channels = 1;           // 1 (mono) or 2 (stereo)
    AudioFormat format = AudioFormat::PCM_16BIT;
};

struct AudioBuffer {
    std::vector<uint8_t> data;
    int sample_rate;
    int channels;
    AudioFormat format;
    uint64_t timestamp_ms;
    int sample_count;
};

class RecordingSession {
public:
    using BufferCallback = std::function<void(const AudioBuffer& buffer)>;

    RecordingSession(int id, const RecordingConfig& config);
    ~RecordingSession();

    int GetId() const { return m_id; }
    const RecordingConfig& GetConfig() const { return m_config; }
    bool IsActive() const { return m_active; }

    // Get accumulated audio data
    AudioBuffer GetBuffer();

    // Set buffer callback (for streaming)
    void SetOnBuffer(BufferCallback cb) { m_on_buffer = std::move(cb); }

    // Stop recording
    void Stop();

    // For mock mode - simulate audio data arrival
    void SimulateBuffer(const AudioBuffer& buffer);

private:
    int m_id;
    RecordingConfig m_config;
    std::atomic<bool> m_active{true};
    BufferCallback m_on_buffer;
    AudioBuffer m_accumulated;
    mutable std::mutex m_mutex;
};

class MicrophoneInterface {
public:
    MicrophoneInterface(const std::string& app_id, PermissionGate* permissions);
    ~MicrophoneInterface();

    // Start recording session
    // Returns session on success, nullptr on failure (sets error)
    // Requires microphone permission and user gesture
    std::shared_ptr<RecordingSession> StartSession(const RecordingConfig& config, std::string& error);

    // Stop active session
    void StopSession();

    // Check if session is active
    bool HasActiveSession() const;

    // Check if recording indicator should be shown
    bool IsIndicatorVisible() const;

    // Cleanup on app stop
    void Shutdown();

    // For testing
    void SetMockMode(bool enabled) { m_mock_mode = enabled; }
    bool IsMockMode() const { return m_mock_mode; }

    // Simulate user gesture for testing
    void SimulateUserGesture();

private:
    std::string m_app_id;
    PermissionGate* m_permissions;
    std::shared_ptr<RecordingSession> m_active_session;
    mutable std::mutex m_mutex;
    bool m_mock_mode = true;
    std::atomic<bool> m_indicator_visible{false};
    int m_next_session_id = 1;

    // Track user gesture timing
    std::chrono::steady_clock::time_point m_last_gesture_time;
    bool m_has_gesture = false;
    static constexpr int GESTURE_VALIDITY_MS = 5000;  // 5 seconds

    bool HasRecentUserGesture() const;
    void ShowIndicator();
    void HideIndicator();
};

// Register microphone.* APIs as globals
void RegisterMicrophoneAPI(lua_State* L, MicrophoneInterface* microphone);

} // namespace mosis

2. Permission Requirements

Microphone access requires:

  1. microphone permission declared in manifest
  2. Permission granted by user (dangerous permission)
  3. Recent user gesture (within 5 seconds)
// Permission check flow
bool MicrophoneInterface::StartSession(...) {
    // 1. Check permission
    if (!m_permissions->HasPermission("microphone")) {
        error = "Microphone permission not granted";
        return nullptr;
    }

    // 2. Check user gesture
    if (!HasRecentUserGesture()) {
        error = "Microphone requires user gesture";
        return nullptr;
    }

    // 3. Check no existing session
    if (m_active_session) {
        error = "Recording session already active";
        return nullptr;
    }

    // ... create session
}

3. Recording Indicator

The recording indicator is mandatory and system-controlled:

  • Shown whenever microphone session is active
  • Cannot be hidden or obscured by app
  • Positioned in system UI area
  • Shows microphone icon with "Recording" text

4. Lua API

-- Start recording session (requires permission + user gesture)
local session, err = microphone.start({
    sampleRate = 44100,    -- 8000, 16000, 22050, 44100, 48000
    channels = 1,          -- 1 (mono) or 2 (stereo)
    format = "pcm16"       -- "pcm16" or "float"
})

if not session then
    print("Failed to start recording:", err)
    return
end

-- Set buffer callback (for streaming audio)
session:on("buffer", function(buffer)
    -- buffer.data, buffer.sampleRate, buffer.channels, buffer.sampleCount
end)

-- Get accumulated audio data
local audio = session:getBuffer()
if audio then
    print("Recorded", audio.sampleCount, "samples at", audio.sampleRate, "Hz")
end

-- Stop recording
session:stop()

-- Check if microphone is active
if microphone.isActive() then
    print("Microphone is recording")
end

5. Sample Rate Validation

To prevent resource abuse:

  • Allowed sample rates: 8000, 16000, 22050, 44100, 48000
  • Maximum: 48000 Hz
  • Channels: 1 (mono) or 2 (stereo)

6. Session Lifecycle

User Gesture ──► microphone.start() ──► Session Active ──► Indicator Shown
                     │                       │
                     │ error                 │ session:stop()
                     ▼                       ▼
                   nil, err              Session Closed ──► Indicator Hidden

App Stop ──► MicrophoneInterface::Shutdown() ──► All Sessions Closed

Test Cases

Test 1: Requires Permission

bool Test_MicrophoneRequiresPermission(std::string& error_msg) {
    // Create context WITHOUT microphone permission
    SandboxContext ctx;
    ctx.app_id = "test.app";
    ctx.permissions = {};
    ctx.is_system_app = false;
    PermissionGate permissions(ctx);

    mosis::MicrophoneInterface mic("test.app", &permissions);
    mic.SimulateUserGesture();

    std::string err;
    mosis::RecordingConfig config;
    auto session = mic.StartSession(config, err);

    EXPECT_TRUE(session == nullptr);
    EXPECT_TRUE(err.find("permission") != std::string::npos);

    return true;
}

Test 2: Requires User Gesture

bool Test_MicrophoneRequiresUserGesture(std::string& error_msg) {
    SandboxContext ctx;
    ctx.app_id = "test.app";
    ctx.permissions = {"microphone"};
    ctx.is_system_app = false;
    PermissionGate permissions(ctx);
    permissions.GrantPermission("microphone");

    mosis::MicrophoneInterface mic("test.app", &permissions);
    // Note: NOT calling SimulateUserGesture()

    std::string err;
    mosis::RecordingConfig config;
    auto session = mic.StartSession(config, err);

    EXPECT_TRUE(session == nullptr);
    EXPECT_TRUE(err.find("gesture") != std::string::npos);

    return true;
}

Test 3: Shows Indicator

bool Test_MicrophoneShowsIndicator(std::string& error_msg) {
    SandboxContext ctx;
    ctx.app_id = "test.app";
    ctx.permissions = {"microphone"};
    ctx.is_system_app = false;
    PermissionGate permissions(ctx);
    permissions.GrantPermission("microphone");

    mosis::MicrophoneInterface mic("test.app", &permissions);
    mic.SimulateUserGesture();

    EXPECT_FALSE(mic.IsIndicatorVisible());

    std::string err;
    mosis::RecordingConfig config;
    auto session = mic.StartSession(config, err);

    EXPECT_TRUE(mic.IsIndicatorVisible());

    mic.StopSession();

    EXPECT_FALSE(mic.IsIndicatorVisible());

    return true;
}

Test 4: Single Session Only

bool Test_MicrophoneSingleSession(std::string& error_msg) {
    SandboxContext ctx;
    ctx.app_id = "test.app";
    ctx.permissions = {"microphone"};
    ctx.is_system_app = false;
    PermissionGate permissions(ctx);
    permissions.GrantPermission("microphone");

    mosis::MicrophoneInterface mic("test.app", &permissions);
    mic.SimulateUserGesture();

    std::string err;
    mosis::RecordingConfig config;

    // First session should succeed
    auto session1 = mic.StartSession(config, err);
    EXPECT_TRUE(session1 != nullptr);

    // Second session should fail
    mic.SimulateUserGesture();
    auto session2 = mic.StartSession(config, err);
    EXPECT_TRUE(session2 == nullptr);
    EXPECT_TRUE(err.find("active") != std::string::npos ||
                err.find("already") != std::string::npos);

    return true;
}

Test 5: Stops On App Stop

bool Test_MicrophoneStopsOnShutdown(std::string& error_msg) {
    SandboxContext ctx;
    ctx.app_id = "test.app";
    ctx.permissions = {"microphone"};
    ctx.is_system_app = false;
    PermissionGate permissions(ctx);
    permissions.GrantPermission("microphone");

    mosis::MicrophoneInterface mic("test.app", &permissions);
    mic.SimulateUserGesture();

    std::string err;
    mosis::RecordingConfig config;
    auto session = mic.StartSession(config, err);

    EXPECT_TRUE(session != nullptr);
    EXPECT_TRUE(mic.HasActiveSession());

    // Simulate app stop
    mic.Shutdown();

    EXPECT_FALSE(mic.HasActiveSession());
    EXPECT_FALSE(mic.IsIndicatorVisible());

    return true;
}

Test 6: Lua Integration

bool Test_MicrophoneLuaIntegration(std::string& error_msg) {
    SandboxContext ctx = TestContext();
    ctx.permissions = {"microphone"};
    LuaSandbox sandbox(ctx);

    PermissionGate permissions(ctx);
    permissions.GrantPermission("microphone");

    mosis::MicrophoneInterface mic("test.app", &permissions);
    mic.SimulateUserGesture();
    mosis::RegisterMicrophoneAPI(sandbox.GetState(), &mic);

    std::string script = R"lua(
        -- Test that microphone global exists
        if not microphone then
            error("microphone global not found")
        end
        if not microphone.start then
            error("microphone.start not found")
        end
        if not microphone.isActive then
            error("microphone.isActive not found")
        end

        -- isActive should be false initially
        if microphone.isActive() then
            error("microphone should not be active initially")
        end
    )lua";

    bool ok = sandbox.LoadString(script, "microphone_test");
    if (!ok) {
        error_msg = "Lua test failed: " + sandbox.GetLastError();
        return false;
    }
    return true;
}

Acceptance Criteria

All tests must pass:

  • Test_MicrophoneRequiresPermission - Permission check works
  • Test_MicrophoneRequiresUserGesture - User gesture required
  • Test_MicrophoneShowsIndicator - Recording indicator shown/hidden
  • Test_MicrophoneSingleSession - Only one session allowed
  • Test_MicrophoneStopsOnShutdown - Cleanup on shutdown
  • Test_MicrophoneLuaIntegration - Lua API works

Dependencies

  • Milestone 1 (LuaSandbox)
  • Milestone 2 (PermissionGate + user gesture)
  • Milestone 3 (AuditLog)

Notes

Desktop vs Android Implementation

For desktop testing, MicrophoneInterface operates in mock mode:

  • Permission and gesture checks run normally
  • Indicator state is tracked but not displayed
  • No actual microphone hardware access
  • Audio buffers can be simulated for testing

On Android, the real implementation would:

  1. Use AudioRecord API through JNI
  2. Display system-level recording indicator
  3. Handle audio hardware lifecycle
  4. Deliver audio buffers through native callbacks

Security Considerations

  1. Permission: Microphone access is a dangerous permission requiring user grant
  2. User gesture: Prevents background audio recording
  3. Indicator: User always knows when microphone is active
  4. Single session: Prevents resource abuse
  5. Cleanup: Sessions closed when app stops

Future Integration

The microphone will integrate with:

  1. Voice communication in multiplayer games
  2. Voice commands/speech recognition
  3. Audio messaging features

Next Steps

After Milestone 12 passes:

  1. Milestone 13: Virtual Hardware - Audio Output