Files
video-v1/vav2/docs/working/NVDEC_State_Machine_Refactoring.md
2025-10-11 02:08:57 +09:00

13 KiB

NVDEC Decoder State Machine Refactoring Design

Problem Statement

The current NVDECAV1Decoder::DecodeToSurface() has excessive complexity:

  • 13+ state variables tracked across multiple atomic flags and mutexes
  • 9+ conditional branches with nested conditions
  • ~150 lines in a single function
  • High cyclomatic complexity (2^9 = 512 possible code paths)

This makes the code:

  • Hard to maintain and debug
  • Difficult to test comprehensively
  • Prone to race conditions and edge cases
  • Challenging to extend with new features

Solution: State Machine Pattern

Core Design Principle

Consolidate all decoder state into a single enum with clear transitions, replacing scattered atomic flags and conditional checks.

State Machine States

enum class DecoderState {
    UNINITIALIZED,              // Before Initialize() is called
    READY,                      // Initialized and ready for decoding
    BUFFERING,                  // Initial buffering (0-15 frames)
    DECODING,                   // Normal frame-by-frame decoding
    FLUSHING,                   // End-of-file reached, draining DPB
    FLUSH_COMPLETE,             // All frames drained
    ERROR                       // Unrecoverable error state
};

State Transitions

UNINITIALIZED → READY                 (Initialize() called successfully)
READY → BUFFERING                     (First DecodeToSurface() call)
BUFFERING → DECODING                  (Display queue has frames)
DECODING → FLUSHING                   (End-of-file reached, NULL packet)
FLUSHING → FLUSH_COMPLETE             (Display queue empty)
FLUSH_COMPLETE → READY                (Reset() called)
* → ERROR                             (Any state can transition to ERROR on failure)
ERROR → READY                         (Reset() called)

State Machine Class

class DecoderStateMachine {
public:
    DecoderStateMachine() : m_state(DecoderState::UNINITIALIZED) {}

    // State queries
    DecoderState GetState() const { return m_state.load(); }
    bool IsState(DecoderState state) const { return m_state.load() == state; }
    bool CanDecode() const {
        auto state = m_state.load();
        return state == DecoderState::READY ||
               state == DecoderState::BUFFERING ||
               state == DecoderState::DECODING ||
               state == DecoderState::FLUSHING;
    }

    // State transitions
    bool TransitionTo(DecoderState new_state) {
        DecoderState expected = m_state.load();
        if (IsValidTransition(expected, new_state)) {
            m_state.store(new_state);
            LOGF_DEBUG("[DecoderStateMachine] State transition: %s → %s",
                       StateToString(expected), StateToString(new_state));
            return true;
        }
        LOGF_ERROR("[DecoderStateMachine] Invalid transition: %s → %s",
                   StateToString(expected), StateToString(new_state));
        return false;
    }

    // Specific transition helpers
    void OnInitializeSuccess() {
        TransitionTo(DecoderState::READY);
    }

    void OnFirstPacket() {
        if (IsState(DecoderState::READY)) {
            TransitionTo(DecoderState::BUFFERING);
        }
    }

    void OnBufferingComplete(size_t queue_size) {
        if (IsState(DecoderState::BUFFERING) && queue_size > 0) {
            TransitionTo(DecoderState::DECODING);
        }
    }

    void OnEndOfFile() {
        if (IsState(DecoderState::DECODING) || IsState(DecoderState::BUFFERING)) {
            TransitionTo(DecoderState::FLUSHING);
        }
    }

    void OnFlushComplete() {
        if (IsState(DecoderState::FLUSHING)) {
            TransitionTo(DecoderState::FLUSH_COMPLETE);
        }
    }

    void OnError() {
        TransitionTo(DecoderState::ERROR);
    }

    void OnReset() {
        TransitionTo(DecoderState::READY);
    }

private:
    std::atomic<DecoderState> m_state;

    bool IsValidTransition(DecoderState from, DecoderState to) const {
        // Define valid state transitions
        switch (from) {
            case DecoderState::UNINITIALIZED:
                return to == DecoderState::READY || to == DecoderState::ERROR;
            case DecoderState::READY:
                return to == DecoderState::BUFFERING || to == DecoderState::ERROR;
            case DecoderState::BUFFERING:
                return to == DecoderState::DECODING || to == DecoderState::FLUSHING ||
                       to == DecoderState::ERROR || to == DecoderState::READY;
            case DecoderState::DECODING:
                return to == DecoderState::FLUSHING || to == DecoderState::ERROR ||
                       to == DecoderState::READY;
            case DecoderState::FLUSHING:
                return to == DecoderState::FLUSH_COMPLETE || to == DecoderState::ERROR ||
                       to == DecoderState::READY;
            case DecoderState::FLUSH_COMPLETE:
                return to == DecoderState::READY || to == DecoderState::ERROR;
            case DecoderState::ERROR:
                return to == DecoderState::READY;
            default:
                return false;
        }
    }

    const char* StateToString(DecoderState state) const {
        switch (state) {
            case DecoderState::UNINITIALIZED: return "UNINITIALIZED";
            case DecoderState::READY: return "READY";
            case DecoderState::BUFFERING: return "BUFFERING";
            case DecoderState::DECODING: return "DECODING";
            case DecoderState::FLUSHING: return "FLUSHING";
            case DecoderState::FLUSH_COMPLETE: return "FLUSH_COMPLETE";
            case DecoderState::ERROR: return "ERROR";
            default: return "UNKNOWN";
        }
    }
};

Refactored DecodeToSurface()

Before (Complex Branching):

bool DecodeToSurface(...) {
    // Step 1: Check if initialized
    if (!m_initialized) { ... }

    // Handle NULL packet_data as flush mode
    if (!packet_data || packet_size == 0) {
        m_endOfFileReached = true;
    }

    // Step 2: Submit packet
    if (m_endOfFileReached) {
        // Flush mode logic
    } else {
        // Normal mode logic
    }

    // Step 3: Check initial buffering
    if (m_displayQueue.empty() && !m_initialBufferingComplete) {
        // Buffering logic
    }
    if (!m_displayQueue.empty() && !m_initialBufferingComplete) {
        m_initialBufferingComplete = true;
    }

    // Step 4: Pop from display queue
    if (m_displayQueue.empty()) {
        if (m_endOfFileReached) {
            // Flush complete logic
        } else {
            // Error - queue empty unexpectedly
        }
    }

    // ... (continues for 150 more lines)
}

After (State Machine):

bool DecodeToSurface(const uint8_t* packet_data, size_t packet_size,
                     VavCoreSurfaceType target_type,
                     void* target_surface,
                     VideoFrame& output_frame) {
    // State validation
    if (!m_stateMachine.CanDecode()) {
        LOGF_ERROR("[DecodeToSurface] Invalid state: %s",
                   m_stateMachine.GetStateString());
        return false;
    }

    // Handle end-of-file
    if (!packet_data || packet_size == 0) {
        return HandleFlushMode(output_frame);
    }

    // Delegate to state-specific handler
    switch (m_stateMachine.GetState()) {
        case DecoderState::READY:
        case DecoderState::BUFFERING:
            return HandleBufferingMode(packet_data, packet_size, target_type,
                                        target_surface, output_frame);
        case DecoderState::DECODING:
            return HandleDecodingMode(packet_data, packet_size, target_type,
                                       target_surface, output_frame);
        default:
            LOGF_ERROR("[DecodeToSurface] Unexpected state in DecodeToSurface");
            return false;
    }
}

Helper Methods (State-Specific Logic):

bool HandleBufferingMode(const uint8_t* packet_data, size_t packet_size,
                         VavCoreSurfaceType target_type,
                         void* target_surface,
                         VideoFrame& output_frame) {
    // Transition to buffering on first packet
    if (m_stateMachine.IsState(DecoderState::READY)) {
        m_stateMachine.OnFirstPacket();
    }

    // Submit packet to NVDEC
    if (!SubmitPacketToParser(packet_data, packet_size)) {
        return false;
    }

    // Check if buffering is complete
    {
        std::lock_guard<std::mutex> lock(m_displayMutex);
        if (m_displayQueue.empty()) {
            // Still buffering
            return false; // VAVCORE_PACKET_ACCEPTED
        } else {
            // Buffering complete
            m_stateMachine.OnBufferingComplete(m_displayQueue.size());
            // Fall through to decode the first frame
        }
    }

    return RetrieveAndRenderFrame(target_type, target_surface, output_frame);
}

bool HandleDecodingMode(const uint8_t* packet_data, size_t packet_size,
                        VavCoreSurfaceType target_type,
                        void* target_surface,
                        VideoFrame& output_frame) {
    // Submit packet to NVDEC
    if (!SubmitPacketToParser(packet_data, packet_size)) {
        return false;
    }

    // Retrieve and render frame
    return RetrieveAndRenderFrame(target_type, target_surface, output_frame);
}

bool HandleFlushMode(VideoFrame& output_frame) {
    // Transition to flushing if not already
    if (!m_stateMachine.IsState(DecoderState::FLUSHING)) {
        m_stateMachine.OnEndOfFile();
    }

    // Submit end-of-stream packet
    if (!SubmitFlushPacket()) {
        return false;
    }

    // Check if flush is complete
    {
        std::lock_guard<std::mutex> lock(m_displayMutex);
        if (m_displayQueue.empty()) {
            m_stateMachine.OnFlushComplete();
            return false; // VAVCORE_END_OF_STREAM
        }
    }

    // Still have frames to drain
    return RetrieveAndRenderFrame(...);
}

Removed/Consolidated State Variables

Before:

// 13+ state variables
std::atomic<bool> m_initialBufferingComplete{false};
std::atomic<bool> m_endOfFileReached{false};
std::atomic<bool> m_converterNeedsReinit{false};
std::atomic<uint64_t> m_submissionCounter{0};
std::atomic<uint64_t> m_returnCounter{0};
std::atomic<bool> m_pollingRunning{false};
std::mutex m_frameQueueMutex;
std::mutex m_cudaContextMutex;
std::mutex m_submissionMutex;
std::mutex m_displayMutex;
std::queue<int> m_displayQueue;
FrameSlot m_frameSlots[16]; // Each has 5 atomic flags

After:

// Single state machine + minimal supporting variables
DecoderStateMachine m_stateMachine;

// Still needed (but usage clarified by state machine):
std::mutex m_displayMutex;
std::queue<int> m_displayQueue;
FrameSlot m_frameSlots[16]; // Frame-specific state (not global decoder state)
std::atomic<uint64_t> m_submissionCounter{0}; // Submission ordering
std::mutex m_submissionMutex;

Eliminated:

  • m_initialBufferingComplete → Replaced by DecoderState::BUFFERING vs DECODING
  • m_endOfFileReached → Replaced by DecoderState::FLUSHING
  • m_converterNeedsReinit → Moved to NV12ToRGBAConverter internal state

Benefits

1. Complexity Reduction

  • 13+ state variables → 1 state machine with 7 well-defined states
  • 9+ conditional branches → State-driven dispatch (1 switch statement)
  • ~150 lines → ~40 lines per state handler (modular functions)

2. Improved Maintainability

  • Clear state transitions with validation (no illegal states)
  • State-specific logic isolated in dedicated functions
  • Easy debugging with state transition logging

3. Better Testability

  • Test individual states independently
  • Verify state transitions explicitly
  • Mock state machine for unit tests

4. Enhanced Readability

  • Self-documenting code (state names describe decoder status)
  • Linear flow instead of nested conditions
  • Clear intent from state-specific handler names

Implementation Plan

Phase 1: Create State Machine Class (CURRENT)

  • Design state machine enum and transitions
  • Implement DecoderStateMachine class
  • Add state transition logging

Phase 2: Extract Helper Methods

  • Create SubmitPacketToParser()
  • Create RetrieveAndRenderFrame()
  • Create SubmitFlushPacket()

Phase 3: Refactor DecodeToSurface()

  • Replace state flags with state machine
  • Implement HandleBufferingMode()
  • Implement HandleDecodingMode()
  • Implement HandleFlushMode()

Phase 4: Update Other Methods

  • Update Initialize() → call m_stateMachine.OnInitializeSuccess()
  • Update Reset() → call m_stateMachine.OnReset()
  • Update Cleanup() → call m_stateMachine.TransitionTo(UNINITIALIZED)

Phase 5: Remove Obsolete State Variables

  • Remove m_initialBufferingComplete
  • Remove m_endOfFileReached
  • Verify no regressions with existing tests

Testing Strategy

Unit Tests

  • State transition validation (legal/illegal transitions)
  • State-specific handler behavior
  • Error state recovery

Integration Tests

  • Full decode pipeline with state transitions
  • Edge cases (empty files, flush mode, errors)
  • Multi-threaded decoding with state machine

Regression Tests

  • Existing RedSurfaceNVDECTest
  • Vav2PlayerHeadless tests
  • Vav2Player GUI tests

Status: Design complete, implementation in progress Last Updated: 2025-10-11