Files
video-v1/vav2/docs/working/NVDEC_State_Machine_Refactoring.md

417 lines
13 KiB
Markdown
Raw Normal View History

2025-10-11 02:08:57 +09:00
# NVDEC Decoder State Machine Refactoring Design
## Problem Statement
The current `NVDECAV1Decoder::DecodeToSurface()` has excessive complexity:
- **13+ state variables** tracked across multiple atomic flags and mutexes
- **9+ conditional branches** with nested conditions
- **~150 lines** in a single function
- **High cyclomatic complexity** (2^9 = 512 possible code paths)
This makes the code:
- Hard to maintain and debug
- Difficult to test comprehensively
- Prone to race conditions and edge cases
- Challenging to extend with new features
## Solution: State Machine Pattern
### Core Design Principle
**Consolidate all decoder state into a single enum** with clear transitions, replacing scattered atomic flags and conditional checks.
### State Machine States
```cpp
enum class DecoderState {
UNINITIALIZED, // Before Initialize() is called
READY, // Initialized and ready for decoding
BUFFERING, // Initial buffering (0-15 frames)
DECODING, // Normal frame-by-frame decoding
FLUSHING, // End-of-file reached, draining DPB
FLUSH_COMPLETE, // All frames drained
ERROR // Unrecoverable error state
};
```
### State Transitions
```
UNINITIALIZED → READY (Initialize() called successfully)
READY → BUFFERING (First DecodeToSurface() call)
BUFFERING → DECODING (Display queue has frames)
DECODING → FLUSHING (End-of-file reached, NULL packet)
FLUSHING → FLUSH_COMPLETE (Display queue empty)
FLUSH_COMPLETE → READY (Reset() called)
* → ERROR (Any state can transition to ERROR on failure)
ERROR → READY (Reset() called)
```
### State Machine Class
```cpp
class DecoderStateMachine {
public:
DecoderStateMachine() : m_state(DecoderState::UNINITIALIZED) {}
// State queries
DecoderState GetState() const { return m_state.load(); }
bool IsState(DecoderState state) const { return m_state.load() == state; }
bool CanDecode() const {
auto state = m_state.load();
return state == DecoderState::READY ||
state == DecoderState::BUFFERING ||
state == DecoderState::DECODING ||
state == DecoderState::FLUSHING;
}
// State transitions
bool TransitionTo(DecoderState new_state) {
DecoderState expected = m_state.load();
if (IsValidTransition(expected, new_state)) {
m_state.store(new_state);
LOGF_DEBUG("[DecoderStateMachine] State transition: %s → %s",
StateToString(expected), StateToString(new_state));
return true;
}
LOGF_ERROR("[DecoderStateMachine] Invalid transition: %s → %s",
StateToString(expected), StateToString(new_state));
return false;
}
// Specific transition helpers
void OnInitializeSuccess() {
TransitionTo(DecoderState::READY);
}
void OnFirstPacket() {
if (IsState(DecoderState::READY)) {
TransitionTo(DecoderState::BUFFERING);
}
}
void OnBufferingComplete(size_t queue_size) {
if (IsState(DecoderState::BUFFERING) && queue_size > 0) {
TransitionTo(DecoderState::DECODING);
}
}
void OnEndOfFile() {
if (IsState(DecoderState::DECODING) || IsState(DecoderState::BUFFERING)) {
TransitionTo(DecoderState::FLUSHING);
}
}
void OnFlushComplete() {
if (IsState(DecoderState::FLUSHING)) {
TransitionTo(DecoderState::FLUSH_COMPLETE);
}
}
void OnError() {
TransitionTo(DecoderState::ERROR);
}
void OnReset() {
TransitionTo(DecoderState::READY);
}
private:
std::atomic<DecoderState> m_state;
bool IsValidTransition(DecoderState from, DecoderState to) const {
// Define valid state transitions
switch (from) {
case DecoderState::UNINITIALIZED:
return to == DecoderState::READY || to == DecoderState::ERROR;
case DecoderState::READY:
return to == DecoderState::BUFFERING || to == DecoderState::ERROR;
case DecoderState::BUFFERING:
return to == DecoderState::DECODING || to == DecoderState::FLUSHING ||
to == DecoderState::ERROR || to == DecoderState::READY;
case DecoderState::DECODING:
return to == DecoderState::FLUSHING || to == DecoderState::ERROR ||
to == DecoderState::READY;
case DecoderState::FLUSHING:
return to == DecoderState::FLUSH_COMPLETE || to == DecoderState::ERROR ||
to == DecoderState::READY;
case DecoderState::FLUSH_COMPLETE:
return to == DecoderState::READY || to == DecoderState::ERROR;
case DecoderState::ERROR:
return to == DecoderState::READY;
default:
return false;
}
}
const char* StateToString(DecoderState state) const {
switch (state) {
case DecoderState::UNINITIALIZED: return "UNINITIALIZED";
case DecoderState::READY: return "READY";
case DecoderState::BUFFERING: return "BUFFERING";
case DecoderState::DECODING: return "DECODING";
case DecoderState::FLUSHING: return "FLUSHING";
case DecoderState::FLUSH_COMPLETE: return "FLUSH_COMPLETE";
case DecoderState::ERROR: return "ERROR";
default: return "UNKNOWN";
}
}
};
```
## Refactored DecodeToSurface()
### Before (Complex Branching):
```cpp
bool DecodeToSurface(...) {
// Step 1: Check if initialized
if (!m_initialized) { ... }
// Handle NULL packet_data as flush mode
if (!packet_data || packet_size == 0) {
m_endOfFileReached = true;
}
// Step 2: Submit packet
if (m_endOfFileReached) {
// Flush mode logic
} else {
// Normal mode logic
}
// Step 3: Check initial buffering
if (m_displayQueue.empty() && !m_initialBufferingComplete) {
// Buffering logic
}
if (!m_displayQueue.empty() && !m_initialBufferingComplete) {
m_initialBufferingComplete = true;
}
// Step 4: Pop from display queue
if (m_displayQueue.empty()) {
if (m_endOfFileReached) {
// Flush complete logic
} else {
// Error - queue empty unexpectedly
}
}
// ... (continues for 150 more lines)
}
```
### After (State Machine):
```cpp
bool DecodeToSurface(const uint8_t* packet_data, size_t packet_size,
VavCoreSurfaceType target_type,
void* target_surface,
VideoFrame& output_frame) {
// State validation
if (!m_stateMachine.CanDecode()) {
LOGF_ERROR("[DecodeToSurface] Invalid state: %s",
m_stateMachine.GetStateString());
return false;
}
// Handle end-of-file
if (!packet_data || packet_size == 0) {
return HandleFlushMode(output_frame);
}
// Delegate to state-specific handler
switch (m_stateMachine.GetState()) {
case DecoderState::READY:
case DecoderState::BUFFERING:
return HandleBufferingMode(packet_data, packet_size, target_type,
target_surface, output_frame);
case DecoderState::DECODING:
return HandleDecodingMode(packet_data, packet_size, target_type,
target_surface, output_frame);
default:
LOGF_ERROR("[DecodeToSurface] Unexpected state in DecodeToSurface");
return false;
}
}
```
### Helper Methods (State-Specific Logic):
```cpp
bool HandleBufferingMode(const uint8_t* packet_data, size_t packet_size,
VavCoreSurfaceType target_type,
void* target_surface,
VideoFrame& output_frame) {
// Transition to buffering on first packet
if (m_stateMachine.IsState(DecoderState::READY)) {
m_stateMachine.OnFirstPacket();
}
// Submit packet to NVDEC
if (!SubmitPacketToParser(packet_data, packet_size)) {
return false;
}
// Check if buffering is complete
{
std::lock_guard<std::mutex> lock(m_displayMutex);
if (m_displayQueue.empty()) {
// Still buffering
return false; // VAVCORE_PACKET_ACCEPTED
} else {
// Buffering complete
m_stateMachine.OnBufferingComplete(m_displayQueue.size());
// Fall through to decode the first frame
}
}
return RetrieveAndRenderFrame(target_type, target_surface, output_frame);
}
bool HandleDecodingMode(const uint8_t* packet_data, size_t packet_size,
VavCoreSurfaceType target_type,
void* target_surface,
VideoFrame& output_frame) {
// Submit packet to NVDEC
if (!SubmitPacketToParser(packet_data, packet_size)) {
return false;
}
// Retrieve and render frame
return RetrieveAndRenderFrame(target_type, target_surface, output_frame);
}
bool HandleFlushMode(VideoFrame& output_frame) {
// Transition to flushing if not already
if (!m_stateMachine.IsState(DecoderState::FLUSHING)) {
m_stateMachine.OnEndOfFile();
}
// Submit end-of-stream packet
if (!SubmitFlushPacket()) {
return false;
}
// Check if flush is complete
{
std::lock_guard<std::mutex> lock(m_displayMutex);
if (m_displayQueue.empty()) {
m_stateMachine.OnFlushComplete();
return false; // VAVCORE_END_OF_STREAM
}
}
// Still have frames to drain
return RetrieveAndRenderFrame(...);
}
```
## Removed/Consolidated State Variables
### Before:
```cpp
// 13+ state variables
std::atomic<bool> m_initialBufferingComplete{false};
std::atomic<bool> m_endOfFileReached{false};
std::atomic<bool> m_converterNeedsReinit{false};
std::atomic<uint64_t> m_submissionCounter{0};
std::atomic<uint64_t> m_returnCounter{0};
std::atomic<bool> m_pollingRunning{false};
std::mutex m_frameQueueMutex;
std::mutex m_cudaContextMutex;
std::mutex m_submissionMutex;
std::mutex m_displayMutex;
std::queue<int> m_displayQueue;
FrameSlot m_frameSlots[16]; // Each has 5 atomic flags
```
### After:
```cpp
// Single state machine + minimal supporting variables
DecoderStateMachine m_stateMachine;
// Still needed (but usage clarified by state machine):
std::mutex m_displayMutex;
std::queue<int> m_displayQueue;
FrameSlot m_frameSlots[16]; // Frame-specific state (not global decoder state)
std::atomic<uint64_t> m_submissionCounter{0}; // Submission ordering
std::mutex m_submissionMutex;
```
**Eliminated:**
- `m_initialBufferingComplete` → Replaced by `DecoderState::BUFFERING` vs `DECODING`
- `m_endOfFileReached` → Replaced by `DecoderState::FLUSHING`
- `m_converterNeedsReinit` → Moved to NV12ToRGBAConverter internal state
## Benefits
### 1. Complexity Reduction
- **13+ state variables → 1 state machine** with 7 well-defined states
- **9+ conditional branches → State-driven dispatch** (1 switch statement)
- **~150 lines → ~40 lines** per state handler (modular functions)
### 2. Improved Maintainability
- **Clear state transitions** with validation (no illegal states)
- **State-specific logic** isolated in dedicated functions
- **Easy debugging** with state transition logging
### 3. Better Testability
- **Test individual states** independently
- **Verify state transitions** explicitly
- **Mock state machine** for unit tests
### 4. Enhanced Readability
- **Self-documenting code** (state names describe decoder status)
- **Linear flow** instead of nested conditions
- **Clear intent** from state-specific handler names
## Implementation Plan
### Phase 1: Create State Machine Class (CURRENT)
- [x] Design state machine enum and transitions
- [ ] Implement DecoderStateMachine class
- [ ] Add state transition logging
### Phase 2: Extract Helper Methods
- [ ] Create `SubmitPacketToParser()`
- [ ] Create `RetrieveAndRenderFrame()`
- [ ] Create `SubmitFlushPacket()`
### Phase 3: Refactor DecodeToSurface()
- [ ] Replace state flags with state machine
- [ ] Implement `HandleBufferingMode()`
- [ ] Implement `HandleDecodingMode()`
- [ ] Implement `HandleFlushMode()`
### Phase 4: Update Other Methods
- [ ] Update `Initialize()` → call `m_stateMachine.OnInitializeSuccess()`
- [ ] Update `Reset()` → call `m_stateMachine.OnReset()`
- [ ] Update `Cleanup()` → call `m_stateMachine.TransitionTo(UNINITIALIZED)`
### Phase 5: Remove Obsolete State Variables
- [ ] Remove `m_initialBufferingComplete`
- [ ] Remove `m_endOfFileReached`
- [ ] Verify no regressions with existing tests
## Testing Strategy
### Unit Tests
- State transition validation (legal/illegal transitions)
- State-specific handler behavior
- Error state recovery
### Integration Tests
- Full decode pipeline with state transitions
- Edge cases (empty files, flush mode, errors)
- Multi-threaded decoding with state machine
### Regression Tests
- Existing RedSurfaceNVDECTest
- Vav2PlayerHeadless tests
- Vav2Player GUI tests
---
**Status**: Design complete, implementation in progress
**Last Updated**: 2025-10-11