599 lines
16 KiB
Markdown
599 lines
16 KiB
Markdown
|
|
# NVDECAV1Decoder C++ Refactoring Design
|
||
|
|
|
||
|
|
**Date**: 2025-10-03
|
||
|
|
**Status**: Design Phase
|
||
|
|
**Goal**: Refactor NVDECAV1Decoder internal C++ code for readability and maintainability
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Problem Analysis
|
||
|
|
|
||
|
|
### Current State
|
||
|
|
- **File**: `vav2/platforms/windows/vavcore/src/Decoder/NVDECAV1Decoder.cpp`
|
||
|
|
- **Lines**: 1,722 lines (too large)
|
||
|
|
- **Main Method**: `DecodeToSurface()` is 500+ lines with deeply nested logic
|
||
|
|
|
||
|
|
### Key Issues
|
||
|
|
1. **Monolithic Method**: `DecodeToSurface()` handles CPU, D3D11, D3D12, CUDA in one giant function
|
||
|
|
2. **Mixed Responsibilities**: Decoding + Surface copying + Memory management + Fence signaling all mixed
|
||
|
|
3. **Hard to Debug**: Pitch/stride bugs are difficult to trace due to complex nesting
|
||
|
|
4. **Difficult to Test**: Cannot unit test individual components in isolation
|
||
|
|
5. **Poor Readability**: Excessive debug logging makes logic hard to follow
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Design Goals
|
||
|
|
|
||
|
|
### Primary Goals
|
||
|
|
1. **Readability**: Each method should do ONE thing clearly
|
||
|
|
2. **Maintainability**: Easy to locate and fix bugs (like current NV12 stride issue)
|
||
|
|
3. **Testability**: Each component can be tested independently
|
||
|
|
4. **Performance**: Zero overhead - use inline functions where appropriate
|
||
|
|
|
||
|
|
### Non-Goals
|
||
|
|
- NOT creating a C API (VavCore already provides that)
|
||
|
|
- NOT changing external interface of NVDECAV1Decoder
|
||
|
|
- NOT over-engineering with complex patterns
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Proposed Architecture
|
||
|
|
|
||
|
|
### File Structure
|
||
|
|
|
||
|
|
```
|
||
|
|
NVDECAV1Decoder.h (Public interface - unchanged)
|
||
|
|
NVDECAV1Decoder.cpp (Main decoder - 400 lines)
|
||
|
|
└── Uses helper classes below
|
||
|
|
|
||
|
|
D3D12SurfaceHandler.h (D3D12-specific logic - 300 lines)
|
||
|
|
D3D12SurfaceHandler.cpp
|
||
|
|
├── ImportD3D12Resource()
|
||
|
|
├── CopyNV12Frame()
|
||
|
|
└── SignalFence()
|
||
|
|
|
||
|
|
ExternalMemoryCache.h (CUDA-D3D12 interop cache - 200 lines)
|
||
|
|
ExternalMemoryCache.cpp
|
||
|
|
├── GetOrCreate()
|
||
|
|
├── Release()
|
||
|
|
└── Clear()
|
||
|
|
```
|
||
|
|
|
||
|
|
### Class Diagram
|
||
|
|
|
||
|
|
```
|
||
|
|
NVDECAV1Decoder (Main decoder)
|
||
|
|
├── CUvideodecoder m_decoder
|
||
|
|
├── CUvideoparser m_parser
|
||
|
|
├── CUcontext m_cudaContext
|
||
|
|
├── D3D12SurfaceHandler* m_d3d12Handler (on-demand)
|
||
|
|
└── ExternalMemoryCache* m_memoryCache (on-demand)
|
||
|
|
|
||
|
|
D3D12SurfaceHandler
|
||
|
|
├── ID3D12Device* m_device
|
||
|
|
├── CUcontext m_cudaContext
|
||
|
|
├── ExternalMemoryCache* m_cache
|
||
|
|
└── Methods:
|
||
|
|
├── CopyNV12Frame(src, dst, width, height, srcPitch)
|
||
|
|
├── GetD3D12CUDAPointer(ID3D12Resource*)
|
||
|
|
└── SignalD3D12Fence(value)
|
||
|
|
|
||
|
|
ExternalMemoryCache
|
||
|
|
├── std::map<ID3D12Resource*, CachedEntry>
|
||
|
|
└── Methods:
|
||
|
|
├── GetOrCreateExternalMemory(resource)
|
||
|
|
└── ReleaseAll()
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Refactored Code Structure
|
||
|
|
|
||
|
|
### 1. NVDECAV1Decoder.cpp (Main decoder - simplified)
|
||
|
|
|
||
|
|
**Before**: 500+ lines in DecodeToSurface()
|
||
|
|
|
||
|
|
**After**: Clean routing logic
|
||
|
|
|
||
|
|
```cpp
|
||
|
|
bool NVDECAV1Decoder::DecodeToSurface(const uint8_t* packet_data, size_t packet_size,
|
||
|
|
void* target_surface, SurfaceType target_type)
|
||
|
|
{
|
||
|
|
// Step 1: Decode packet to NVDEC internal buffer
|
||
|
|
if (!DecodePacket(packet_data, packet_size)) {
|
||
|
|
return false;
|
||
|
|
}
|
||
|
|
|
||
|
|
// Step 2: Get decoded frame info
|
||
|
|
DecodedFrameInfo frame_info;
|
||
|
|
if (!GetDecodedFrame(&frame_info)) {
|
||
|
|
return false;
|
||
|
|
}
|
||
|
|
|
||
|
|
// Step 3: Copy to target surface based on type
|
||
|
|
bool result = false;
|
||
|
|
switch (target_type) {
|
||
|
|
case SURFACE_TYPE_CPU:
|
||
|
|
result = CopyToCPUSurface(frame_info, target_surface);
|
||
|
|
break;
|
||
|
|
case SURFACE_TYPE_D3D12:
|
||
|
|
result = CopyToD3D12Surface(frame_info, target_surface);
|
||
|
|
break;
|
||
|
|
case SURFACE_TYPE_D3D11:
|
||
|
|
result = CopyToD3D11Surface(frame_info, target_surface);
|
||
|
|
break;
|
||
|
|
case SURFACE_TYPE_CUDA:
|
||
|
|
result = CopyToCUDASurface(frame_info, target_surface);
|
||
|
|
break;
|
||
|
|
}
|
||
|
|
|
||
|
|
// Step 4: Cleanup
|
||
|
|
cuvidUnmapVideoFrame(m_decoder, frame_info.device_ptr);
|
||
|
|
return result;
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
### 2. Private Helper Methods (in NVDECAV1Decoder.cpp)
|
||
|
|
|
||
|
|
```cpp
|
||
|
|
// Decode packet using cuvidParseVideoData
|
||
|
|
// Returns: true on success
|
||
|
|
// Complexity: ~30 lines
|
||
|
|
private:
|
||
|
|
bool DecodePacket(const uint8_t* data, size_t size)
|
||
|
|
{
|
||
|
|
CUVIDSOURCEDATAPACKET packet = {};
|
||
|
|
packet.payload = data;
|
||
|
|
packet.payload_size = size;
|
||
|
|
packet.flags = CUVID_PKT_TIMESTAMP;
|
||
|
|
|
||
|
|
CUresult result = cuvidParseVideoData(m_parser, &packet);
|
||
|
|
if (result != CUDA_SUCCESS) {
|
||
|
|
LogError("cuvidParseVideoData failed: %d", result);
|
||
|
|
return false;
|
||
|
|
}
|
||
|
|
return true;
|
||
|
|
}
|
||
|
|
|
||
|
|
// Get decoded frame from internal queue
|
||
|
|
// Returns: true if frame available
|
||
|
|
// Complexity: ~40 lines
|
||
|
|
private:
|
||
|
|
struct DecodedFrameInfo {
|
||
|
|
CUdeviceptr device_ptr;
|
||
|
|
uint32_t pitch;
|
||
|
|
uint32_t width;
|
||
|
|
uint32_t height;
|
||
|
|
};
|
||
|
|
|
||
|
|
bool GetDecodedFrame(DecodedFrameInfo* out_info)
|
||
|
|
{
|
||
|
|
if (m_frameQueue.empty()) {
|
||
|
|
return false;
|
||
|
|
}
|
||
|
|
|
||
|
|
int frame_index = m_frameQueue.front();
|
||
|
|
m_frameQueue.pop();
|
||
|
|
|
||
|
|
CUVIDPROCPARAMS proc_params = {};
|
||
|
|
proc_params.progressive_frame = 1;
|
||
|
|
|
||
|
|
CUdeviceptr device_ptr;
|
||
|
|
unsigned int pitch;
|
||
|
|
CUresult result = cuvidMapVideoFrame(m_decoder, frame_index,
|
||
|
|
&device_ptr, &pitch, &proc_params);
|
||
|
|
|
||
|
|
if (result != CUDA_SUCCESS) {
|
||
|
|
LogError("cuvidMapVideoFrame failed: %d", result);
|
||
|
|
return false;
|
||
|
|
}
|
||
|
|
|
||
|
|
out_info->device_ptr = device_ptr;
|
||
|
|
out_info->pitch = pitch;
|
||
|
|
out_info->width = m_width;
|
||
|
|
out_info->height = m_height;
|
||
|
|
|
||
|
|
return true;
|
||
|
|
}
|
||
|
|
|
||
|
|
// Copy to D3D12 surface (delegates to handler)
|
||
|
|
// Returns: true on success
|
||
|
|
// Complexity: ~20 lines
|
||
|
|
private:
|
||
|
|
bool CopyToD3D12Surface(const DecodedFrameInfo& frame, void* surface)
|
||
|
|
{
|
||
|
|
auto* d3d12_resource = static_cast<ID3D12Resource*>(surface);
|
||
|
|
|
||
|
|
// Create handler on-demand
|
||
|
|
if (!m_d3d12Handler) {
|
||
|
|
m_d3d12Handler = std::make_unique<D3D12SurfaceHandler>(
|
||
|
|
m_d3d12Device, m_cudaContext
|
||
|
|
);
|
||
|
|
}
|
||
|
|
|
||
|
|
return m_d3d12Handler->CopyNV12Frame(
|
||
|
|
frame.device_ptr,
|
||
|
|
frame.pitch,
|
||
|
|
d3d12_resource,
|
||
|
|
frame.width,
|
||
|
|
frame.height
|
||
|
|
);
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
### 3. D3D12SurfaceHandler.h (D3D12-specific operations)
|
||
|
|
|
||
|
|
```cpp
|
||
|
|
#pragma once
|
||
|
|
|
||
|
|
#include <d3d12.h>
|
||
|
|
#include <cuda.h>
|
||
|
|
#include <cuda_runtime.h>
|
||
|
|
#include <memory>
|
||
|
|
|
||
|
|
namespace VavCore {
|
||
|
|
|
||
|
|
// Forward declaration
|
||
|
|
class ExternalMemoryCache;
|
||
|
|
|
||
|
|
class D3D12SurfaceHandler {
|
||
|
|
public:
|
||
|
|
D3D12SurfaceHandler(ID3D12Device* device, CUcontext cuda_context);
|
||
|
|
~D3D12SurfaceHandler();
|
||
|
|
|
||
|
|
// Copy NV12 frame from CUDA to D3D12 texture
|
||
|
|
// Returns: true on success
|
||
|
|
bool CopyNV12Frame(CUdeviceptr src_frame,
|
||
|
|
uint32_t src_pitch,
|
||
|
|
ID3D12Resource* dst_texture,
|
||
|
|
uint32_t width,
|
||
|
|
uint32_t height);
|
||
|
|
|
||
|
|
// Signal D3D12 fence from CUDA stream
|
||
|
|
// Returns: true on success
|
||
|
|
bool SignalD3D12Fence(uint64_t fence_value);
|
||
|
|
|
||
|
|
private:
|
||
|
|
// Get CUDA device pointer for D3D12 resource (uses cache)
|
||
|
|
bool GetD3D12CUDAPointer(ID3D12Resource* resource, CUdeviceptr* out_ptr);
|
||
|
|
|
||
|
|
// Copy Y plane (8-bit single channel)
|
||
|
|
bool CopyYPlane(CUdeviceptr src, uint32_t src_pitch,
|
||
|
|
CUdeviceptr dst, uint32_t dst_pitch,
|
||
|
|
uint32_t width, uint32_t height);
|
||
|
|
|
||
|
|
// Copy UV plane (8-bit dual channel, interleaved)
|
||
|
|
bool CopyUVPlane(CUdeviceptr src, uint32_t src_pitch,
|
||
|
|
CUdeviceptr dst, uint32_t dst_pitch,
|
||
|
|
uint32_t width, uint32_t height);
|
||
|
|
|
||
|
|
private:
|
||
|
|
ID3D12Device* m_device;
|
||
|
|
CUcontext m_cudaContext;
|
||
|
|
std::unique_ptr<ExternalMemoryCache> m_cache;
|
||
|
|
};
|
||
|
|
|
||
|
|
} // namespace VavCore
|
||
|
|
```
|
||
|
|
|
||
|
|
### 4. D3D12SurfaceHandler.cpp (Implementation)
|
||
|
|
|
||
|
|
```cpp
|
||
|
|
#include "D3D12SurfaceHandler.h"
|
||
|
|
#include "ExternalMemoryCache.h"
|
||
|
|
#include <stdio.h>
|
||
|
|
|
||
|
|
namespace VavCore {
|
||
|
|
|
||
|
|
D3D12SurfaceHandler::D3D12SurfaceHandler(ID3D12Device* device, CUcontext cuda_context)
|
||
|
|
: m_device(device)
|
||
|
|
, m_cudaContext(cuda_context)
|
||
|
|
, m_cache(std::make_unique<ExternalMemoryCache>(device, cuda_context))
|
||
|
|
{
|
||
|
|
}
|
||
|
|
|
||
|
|
D3D12SurfaceHandler::~D3D12SurfaceHandler()
|
||
|
|
{
|
||
|
|
}
|
||
|
|
|
||
|
|
bool D3D12SurfaceHandler::CopyNV12Frame(CUdeviceptr src_frame,
|
||
|
|
uint32_t src_pitch,
|
||
|
|
ID3D12Resource* dst_texture,
|
||
|
|
uint32_t width,
|
||
|
|
uint32_t height)
|
||
|
|
{
|
||
|
|
// Get CUDA pointer for D3D12 resource
|
||
|
|
CUdeviceptr dst_ptr = 0;
|
||
|
|
if (!GetD3D12CUDAPointer(dst_texture, &dst_ptr)) {
|
||
|
|
return false;
|
||
|
|
}
|
||
|
|
|
||
|
|
// Get D3D12 texture layout
|
||
|
|
D3D12_RESOURCE_DESC desc = dst_texture->GetDesc();
|
||
|
|
D3D12_PLACED_SUBRESOURCE_FOOTPRINT layouts[2];
|
||
|
|
UINT num_rows[2] = {0};
|
||
|
|
UINT64 row_sizes[2] = {0};
|
||
|
|
UINT64 total_bytes = 0;
|
||
|
|
|
||
|
|
m_device->GetCopyableFootprints(&desc, 0, 2, 0,
|
||
|
|
layouts, num_rows, row_sizes, &total_bytes);
|
||
|
|
|
||
|
|
// Copy Y plane
|
||
|
|
if (!CopyYPlane(src_frame, src_pitch,
|
||
|
|
dst_ptr, layouts[0].Footprint.RowPitch,
|
||
|
|
width, height)) {
|
||
|
|
return false;
|
||
|
|
}
|
||
|
|
|
||
|
|
// Copy UV plane
|
||
|
|
CUdeviceptr src_uv = src_frame + (src_pitch * height);
|
||
|
|
CUdeviceptr dst_uv = dst_ptr + layouts[1].Offset;
|
||
|
|
|
||
|
|
if (!CopyUVPlane(src_uv, src_pitch,
|
||
|
|
dst_uv, layouts[1].Footprint.RowPitch,
|
||
|
|
width, height / 2)) {
|
||
|
|
return false;
|
||
|
|
}
|
||
|
|
|
||
|
|
return true;
|
||
|
|
}
|
||
|
|
|
||
|
|
bool D3D12SurfaceHandler::GetD3D12CUDAPointer(ID3D12Resource* resource,
|
||
|
|
CUdeviceptr* out_ptr)
|
||
|
|
{
|
||
|
|
return m_cache->GetOrCreateExternalMemory(resource, out_ptr);
|
||
|
|
}
|
||
|
|
|
||
|
|
bool D3D12SurfaceHandler::CopyYPlane(CUdeviceptr src, uint32_t src_pitch,
|
||
|
|
CUdeviceptr dst, uint32_t dst_pitch,
|
||
|
|
uint32_t width, uint32_t height)
|
||
|
|
{
|
||
|
|
cudaError_t err = cudaMemcpy2D(
|
||
|
|
(void*)dst, dst_pitch,
|
||
|
|
(void*)src, src_pitch,
|
||
|
|
width, height, // Copy only valid pixels, not padding
|
||
|
|
cudaMemcpyDeviceToDevice
|
||
|
|
);
|
||
|
|
|
||
|
|
if (err != cudaSuccess) {
|
||
|
|
printf("[D3D12] Y plane copy failed: %d\n", err);
|
||
|
|
return false;
|
||
|
|
}
|
||
|
|
|
||
|
|
return true;
|
||
|
|
}
|
||
|
|
|
||
|
|
bool D3D12SurfaceHandler::CopyUVPlane(CUdeviceptr src, uint32_t src_pitch,
|
||
|
|
CUdeviceptr dst, uint32_t dst_pitch,
|
||
|
|
uint32_t width, uint32_t height)
|
||
|
|
{
|
||
|
|
// NV12 UV plane: interleaved U and V, so width in bytes = width of Y plane
|
||
|
|
cudaError_t err = cudaMemcpy2D(
|
||
|
|
(void*)dst, dst_pitch,
|
||
|
|
(void*)src, src_pitch,
|
||
|
|
width, height, // UV plane has same width in bytes, half height
|
||
|
|
cudaMemcpyDeviceToDevice
|
||
|
|
);
|
||
|
|
|
||
|
|
if (err != cudaSuccess) {
|
||
|
|
printf("[D3D12] UV plane copy failed: %d\n", err);
|
||
|
|
return false;
|
||
|
|
}
|
||
|
|
|
||
|
|
return true;
|
||
|
|
}
|
||
|
|
|
||
|
|
} // namespace VavCore
|
||
|
|
```
|
||
|
|
|
||
|
|
### 5. ExternalMemoryCache.h (CUDA-D3D12 interop cache)
|
||
|
|
|
||
|
|
```cpp
|
||
|
|
#pragma once
|
||
|
|
|
||
|
|
#include <d3d12.h>
|
||
|
|
#include <cuda.h>
|
||
|
|
#include <cuda_runtime.h>
|
||
|
|
#include <map>
|
||
|
|
|
||
|
|
namespace VavCore {
|
||
|
|
|
||
|
|
class ExternalMemoryCache {
|
||
|
|
public:
|
||
|
|
ExternalMemoryCache(ID3D12Device* device, CUcontext cuda_context);
|
||
|
|
~ExternalMemoryCache();
|
||
|
|
|
||
|
|
// Get or create CUDA device pointer for D3D12 resource
|
||
|
|
// Returns: true on success
|
||
|
|
bool GetOrCreateExternalMemory(ID3D12Resource* resource, CUdeviceptr* out_ptr);
|
||
|
|
|
||
|
|
// Release specific resource
|
||
|
|
void Release(ID3D12Resource* resource);
|
||
|
|
|
||
|
|
// Release all cached resources
|
||
|
|
void ReleaseAll();
|
||
|
|
|
||
|
|
private:
|
||
|
|
struct CachedEntry {
|
||
|
|
cudaExternalMemory_t external_memory;
|
||
|
|
CUdeviceptr device_ptr;
|
||
|
|
size_t size;
|
||
|
|
};
|
||
|
|
|
||
|
|
bool ImportD3D12Resource(ID3D12Resource* resource,
|
||
|
|
cudaExternalMemory_t* out_ext_mem,
|
||
|
|
CUdeviceptr* out_ptr);
|
||
|
|
|
||
|
|
private:
|
||
|
|
ID3D12Device* m_device;
|
||
|
|
CUcontext m_cudaContext;
|
||
|
|
std::map<ID3D12Resource*, CachedEntry> m_cache;
|
||
|
|
};
|
||
|
|
|
||
|
|
} // namespace VavCore
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Key Improvements
|
||
|
|
|
||
|
|
### Readability
|
||
|
|
**Before**:
|
||
|
|
- `DecodeToSurface()`: 500+ lines with 5 levels of nesting
|
||
|
|
- Mixed concerns: decoding, copying, caching, signaling
|
||
|
|
|
||
|
|
**After**:
|
||
|
|
- `DecodeToSurface()`: 40 lines, clear 4-step process
|
||
|
|
- Each helper method: 20-60 lines, single responsibility
|
||
|
|
|
||
|
|
### Debugging
|
||
|
|
**Before**:
|
||
|
|
- NV12 stride bug hidden in 500 lines of mixed logic
|
||
|
|
- Hard to locate which `cudaMemcpy2D` call is wrong
|
||
|
|
|
||
|
|
**After**:
|
||
|
|
- `CopyYPlane()` and `CopyUVPlane()` are separate methods
|
||
|
|
- Easy to add breakpoint and inspect parameters
|
||
|
|
- Clear separation of Y and UV plane logic
|
||
|
|
|
||
|
|
### Testing
|
||
|
|
**Before**:
|
||
|
|
- Cannot test D3D12 copying without full decoder setup
|
||
|
|
- Cannot mock CUDA operations
|
||
|
|
|
||
|
|
**After**:
|
||
|
|
- Can unit test `D3D12SurfaceHandler` independently
|
||
|
|
- Can test `ExternalMemoryCache` in isolation
|
||
|
|
- Easy to add mock implementations
|
||
|
|
|
||
|
|
### Maintenance
|
||
|
|
**Before**:
|
||
|
|
- Adding D3D11 support requires modifying 500+ line method
|
||
|
|
- Risk of breaking existing D3D12 code
|
||
|
|
|
||
|
|
**After**:
|
||
|
|
- Add new `D3D11SurfaceHandler` class
|
||
|
|
- Existing D3D12 code untouched
|
||
|
|
- Clean separation of concerns
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## File Size Comparison
|
||
|
|
|
||
|
|
| File | Before | After |
|
||
|
|
|------|--------|-------|
|
||
|
|
| NVDECAV1Decoder.cpp | 1,722 lines | ~600 lines |
|
||
|
|
| D3D12SurfaceHandler.cpp | - | ~300 lines |
|
||
|
|
| ExternalMemoryCache.cpp | - | ~200 lines |
|
||
|
|
| **Total** | 1,722 lines | 1,100 lines |
|
||
|
|
|
||
|
|
**Reduction**: 36% code reduction while improving readability
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Implementation Plan
|
||
|
|
|
||
|
|
### Phase 1: Extract D3D12 Handler (2-3 hours)
|
||
|
|
1. Create `D3D12SurfaceHandler.h/.cpp`
|
||
|
|
2. Move D3D12 resource import logic
|
||
|
|
3. Move NV12 plane copying logic
|
||
|
|
4. Test with existing Vav2Player
|
||
|
|
|
||
|
|
**Acceptance Criteria**:
|
||
|
|
- Vav2Player displays video correctly
|
||
|
|
- No memory leaks
|
||
|
|
- Performance same or better
|
||
|
|
|
||
|
|
### Phase 2: Extract External Memory Cache (1-2 hours)
|
||
|
|
1. Create `ExternalMemoryCache.h/.cpp`
|
||
|
|
2. Move external memory caching logic
|
||
|
|
3. Add proper cleanup on resource release
|
||
|
|
4. Test memory management
|
||
|
|
|
||
|
|
**Acceptance Criteria**:
|
||
|
|
- Cache hit/miss working correctly
|
||
|
|
- No memory leaks on repeated loads
|
||
|
|
- Cache cleared on decoder cleanup
|
||
|
|
|
||
|
|
### Phase 3: Refactor Main Decoder (1-2 hours)
|
||
|
|
1. Simplify `DecodeToSurface()` to routing logic
|
||
|
|
2. Extract `DecodePacket()` method
|
||
|
|
3. Extract `GetDecodedFrame()` method
|
||
|
|
4. Extract `CopyToCPUSurface()` method
|
||
|
|
5. Test all surface types
|
||
|
|
|
||
|
|
**Acceptance Criteria**:
|
||
|
|
- All surface types working
|
||
|
|
- Code passes all existing tests
|
||
|
|
- Debug logging reduced
|
||
|
|
|
||
|
|
### Phase 4: Fix NV12 Stride Bug (30 minutes)
|
||
|
|
1. Fix `CopyYPlane()` width parameter
|
||
|
|
2. Fix `CopyUVPlane()` width parameter
|
||
|
|
3. Verify with test video
|
||
|
|
|
||
|
|
**Acceptance Criteria**:
|
||
|
|
- No stripe pattern in displayed video
|
||
|
|
- Correct colors displayed
|
||
|
|
- Performance maintained
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Testing Strategy
|
||
|
|
|
||
|
|
### Unit Tests
|
||
|
|
```cpp
|
||
|
|
TEST(D3D12SurfaceHandler, CopiesNV12FrameCorrectly)
|
||
|
|
{
|
||
|
|
auto handler = CreateTestHandler();
|
||
|
|
auto src_frame = CreateTestNV12Frame(1920, 1080);
|
||
|
|
auto dst_texture = CreateTestD3D12Texture(1920, 1080);
|
||
|
|
|
||
|
|
bool result = handler->CopyNV12Frame(
|
||
|
|
src_frame.device_ptr, src_frame.pitch,
|
||
|
|
dst_texture, 1920, 1080
|
||
|
|
);
|
||
|
|
|
||
|
|
EXPECT_TRUE(result);
|
||
|
|
VerifyNV12Data(dst_texture);
|
||
|
|
}
|
||
|
|
|
||
|
|
TEST(ExternalMemoryCache, ReusesExistingEntry)
|
||
|
|
{
|
||
|
|
auto cache = CreateTestCache();
|
||
|
|
auto resource = CreateTestD3D12Resource();
|
||
|
|
|
||
|
|
CUdeviceptr ptr1, ptr2;
|
||
|
|
cache->GetOrCreateExternalMemory(resource, &ptr1);
|
||
|
|
cache->GetOrCreateExternalMemory(resource, &ptr2);
|
||
|
|
|
||
|
|
EXPECT_EQ(ptr1, ptr2); // Should return same pointer
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
### Integration Tests
|
||
|
|
- Load video file
|
||
|
|
- Decode multiple frames
|
||
|
|
- Verify no memory leaks
|
||
|
|
- Verify correct video display
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Success Criteria
|
||
|
|
|
||
|
|
- [x] Design document complete
|
||
|
|
- [ ] Phase 1 complete: D3D12SurfaceHandler working
|
||
|
|
- [ ] Phase 2 complete: ExternalMemoryCache working
|
||
|
|
- [ ] Phase 3 complete: Main decoder simplified
|
||
|
|
- [ ] Phase 4 complete: NV12 stripe bug fixed
|
||
|
|
- [ ] All existing tests passing
|
||
|
|
- [ ] No performance regression
|
||
|
|
- [ ] Code review passed
|
||
|
|
- [ ] Documentation updated
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
**Next Step**: Start Phase 1 - Extract D3D12SurfaceHandler
|
||
|
|
|
||
|
|
**Last Updated**: 2025-10-03
|