Files
video-v1/vav2/docs/working/D3D12VideoRenderer_Architecture_Final.md

539 lines
20 KiB
Markdown

# D3D12VideoRenderer Layered Architecture - Final Design v3
**Date**: 2025-10-06
**Status**: ✅ **FINAL APPROVED DESIGN** - Format + Method Naming Convention
**Supersedes**: SimpleGPURenderer_Layered_Architecture_Design_v2.md
**Key Decision**: Use Surface/Upload/Direct method naming (NO Hardware/Software)
---
## 🎯 Final Naming Convention
**Format**: `{PixelFormat}{Method}Backend`
**Approved Methods**:
- **Surface**: CUDA Surface Objects for tiled texture write
- **Upload**: CPU upload buffers + GPU compute shader
- **Direct**: Direct GPU rendering (future)
**Rejected Methods**:
- ❌ Hardware/Software - Too implementation-focused, not descriptive
---
## 📊 Final Backend Architecture
```
D3D12VideoRenderer (orchestrator)
├── RGBASurfaceBackend (handles VAVCORE_COLOR_SPACE_RGB32)
├── YUV420PUploadBackend (handles VAVCORE_COLOR_SPACE_YUV420P)
└── NV12DirectBackend (handles VAVCORE_COLOR_SPACE_NV12) [future]
```
**File Mapping**:
| Old Code | New Backend | Format + Method | Implementation |
|----------|-------------|-----------------|----------------|
| SimpleGPURenderer RGBA | `RGBASurfaceBackend` | RGB32 + Surface | NVDEC → CUDA RGBA → surf2Dwrite() → D3D12 |
| D3D12VideoRenderer (old) | `YUV420PUploadBackend` | YUV420P + Upload | dav1d → CPU upload → GPU YUV→RGB shader |
| Future NV12 | `NV12DirectBackend` | NV12 + Direct | NVDEC → D3D12 NV12 → Direct rendering |
**Benefits**:
-**Format clarity**: First word = pixel format (RGBA, YUV420P, NV12)
-**Method clarity**: Second word = rendering method (Surface, Upload, Direct)
-**Direct mapping**: Easy to map `VavCoreColorSpace` → backend class
-**No ambiguity**: "Surface" = CUDA Surface Objects, "Upload" = CPU buffers, "Direct" = GPU-direct
**Code Example**:
```cpp
void D3D12VideoRenderer::SelectBackend(const VavCoreVideoFrame& frame) {
switch (frame.color_space) {
case VAVCORE_COLOR_SPACE_RGB32:
m_activeBackend = m_rgbaSurfaceBackend.get(); // Surface method
break;
case VAVCORE_COLOR_SPACE_YUV420P:
m_activeBackend = m_yuv420pUploadBackend.get(); // Upload method
break;
case VAVCORE_COLOR_SPACE_NV12:
m_activeBackend = m_nv12DirectBackend.get(); // Direct method
break;
}
}
```
---
## 🚫 Rejected Naming Approaches
### ❌ Hardware/Software Naming
```cpp
// REJECTED - Too implementation-focused
RGBAHardwareBackend // What "hardware"? GPU? NVDEC? Confusing
YUV420PSoftwareBackend // Still uses GPU shaders, not really "software"
```
**Why rejected**: "Hardware/Software" describes implementation internals, not the rendering method visible to users
---
## ✅ Why Surface/Upload/Direct Works Better
**Surface (CUDA Surface Objects)**:
- Describes the actual mechanism: Writing to D3D12 tiled textures via CUDA surfaces
- Clear technical distinction from linear buffers
- Indicates GPU-direct write capability
**Upload (CPU Upload Buffers)**:
- Describes the actual mechanism: CPU writes to upload heaps → GPU copy
- Familiar concept in graphics programming
- Indicates CPU involvement in data transfer
**Direct (Direct GPU Rendering)**:
- Describes the actual mechanism: GPU renders directly without format conversion
- Future-proof naming for hardware-decoded NV12
- Indicates zero-copy GPU pipeline
---
## 📐 Architecture Diagram
```
┌─────────────────────────────────────────────────────────────┐
│ IVideoRenderer │
│ (Public API - unchanged) │
└─────────────────────────────────────────────────────────────┘
│ implements
┌─────────────────────────────────────────────────────────────┐
│ D3D12VideoRenderer │
│ (Orchestrator - format-agnostic) │
│ │
│ Responsibilities: │
│ - D3D12 device, command queue, swap chain │
│ - Backend selection by color_space │
│ - Delegation to active backend │
│ - ~300 lines │
└─────────────────────────────────────────────────────────────┘
│ delegates to
┌───────────────────┴───────────────────────────┐
│ │ │
┌───────▼─────────┐ ┌──────▼────────────┐ ┌──────▼──────────┐
│ RGBASurface │ │ YUV420PUpload │ │ NV12Direct │
│ Backend │ │ Backend │ │ Backend │
│ │ │ │ │ │
│ Format: RGB32 │ │ Format: YUV420P │ │ Format: NV12 │
│ Method: Surface │ │ Method: Upload │ │ Method: Direct │
│ │ │ │ │ │
│ Source: │ │ Source: │ │ Source: │
│ SimpleGPU │ │ D3D12Video │ │ Future │
│ Renderer │ │ Renderer (old) │ │ │
│ RGBA path │ │ │ │ │
│ │ │ │ │ │
│ Pipeline: │ │ Pipeline: │ │ Pipeline: │
│ NVDEC NV12 → │ │ dav1d YUV → │ │ NVDEC NV12 → │
│ CUDA RGBA → │ │ CPU upload → │ │ D3D12 NV12 → │
│ surf2Dwrite() → │ │ GPU YUV→RGB → │ │ Direct render → │
│ D3D12 RGBA → │ │ Render │ │ Present │
│ Sampling │ │ │ │ │
│ │ │ │ │ │
│ ~400 lines │ │ ~2000 lines │ │ TBD │
└─────────────────┘ └───────────────────┘ └─────────────────┘
```
---
## 📂 Final File Structure
```
src/Rendering/
├── IVideoRenderer.h # Public interface
├── D3D12VideoRenderer.h/.cpp # Orchestrator (~300 lines)
├── IVideoBackend.h # Internal backend interface
├── RGBASurfaceBackend.h/.cpp # RGBA Surface backend (~400 lines)
│ │ Extracted from: SimpleGPURenderer RGBA path
│ │ Handles: VAVCORE_COLOR_SPACE_RGB32
│ │ Method: CUDA Surface Objects (surf2Dwrite)
│ │ Pipeline: NVDEC → CUDA RGBA → surf2Dwrite() → D3D12 RGBA → sampling
├── YUV420PUploadBackend.h/.cpp # YUV420P Upload backend (~2000 lines)
│ │ Renamed from: D3D12VideoRenderer (old)
│ │ Handles: VAVCORE_COLOR_SPACE_YUV420P
│ │ Method: CPU upload buffers + GPU shader
│ │ Pipeline: dav1d → CPU upload → GPU YUV→RGB shader → render
└── NV12DirectBackend.h/.cpp # NV12 Direct backend (future)
│ Handles: VAVCORE_COLOR_SPACE_NV12
│ Method: Direct GPU rendering (zero-copy)
│ Pipeline: NVDEC → D3D12 NV12 → Direct render → present
Legacy/ (archived)
└── SimpleGPURenderer_Legacy.h/.cpp # Old mixed-format renderer
```
---
## 🎯 Backend Responsibilities
### IVideoBackend Interface
```cpp
class IVideoBackend {
public:
virtual ~IVideoBackend() = default;
// Lifecycle
virtual HRESULT Initialize(
ID3D12Device* device,
ID3D12CommandQueue* commandQueue,
uint32_t width, uint32_t height) = 0;
virtual void Shutdown() = 0;
virtual bool IsInitialized() const = 0;
// Video texture for CUDA interop (nullptr if not applicable)
virtual HRESULT CreateVideoTexture(uint32_t width, uint32_t height) = 0;
virtual ID3D12Resource* GetVideoTexture() const = 0;
// Render frame to back buffer
virtual HRESULT RenderToBackBuffer(
const VavCoreVideoFrame& frame,
ID3D12Resource* backBuffer,
ID3D12GraphicsCommandList* commandList) = 0;
// Format this backend handles
virtual VavCoreColorSpace GetSupportedFormat() const = 0;
};
```
---
### RGBASurfaceBackend
**Handles**: `VAVCORE_COLOR_SPACE_RGB32`
**Method**: CUDA Surface Objects (surf2Dwrite)
```cpp
class RGBASurfaceBackend : public IVideoBackend {
public:
VavCoreColorSpace GetSupportedFormat() const override {
return VAVCORE_COLOR_SPACE_RGB32;
}
HRESULT CreateVideoTexture(uint32_t width, uint32_t height) override;
// Creates: DXGI_FORMAT_R8G8B8A8_UNORM texture with D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS
// Enables CUDA Surface Object creation via cudaExternalMemoryGetMappedMipmappedArray
HRESULT RenderToBackBuffer(...) override;
// Pipeline: Simple RGBA texture sampling (no YUV conversion needed)
private:
ComPtr<ID3D12Resource> m_rgbaTexture; // Tiled RGBA texture
ComPtr<ID3D12PipelineState> m_pipelineState;
ComPtr<ID3D12RootSignature> m_rootSignature;
// Simple texture sampling shader (no YUV conversion)
};
```
**Source**: Extracted from `SimpleGPURenderer` RGBA path
**Size**: ~400 lines
**Key Feature**: Uses CUDA Surface Objects for tiled texture write (surf2Dwrite)
---
### YUV420PUploadBackend
**Handles**: `VAVCORE_COLOR_SPACE_YUV420P`
**Method**: CPU upload buffers + GPU shader
```cpp
class YUV420PUploadBackend : public IVideoBackend {
public:
VavCoreColorSpace GetSupportedFormat() const override {
return VAVCORE_COLOR_SPACE_YUV420P;
}
HRESULT CreateVideoTexture(uint32_t width, uint32_t height) override;
// Creates: Separate Y/U/V textures + CPU upload buffers (ring buffer system)
HRESULT RenderToBackBuffer(...) override;
// Pipeline:
// 1. CPU writes to upload buffers (ring buffer system, persistent mapped memory)
// 2. GPU copies upload → textures (CopyTextureRegion)
// 3. YUV→RGB compute shader (GPU conversion)
// 4. Render to back buffer
// Legacy D3D12VideoRenderer methods (preserved for compatibility)
uint8_t* GetYMappedBuffer(uint32_t bufferIndex) const;
uint8_t* GetUMappedBuffer(uint32_t bufferIndex) const;
uint8_t* GetVMappedBuffer(uint32_t bufferIndex) const;
private:
// Ring buffer system (from old D3D12VideoRenderer)
struct RingBufferSlot {
ComPtr<ID3D12Resource> yUploadBuffer; // D3D12_HEAP_TYPE_UPLOAD
ComPtr<ID3D12Resource> uUploadBuffer;
ComPtr<ID3D12Resource> vUploadBuffer;
uint8_t* yMappedData; // Persistent CPU mapping
uint8_t* uMappedData;
uint8_t* vMappedData;
};
std::vector<RingBufferSlot> m_ringBuffers;
ComPtr<ID3D12Resource> m_yTexture; // GPU textures (D3D12_HEAP_TYPE_DEFAULT)
ComPtr<ID3D12Resource> m_uTexture;
ComPtr<ID3D12Resource> m_vTexture;
ComPtr<ID3D12PipelineState> m_yuvToRgbPipeline; // YUV→RGB compute shader
};
```
**Source**: Renamed from `D3D12VideoRenderer` (old)
**Size**: ~2000 lines (preserves all existing logic)
**Key Feature**: Persistent CPU mapped upload buffers with ring buffer system
---
### NV12DirectBackend (Future)
**Handles**: `VAVCORE_COLOR_SPACE_NV12`
**Method**: Direct GPU rendering (zero-copy)
```cpp
class NV12DirectBackend : public IVideoBackend {
public:
VavCoreColorSpace GetSupportedFormat() const override {
return VAVCORE_COLOR_SPACE_NV12;
}
HRESULT CreateVideoTexture(uint32_t width, uint32_t height) override;
// Creates: DXGI_FORMAT_NV12 texture (when D3D12 tiled NV12 is viable)
// Zero-copy: NVDEC writes directly to D3D12 texture
HRESULT RenderToBackBuffer(...) override;
// Pipeline: NVDEC → D3D12 NV12 → Direct YUV→RGB shader → Render
// No CPU involvement, no format conversion, pure GPU path
private:
ComPtr<ID3D12Resource> m_nv12Texture; // Tiled NV12 texture
ComPtr<ID3D12PipelineState> m_nv12ToRgbPipeline; // Direct YUV→RGB shader
};
```
**Status**: Not implemented yet (requires D3D12 tiled NV12 support resolution)
**Key Feature**: Zero-copy GPU pipeline (NVDEC → D3D12 direct write)
---
## 🔄 Backend Selection Logic
```cpp
class D3D12VideoRenderer : public IVideoRenderer {
public:
HRESULT RenderVideoFrame(const VavCoreVideoFrame& frame) override {
// Select backend based on frame color space
IVideoBackend* backend = SelectBackend(frame.color_space);
if (!backend) {
return E_FAIL;
}
// Get current back buffer
ID3D12Resource* backBuffer = m_renderTargets[m_frameIndex].Get();
// Delegate rendering to backend
return backend->RenderToBackBuffer(frame, backBuffer, m_commandList.Get());
}
private:
IVideoBackend* SelectBackend(VavCoreColorSpace colorSpace) {
switch (colorSpace) {
case VAVCORE_COLOR_SPACE_RGB32:
if (!m_rgbaSurfaceBackend) {
m_rgbaSurfaceBackend = std::make_unique<RGBASurfaceBackend>();
m_rgbaSurfaceBackend->Initialize(m_device.Get(), m_commandQueue.Get(),
m_width, m_height);
}
return m_rgbaSurfaceBackend.get();
case VAVCORE_COLOR_SPACE_YUV420P:
if (!m_yuv420pUploadBackend) {
m_yuv420pUploadBackend = std::make_unique<YUV420PUploadBackend>();
m_yuv420pUploadBackend->Initialize(m_device.Get(), m_commandQueue.Get(),
m_width, m_height);
}
return m_yuv420pUploadBackend.get();
case VAVCORE_COLOR_SPACE_NV12:
// Future: NV12DirectBackend
if (!m_nv12DirectBackend) {
m_nv12DirectBackend = std::make_unique<NV12DirectBackend>();
m_nv12DirectBackend->Initialize(m_device.Get(), m_commandQueue.Get(),
m_width, m_height);
}
return m_nv12DirectBackend.get();
default:
return nullptr;
}
}
std::unique_ptr<RGBASurfaceBackend> m_rgbaSurfaceBackend; // Surface method
std::unique_ptr<YUV420PUploadBackend> m_yuv420pUploadBackend; // Upload method
std::unique_ptr<NV12DirectBackend> m_nv12DirectBackend; // Direct method (future)
};
```
---
## 📊 Naming Consistency Table
| Backend Class | Format Enum | Method | Pixel Layout | Pipeline | File Origin |
|---------------|-------------|--------|--------------|----------|-------------|
| `RGBASurfaceBackend` | `VAVCORE_COLOR_SPACE_RGB32` | Surface | RGBA (4 bytes/pixel) | NVDEC → CUDA surf2Dwrite() → D3D12 | SimpleGPURenderer |
| `YUV420PUploadBackend` | `VAVCORE_COLOR_SPACE_YUV420P` | Upload | Planar YUV 4:2:0 | dav1d → CPU upload → GPU shader | D3D12VideoRenderer (old) |
| `NV12DirectBackend` | `VAVCORE_COLOR_SPACE_NV12` | Direct | Semi-planar NV12 | NVDEC → D3D12 direct → Render | Future |
**Naming Rule**: `{PixelFormat}{Method}Backend`
- **Format-first**: Clear pixel format (RGBA, YUV420P, NV12)
- **Method-second**: Rendering method (Surface, Upload, Direct)
- **Direct 1:1 mapping**: VavCoreColorSpace enum → backend class
- **No ambiguity**: Method names describe actual mechanism, not implementation details
---
## 📝 Implementation Plan
### Phase 1: Create Backend Infrastructure
**Goal**: Establish base interfaces and RGBA Surface backend
**Tasks**:
1. Create `IVideoBackend.h` interface
2. Create `RGBASurfaceBackend.h/.cpp`
3. Extract RGBA Surface logic from SimpleGPURenderer
4. Test RGBASurfaceBackend independently
**Estimated Time**: 2 hours
---
### Phase 2: Transform D3D12VideoRenderer → YUV420PUploadBackend
**Goal**: Repurpose existing code as Upload backend
**Tasks**:
1. Rename files: `D3D12VideoRenderer.*``YUV420PUploadBackend.*`
2. Rename class: `D3D12VideoRenderer``YUV420PUploadBackend`
3. Implement `IVideoBackend` interface
4. Remove swap chain ownership (delegate to orchestrator)
5. Test YUV420PUploadBackend independently
**Estimated Time**: 1.5 hours
---
### Phase 3: Create New D3D12VideoRenderer Orchestrator
**Goal**: Build thin orchestrator from scratch
**Tasks**:
1. Create new `D3D12VideoRenderer.h/.cpp`
2. Implement IVideoRenderer interface
3. Implement backend selection logic
4. Test with RGBASurfaceBackend
5. Test with YUV420PUploadBackend
6. Test dynamic backend switching
**Estimated Time**: 1.5 hours
---
### Phase 4: Archive Legacy Code
**Goal**: Clean up old SimpleGPURenderer
**Tasks**:
1. Create `src/Rendering/Legacy/` directory
2. Move `SimpleGPURenderer``SimpleGPURenderer_Legacy`
3. Update all references to new `D3D12VideoRenderer`
4. Verify all tests pass
5. Update documentation
**Estimated Time**: 1 hour
**Total Estimated Time**: 6 hours
---
## ✅ Success Criteria
### Functional
- ✅ NVDEC RGBA rendering works (via RGBASurfaceBackend)
- ✅ CPU YUV rendering works (via YUV420PUploadBackend)
- ✅ Backend auto-selection by color_space
- ✅ No visual regressions
- ✅ All existing tests pass
### Code Quality
- ✅ D3D12VideoRenderer < 400 lines
- ✅ Each backend handles exactly 1 format with 1 method
- ✅ Consistent format+method naming (Surface/Upload/Direct)
- ✅ No format-specific if/else in orchestrator
### Maintainability
- ✅ Adding new format = add `{Format}{Method}Backend` class only
- ✅ Each backend independently testable
- ✅ Clear mapping: `VavCoreColorSpace` → Backend class → Rendering method
---
## 🎯 Why This Design Wins
### 1. Naming Clarity
```cpp
// Clear from class name what format AND method it uses:
RGBASurfaceBackend RGB32 format + CUDA Surface write
YUV420PUploadBackend YUV420P format + CPU upload buffers
NV12DirectBackend NV12 format + Direct GPU rendering
```
### 2. Code Reuse
```cpp
// Zero rewrite of proven code:
D3D12VideoRenderer (old, 2581 lines) YUV420PUploadBackend (2000 lines, same logic)
```
### 3. Extensibility
```cpp
// Adding new format+method is trivial:
case VAVCORE_COLOR_SPACE_VP9:
return m_vp9UploadBackend.get(); // Just add one line!
```
### 4. Testability
```cpp
// Each backend tests independently:
TEST(RGBASurfaceBackend, RenderFrame) {
VavCoreVideoFrame frame;
frame.color_space = VAVCORE_COLOR_SPACE_RGB32;
// Test RGBA Surface rendering in isolation
}
```
---
## 📚 References
- **VavCore Color Space**: `VavCore/VavCore.h``VavCoreColorSpace` enum
- **Old Code**: `D3D12VideoRenderer.cpp` (2581 lines, YUV420P)
- **Old Code**: `SimpleGPURenderer.cpp` (2105 lines, mixed RGBA/YUV)
- **Previous Design**: `SimpleGPURenderer_Layered_Architecture_Design_v2.md`
---
**Status**: ✅ **FINAL DESIGN APPROVED (v3)**
**Key Decision**: Format + Method naming (`{PixelFormat}{Method}Backend`)
**Approved Methods**: Surface, Upload, Direct (NO Hardware/Software)
**Next Step**: Begin Phase 1 - Create IVideoBackend + RGBASurfaceBackend
**Total Estimated Time**: 6 hours (4 phases)
---
**Document Revision History**:
- **v1**: Initial format-based naming (CPUVideoBackend - rejected)
- **v2**: Reuse D3D12VideoRenderer as backend (approved structure)
- **v3**: Final naming with Surface/Upload/Direct methods (current) ✅