# D3D12VideoRenderer Layered Architecture - Final Design v3 **Date**: 2025-10-06 **Status**: ✅ **FINAL APPROVED DESIGN** - Format + Method Naming Convention **Supersedes**: SimpleGPURenderer_Layered_Architecture_Design_v2.md **Key Decision**: Use Surface/Upload/Direct method naming (NO Hardware/Software) --- ## 🎯 Final Naming Convention **Format**: `{PixelFormat}{Method}Backend` **Approved Methods**: - **Surface**: CUDA Surface Objects for tiled texture write - **Upload**: CPU upload buffers + GPU compute shader - **Direct**: Direct GPU rendering (future) **Rejected Methods**: - ❌ Hardware/Software - Too implementation-focused, not descriptive --- ## 📊 Final Backend Architecture ``` D3D12VideoRenderer (orchestrator) ├── RGBASurfaceBackend (handles VAVCORE_COLOR_SPACE_RGB32) ├── YUV420PUploadBackend (handles VAVCORE_COLOR_SPACE_YUV420P) └── NV12DirectBackend (handles VAVCORE_COLOR_SPACE_NV12) [future] ``` **File Mapping**: | Old Code | New Backend | Format + Method | Implementation | |----------|-------------|-----------------|----------------| | SimpleGPURenderer RGBA | `RGBASurfaceBackend` | RGB32 + Surface | NVDEC → CUDA RGBA → surf2Dwrite() → D3D12 | | D3D12VideoRenderer (old) | `YUV420PUploadBackend` | YUV420P + Upload | dav1d → CPU upload → GPU YUV→RGB shader | | Future NV12 | `NV12DirectBackend` | NV12 + Direct | NVDEC → D3D12 NV12 → Direct rendering | **Benefits**: - ✅ **Format clarity**: First word = pixel format (RGBA, YUV420P, NV12) - ✅ **Method clarity**: Second word = rendering method (Surface, Upload, Direct) - ✅ **Direct mapping**: Easy to map `VavCoreColorSpace` → backend class - ✅ **No ambiguity**: "Surface" = CUDA Surface Objects, "Upload" = CPU buffers, "Direct" = GPU-direct **Code Example**: ```cpp void D3D12VideoRenderer::SelectBackend(const VavCoreVideoFrame& frame) { switch (frame.color_space) { case VAVCORE_COLOR_SPACE_RGB32: m_activeBackend = m_rgbaSurfaceBackend.get(); // Surface method break; case VAVCORE_COLOR_SPACE_YUV420P: m_activeBackend = m_yuv420pUploadBackend.get(); // Upload method break; case VAVCORE_COLOR_SPACE_NV12: m_activeBackend = m_nv12DirectBackend.get(); // Direct method break; } } ``` --- ## 🚫 Rejected Naming Approaches ### ❌ Hardware/Software Naming ```cpp // REJECTED - Too implementation-focused RGBAHardwareBackend // What "hardware"? GPU? NVDEC? Confusing YUV420PSoftwareBackend // Still uses GPU shaders, not really "software" ``` **Why rejected**: "Hardware/Software" describes implementation internals, not the rendering method visible to users --- ## ✅ Why Surface/Upload/Direct Works Better **Surface (CUDA Surface Objects)**: - Describes the actual mechanism: Writing to D3D12 tiled textures via CUDA surfaces - Clear technical distinction from linear buffers - Indicates GPU-direct write capability **Upload (CPU Upload Buffers)**: - Describes the actual mechanism: CPU writes to upload heaps → GPU copy - Familiar concept in graphics programming - Indicates CPU involvement in data transfer **Direct (Direct GPU Rendering)**: - Describes the actual mechanism: GPU renders directly without format conversion - Future-proof naming for hardware-decoded NV12 - Indicates zero-copy GPU pipeline --- ## 📐 Architecture Diagram ``` ┌─────────────────────────────────────────────────────────────┐ │ IVideoRenderer │ │ (Public API - unchanged) │ └─────────────────────────────────────────────────────────────┘ ▲ │ implements │ ┌─────────────────────────────────────────────────────────────┐ │ D3D12VideoRenderer │ │ (Orchestrator - format-agnostic) │ │ │ │ Responsibilities: │ │ - D3D12 device, command queue, swap chain │ │ - Backend selection by color_space │ │ - Delegation to active backend │ │ - ~300 lines │ └─────────────────────────────────────────────────────────────┘ │ │ delegates to ▼ ┌───────────────────┴───────────────────────────┐ │ │ │ ┌───────▼─────────┐ ┌──────▼────────────┐ ┌──────▼──────────┐ │ RGBASurface │ │ YUV420PUpload │ │ NV12Direct │ │ Backend │ │ Backend │ │ Backend │ │ │ │ │ │ │ │ Format: RGB32 │ │ Format: YUV420P │ │ Format: NV12 │ │ Method: Surface │ │ Method: Upload │ │ Method: Direct │ │ │ │ │ │ │ │ Source: │ │ Source: │ │ Source: │ │ SimpleGPU │ │ D3D12Video │ │ Future │ │ Renderer │ │ Renderer (old) │ │ │ │ RGBA path │ │ │ │ │ │ │ │ │ │ │ │ Pipeline: │ │ Pipeline: │ │ Pipeline: │ │ NVDEC NV12 → │ │ dav1d YUV → │ │ NVDEC NV12 → │ │ CUDA RGBA → │ │ CPU upload → │ │ D3D12 NV12 → │ │ surf2Dwrite() → │ │ GPU YUV→RGB → │ │ Direct render → │ │ D3D12 RGBA → │ │ Render │ │ Present │ │ Sampling │ │ │ │ │ │ │ │ │ │ │ │ ~400 lines │ │ ~2000 lines │ │ TBD │ └─────────────────┘ └───────────────────┘ └─────────────────┘ ``` --- ## 📂 Final File Structure ``` src/Rendering/ ├── IVideoRenderer.h # Public interface ├── D3D12VideoRenderer.h/.cpp # Orchestrator (~300 lines) ├── IVideoBackend.h # Internal backend interface │ ├── RGBASurfaceBackend.h/.cpp # RGBA Surface backend (~400 lines) │ │ Extracted from: SimpleGPURenderer RGBA path │ │ Handles: VAVCORE_COLOR_SPACE_RGB32 │ │ Method: CUDA Surface Objects (surf2Dwrite) │ │ Pipeline: NVDEC → CUDA RGBA → surf2Dwrite() → D3D12 RGBA → sampling │ ├── YUV420PUploadBackend.h/.cpp # YUV420P Upload backend (~2000 lines) │ │ Renamed from: D3D12VideoRenderer (old) │ │ Handles: VAVCORE_COLOR_SPACE_YUV420P │ │ Method: CPU upload buffers + GPU shader │ │ Pipeline: dav1d → CPU upload → GPU YUV→RGB shader → render │ └── NV12DirectBackend.h/.cpp # NV12 Direct backend (future) │ Handles: VAVCORE_COLOR_SPACE_NV12 │ Method: Direct GPU rendering (zero-copy) │ Pipeline: NVDEC → D3D12 NV12 → Direct render → present Legacy/ (archived) └── SimpleGPURenderer_Legacy.h/.cpp # Old mixed-format renderer ``` --- ## 🎯 Backend Responsibilities ### IVideoBackend Interface ```cpp class IVideoBackend { public: virtual ~IVideoBackend() = default; // Lifecycle virtual HRESULT Initialize( ID3D12Device* device, ID3D12CommandQueue* commandQueue, uint32_t width, uint32_t height) = 0; virtual void Shutdown() = 0; virtual bool IsInitialized() const = 0; // Video texture for CUDA interop (nullptr if not applicable) virtual HRESULT CreateVideoTexture(uint32_t width, uint32_t height) = 0; virtual ID3D12Resource* GetVideoTexture() const = 0; // Render frame to back buffer virtual HRESULT RenderToBackBuffer( const VavCoreVideoFrame& frame, ID3D12Resource* backBuffer, ID3D12GraphicsCommandList* commandList) = 0; // Format this backend handles virtual VavCoreColorSpace GetSupportedFormat() const = 0; }; ``` --- ### RGBASurfaceBackend **Handles**: `VAVCORE_COLOR_SPACE_RGB32` **Method**: CUDA Surface Objects (surf2Dwrite) ```cpp class RGBASurfaceBackend : public IVideoBackend { public: VavCoreColorSpace GetSupportedFormat() const override { return VAVCORE_COLOR_SPACE_RGB32; } HRESULT CreateVideoTexture(uint32_t width, uint32_t height) override; // Creates: DXGI_FORMAT_R8G8B8A8_UNORM texture with D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS // Enables CUDA Surface Object creation via cudaExternalMemoryGetMappedMipmappedArray HRESULT RenderToBackBuffer(...) override; // Pipeline: Simple RGBA texture sampling (no YUV conversion needed) private: ComPtr m_rgbaTexture; // Tiled RGBA texture ComPtr m_pipelineState; ComPtr m_rootSignature; // Simple texture sampling shader (no YUV conversion) }; ``` **Source**: Extracted from `SimpleGPURenderer` RGBA path **Size**: ~400 lines **Key Feature**: Uses CUDA Surface Objects for tiled texture write (surf2Dwrite) --- ### YUV420PUploadBackend **Handles**: `VAVCORE_COLOR_SPACE_YUV420P` **Method**: CPU upload buffers + GPU shader ```cpp class YUV420PUploadBackend : public IVideoBackend { public: VavCoreColorSpace GetSupportedFormat() const override { return VAVCORE_COLOR_SPACE_YUV420P; } HRESULT CreateVideoTexture(uint32_t width, uint32_t height) override; // Creates: Separate Y/U/V textures + CPU upload buffers (ring buffer system) HRESULT RenderToBackBuffer(...) override; // Pipeline: // 1. CPU writes to upload buffers (ring buffer system, persistent mapped memory) // 2. GPU copies upload → textures (CopyTextureRegion) // 3. YUV→RGB compute shader (GPU conversion) // 4. Render to back buffer // Legacy D3D12VideoRenderer methods (preserved for compatibility) uint8_t* GetYMappedBuffer(uint32_t bufferIndex) const; uint8_t* GetUMappedBuffer(uint32_t bufferIndex) const; uint8_t* GetVMappedBuffer(uint32_t bufferIndex) const; private: // Ring buffer system (from old D3D12VideoRenderer) struct RingBufferSlot { ComPtr yUploadBuffer; // D3D12_HEAP_TYPE_UPLOAD ComPtr uUploadBuffer; ComPtr vUploadBuffer; uint8_t* yMappedData; // Persistent CPU mapping uint8_t* uMappedData; uint8_t* vMappedData; }; std::vector m_ringBuffers; ComPtr m_yTexture; // GPU textures (D3D12_HEAP_TYPE_DEFAULT) ComPtr m_uTexture; ComPtr m_vTexture; ComPtr m_yuvToRgbPipeline; // YUV→RGB compute shader }; ``` **Source**: Renamed from `D3D12VideoRenderer` (old) **Size**: ~2000 lines (preserves all existing logic) **Key Feature**: Persistent CPU mapped upload buffers with ring buffer system --- ### NV12DirectBackend (Future) **Handles**: `VAVCORE_COLOR_SPACE_NV12` **Method**: Direct GPU rendering (zero-copy) ```cpp class NV12DirectBackend : public IVideoBackend { public: VavCoreColorSpace GetSupportedFormat() const override { return VAVCORE_COLOR_SPACE_NV12; } HRESULT CreateVideoTexture(uint32_t width, uint32_t height) override; // Creates: DXGI_FORMAT_NV12 texture (when D3D12 tiled NV12 is viable) // Zero-copy: NVDEC writes directly to D3D12 texture HRESULT RenderToBackBuffer(...) override; // Pipeline: NVDEC → D3D12 NV12 → Direct YUV→RGB shader → Render // No CPU involvement, no format conversion, pure GPU path private: ComPtr m_nv12Texture; // Tiled NV12 texture ComPtr m_nv12ToRgbPipeline; // Direct YUV→RGB shader }; ``` **Status**: Not implemented yet (requires D3D12 tiled NV12 support resolution) **Key Feature**: Zero-copy GPU pipeline (NVDEC → D3D12 direct write) --- ## 🔄 Backend Selection Logic ```cpp class D3D12VideoRenderer : public IVideoRenderer { public: HRESULT RenderVideoFrame(const VavCoreVideoFrame& frame) override { // Select backend based on frame color space IVideoBackend* backend = SelectBackend(frame.color_space); if (!backend) { return E_FAIL; } // Get current back buffer ID3D12Resource* backBuffer = m_renderTargets[m_frameIndex].Get(); // Delegate rendering to backend return backend->RenderToBackBuffer(frame, backBuffer, m_commandList.Get()); } private: IVideoBackend* SelectBackend(VavCoreColorSpace colorSpace) { switch (colorSpace) { case VAVCORE_COLOR_SPACE_RGB32: if (!m_rgbaSurfaceBackend) { m_rgbaSurfaceBackend = std::make_unique(); m_rgbaSurfaceBackend->Initialize(m_device.Get(), m_commandQueue.Get(), m_width, m_height); } return m_rgbaSurfaceBackend.get(); case VAVCORE_COLOR_SPACE_YUV420P: if (!m_yuv420pUploadBackend) { m_yuv420pUploadBackend = std::make_unique(); m_yuv420pUploadBackend->Initialize(m_device.Get(), m_commandQueue.Get(), m_width, m_height); } return m_yuv420pUploadBackend.get(); case VAVCORE_COLOR_SPACE_NV12: // Future: NV12DirectBackend if (!m_nv12DirectBackend) { m_nv12DirectBackend = std::make_unique(); m_nv12DirectBackend->Initialize(m_device.Get(), m_commandQueue.Get(), m_width, m_height); } return m_nv12DirectBackend.get(); default: return nullptr; } } std::unique_ptr m_rgbaSurfaceBackend; // Surface method std::unique_ptr m_yuv420pUploadBackend; // Upload method std::unique_ptr m_nv12DirectBackend; // Direct method (future) }; ``` --- ## 📊 Naming Consistency Table | Backend Class | Format Enum | Method | Pixel Layout | Pipeline | File Origin | |---------------|-------------|--------|--------------|----------|-------------| | `RGBASurfaceBackend` | `VAVCORE_COLOR_SPACE_RGB32` | Surface | RGBA (4 bytes/pixel) | NVDEC → CUDA surf2Dwrite() → D3D12 | SimpleGPURenderer | | `YUV420PUploadBackend` | `VAVCORE_COLOR_SPACE_YUV420P` | Upload | Planar YUV 4:2:0 | dav1d → CPU upload → GPU shader | D3D12VideoRenderer (old) | | `NV12DirectBackend` | `VAVCORE_COLOR_SPACE_NV12` | Direct | Semi-planar NV12 | NVDEC → D3D12 direct → Render | Future | **Naming Rule**: `{PixelFormat}{Method}Backend` - **Format-first**: Clear pixel format (RGBA, YUV420P, NV12) - **Method-second**: Rendering method (Surface, Upload, Direct) - **Direct 1:1 mapping**: VavCoreColorSpace enum → backend class - **No ambiguity**: Method names describe actual mechanism, not implementation details --- ## 📝 Implementation Plan ### Phase 1: Create Backend Infrastructure **Goal**: Establish base interfaces and RGBA Surface backend **Tasks**: 1. Create `IVideoBackend.h` interface 2. Create `RGBASurfaceBackend.h/.cpp` 3. Extract RGBA Surface logic from SimpleGPURenderer 4. Test RGBASurfaceBackend independently **Estimated Time**: 2 hours --- ### Phase 2: Transform D3D12VideoRenderer → YUV420PUploadBackend **Goal**: Repurpose existing code as Upload backend **Tasks**: 1. Rename files: `D3D12VideoRenderer.*` → `YUV420PUploadBackend.*` 2. Rename class: `D3D12VideoRenderer` → `YUV420PUploadBackend` 3. Implement `IVideoBackend` interface 4. Remove swap chain ownership (delegate to orchestrator) 5. Test YUV420PUploadBackend independently **Estimated Time**: 1.5 hours --- ### Phase 3: Create New D3D12VideoRenderer Orchestrator **Goal**: Build thin orchestrator from scratch **Tasks**: 1. Create new `D3D12VideoRenderer.h/.cpp` 2. Implement IVideoRenderer interface 3. Implement backend selection logic 4. Test with RGBASurfaceBackend 5. Test with YUV420PUploadBackend 6. Test dynamic backend switching **Estimated Time**: 1.5 hours --- ### Phase 4: Archive Legacy Code **Goal**: Clean up old SimpleGPURenderer **Tasks**: 1. Create `src/Rendering/Legacy/` directory 2. Move `SimpleGPURenderer` → `SimpleGPURenderer_Legacy` 3. Update all references to new `D3D12VideoRenderer` 4. Verify all tests pass 5. Update documentation **Estimated Time**: 1 hour **Total Estimated Time**: 6 hours --- ## ✅ Success Criteria ### Functional - ✅ NVDEC RGBA rendering works (via RGBASurfaceBackend) - ✅ CPU YUV rendering works (via YUV420PUploadBackend) - ✅ Backend auto-selection by color_space - ✅ No visual regressions - ✅ All existing tests pass ### Code Quality - ✅ D3D12VideoRenderer < 400 lines - ✅ Each backend handles exactly 1 format with 1 method - ✅ Consistent format+method naming (Surface/Upload/Direct) - ✅ No format-specific if/else in orchestrator ### Maintainability - ✅ Adding new format = add `{Format}{Method}Backend` class only - ✅ Each backend independently testable - ✅ Clear mapping: `VavCoreColorSpace` → Backend class → Rendering method --- ## 🎯 Why This Design Wins ### 1. Naming Clarity ```cpp // Clear from class name what format AND method it uses: RGBASurfaceBackend → RGB32 format + CUDA Surface write YUV420PUploadBackend → YUV420P format + CPU upload buffers NV12DirectBackend → NV12 format + Direct GPU rendering ``` ### 2. Code Reuse ```cpp // Zero rewrite of proven code: D3D12VideoRenderer (old, 2581 lines) → YUV420PUploadBackend (2000 lines, same logic) ``` ### 3. Extensibility ```cpp // Adding new format+method is trivial: case VAVCORE_COLOR_SPACE_VP9: return m_vp9UploadBackend.get(); // Just add one line! ``` ### 4. Testability ```cpp // Each backend tests independently: TEST(RGBASurfaceBackend, RenderFrame) { VavCoreVideoFrame frame; frame.color_space = VAVCORE_COLOR_SPACE_RGB32; // Test RGBA Surface rendering in isolation } ``` --- ## 📚 References - **VavCore Color Space**: `VavCore/VavCore.h` → `VavCoreColorSpace` enum - **Old Code**: `D3D12VideoRenderer.cpp` (2581 lines, YUV420P) - **Old Code**: `SimpleGPURenderer.cpp` (2105 lines, mixed RGBA/YUV) - **Previous Design**: `SimpleGPURenderer_Layered_Architecture_Design_v2.md` --- **Status**: ✅ **FINAL DESIGN APPROVED (v3)** **Key Decision**: Format + Method naming (`{PixelFormat}{Method}Backend`) **Approved Methods**: Surface, Upload, Direct (NO Hardware/Software) **Next Step**: Begin Phase 1 - Create IVideoBackend + RGBASurfaceBackend **Total Estimated Time**: 6 hours (4 phases) --- **Document Revision History**: - **v1**: Initial format-based naming (CPUVideoBackend - rejected) - **v2**: Reuse D3D12VideoRenderer as backend (approved structure) - **v3**: Final naming with Surface/Upload/Direct methods (current) ✅