Files
video-v1/vav2/docs/working/D3D12VideoRenderer_Architecture_Final.md

20 KiB

D3D12VideoRenderer Layered Architecture - Final Design v3

Date: 2025-10-06 Status: FINAL APPROVED DESIGN - Format + Method Naming Convention Supersedes: SimpleGPURenderer_Layered_Architecture_Design_v2.md Key Decision: Use Surface/Upload/Direct method naming (NO Hardware/Software)


🎯 Final Naming Convention

Format: {PixelFormat}{Method}Backend

Approved Methods:

  • Surface: CUDA Surface Objects for tiled texture write
  • Upload: CPU upload buffers + GPU compute shader
  • Direct: Direct GPU rendering (future)

Rejected Methods:

  • Hardware/Software - Too implementation-focused, not descriptive

📊 Final Backend Architecture

D3D12VideoRenderer (orchestrator)
├── RGBASurfaceBackend       (handles VAVCORE_COLOR_SPACE_RGB32)
├── YUV420PUploadBackend     (handles VAVCORE_COLOR_SPACE_YUV420P)
└── NV12DirectBackend        (handles VAVCORE_COLOR_SPACE_NV12) [future]

File Mapping:

Old Code New Backend Format + Method Implementation
SimpleGPURenderer RGBA RGBASurfaceBackend RGB32 + Surface NVDEC → CUDA RGBA → surf2Dwrite() → D3D12
D3D12VideoRenderer (old) YUV420PUploadBackend YUV420P + Upload dav1d → CPU upload → GPU YUV→RGB shader
Future NV12 NV12DirectBackend NV12 + Direct NVDEC → D3D12 NV12 → Direct rendering

Benefits:

  • Format clarity: First word = pixel format (RGBA, YUV420P, NV12)
  • Method clarity: Second word = rendering method (Surface, Upload, Direct)
  • Direct mapping: Easy to map VavCoreColorSpace → backend class
  • No ambiguity: "Surface" = CUDA Surface Objects, "Upload" = CPU buffers, "Direct" = GPU-direct

Code Example:

void D3D12VideoRenderer::SelectBackend(const VavCoreVideoFrame& frame) {
    switch (frame.color_space) {
        case VAVCORE_COLOR_SPACE_RGB32:
            m_activeBackend = m_rgbaSurfaceBackend.get();  // Surface method
            break;
        case VAVCORE_COLOR_SPACE_YUV420P:
            m_activeBackend = m_yuv420pUploadBackend.get();  // Upload method
            break;
        case VAVCORE_COLOR_SPACE_NV12:
            m_activeBackend = m_nv12DirectBackend.get();  // Direct method
            break;
    }
}

🚫 Rejected Naming Approaches

Hardware/Software Naming

// REJECTED - Too implementation-focused
RGBAHardwareBackend  // What "hardware"? GPU? NVDEC? Confusing
YUV420PSoftwareBackend  // Still uses GPU shaders, not really "software"

Why rejected: "Hardware/Software" describes implementation internals, not the rendering method visible to users


Why Surface/Upload/Direct Works Better

Surface (CUDA Surface Objects):

  • Describes the actual mechanism: Writing to D3D12 tiled textures via CUDA surfaces
  • Clear technical distinction from linear buffers
  • Indicates GPU-direct write capability

Upload (CPU Upload Buffers):

  • Describes the actual mechanism: CPU writes to upload heaps → GPU copy
  • Familiar concept in graphics programming
  • Indicates CPU involvement in data transfer

Direct (Direct GPU Rendering):

  • Describes the actual mechanism: GPU renders directly without format conversion
  • Future-proof naming for hardware-decoded NV12
  • Indicates zero-copy GPU pipeline

📐 Architecture Diagram

┌─────────────────────────────────────────────────────────────┐
│                    IVideoRenderer                           │
│  (Public API - unchanged)                                   │
└─────────────────────────────────────────────────────────────┘
                            ▲
                            │ implements
                            │
┌─────────────────────────────────────────────────────────────┐
│                  D3D12VideoRenderer                         │
│  (Orchestrator - format-agnostic)                          │
│                                                             │
│  Responsibilities:                                          │
│  - D3D12 device, command queue, swap chain                 │
│  - Backend selection by color_space                        │
│  - Delegation to active backend                            │
│  - ~300 lines                                              │
└─────────────────────────────────────────────────────────────┘
                            │
                            │ delegates to
                            ▼
        ┌───────────────────┴───────────────────────────┐
        │                   │                           │
┌───────▼─────────┐  ┌──────▼────────────┐  ┌──────▼──────────┐
│ RGBASurface     │  │ YUV420PUpload     │  │ NV12Direct      │
│ Backend         │  │ Backend           │  │ Backend         │
│                 │  │                   │  │                 │
│ Format: RGB32   │  │ Format: YUV420P   │  │ Format: NV12    │
│ Method: Surface │  │ Method: Upload    │  │ Method: Direct  │
│                 │  │                   │  │                 │
│ Source:         │  │ Source:           │  │ Source:         │
│ SimpleGPU       │  │ D3D12Video        │  │ Future          │
│ Renderer        │  │ Renderer (old)    │  │                 │
│ RGBA path       │  │                   │  │                 │
│                 │  │                   │  │                 │
│ Pipeline:       │  │ Pipeline:         │  │ Pipeline:       │
│ NVDEC NV12 →    │  │ dav1d YUV →       │  │ NVDEC NV12 →    │
│ CUDA RGBA →     │  │ CPU upload →      │  │ D3D12 NV12 →    │
│ surf2Dwrite() → │  │ GPU YUV→RGB →     │  │ Direct render → │
│ D3D12 RGBA →    │  │ Render            │  │ Present         │
│ Sampling        │  │                   │  │                 │
│                 │  │                   │  │                 │
│ ~400 lines      │  │ ~2000 lines       │  │ TBD             │
└─────────────────┘  └───────────────────┘  └─────────────────┘

📂 Final File Structure

src/Rendering/
├── IVideoRenderer.h                    # Public interface
├── D3D12VideoRenderer.h/.cpp          # Orchestrator (~300 lines)
├── IVideoBackend.h                     # Internal backend interface
│
├── RGBASurfaceBackend.h/.cpp          # RGBA Surface backend (~400 lines)
│   │ Extracted from: SimpleGPURenderer RGBA path
│   │ Handles: VAVCORE_COLOR_SPACE_RGB32
│   │ Method: CUDA Surface Objects (surf2Dwrite)
│   │ Pipeline: NVDEC → CUDA RGBA → surf2Dwrite() → D3D12 RGBA → sampling
│
├── YUV420PUploadBackend.h/.cpp        # YUV420P Upload backend (~2000 lines)
│   │ Renamed from: D3D12VideoRenderer (old)
│   │ Handles: VAVCORE_COLOR_SPACE_YUV420P
│   │ Method: CPU upload buffers + GPU shader
│   │ Pipeline: dav1d → CPU upload → GPU YUV→RGB shader → render
│
└── NV12DirectBackend.h/.cpp           # NV12 Direct backend (future)
    │ Handles: VAVCORE_COLOR_SPACE_NV12
    │ Method: Direct GPU rendering (zero-copy)
    │ Pipeline: NVDEC → D3D12 NV12 → Direct render → present

Legacy/ (archived)
└── SimpleGPURenderer_Legacy.h/.cpp    # Old mixed-format renderer

🎯 Backend Responsibilities

IVideoBackend Interface

class IVideoBackend {
public:
    virtual ~IVideoBackend() = default;

    // Lifecycle
    virtual HRESULT Initialize(
        ID3D12Device* device,
        ID3D12CommandQueue* commandQueue,
        uint32_t width, uint32_t height) = 0;

    virtual void Shutdown() = 0;
    virtual bool IsInitialized() const = 0;

    // Video texture for CUDA interop (nullptr if not applicable)
    virtual HRESULT CreateVideoTexture(uint32_t width, uint32_t height) = 0;
    virtual ID3D12Resource* GetVideoTexture() const = 0;

    // Render frame to back buffer
    virtual HRESULT RenderToBackBuffer(
        const VavCoreVideoFrame& frame,
        ID3D12Resource* backBuffer,
        ID3D12GraphicsCommandList* commandList) = 0;

    // Format this backend handles
    virtual VavCoreColorSpace GetSupportedFormat() const = 0;
};

RGBASurfaceBackend

Handles: VAVCORE_COLOR_SPACE_RGB32 Method: CUDA Surface Objects (surf2Dwrite)

class RGBASurfaceBackend : public IVideoBackend {
public:
    VavCoreColorSpace GetSupportedFormat() const override {
        return VAVCORE_COLOR_SPACE_RGB32;
    }

    HRESULT CreateVideoTexture(uint32_t width, uint32_t height) override;
    // Creates: DXGI_FORMAT_R8G8B8A8_UNORM texture with D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS
    // Enables CUDA Surface Object creation via cudaExternalMemoryGetMappedMipmappedArray

    HRESULT RenderToBackBuffer(...) override;
    // Pipeline: Simple RGBA texture sampling (no YUV conversion needed)

private:
    ComPtr<ID3D12Resource> m_rgbaTexture;  // Tiled RGBA texture
    ComPtr<ID3D12PipelineState> m_pipelineState;
    ComPtr<ID3D12RootSignature> m_rootSignature;
    // Simple texture sampling shader (no YUV conversion)
};

Source: Extracted from SimpleGPURenderer RGBA path Size: ~400 lines Key Feature: Uses CUDA Surface Objects for tiled texture write (surf2Dwrite)


YUV420PUploadBackend

Handles: VAVCORE_COLOR_SPACE_YUV420P Method: CPU upload buffers + GPU shader

class YUV420PUploadBackend : public IVideoBackend {
public:
    VavCoreColorSpace GetSupportedFormat() const override {
        return VAVCORE_COLOR_SPACE_YUV420P;
    }

    HRESULT CreateVideoTexture(uint32_t width, uint32_t height) override;
    // Creates: Separate Y/U/V textures + CPU upload buffers (ring buffer system)

    HRESULT RenderToBackBuffer(...) override;
    // Pipeline:
    // 1. CPU writes to upload buffers (ring buffer system, persistent mapped memory)
    // 2. GPU copies upload → textures (CopyTextureRegion)
    // 3. YUV→RGB compute shader (GPU conversion)
    // 4. Render to back buffer

    // Legacy D3D12VideoRenderer methods (preserved for compatibility)
    uint8_t* GetYMappedBuffer(uint32_t bufferIndex) const;
    uint8_t* GetUMappedBuffer(uint32_t bufferIndex) const;
    uint8_t* GetVMappedBuffer(uint32_t bufferIndex) const;

private:
    // Ring buffer system (from old D3D12VideoRenderer)
    struct RingBufferSlot {
        ComPtr<ID3D12Resource> yUploadBuffer;  // D3D12_HEAP_TYPE_UPLOAD
        ComPtr<ID3D12Resource> uUploadBuffer;
        ComPtr<ID3D12Resource> vUploadBuffer;
        uint8_t* yMappedData;  // Persistent CPU mapping
        uint8_t* uMappedData;
        uint8_t* vMappedData;
    };
    std::vector<RingBufferSlot> m_ringBuffers;

    ComPtr<ID3D12Resource> m_yTexture;  // GPU textures (D3D12_HEAP_TYPE_DEFAULT)
    ComPtr<ID3D12Resource> m_uTexture;
    ComPtr<ID3D12Resource> m_vTexture;
    ComPtr<ID3D12PipelineState> m_yuvToRgbPipeline;  // YUV→RGB compute shader
};

Source: Renamed from D3D12VideoRenderer (old) Size: ~2000 lines (preserves all existing logic) Key Feature: Persistent CPU mapped upload buffers with ring buffer system


NV12DirectBackend (Future)

Handles: VAVCORE_COLOR_SPACE_NV12 Method: Direct GPU rendering (zero-copy)

class NV12DirectBackend : public IVideoBackend {
public:
    VavCoreColorSpace GetSupportedFormat() const override {
        return VAVCORE_COLOR_SPACE_NV12;
    }

    HRESULT CreateVideoTexture(uint32_t width, uint32_t height) override;
    // Creates: DXGI_FORMAT_NV12 texture (when D3D12 tiled NV12 is viable)
    // Zero-copy: NVDEC writes directly to D3D12 texture

    HRESULT RenderToBackBuffer(...) override;
    // Pipeline: NVDEC → D3D12 NV12 → Direct YUV→RGB shader → Render
    // No CPU involvement, no format conversion, pure GPU path

private:
    ComPtr<ID3D12Resource> m_nv12Texture;  // Tiled NV12 texture
    ComPtr<ID3D12PipelineState> m_nv12ToRgbPipeline;  // Direct YUV→RGB shader
};

Status: Not implemented yet (requires D3D12 tiled NV12 support resolution) Key Feature: Zero-copy GPU pipeline (NVDEC → D3D12 direct write)


🔄 Backend Selection Logic

class D3D12VideoRenderer : public IVideoRenderer {
public:
    HRESULT RenderVideoFrame(const VavCoreVideoFrame& frame) override {
        // Select backend based on frame color space
        IVideoBackend* backend = SelectBackend(frame.color_space);
        if (!backend) {
            return E_FAIL;
        }

        // Get current back buffer
        ID3D12Resource* backBuffer = m_renderTargets[m_frameIndex].Get();

        // Delegate rendering to backend
        return backend->RenderToBackBuffer(frame, backBuffer, m_commandList.Get());
    }

private:
    IVideoBackend* SelectBackend(VavCoreColorSpace colorSpace) {
        switch (colorSpace) {
            case VAVCORE_COLOR_SPACE_RGB32:
                if (!m_rgbaSurfaceBackend) {
                    m_rgbaSurfaceBackend = std::make_unique<RGBASurfaceBackend>();
                    m_rgbaSurfaceBackend->Initialize(m_device.Get(), m_commandQueue.Get(),
                                                    m_width, m_height);
                }
                return m_rgbaSurfaceBackend.get();

            case VAVCORE_COLOR_SPACE_YUV420P:
                if (!m_yuv420pUploadBackend) {
                    m_yuv420pUploadBackend = std::make_unique<YUV420PUploadBackend>();
                    m_yuv420pUploadBackend->Initialize(m_device.Get(), m_commandQueue.Get(),
                                                      m_width, m_height);
                }
                return m_yuv420pUploadBackend.get();

            case VAVCORE_COLOR_SPACE_NV12:
                // Future: NV12DirectBackend
                if (!m_nv12DirectBackend) {
                    m_nv12DirectBackend = std::make_unique<NV12DirectBackend>();
                    m_nv12DirectBackend->Initialize(m_device.Get(), m_commandQueue.Get(),
                                                   m_width, m_height);
                }
                return m_nv12DirectBackend.get();

            default:
                return nullptr;
        }
    }

    std::unique_ptr<RGBASurfaceBackend> m_rgbaSurfaceBackend;      // Surface method
    std::unique_ptr<YUV420PUploadBackend> m_yuv420pUploadBackend;  // Upload method
    std::unique_ptr<NV12DirectBackend> m_nv12DirectBackend;        // Direct method (future)
};

📊 Naming Consistency Table

Backend Class Format Enum Method Pixel Layout Pipeline File Origin
RGBASurfaceBackend VAVCORE_COLOR_SPACE_RGB32 Surface RGBA (4 bytes/pixel) NVDEC → CUDA surf2Dwrite() → D3D12 SimpleGPURenderer
YUV420PUploadBackend VAVCORE_COLOR_SPACE_YUV420P Upload Planar YUV 4:2:0 dav1d → CPU upload → GPU shader D3D12VideoRenderer (old)
NV12DirectBackend VAVCORE_COLOR_SPACE_NV12 Direct Semi-planar NV12 NVDEC → D3D12 direct → Render Future

Naming Rule: {PixelFormat}{Method}Backend

  • Format-first: Clear pixel format (RGBA, YUV420P, NV12)
  • Method-second: Rendering method (Surface, Upload, Direct)
  • Direct 1:1 mapping: VavCoreColorSpace enum → backend class
  • No ambiguity: Method names describe actual mechanism, not implementation details

📝 Implementation Plan

Phase 1: Create Backend Infrastructure

Goal: Establish base interfaces and RGBA Surface backend

Tasks:

  1. Create IVideoBackend.h interface
  2. Create RGBASurfaceBackend.h/.cpp
  3. Extract RGBA Surface logic from SimpleGPURenderer
  4. Test RGBASurfaceBackend independently

Estimated Time: 2 hours


Phase 2: Transform D3D12VideoRenderer → YUV420PUploadBackend

Goal: Repurpose existing code as Upload backend

Tasks:

  1. Rename files: D3D12VideoRenderer.*YUV420PUploadBackend.*
  2. Rename class: D3D12VideoRendererYUV420PUploadBackend
  3. Implement IVideoBackend interface
  4. Remove swap chain ownership (delegate to orchestrator)
  5. Test YUV420PUploadBackend independently

Estimated Time: 1.5 hours


Phase 3: Create New D3D12VideoRenderer Orchestrator

Goal: Build thin orchestrator from scratch

Tasks:

  1. Create new D3D12VideoRenderer.h/.cpp
  2. Implement IVideoRenderer interface
  3. Implement backend selection logic
  4. Test with RGBASurfaceBackend
  5. Test with YUV420PUploadBackend
  6. Test dynamic backend switching

Estimated Time: 1.5 hours


Phase 4: Archive Legacy Code

Goal: Clean up old SimpleGPURenderer

Tasks:

  1. Create src/Rendering/Legacy/ directory
  2. Move SimpleGPURendererSimpleGPURenderer_Legacy
  3. Update all references to new D3D12VideoRenderer
  4. Verify all tests pass
  5. Update documentation

Estimated Time: 1 hour

Total Estimated Time: 6 hours


Success Criteria

Functional

  • NVDEC RGBA rendering works (via RGBASurfaceBackend)
  • CPU YUV rendering works (via YUV420PUploadBackend)
  • Backend auto-selection by color_space
  • No visual regressions
  • All existing tests pass

Code Quality

  • D3D12VideoRenderer < 400 lines
  • Each backend handles exactly 1 format with 1 method
  • Consistent format+method naming (Surface/Upload/Direct)
  • No format-specific if/else in orchestrator

Maintainability

  • Adding new format = add {Format}{Method}Backend class only
  • Each backend independently testable
  • Clear mapping: VavCoreColorSpace → Backend class → Rendering method

🎯 Why This Design Wins

1. Naming Clarity

// Clear from class name what format AND method it uses:
RGBASurfaceBackend        RGB32 format + CUDA Surface write
YUV420PUploadBackend      YUV420P format + CPU upload buffers
NV12DirectBackend         NV12 format + Direct GPU rendering

2. Code Reuse

// Zero rewrite of proven code:
D3D12VideoRenderer (old, 2581 lines)  YUV420PUploadBackend (2000 lines, same logic)

3. Extensibility

// Adding new format+method is trivial:
case VAVCORE_COLOR_SPACE_VP9:
    return m_vp9UploadBackend.get();  // Just add one line!

4. Testability

// Each backend tests independently:
TEST(RGBASurfaceBackend, RenderFrame) {
    VavCoreVideoFrame frame;
    frame.color_space = VAVCORE_COLOR_SPACE_RGB32;
    // Test RGBA Surface rendering in isolation
}

📚 References

  • VavCore Color Space: VavCore/VavCore.hVavCoreColorSpace enum
  • Old Code: D3D12VideoRenderer.cpp (2581 lines, YUV420P)
  • Old Code: SimpleGPURenderer.cpp (2105 lines, mixed RGBA/YUV)
  • Previous Design: SimpleGPURenderer_Layered_Architecture_Design_v2.md

Status: FINAL DESIGN APPROVED (v3) Key Decision: Format + Method naming ({PixelFormat}{Method}Backend) Approved Methods: Surface, Upload, Direct (NO Hardware/Software) Next Step: Begin Phase 1 - Create IVideoBackend + RGBASurfaceBackend Total Estimated Time: 6 hours (4 phases)


Document Revision History:

  • v1: Initial format-based naming (CPUVideoBackend - rejected)
  • v2: Reuse D3D12VideoRenderer as backend (approved structure)
  • v3: Final naming with Surface/Upload/Direct methods (current)