1. ✅ Deferred D3D12SurfaceHandler creation to InitializeCUDA() when SetD3DDevice is called before Initialize
2. ✅ Fixed NV12ToRGBAConverter repeated reinitialization by adding IsInitialized() check before calling
Initialize()
3. ✅ Test now successfully decodes 24 frames without resource thrashing
Remaining issue (in test app, not VavCore):
- RedSurfaceNVDECTest creates a new D3D12Resource for every frame instead of reusing a pool
- This causes ExternalMemoryCache to create unlimited surface objects
- Fix: Test app should reuse a small pool of textures (e.g., 3-5 textures for buffering)