Add high performance timer and sync thread pool

This commit is contained in:
2025-09-23 23:35:57 +09:00
parent 03de610304
commit 99b63eb280
13 changed files with 826 additions and 101 deletions

View File

@@ -23,41 +23,70 @@
- [x] CPU-GPU 하이브리드 fallback 구조
- [x] 성능 최적화 및 안정성 확보
#### **🔥 현재 작업: Unit Test 구현** (최고 우선순위)
#### ** Unit Test 구현 완료** (완료됨)
- **참조 문서**: [UNIT_TEST_REFACTORING_PLAN.md](./UNIT_TEST_REFACTORING_PLAN.md)
- **의사결정 필요**: Option A (인터페이스+Mock) vs Option B (직접 테스트)
- [ ] 인터페이스 리팩토링 (Option A 선택 시)
- [ ] Mock 시스템 구축 (MockWebMFileReader, MockVideoRenderer)
- [ ] 핵심 컴포넌트 테스트 작성 (WebMFileReader, AV1Decoder, VideoDecoderFactory)
- [ ] GPU 렌더링 테스트 (SimpleGPURenderer)
- [ ] 통합 테스트 구현 (전체 파이프라인 검증)
- **선택**: Option A (인터페이스+Mock) 구현 완료
- [x] 인터페이스 리팩토링 (IWebMFileReader, IVideoRenderer)
- [x] Mock 시스템 구축 (MockWebMFileReader, MockVideoRenderer)
- [x] 핵심 컴포넌트 테스트 작성 (47개 테스트, 95.7% 통과율)
- [x] 빌드 시스템 통합 (debug 라이브러리 호환성 해결)
- [x] VSTest 실행 환경 구축
#### **🔥 현재 상황: AV1 디코딩 문제 해결 필요** (최고 우선순위)
- **핵심 문제**: AV1 디코더 초기화는 성공하지만 실제 프레임 디코딩이 실패
- **증상**: "Frame 0: decode failed" - 모든 프레임에서 디코딩 실패
- **영향**: 기본 비디오 재생 기능 작동 불가
---
## 🚀 현재 작업 단계: 고급 파이프라인 디버깅 및 검증 (CRITICAL)
## 🎯 **현재 프로젝트 상태 요약 (2025-09-23 업데이트)**
**⚠️ 중요**: 현재는 기본 기능 구현이 아닌 **고급 파이프라인 시스템들의 디버깅과 검증 단계**입니다.
- **목적**: DependencyScheduler, OverlappedProcessor, ThreadedDecoder 등 고급 파이프라인의 문제점 분석 및 해결
- **주의**: Legacy 시스템으로 되돌리지 말고 고급 파이프라인 자체의 문제를 해결해야 함
- **방향**: 시행착오나 반복적 우회 대신 근본 원인 파악과 직접적 해결책 추구
### ✅ **구현 완료된 주요 컴포넌트**
1. **Core Video Infrastructure**: WebMFileReader, AV1Decoder, VideoDecoderFactory ✅
2. **GPU Rendering System**: SimpleGPURenderer, D3D12VideoRenderer 구현 ✅
3. **UI Integration**: VideoPlayerControl 단순화 및 WinUI3 통합 ✅
4. **Build System**: 모든 프로젝트 빌드 성공 (GUI/Headless/UnitTest) ✅
5. **Test Infrastructure**: 47개 Unit Test, Mock 시스템 구축 ✅
### 🔧 현재 진행 중인 고급 파이프라인 이슈
### ⚠️ **현재 해결 필요한 핵심 이슈**
#### 1. DependencyScheduler ResourceBarrier NULL 포인터 문제
- **상**: 부분 해결됨 (GPU 텍스처 생성 로직 추가)
- **잔여 이슈**: 여전히 NULL 포인터 오류 발생
- **필요한 작업**: GPU 리소스 생성 타이밍 및 생명주기 정밀 분석
#### 1. **AV1 디코딩 실패 문제** (최우선)
- **상**: 디코더 초기화 성공, 프레임 디코딩 실패
- **로그**: "Frame 0: decode failed" 반복
- **원인 추정**: 패킷 포맷, 코덱 설정, 또는 dav1d 호환성 문제
- **영향**: 기본 비디오 재생 불가
#### 2. 고급 파이프라인에서 화면 렌더링 누락 문제
- **상태**: 핵심 문제 발견
- **문제**: OverlappedProcessor, DependencyScheduler가 패킷 처리만 하고 실제 화면 렌더링은 수행하지 않음
- **필요한 작업**: 각 고급 파이프라인 내부에서 RenderFrameToScreen() 호출 통합
#### 2. **GPU 렌더링 검증 필요**
- **상태**: SimpleGPURenderer 코드 존재하지만 end-to-end 테스트 미완료
- **필요**: 실제 프레임 렌더링 및 성능 검증
#### 3. YUVRenderer 메모리 최적화
- **상태**: ✅ 완료 (8K→5K, 330MB 절약)
## 🚀 **다음 우선순위 작업 (업데이트된 로드맵)**
### Phase 1: D3D Texture 기반 GPU 렌더링 파이프라인 구현
**목표**: CPU 기반 렌더링을 GPU 직접 렌더링으로 교체하여 15-30배 성능 향상
### **Option 1: AV1 디코딩 문제 해결** 🔥 (최우선 권장)
**목표**: 기본 비디오 재생 기능 복구
**예상 소요**: 1-2일
**작업 내용**:
1. AV1 패킷 디코딩 실패 원인 분석
2. dav1d 라이브러리 설정 및 호환성 확인
3. WebM 패킷 포맷 검증
4. 다양한 AV1 테스트 파일로 검증
### **Option 2: GPU 렌더링 파이프라인 검증** ⚡
**목표**: 기존 SimpleGPURenderer의 실제 동작 확인
**예상 소요**: 2-3일
**작업 내용**:
1. SimpleGPURenderer end-to-end 테스트
2. GPU vs CPU 렌더링 성능 비교
3. AspectFit 및 다해상도 테스트
4. 실제 15-30배 성능 향상 검증
### **Option 3: 추가 코덱 지원** 📈
**목표**: VP9 디코더 구현으로 호환성 확장
**예상 소요**: 3-5일
**작업 내용**:
1. VP9 디코더 클래스 구현
2. VideoDecoderFactory VP9 지원 추가
3. VP9 테스트 파일 검증
#### ✅ 완료된 사전 작업
- SwapChainPanel XAML 설정 완료

View File

@@ -154,6 +154,7 @@
<ClInclude Include="src\FileIO\WebMFileReader.h" />
<ClInclude Include="src\Rendering\D3D12VideoRenderer.h" />
<ClInclude Include="src\Rendering\SimpleGPURenderer.h" />
<ClInclude Include="src\Rendering\GlobalD3D12SyncManager.h" />
</ItemGroup>
<ItemGroup>
<ApplicationDefinition Include="App.xaml" />
@@ -184,6 +185,7 @@
<ClCompile Include="src\FileIO\WebMFileReader.cpp" />
<ClCompile Include="src\Rendering\D3D12VideoRenderer.cpp" />
<ClCompile Include="src\Rendering\SimpleGPURenderer.cpp" />
<ClCompile Include="src\Rendering\GlobalD3D12SyncManager.cpp" />
<ClCompile Include="$(GeneratedFilesDir)module.g.cpp" />
</ItemGroup>
<ItemGroup>

View File

@@ -188,9 +188,9 @@ namespace winrt::Vav2Player::implementation
{
case VideoDecoderFactory::DecoderType::AUTO:
return Vav2Player::VideoDecoderType::Auto;
case VideoDecoderFactory::DecoderType::SOFTWARE:
case VideoDecoderFactory::DecoderType::DAV1D:
return Vav2Player::VideoDecoderType::Software;
case VideoDecoderFactory::DecoderType::HARDWARE_MF:
case VideoDecoderFactory::DecoderType::MEDIA_FOUNDATION:
return Vav2Player::VideoDecoderType::HardwareMF;
default:
return Vav2Player::VideoDecoderType::Auto;
@@ -206,10 +206,10 @@ namespace winrt::Vav2Player::implementation
newType = VideoDecoderFactory::DecoderType::AUTO;
break;
case Vav2Player::VideoDecoderType::Software:
newType = VideoDecoderFactory::DecoderType::SOFTWARE;
newType = VideoDecoderFactory::DecoderType::DAV1D;
break;
case Vav2Player::VideoDecoderType::HardwareMF:
newType = VideoDecoderFactory::DecoderType::HARDWARE_MF;
newType = VideoDecoderFactory::DecoderType::MEDIA_FOUNDATION;
break;
default:
newType = VideoDecoderFactory::DecoderType::AUTO;
@@ -334,52 +334,92 @@ namespace winrt::Vav2Player::implementation
m_isPlaying = true;
UpdateStatus(L"Playing");
// Cleanup any existing timer before creating new one
// Record playback start time for accurate speed measurement
m_playbackStartTime = std::chrono::high_resolution_clock::now();
// Stop any existing timer/thread
if (m_playbackTimer)
{
m_playbackTimer.Stop();
m_playbackTimer = nullptr;
}
// Create new playback timer
m_playbackTimer = winrt::Microsoft::UI::Xaml::DispatcherTimer();
if (m_timingThread && m_timingThread->joinable()) {
m_shouldStopTiming = true;
m_timingThread->join();
m_timingThread.reset();
}
// Start high-resolution timing thread
m_shouldStopTiming = false;
auto weakThis = get_weak();
m_playbackTimer.Tick([weakThis](auto&&, auto&&) {
if (auto strongThis = weakThis.get()) {
if (strongThis->m_isPlaying && strongThis->m_isLoaded) {
strongThis->ProcessSingleFrame();
double targetIntervalMs = 1000.0 / m_frameRate;
m_timingThread = std::make_unique<std::thread>([weakThis, targetIntervalMs]() {
auto start = std::chrono::high_resolution_clock::now();
while (true) {
if (auto strongThis = weakThis.get()) {
if (strongThis->m_shouldStopTiming || !strongThis->m_isPlaying) {
break;
}
// Process frame on UI thread
strongThis->DispatcherQueue().TryEnqueue([strongThis]() {
if (strongThis->m_isPlaying && strongThis->m_isLoaded) {
strongThis->ProcessSingleFrame();
}
});
// High-precision sleep until next frame
auto nextFrame = start + std::chrono::microseconds(
static_cast<long long>(targetIntervalMs * 1000));
std::this_thread::sleep_until(nextFrame);
start = nextFrame;
} else {
break; // Object was destroyed
}
}
});
auto interval = std::chrono::milliseconds(static_cast<int>(1000.0 / m_frameRate));
m_playbackTimer.Interval(interval);
m_playbackTimer.Start();
ProcessSingleFrame();
}
void VideoPlayerControl::Pause()
{
m_isPlaying = false;
m_shouldStopTiming = true;
if (m_playbackTimer)
{
m_playbackTimer.Stop();
}
if (m_timingThread && m_timingThread->joinable()) {
m_timingThread->join();
m_timingThread.reset();
}
UpdateStatus(L"Paused");
}
void VideoPlayerControl::Stop()
{
m_isPlaying = false;
m_shouldStopTiming = true;
// Properly cleanup timer to prevent resource leaks
// Properly cleanup timer and thread to prevent resource leaks
if (m_playbackTimer)
{
m_playbackTimer.Stop();
m_playbackTimer = nullptr; // Release timer completely
}
if (m_timingThread && m_timingThread->joinable()) {
m_timingThread->join();
m_timingThread.reset();
}
m_currentFrame = 0;
m_currentTime = 0.0;
@@ -405,6 +445,10 @@ namespace winrt::Vav2Player::implementation
return;
}
// Simple decode-render flow without frame skipping
auto totalStart = std::chrono::high_resolution_clock::now();
// Read next packet
VideoPacket packet;
if (!m_fileReader->ReadNextPacket(packet))
{
@@ -415,15 +459,57 @@ namespace winrt::Vav2Player::implementation
return;
}
// Decode frame
VideoFrame frame;
if (!m_decoder->DecodeFrame(packet, frame)) {
return; // Skip failed frames
bool decodeSuccess = m_decoder->DecodeFrame(packet, frame);
if (!decodeSuccess) {
// Count decode failures but continue processing
m_framesDecodeErrors++;
m_currentFrame++;
m_currentTime = m_currentFrame / m_frameRate;
// Log decode error occasionally
if (m_framesDecodeErrors % 10 == 1) {
wchar_t errorMsg[256];
swprintf_s(errorMsg, L"Decode error #%llu at frame %llu", m_framesDecodeErrors, m_currentFrame);
OutputDebugStringW(errorMsg);
OutputDebugStringW(L"\n");
}
return;
}
// Render frame
RenderFrameToScreen(frame);
// Update counters
m_currentFrame++;
m_currentTime = m_currentFrame / m_frameRate;
// Performance logging every 30 frames
if (m_currentFrame % 30 == 0) {
auto totalEnd = std::chrono::high_resolution_clock::now();
double totalTime = std::chrono::duration<double, std::milli>(totalEnd - totalStart).count();
// Calculate realSpeed for performance monitoring
auto now = std::chrono::high_resolution_clock::now();
double realElapsedMs = std::chrono::duration<double, std::milli>(now - m_playbackStartTime).count();
double expectedElapsedMs = (m_currentFrame * 1000.0) / m_frameRate;
double realSpeed = expectedElapsedMs / realElapsedMs; // 1.0 = real-time, 0.5 = half speed
wchar_t perfMsg[256];
const wchar_t* bufferingMode = m_useHardwareRendering ? L"GPU" : L"CPU";
double actualFPS = 1000.0 / totalTime;
double errorRate = (m_framesDecodeErrors * 100.0) / m_currentFrame;
swprintf_s(perfMsg, L"Frame %llu [%s]: processing=%.1ffps, realSpeed=%.2fx, errors=%.1f%%",
m_currentFrame, bufferingMode, actualFPS, realSpeed, errorRate);
UpdateStatus(perfMsg);
// Also output to debug console for analysis
OutputDebugStringW(perfMsg);
OutputDebugStringW(L"\n");
}
}
void VideoPlayerControl::ProcessSingleFrameLegacy()
@@ -436,15 +522,42 @@ namespace winrt::Vav2Player::implementation
{
// GPU rendering attempt (only if user selected GPU pipeline)
if (m_useHardwareRendering && m_gpuRenderer && m_gpuRenderer->IsInitialized()) {
if (m_gpuRenderer->TryRenderFrame(frame)) {
// GPU rendering successful - AspectFit was already applied during initialization
auto gpuStart = std::chrono::high_resolution_clock::now();
bool gpuSuccess = m_gpuRenderer->TryRenderFrame(frame);
auto gpuEnd = std::chrono::high_resolution_clock::now();
double gpuTime = std::chrono::duration<double, std::milli>(gpuEnd - gpuStart).count();
if (gpuSuccess) {
// Log GPU rendering time occasionally for debugging
if (m_currentFrame % 60 == 0) { // Every 2 seconds
wchar_t gpuMsg[256];
swprintf_s(gpuMsg, L"GPU render time: %.2fms", gpuTime);
OutputDebugStringW(gpuMsg);
OutputDebugStringW(L"\n");
}
return; // Success - done
} else {
// Log GPU failure for debugging
wchar_t gpuFailMsg[256];
swprintf_s(gpuFailMsg, L"GPU render failed (%.2fms), falling back to CPU", gpuTime);
OutputDebugStringW(gpuFailMsg);
OutputDebugStringW(L"\n");
}
// If GPU rendering fails, fall back to CPU
}
// CPU rendering (either by user choice or GPU fallback)
auto cpuStart = std::chrono::high_resolution_clock::now();
RenderFrameSoftware(frame);
auto cpuEnd = std::chrono::high_resolution_clock::now();
double cpuTime = std::chrono::duration<double, std::milli>(cpuEnd - cpuStart).count();
// Log CPU rendering time occasionally for debugging
if (m_currentFrame % 60 == 0) { // Every 2 seconds
wchar_t cpuMsg[256];
swprintf_s(cpuMsg, L"CPU render time: %.2fms", cpuTime);
OutputDebugStringW(cpuMsg);
OutputDebugStringW(L"\n");
}
}
void VideoPlayerControl::RenderFrameSoftware(const VideoFrame& frame)

View File

@@ -71,6 +71,10 @@ namespace winrt::Vav2Player::implementation
// Playback timer for continuous frame processing
winrt::Microsoft::UI::Xaml::DispatcherTimer m_playbackTimer;
// High-resolution timer for accurate frame timing
std::unique_ptr<std::thread> m_timingThread;
std::atomic<bool> m_shouldStopTiming{false};
// Video dimensions
uint32_t m_videoWidth = 0;
uint32_t m_videoHeight = 0;
@@ -95,6 +99,10 @@ namespace winrt::Vav2Player::implementation
double m_duration = 0.0;
winrt::hstring m_status = L"Ready";
// Basic timing and error tracking
std::chrono::high_resolution_clock::time_point m_playbackStartTime;
uint64_t m_framesDecodeErrors = 0;
// Helper methods
void InitializeVideoRenderer();
void ProcessSingleFrame();

View File

@@ -2,6 +2,9 @@
#include "../src/Common/VideoTypes.h"
#include "../src/Decoder/VideoDecoderFactory.h"
#include "../src/FileIO/WebMFileReader.h"
#include <chrono>
#include <vector>
#include <iomanip>
using namespace Vav2Player;
@@ -48,8 +51,41 @@ int main(int argc, char* argv[])
std::cout << "Codec: " << (metadata.codec_type == VideoCodecType::AV1 ? "AV1" :
metadata.codec_type == VideoCodecType::VP9 ? "VP9" : "Other") << std::endl;
// Test decoder creation using detected codec type
auto decoder = Vav2Player::VideoDecoderFactory::CreateDecoder(metadata.codec_type, Vav2Player::VideoDecoderFactory::DecoderType::AUTO);
// Test decoder creation - First try MediaFoundation, then fallback to dav1d if it fails
std::cout << std::endl;
std::cout << "=== TESTING MEDIA FOUNDATION DECODER ===" << std::endl;
auto mfDecoder = Vav2Player::VideoDecoderFactory::CreateDecoder(metadata.codec_type, Vav2Player::VideoDecoderFactory::DecoderType::MEDIA_FOUNDATION);
bool useMF = false;
if (mfDecoder && mfDecoder->Initialize(metadata)) {
std::cout << "[SUCCESS] MediaFoundation decoder initialized successfully" << std::endl;
useMF = true;
} else {
std::cout << "[FAILED] MediaFoundation decoder failed - falling back to dav1d" << std::endl;
}
std::cout << std::endl;
std::cout << "=== TESTING DAV1D DECODER ===" << std::endl;
auto dav1dDecoder = Vav2Player::VideoDecoderFactory::CreateDecoder(metadata.codec_type, Vav2Player::VideoDecoderFactory::DecoderType::DAV1D);
bool useDav1d = false;
if (dav1dDecoder && dav1dDecoder->Initialize(metadata)) {
std::cout << "[SUCCESS] dav1d decoder initialized successfully" << std::endl;
useDav1d = true;
} else {
std::cout << "[FAILED] dav1d decoder failed" << std::endl;
}
// Use the decoder that works
std::unique_ptr<IVideoDecoder> decoder;
if (useMF) {
std::cout << std::endl << "=== USING MEDIA FOUNDATION DECODER ===" << std::endl;
decoder = std::move(mfDecoder);
} else if (useDav1d) {
std::cout << std::endl << "=== USING DAV1D DECODER ===" << std::endl;
decoder = std::move(dav1dDecoder);
} else {
std::cout << "No working decoder found" << std::endl;
return 1;
}
if (!decoder) {
std::cout << "Failed to create " << (metadata.codec_type == VideoCodecType::AV1 ? "AV1" :
metadata.codec_type == VideoCodecType::VP9 ? "VP9" : "Other") << " decoder" << std::endl;
@@ -63,22 +99,102 @@ int main(int argc, char* argv[])
std::cout << "Decoder initialized successfully" << std::endl;
// Test a few frames
for (int i = 0; i < 5; i++) {
// Performance test - measure decoding performance for 4K video
std::cout << "=== PERFORMANCE TEST: 4K Video Decoding ===" << std::endl;
std::cout << "Target: 30fps (33.33ms per frame)" << std::endl;
std::cout << std::endl;
int packetsRead = 0;
int framesDecoded = 0;
int maxFrames = 30; // Test 30 frames (1 second at 30fps)
auto testStartTime = std::chrono::high_resolution_clock::now();
double totalDecodeTime = 0.0;
double totalPacketReadTime = 0.0;
std::vector<double> frameDecodeTimes;
std::vector<double> packetReadTimes;
for (int i = 0; i < maxFrames; i++) {
// Measure packet reading time
auto packetReadStart = std::chrono::high_resolution_clock::now();
Vav2Player::VideoPacket packet;
if (!fileReader->ReadNextPacket(packet)) {
std::cout << "End of file or read error at frame " << i << std::endl;
std::cout << "End of file at packet " << i << std::endl;
break;
}
auto packetReadEnd = std::chrono::high_resolution_clock::now();
double packetReadTime = std::chrono::duration<double, std::milli>(packetReadEnd - packetReadStart).count();
packetReadTimes.push_back(packetReadTime);
totalPacketReadTime += packetReadTime;
packetsRead++;
// Measure frame decoding time
auto decodeStart = std::chrono::high_resolution_clock::now();
Vav2Player::VideoFrame frame;
if (decoder->DecodeFrame(packet, frame)) {
std::cout << "Frame " << i << ": " << frame.width << "x" << frame.height << std::endl;
auto decodeEnd = std::chrono::high_resolution_clock::now();
double decodeTime = std::chrono::duration<double, std::milli>(decodeEnd - decodeStart).count();
frameDecodeTimes.push_back(decodeTime);
totalDecodeTime += decodeTime;
framesDecoded++;
std::cout << "Frame " << framesDecoded << ": "
<< "read=" << std::fixed << std::setprecision(2) << packetReadTime << "ms, "
<< "decode=" << decodeTime << "ms, "
<< "total=" << (packetReadTime + decodeTime) << "ms" << std::endl;
} else {
std::cout << "Frame " << i << ": decode failed" << std::endl;
}
}
auto testEndTime = std::chrono::high_resolution_clock::now();
double totalTestTime = std::chrono::duration<double, std::milli>(testEndTime - testStartTime).count();
std::cout << std::endl;
std::cout << "=== PERFORMANCE RESULTS ===" << std::endl;
std::cout << "Frames decoded: " << framesDecoded << " / " << packetsRead << " packets" << std::endl;
std::cout << "Total test time: " << std::fixed << std::setprecision(2) << totalTestTime << "ms" << std::endl;
if (framesDecoded > 0) {
double avgDecodeTime = totalDecodeTime / framesDecoded;
double avgPacketReadTime = totalPacketReadTime / packetsRead;
double avgTotalTime = avgDecodeTime + avgPacketReadTime;
double achievableFPS = 1000.0 / avgTotalTime;
std::cout << std::endl;
std::cout << "Average packet read time: " << avgPacketReadTime << "ms" << std::endl;
std::cout << "Average decode time: " << avgDecodeTime << "ms" << std::endl;
std::cout << "Average total time per frame: " << avgTotalTime << "ms" << std::endl;
std::cout << "Achievable FPS: " << std::fixed << std::setprecision(1) << achievableFPS << " fps" << std::endl;
std::cout << std::endl;
if (achievableFPS >= 30.0) {
std::cout << "[SUCCESS] Can achieve 30fps target!" << std::endl;
} else {
std::cout << "[WARNING] Cannot achieve 30fps target (current: " << achievableFPS << " fps)" << std::endl;
// Identify bottleneck
if (avgDecodeTime > avgPacketReadTime * 2) {
std::cout << "[BOTTLENECK] Decoding is the main bottleneck (" << avgDecodeTime << "ms)" << std::endl;
} else if (avgPacketReadTime > avgDecodeTime * 2) {
std::cout << "[BOTTLENECK] Packet reading is the main bottleneck (" << avgPacketReadTime << "ms)" << std::endl;
} else {
std::cout << "[BOTTLENECK] Both decoding and I/O contribute to slowdown" << std::endl;
}
}
std::cout << std::endl;
std::cout << "Target frame time (30fps): 33.33ms" << std::endl;
std::cout << "Current frame time: " << avgTotalTime << "ms" << std::endl;
std::cout << "Performance gap: " << std::fixed << std::setprecision(1) << (avgTotalTime - 33.33) << "ms too slow" << std::endl;
} else {
std::cout << "[ERROR] No frames decoded successfully" << std::endl;
}
std::cout << "=== MAJOR_REFACTORING_GUIDE Phase 3: Test completed successfully ===" << std::endl;
std::cout << "Basic video decoding pipeline verified!" << std::endl;
return 0;

View File

@@ -129,12 +129,18 @@ bool MediaFoundationAV1Decoder::DecodeFrame(const uint8_t* packet_data, size_t p
auto start_time = std::chrono::high_resolution_clock::now();
// Always feed input data to the MFT
if (!ProcessMFTInput(packet_data, packet_size)) {
return false; // Error is logged inside
}
if (!ProcessMFTOutput(output_frame)) {
return false; // Can be normal (need more input) or an error
// Try to get output, but "need more input" is normal for MediaFoundation
bool outputAvailable = ProcessMFTOutput(output_frame);
if (!outputAvailable) {
// Could be "need more input" (normal) or actual error
// For now, we'll assume it's normal and wait for more input
LogInfo("No output available yet - buffering input data");
return false; // No output frame available yet
}
// Update timing and statistics only when frame is successfully output
@@ -562,25 +568,83 @@ bool MediaFoundationAV1Decoder::ProcessMFTInput(const uint8_t* packet_data, size
}
bool MediaFoundationAV1Decoder::ProcessMFTOutput(VideoFrame& output_frame) {
if (!m_decoderMFT) { return false; }
MFT_OUTPUT_DATA_BUFFER outputBuffer = {};
outputBuffer.dwStreamID = 0;
outputBuffer.pEvents = nullptr;
DWORD status = 0;
HRESULT hr = m_decoderMFT->ProcessOutput(0, 1, &outputBuffer, &status);
if (hr == MF_E_TRANSFORM_NEED_MORE_INPUT) {
if (!m_decoderMFT) {
LogError("ProcessMFTOutput: m_decoderMFT is null");
return false;
}
// Check MFT output stream info
DWORD outputStreamId = 0;
MFT_OUTPUT_STREAM_INFO outputStreamInfo = {};
HRESULT hr = m_decoderMFT->GetOutputStreamInfo(outputStreamId, &outputStreamInfo);
if (FAILED(hr)) {
LogError("GetOutputStreamInfo failed", hr);
return false;
}
LogInfo("MFT Output Stream Info - Flags: " + std::to_string(outputStreamInfo.dwFlags) +
", Size: " + std::to_string(outputStreamInfo.cbSize) +
", Alignment: " + std::to_string(outputStreamInfo.cbAlignment));
MFT_OUTPUT_DATA_BUFFER outputDataBuffer = {};
outputDataBuffer.dwStreamID = 0;
outputDataBuffer.pEvents = nullptr;
// Check if we need to provide the output sample
if (outputStreamInfo.dwFlags & MFT_OUTPUT_STREAM_PROVIDES_SAMPLES) {
LogInfo("MFT provides output samples");
outputDataBuffer.pSample = nullptr;
} else {
LogInfo("Client must provide output sample");
// Create output sample
ComPtr<IMFSample> outputSample;
hr = MFCreateSample(&outputSample);
if (FAILED(hr)) {
LogError("Failed to create output sample", hr);
return false;
}
// Create output media buffer
UINT32 bufferSize = outputStreamInfo.cbSize > 0 ? outputStreamInfo.cbSize : (3840 * 2160 * 4);
ComPtr<IMFMediaBuffer> outputBuffer;
hr = MFCreateMemoryBuffer(bufferSize, &outputBuffer);
if (FAILED(hr)) {
LogError("Failed to create output buffer", hr);
return false;
}
hr = outputSample->AddBuffer(outputBuffer.Get());
if (FAILED(hr)) {
LogError("Failed to add buffer to sample", hr);
return false;
}
outputDataBuffer.pSample = outputSample.Get();
}
DWORD status = 0;
hr = m_decoderMFT->ProcessOutput(0, 1, &outputDataBuffer, &status);
if (hr == MF_E_TRANSFORM_NEED_MORE_INPUT) {
LogInfo("ProcessOutput: Need more input (normal for MediaFoundation)");
return false; // Normal case - need more input packets
}
if (FAILED(hr)) {
LogError("ProcessOutput failed", hr);
return false;
}
if (!outputBuffer.pSample) {
if (!outputDataBuffer.pSample) {
LogError("ProcessOutput: No output sample returned");
return false;
}
bool result = ConvertMFSampleToVideoFrame(outputBuffer.pSample, output_frame);
if (outputBuffer.pSample) { outputBuffer.pSample->Release(); }
if (outputBuffer.pEvents) { outputBuffer.pEvents->Release(); }
bool result = ConvertMFSampleToVideoFrame(outputDataBuffer.pSample, output_frame);
// Clean up
if (outputDataBuffer.pEvents) {
outputDataBuffer.pEvents->Release();
}
return result;
}

View File

@@ -41,34 +41,34 @@ std::unique_ptr<IVideoDecoder> VideoDecoderFactory::CreateDecoder(VideoCodecType
std::unique_ptr<IVideoDecoder> VideoDecoderFactory::CreateAV1Decoder(DecoderType decoder_type) {
switch (decoder_type) {
case DecoderType::HARDWARE_MF:
case DecoderType::MEDIA_FOUNDATION:
if (s_media_foundation_available) {
OutputDebugStringA("[VideoDecoderFactory] Creating Media Foundation AV1 decoder\n");
OutputDebugStringA("[VideoDecoderFactory] Creating MediaFoundation AV1 decoder\n");
return std::make_unique<MediaFoundationAV1Decoder>();
}
OutputDebugStringA("[VideoDecoderFactory] Media Foundation not available, falling back to software\n");
OutputDebugStringA("[VideoDecoderFactory] MediaFoundation not available, falling back to dav1d\n");
[[fallthrough]];
case DecoderType::SOFTWARE:
case DecoderType::DAV1D:
if (s_av1_available) {
OutputDebugStringA("[VideoDecoderFactory] Creating software AV1 decoder (dav1d)\n");
OutputDebugStringA("[VideoDecoderFactory] Creating dav1d AV1 decoder\n");
return std::make_unique<AV1Decoder>();
}
break;
case DecoderType::AUTO:
// Try hardware acceleration first, fallback to software if failed
// Try MediaFoundation first, fallback to dav1d if failed
if (s_media_foundation_available) {
OutputDebugStringA("[VideoDecoderFactory] Auto mode: trying Media Foundation AV1 decoder first\n");
OutputDebugStringA("[VideoDecoderFactory] Auto mode: trying MediaFoundation AV1 decoder first\n");
auto decoder = std::make_unique<MediaFoundationAV1Decoder>();
if (decoder) {
return decoder;
}
}
// Fallback to dav1d when Media Foundation fails
// Fallback to dav1d when MediaFoundation fails
if (s_av1_available) {
OutputDebugStringA("[VideoDecoderFactory] Auto mode: falling back to software AV1 decoder (dav1d)\n");
OutputDebugStringA("[VideoDecoderFactory] Auto mode: falling back to dav1d AV1 decoder\n");
return std::make_unique<AV1Decoder>();
}
break;
@@ -99,27 +99,27 @@ std::vector<VideoDecoderFactory::DecoderInfo> VideoDecoderFactory::GetSupportedD
std::vector<DecoderInfo> decoders;
// AV1 software decoder
// AV1 dav1d decoder
decoders.push_back({
VideoCodecType::AV1,
DecoderType::SOFTWARE,
"AV1 (Software)",
DecoderType::DAV1D,
"AV1 (dav1d)",
"AV1 video decoder using dav1d library",
s_av1_available
});
// AV1 hardware decoder
// AV1 MediaFoundation decoder
decoders.push_back({
VideoCodecType::AV1,
DecoderType::HARDWARE_MF,
"AV1 (Hardware)",
"AV1 hardware-accelerated decoder using Media Foundation",
DecoderType::MEDIA_FOUNDATION,
"AV1 (MediaFoundation)",
"AV1 decoder using Windows Media Foundation",
s_media_foundation_available
});
decoders.push_back({
VideoCodecType::VP9,
DecoderType::SOFTWARE,
DecoderType::DAV1D, // TODO: VP9은 별도 디코더 타입 필요
"VP9",
"VP9 video decoder (TODO: not implemented yet)",
s_vp9_available

View File

@@ -13,9 +13,9 @@ class VideoDecoderFactory {
public:
// 디코더 타입 열거
enum class DecoderType {
SOFTWARE, // 소프트웨어 디코더 (dav1d)
HARDWARE_MF, // Media Foundation 하드웨어 가속
AUTO // 자동 선택 (하드웨어 우선, 실패시 소프트웨어)
DAV1D, // dav1d 라이브러리 기반 디코더
MEDIA_FOUNDATION, // Windows Media Foundation 기반 디코더
AUTO // 자동 선택 (MediaFoundation 우선, 실패시 dav1d)
};
// 지원되는 디코더 정보

View File

@@ -0,0 +1,183 @@
#include "pch.h"
#include "GlobalD3D12SyncManager.h"
#include <chrono>
namespace Vav2Player {
GlobalD3D12SyncManager& GlobalD3D12SyncManager::GetInstance() {
static GlobalD3D12SyncManager instance;
return instance;
}
GlobalD3D12SyncManager::~GlobalD3D12SyncManager() {
Shutdown();
}
void GlobalD3D12SyncManager::Initialize(size_t numThreads) {
if (!m_workerThreads.empty()) {
return; // Already initialized
}
m_shutdown = false;
// Create worker threads
m_workerThreads.reserve(numThreads);
for (size_t i = 0; i < numThreads; ++i) {
m_workerThreads.emplace_back(&GlobalD3D12SyncManager::WorkerThreadFunc, this);
}
OutputDebugStringA("[GlobalD3D12SyncManager] Initialized with ");
OutputDebugStringA((std::to_string(numThreads) + " worker threads\n").c_str());
}
void GlobalD3D12SyncManager::Shutdown() {
if (m_workerThreads.empty()) {
return; // Not initialized
}
// Signal shutdown
m_shutdown = true;
m_condition.notify_all();
// Wait for all worker threads to finish
for (auto& thread : m_workerThreads) {
if (thread.joinable()) {
thread.join();
}
}
m_workerThreads.clear();
// Clear any remaining tasks
std::lock_guard<std::mutex> lock(m_queueMutex);
while (!m_taskQueue.empty()) {
m_taskQueue.pop();
}
OutputDebugStringA("[GlobalD3D12SyncManager] Shutdown completed\n");
}
std::future<void> GlobalD3D12SyncManager::SubmitTask(std::function<void()> task) {
std::lock_guard<std::mutex> lock(m_queueMutex);
if (m_shutdown) {
// Return immediately completed future if shutting down
std::promise<void> promise;
promise.set_value();
return promise.get_future();
}
m_taskQueue.emplace(std::move(task));
auto future = m_taskQueue.back().completion.get_future();
m_condition.notify_one();
return future;
}
std::future<void> GlobalD3D12SyncManager::WaitForFence(ComPtr<ID3D12Fence> fence, UINT64 targetValue, HANDLE fenceEvent) {
return SubmitTask([this, fence, targetValue, fenceEvent]() {
WaitForFenceInternal(fence, targetValue, fenceEvent);
});
}
std::future<void> GlobalD3D12SyncManager::WaitForFrameCompletion(ComPtr<ID3D12Fence> fence, UINT64 targetValue, HANDLE fenceEvent, DWORD timeoutMs) {
return SubmitTask([this, fence, targetValue, fenceEvent, timeoutMs]() {
WaitForFrameCompletionInternal(fence, targetValue, fenceEvent, timeoutMs);
});
}
void GlobalD3D12SyncManager::WorkerThreadFunc() {
while (!m_shutdown) {
std::unique_lock<std::mutex> lock(m_queueMutex);
// Wait for tasks or shutdown signal
m_condition.wait(lock, [this] {
return !m_taskQueue.empty() || m_shutdown;
});
if (m_shutdown && m_taskQueue.empty()) {
break; // Exit if shutting down and no more tasks
}
if (!m_taskQueue.empty()) {
// Get task from queue
SyncTask task = std::move(m_taskQueue.front());
m_taskQueue.pop();
++m_activeTasks;
lock.unlock(); // Release lock while executing task
try {
// Execute the task
task.task();
task.completion.set_value();
}
catch (const std::exception& e) {
task.completion.set_exception(std::current_exception());
OutputDebugStringA("[GlobalD3D12SyncManager] Task exception: ");
OutputDebugStringA(e.what());
OutputDebugStringA("\n");
}
catch (...) {
task.completion.set_exception(std::current_exception());
OutputDebugStringA("[GlobalD3D12SyncManager] Unknown task exception\n");
}
--m_activeTasks;
}
}
}
void GlobalD3D12SyncManager::WaitForFenceInternal(ComPtr<ID3D12Fence> fence, UINT64 targetValue, HANDLE fenceEvent) {
if (!fence || !fenceEvent) {
return;
}
// Check if already completed
if (fence->GetCompletedValue() >= targetValue) {
return;
}
// Set event for completion
HRESULT hr = fence->SetEventOnCompletion(targetValue, fenceEvent);
if (FAILED(hr)) {
OutputDebugStringA("[GlobalD3D12SyncManager] SetEventOnCompletion failed\n");
return;
}
// Wait for completion (no timeout for WaitForGPU equivalent)
WaitForSingleObject(fenceEvent, INFINITE);
}
void GlobalD3D12SyncManager::WaitForFrameCompletionInternal(ComPtr<ID3D12Fence> fence, UINT64 targetValue, HANDLE fenceEvent, DWORD timeoutMs) {
if (!fence || !fenceEvent || targetValue == 0) {
return;
}
const auto startTime = std::chrono::high_resolution_clock::now();
const auto timeoutDuration = std::chrono::milliseconds(timeoutMs);
// Wait with timeout to prevent infinite hangs
while (fence->GetCompletedValue() < targetValue) {
HRESULT hr = fence->SetEventOnCompletion(targetValue, fenceEvent);
if (FAILED(hr)) {
OutputDebugStringA("[GlobalD3D12SyncManager] SetEventOnCompletion failed for frame\n");
break;
}
DWORD result = WaitForSingleObject(fenceEvent, timeoutMs);
if (result == WAIT_TIMEOUT) {
auto elapsed = std::chrono::high_resolution_clock::now() - startTime;
if (elapsed >= timeoutDuration) {
OutputDebugStringA("[GlobalD3D12SyncManager] Frame completion timeout exceeded\n");
break;
}
}
else if (result != WAIT_OBJECT_0) {
OutputDebugStringA("[GlobalD3D12SyncManager] WaitForSingleObject failed for frame\n");
break;
}
}
}
} // namespace Vav2Player

View File

@@ -0,0 +1,85 @@
#pragma once
#include <d3d12.h>
#include <wrl/client.h>
#include <thread>
#include <queue>
#include <mutex>
#include <condition_variable>
#include <functional>
#include <atomic>
#include <future>
using Microsoft::WRL::ComPtr;
namespace Vav2Player {
// Task structure for GPU synchronization work
struct SyncTask {
std::function<void()> task;
std::promise<void> completion;
SyncTask(std::function<void()> t) : task(std::move(t)) {}
// Move constructor
SyncTask(SyncTask&& other) noexcept
: task(std::move(other.task))
, completion(std::move(other.completion))
{
}
// No copy constructor (move-only)
SyncTask(const SyncTask&) = delete;
SyncTask& operator=(const SyncTask&) = delete;
};
// Global D3D12 synchronization manager with thread pool
// Reduces Internal D3D12 Sync Thread creation/destruction overhead
class GlobalD3D12SyncManager {
public:
static GlobalD3D12SyncManager& GetInstance();
// Initialize the thread pool
void Initialize(size_t numThreads = 4);
// Shutdown the thread pool
void Shutdown();
// Submit a synchronization task (non-blocking)
std::future<void> SubmitTask(std::function<void()> task);
// Submit fence wait task
std::future<void> WaitForFence(ComPtr<ID3D12Fence> fence, UINT64 targetValue, HANDLE fenceEvent);
// Submit frame completion wait task
std::future<void> WaitForFrameCompletion(ComPtr<ID3D12Fence> fence, UINT64 targetValue, HANDLE fenceEvent, DWORD timeoutMs = 1000);
// Get statistics
size_t GetActiveTaskCount() const { return m_taskQueue.size(); }
size_t GetThreadCount() const { return m_workerThreads.size(); }
private:
GlobalD3D12SyncManager() = default;
~GlobalD3D12SyncManager();
// Thread pool workers
std::vector<std::thread> m_workerThreads;
// Task queue
std::queue<SyncTask> m_taskQueue;
mutable std::mutex m_queueMutex;
std::condition_variable m_condition;
// Thread pool state
std::atomic<bool> m_shutdown{false};
std::atomic<size_t> m_activeTasks{0};
// Worker thread function
void WorkerThreadFunc();
// Internal fence wait implementation
void WaitForFenceInternal(ComPtr<ID3D12Fence> fence, UINT64 targetValue, HANDLE fenceEvent);
void WaitForFrameCompletionInternal(ComPtr<ID3D12Fence> fence, UINT64 targetValue, HANDLE fenceEvent, DWORD timeoutMs);
};
} // namespace Vav2Player

View File

@@ -19,6 +19,10 @@ SimpleGPURenderer::SimpleGPURenderer()
{
m_frameCompletionValues[i] = 0;
}
// Initialize GlobalD3D12SyncManager (singleton, safe to call multiple times)
auto& syncManager = GlobalD3D12SyncManager::GetInstance();
syncManager.Initialize(4); // Use 4 worker threads for sync operations
}
SimpleGPURenderer::~SimpleGPURenderer()
@@ -924,13 +928,14 @@ HRESULT SimpleGPURenderer::WaitForGPU()
HRESULT hr = m_commandQueue->Signal(m_fence.Get(), fenceValue);
if (FAILED(hr)) return hr;
// Wait for completion
// Use GlobalD3D12SyncManager instead of direct WaitForSingleObject
if (m_fence->GetCompletedValue() < fenceValue)
{
hr = m_fence->SetEventOnCompletion(fenceValue, m_fenceEvent);
if (FAILED(hr)) return hr;
auto& syncManager = GlobalD3D12SyncManager::GetInstance();
auto future = syncManager.WaitForFence(m_fence, fenceValue, m_fenceEvent);
WaitForSingleObject(m_fenceEvent, INFINITE);
// Wait for completion (this is still synchronous from caller's perspective)
future.wait();
}
return S_OK;
@@ -945,16 +950,12 @@ void SimpleGPURenderer::WaitForFrameCompletion(UINT frameIndex)
if (targetValue > 0)
{
// Always wait with timeout to prevent infinite hangs
while (m_fence->GetCompletedValue() < targetValue)
{
m_fence->SetEventOnCompletion(targetValue, m_fenceEvent);
DWORD result = WaitForSingleObject(m_fenceEvent, 1000); // 1 second timeout
if (result == WAIT_TIMEOUT)
{
break; // Continue anyway to prevent infinite hang
}
}
// Use GlobalD3D12SyncManager for frame completion wait
auto& syncManager = GlobalD3D12SyncManager::GetInstance();
auto future = syncManager.WaitForFrameCompletion(m_fence, targetValue, m_fenceEvent, 1000);
// Wait for completion (this is still synchronous from caller's perspective)
future.wait();
}
}

View File

@@ -9,6 +9,7 @@
#include <microsoft.ui.xaml.media.dxinterop.h>
#include "../Common/VideoTypes.h"
#include "IVideoRenderer.h"
#include "GlobalD3D12SyncManager.h"
using Microsoft::WRL::ComPtr;
@@ -59,8 +60,8 @@ private:
ComPtr<IDXGISwapChain3> m_swapChain;
ComPtr<ID3D12DescriptorHeap> m_rtvHeap;
// Command objects - Triple buffering
static const UINT FrameCount = 3;
// Command objects - Double buffering (changed from 3 for performance test)
static const UINT FrameCount = 2;
ComPtr<ID3D12CommandAllocator> m_commandAllocators[FrameCount];
ComPtr<ID3D12GraphicsCommandList> m_commandList;
ComPtr<ID3D12Resource> m_renderTargets[FrameCount];

123
vav2/todo6.txt Normal file
View File

@@ -0,0 +1,123 @@
● ⚠️ 프레임 스킵으로 인한 영상 끊김/깨짐 문제!
🔍 문제 원인:
1. AV1 프레임 의존성: P/B 프레임이 이전 프레임을 참조하는데 스킵으로 인해 참조 프레임 손실
2. 키프레임 스킵: I-프레임까지 스킵하면 전체 GOP가 깨짐
3. 디코드 에러 급증: errors=39.3% → 화면 깨짐 직접 원인
🎯 해결방안:
Option A: 키프레임 보호 스킵 (추천)
// I-프레임(키프레임)은 절대 스킵하지 않음
bool isKeyFrame = /* 프레임 타입 확인 */;
if (!isKeyFrame && shouldSkipFrame) {
// P/B 프레임만 스킵
}
Option B: 스킵 임계값 완화
// 더 보수적인 스킵
if (timingDelta > 0.15) { // 50ms → 150ms로 증가
shouldSkipFrame = true;
}
Option C: GOP 단위 스킵
// GOP(Group of Pictures) 전체를 스킵해서 깨짐 방지
if (shouldSkip) {
skipToNextKeyFrame(); // 다음 키프레임까지 스킵
}
Option D: 품질 저하 대신 스킵
// 프레임 스킵 대신 해상도/품질 동적 조정
if (timingDelta > 0.1) {
reduceQuality(); // 4K → 1080p 동적 변경
}
어떤 방식을 시도해보시겠습니까?
1. 키프레임 보호 스킵 (가장 안전)
2. 스킵 임계값 완화 (가장 간단)
● 동적 해상도 조정 구현 방안
3가지 접근 방법을 제안합니다:
옵션 A: GPU 렌더링 해상도 스케일링 ⭐ (권장)
현재 시스템에 가장 적합하고 구현이 용이합니다.
구현 방안:
// 뷰포트 크기에 따른 동적 해상도 조정
struct AdaptiveRenderingSettings {
uint32_t viewport_width, viewport_height;
uint32_t render_width, render_height; // 실제 렌더링 해상도
float quality_scale; // 0.25, 0.5, 0.75, 1.0
};
// 4개 플레이어 → 각각 25% 해상도로 렌더링
if (player_count == 4) {
settings.quality_scale = 0.5f; // 4K → 1080p
settings.render_width = original_width * 0.5f;
settings.render_height = original_height * 0.5f;
}
장점:
- SimpleGPURenderer에서 텍스처 크기만 조정하면 됨
- dav1d 디코더는 그대로 사용 (전체 해상도 디코딩)
- GPU 스케일링으로 고품질 다운샘플링
- 구현이 비교적 간단
성능 향상:
- GPU 메모리: 75% 절약 (4K → 1080p 텍스처)
- 렌더링 속도: 4배 향상
- 디코딩은 동일하지만 렌더링 부하 대폭 감소
옵션 B: dav1d 레이어 디코딩 🔬 (고급)
AV1의 Scalable Video Coding (SVC) 기능 활용
구현 방안:
// dav1d settings에서 spatial layer 제한
Dav1dSettings settings;
dav1d_default_settings(&settings);
settings.operating_point = 0; // Base layer만 디코딩
settings.all_layers = 0; // 상위 레이어 스킵
장점:
- 디코딩 단계에서부터 성능 향상
- 메모리 사용량 최소화
- 네트워크 대역폭도 절약 (SVC 스트림 시)
단점:
- 파일이 SVC로 인코딩되어야 함
- 기존 AV1 파일은 대부분 단일 레이어
옵션 C: 선택적 프레임 디코딩 ⚡ (성능 최적화)
해상도는 유지하되 디코딩 품질/속도 조정
구현 방안:
// dav1d에서 디코딩 스레드 수 동적 조정
if (player_count >= 4) {
settings.n_threads = 1; // 각 플레이어당 1스레드
} else {
settings.n_threads = 4; // 단일 플레이어는 4스레드
}
// 그레인 필터 비활성화로 성능 향상
settings.apply_grain = 0; // Film grain 효과 스킵
권장 구현 순서
1단계: GPU 렌더링 스케일링 (옵션 A)
- SimpleGPURenderer에서 텍스처 크기 동적 조정
- 즉시 75% 성능 향상 효과
2단계: 디코더 최적화 (옵션 C)
- 스레드 수 및 품질 옵션 동적 조정
- 추가 20-30% 성능 향상
이 방법이 프레임 스킵보다 훨씬 자연스럽고 효과적일 것 같습니다. 어떤 옵션부터 구현해보시겠습니까?