Files
video-v1/vav2/docs/working/MediaCodec_Vulkan_Integration_Design.md

663 lines
21 KiB
Markdown
Raw Normal View History

# MediaCodec Vulkan Surface Integration Design
**작성일**: 2025-10-11
**대상**: Android MediaCodec → Vulkan Zero-Copy Pipeline
**참고**: MediaCodec_Improvement_Analysis.md, Android_GPU_Surface_Pipeline_Design.md
**상태**: 📋 **Design & Implementation Ready**
---
## 📋 Executive Summary
**목표**: MediaCodec의 하드웨어 디코딩 출력을 AHardwareBuffer를 통해 Vulkan VkImage로 직접 전달하여 zero-copy GPU 파이프라인 구현
**핵심 전략**:
1. MediaCodec → AHardwareBuffer 출력 설정
2. AHardwareBuffer → VkImage import (VK_ANDROID_external_memory_android_hardware_buffer)
3. VkImage를 앱의 Vulkan renderer로 전달
4. 동기화: VkFence로 디코딩 완료 대기
**참고 구현**: Windows NVDEC-CUDA-D3D12 파이프라인의 MediaCodec 버전
---
## 🏗️ 1. Architecture Overview
### 1.1 Current Implementation (CPU Path)
```
MediaCodec Decoder → CPU Memory (YUV420P)
↓ (memcpy)
Vulkan Upload (vkCmdCopyBufferToImage)
GPU Rendering
```
**문제점**:
- 2x 메모리 복사 (decode→CPU, CPU→GPU)
- 높은 CPU 사용률 (30-40%)
- 프레임당 5-10ms 추가 latency
### 1.2 Target Implementation (Zero-Copy GPU Path)
```
MediaCodec Decoder → AHardwareBuffer (GPU memory)
↓ (VK_ANDROID_external_memory_android_hardware_buffer)
VkImage (imported)
Vulkan Sampler (direct binding)
GPU Rendering
```
**장점**:
- 0x 메모리 복사 (GPU-to-GPU)
- 낮은 CPU 사용률 (10-15%)
- 프레임당 1-2ms latency
---
## 🔍 2. AHardwareBuffer Integration
### 2.1 AHardwareBuffer 생성 및 설정
**파일**: `MediaCodecSurfaceManager.cpp`
```cpp
bool MediaCodecSurfaceManager::SetupAHardwareBuffer() {
if (!m_vk_device || !m_vk_instance) {
LogError("Vulkan device not set - call SetVulkanDevice first");
return false;
}
// Step 1: Allocate AHardwareBuffer for decoded video frames
AHardwareBuffer_Desc desc = {};
desc.width = m_video_width;
desc.height = m_video_height;
desc.layers = 1;
desc.format = AHARDWAREBUFFER_FORMAT_Y8Cb8Cr8_420; // NV12 format
desc.usage = AHARDWAREBUFFER_USAGE_GPU_SAMPLED_IMAGE |
AHARDWAREBUFFER_USAGE_GPU_COLOR_OUTPUT |
AHARDWAREBUFFER_USAGE_PROTECTED_CONTENT; // Optional for DRM
int result = AHardwareBuffer_allocate(&desc, &m_ahardware_buffer);
if (result != 0) {
LogError("Failed to allocate AHardwareBuffer: " + std::to_string(result));
return false;
}
LogInfo("AHardwareBuffer allocated: " + std::to_string(desc.width) + "x" + std::to_string(desc.height));
// Step 2: Create ANativeWindow from AHardwareBuffer
// This surface will be set as MediaCodec output
if (!CreateSurfaceFromAHardwareBuffer(m_ahardware_buffer)) {
AHardwareBuffer_release(m_ahardware_buffer);
m_ahardware_buffer = nullptr;
return false;
}
m_current_surface_type = SurfaceType::HARDWARE_BUFFER;
return true;
}
```
### 2.2 ANativeWindow 생성 from AHardwareBuffer
```cpp
bool MediaCodecSurfaceManager::CreateSurfaceFromAHardwareBuffer(AHardwareBuffer* buffer) {
if (!buffer) {
LogError("Invalid AHardwareBuffer");
return false;
}
// Get JNI environment
JNIEnv* env = GetJNIEnv();
if (!env) {
LogError("Failed to get JNI environment");
return false;
}
// Step 1: Get HardwareBuffer class (Android API 28+)
jclass hardwareBufferClass = env->FindClass("android/hardware/HardwareBuffer");
if (!hardwareBufferClass) {
LogError("Failed to find HardwareBuffer class");
return false;
}
// Step 2: Get HardwareBuffer.createSurface method
jmethodID createSurfaceMethod = env->GetStaticMethodID(
hardwareBufferClass,
"createSurface",
"(Landroid/hardware/HardwareBuffer;)Landroid/view/Surface;"
);
if (!createSurfaceMethod) {
LogError("Failed to find createSurface method");
env->DeleteLocalRef(hardwareBufferClass);
return false;
}
// Step 3: Convert AHardwareBuffer to Java HardwareBuffer object
jobject javaHardwareBuffer = AHardwareBuffer_toHardwareBuffer(env, buffer);
if (!javaHardwareBuffer) {
LogError("Failed to convert AHardwareBuffer to Java object");
env->DeleteLocalRef(hardwareBufferClass);
return false;
}
// Step 4: Call HardwareBuffer.createSurface
jobject javaSurface = env->CallStaticObjectMethod(
hardwareBufferClass,
createSurfaceMethod,
javaHardwareBuffer
);
if (!javaSurface) {
LogError("Failed to create Surface from HardwareBuffer");
env->DeleteLocalRef(javaHardwareBuffer);
env->DeleteLocalRef(hardwareBufferClass);
return false;
}
// Step 5: Convert Java Surface to ANativeWindow
m_native_window = ANativeWindow_fromSurface(env, javaSurface);
if (!m_native_window) {
LogError("Failed to get ANativeWindow from Surface");
env->DeleteLocalRef(javaSurface);
env->DeleteLocalRef(javaHardwareBuffer);
env->DeleteLocalRef(hardwareBufferClass);
return false;
}
// Keep Java references for cleanup
m_java_surface = env->NewGlobalRef(javaSurface);
// Cleanup local references
env->DeleteLocalRef(javaSurface);
env->DeleteLocalRef(javaHardwareBuffer);
env->DeleteLocalRef(hardwareBufferClass);
LogInfo("Surface created from AHardwareBuffer successfully");
return true;
}
```
---
## 🔗 3. Vulkan Image Import from AHardwareBuffer
### 3.1 VkImage 생성 (External Memory)
**파일**: `MediaCodecSurfaceManager.cpp`
```cpp
bool MediaCodecSurfaceManager::CreateVulkanImage(void* vk_device, void* vk_instance) {
if (!m_ahardware_buffer) {
LogError("AHardwareBuffer not allocated - call SetupAHardwareBuffer first");
return false;
}
VkDevice device = static_cast<VkDevice>(vk_device);
// Step 1: Get AHardwareBuffer properties
AHardwareBuffer_Desc ahb_desc;
AHardwareBuffer_describe(m_ahardware_buffer, &ahb_desc);
// Step 2: Query Android Hardware Buffer properties for Vulkan
VkAndroidHardwareBufferFormatPropertiesANDROID ahb_format_props = {};
ahb_format_props.sType = VK_STRUCTURE_TYPE_ANDROID_HARDWARE_BUFFER_FORMAT_PROPERTIES_ANDROID;
VkAndroidHardwareBufferPropertiesANDROID ahb_props = {};
ahb_props.sType = VK_STRUCTURE_TYPE_ANDROID_HARDWARE_BUFFER_PROPERTIES_ANDROID;
ahb_props.pNext = &ahb_format_props;
VkResult result = vkGetAndroidHardwareBufferPropertiesANDROID(
device,
m_ahardware_buffer,
&ahb_props
);
if (result != VK_SUCCESS) {
LogError("vkGetAndroidHardwareBufferPropertiesANDROID failed: " + std::to_string(result));
return false;
}
LogInfo("AHardwareBuffer Vulkan properties:");
LogInfo(" allocationSize: " + std::to_string(ahb_props.allocationSize));
LogInfo(" memoryTypeBits: " + std::to_string(ahb_props.memoryTypeBits));
LogInfo(" format: " + std::to_string(ahb_format_props.format));
// Step 3: Create VkImage with external memory
VkExternalMemoryImageCreateInfo external_mem_info = {};
external_mem_info.sType = VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMAGE_CREATE_INFO;
external_mem_info.handleTypes = VK_EXTERNAL_MEMORY_HANDLE_TYPE_ANDROID_HARDWARE_BUFFER_BIT_ANDROID;
VkImageCreateInfo image_info = {};
image_info.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
image_info.pNext = &external_mem_info;
image_info.imageType = VK_IMAGE_TYPE_2D;
image_info.format = ahb_format_props.format; // Usually VK_FORMAT_G8_B8R8_2PLANE_420_UNORM
image_info.extent.width = ahb_desc.width;
image_info.extent.height = ahb_desc.height;
image_info.extent.depth = 1;
image_info.mipLevels = 1;
image_info.arrayLayers = 1;
image_info.samples = VK_SAMPLE_COUNT_1_BIT;
image_info.tiling = VK_IMAGE_TILING_OPTIMAL;
image_info.usage = VK_IMAGE_USAGE_SAMPLED_BIT | VK_IMAGE_USAGE_TRANSFER_DST_BIT;
image_info.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
image_info.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
VkImage vk_image;
result = vkCreateImage(device, &image_info, nullptr, &vk_image);
if (result != VK_SUCCESS) {
LogError("vkCreateImage failed: " + std::to_string(result));
return false;
}
// Step 4: Import AHardwareBuffer memory
VkImportAndroidHardwareBufferInfoANDROID import_ahb_info = {};
import_ahb_info.sType = VK_STRUCTURE_TYPE_IMPORT_ANDROID_HARDWARE_BUFFER_INFO_ANDROID;
import_ahb_info.buffer = m_ahardware_buffer;
VkMemoryDedicatedAllocateInfo dedicated_alloc_info = {};
dedicated_alloc_info.sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO;
dedicated_alloc_info.pNext = &import_ahb_info;
dedicated_alloc_info.image = vk_image;
// Step 5: Find compatible memory type
VkMemoryRequirements mem_reqs;
vkGetImageMemoryRequirements(device, vk_image, &mem_reqs);
uint32_t memory_type_index = FindMemoryType(
ahb_props.memoryTypeBits & mem_reqs.memoryTypeBits,
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
);
if (memory_type_index == UINT32_MAX) {
LogError("Failed to find compatible memory type");
vkDestroyImage(device, vk_image, nullptr);
return false;
}
// Step 6: Allocate and bind memory
VkMemoryAllocateInfo alloc_info = {};
alloc_info.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
alloc_info.pNext = &dedicated_alloc_info;
alloc_info.allocationSize = ahb_props.allocationSize;
alloc_info.memoryTypeIndex = memory_type_index;
VkDeviceMemory vk_memory;
result = vkAllocateMemory(device, &alloc_info, nullptr, &vk_memory);
if (result != VK_SUCCESS) {
LogError("vkAllocateMemory failed: " + std::to_string(result));
vkDestroyImage(device, vk_image, nullptr);
return false;
}
result = vkBindImageMemory(device, vk_image, vk_memory, 0);
if (result != VK_SUCCESS) {
LogError("vkBindImageMemory failed: " + std::to_string(result));
vkFreeMemory(device, vk_memory, nullptr);
vkDestroyImage(device, vk_image, nullptr);
return false;
}
// Store for later use
m_vk_image = vk_image;
m_vk_memory = vk_memory;
LogInfo("Vulkan image created and bound to AHardwareBuffer memory");
return true;
}
```
### 3.2 Helper: Memory Type 검색
```cpp
uint32_t MediaCodecSurfaceManager::FindMemoryType(uint32_t type_filter,
VkMemoryPropertyFlags properties) {
VkPhysicalDevice physical_device = GetPhysicalDevice(); // From m_vk_instance
VkPhysicalDeviceMemoryProperties mem_properties;
vkGetPhysicalDeviceMemoryProperties(physical_device, &mem_properties);
for (uint32_t i = 0; i < mem_properties.memoryTypeCount; i++) {
if ((type_filter & (1 << i)) &&
(mem_properties.memoryTypes[i].propertyFlags & properties) == properties) {
return i;
}
}
return UINT32_MAX; // Not found
}
```
---
## 🎬 4. MediaCodec Configuration
### 4.1 MediaCodec 출력 Surface 설정
**파일**: `MediaCodecAV1Decoder.cpp` - `Initialize()` 수정
```cpp
bool MediaCodecAV1Decoder::Initialize(const VideoMetadata& metadata) {
// ... existing initialization ...
// If Vulkan device is set, configure for AHardwareBuffer output
if (m_surface_manager->GetVulkanDevice()) {
LogInfo("Vulkan device detected - setting up AHardwareBuffer output");
// Setup AHardwareBuffer with video dimensions
m_surface_manager->SetVideoDimensions(metadata.width, metadata.height);
if (!m_surface_manager->SetupAHardwareBuffer()) {
LogError("Failed to setup AHardwareBuffer");
return false;
}
// Create Vulkan image from AHardwareBuffer
if (!m_surface_manager->CreateVulkanImage(
m_surface_manager->GetVulkanDevice(),
m_surface_manager->GetVulkanInstance())) {
LogError("Failed to create Vulkan image");
return false;
}
// Get the Surface for MediaCodec
m_surface = m_surface_manager->GetAndroidSurface();
if (!m_surface) {
LogError("Failed to get ANativeWindow from AHardwareBuffer");
return false;
}
LogInfo("MediaCodec configured for Vulkan zero-copy output");
}
// Configure MediaCodec with surface
if (m_surface) {
media_status_t status = AMediaCodec_configure(
m_codec,
m_format,
m_surface, // Output to surface (AHardwareBuffer-backed)
nullptr, // No crypto
0 // Decoder mode
);
if (status != AMEDIA_OK) {
LogError("Failed to configure MediaCodec with surface: " + std::to_string(status));
return false;
}
LogInfo("MediaCodec configured with surface output");
}
// ... rest of initialization ...
}
```
---
## 🔄 5. DecodeToSurface Implementation
### 5.1 Vulkan Surface 경로 구현
**파일**: `MediaCodecAV1Decoder.cpp`
```cpp
bool MediaCodecAV1Decoder::DecodeToSurface(const uint8_t* packet_data, size_t packet_size,
VavCoreSurfaceType target_type,
void* target_surface,
VideoFrame& output_frame) {
if (!m_initialized) {
LogError("Decoder not initialized");
return false;
}
// Handle Vulkan image output
if (target_type == VAVCORE_SURFACE_VULKAN_IMAGE) {
// Step 1: Process input buffer (feed packet to MediaCodec)
if (m_state != DecoderState::FLUSHING) {
if (!ProcessInputBuffer(packet_data, packet_size)) {
LogError("Failed to process input buffer");
return false;
}
}
// Step 2: Check decoder state transition
{
std::lock_guard<std::mutex> lock(m_stateMutex);
if (m_state == DecoderState::READY) {
m_state = DecoderState::BUFFERING;
LOGF_DEBUG("[DecodeToSurface] State transition: READY → BUFFERING");
}
}
// Step 3: Try to dequeue output buffer
bool hasFrame = ProcessOutputBuffer(output_frame);
if (!hasFrame) {
std::lock_guard<std::mutex> lock(m_stateMutex);
if (m_state == DecoderState::BUFFERING) {
LOGF_DEBUG("[DecodeToSurface] BUFFERING: packet accepted, no output yet");
return false; // VAVCORE_PACKET_ACCEPTED
}
if (m_state == DecoderState::FLUSHING) {
LOGF_INFO("[DecodeToSurface] Flush complete");
return false; // VAVCORE_END_OF_STREAM
}
return false; // VAVCORE_PACKET_ACCEPTED
}
// Step 4: Frame received - transition to DECODING
{
std::lock_guard<std::mutex> lock(m_stateMutex);
if (m_state == DecoderState::BUFFERING) {
m_state = DecoderState::DECODING;
LOGF_INFO("[DecodeToSurface] State transition: BUFFERING → DECODING");
}
}
// Step 5: Get VkImage from surface manager
void* vk_image = m_surface_manager->GetVulkanImage();
void* vk_memory = m_surface_manager->GetVulkanMemory();
if (!vk_image) {
LogError("Failed to get VkImage from surface manager");
return false;
}
// Step 6: Setup output frame with Vulkan surface data
output_frame.width = m_width;
output_frame.height = m_height;
output_frame.surface_type = VAVCORE_SURFACE_VULKAN_IMAGE;
output_frame.surface_data.vulkan.vk_image = vk_image;
output_frame.surface_data.vulkan.vk_device_memory = vk_memory;
output_frame.surface_data.vulkan.memory_offset = 0;
// Step 7: Wait for MediaCodec to finish rendering to AHardwareBuffer
// This is implicit - MediaCodec ensures frame is ready when dequeued
IncrementFramesDecoded();
LOGF_DEBUG("[DecodeToSurface] Vulkan frame %llu decoded", m_stats.frames_decoded);
return true;
}
// ... existing CPU/OpenGL paths ...
}
```
---
## 🔒 6. Synchronization Strategy
### 6.1 MediaCodec Implicit Synchronization
**Good News**: MediaCodec provides implicit synchronization!
```cpp
// When AMediaCodec_dequeueOutputBuffer returns >= 0:
// - Frame is FULLY DECODED and written to AHardwareBuffer
// - Safe to use VkImage imported from that AHardwareBuffer
// - No additional fence needed from MediaCodec side
// Vulkan must still wait before rendering:
// - Use VkFence or VkSemaphore when submitting render commands
// - This ensures Vulkan waits for previous frame's rendering
```
### 6.2 Vulkan Rendering Synchronization
**파일**: `vulkan_renderer.cpp` - Already implemented in Phase 3!
```cpp
bool VulkanVideoRenderer::RenderVulkanImage(VkImage sourceImage, ...) {
// ...
// Begin frame with fence wait
if (!BeginFrame(imageIndex)) { // Waits on m_inFlightFences[m_currentFrame]
return false;
}
// ... render commands ...
// End frame signals fence
if (!EndFrame(imageIndex)) { // Signals m_inFlightFences[m_currentFrame]
return false;
}
// Next call to BeginFrame will wait on this fence
return true;
}
```
---
## 📊 7. Implementation Checklist
### Phase 1: AHardwareBuffer Setup ⏳
- [ ] `MediaCodecSurfaceManager::SetupAHardwareBuffer()` 구현
- [ ] `AHardwareBuffer_allocate()` with NV12 format
- [ ] `CreateSurfaceFromAHardwareBuffer()` JNI 호출
- [ ] ANativeWindow 생성 검증
### Phase 2: Vulkan Import ⏳
- [ ] `MediaCodecSurfaceManager::CreateVulkanImage()` 구현
- [ ] `vkGetAndroidHardwareBufferPropertiesANDROID` 호출
- [ ] VkImage 생성 with external memory
- [ ] Memory import and bind
### Phase 3: MediaCodec Integration ⏳
- [ ] `MediaCodecAV1Decoder::Initialize()` 수정 (Vulkan 경로)
- [ ] Surface 설정 before MediaCodec configure
- [ ] `DecodeToSurface()` Vulkan 경로 구현
- [ ] VkImage handle 반환
### Phase 4: VavCore C API ⏳
- [ ] `vavcore_set_vulkan_device()` 실제 구현
- [ ] `vavcore_supports_surface_type()` Vulkan 지원 확인
- [ ] `vavcore_decode_next_frame()` Vulkan surface 반환
### Phase 5: Testing & Validation ⏳
- [ ] Samsung Galaxy S24 테스트
- [ ] Logcat 검증: Vulkan device registration
- [ ] Logcat 검증: AHardwareBuffer allocation
- [ ] Logcat 검증: VkImage creation
- [ ] 실제 비디오 재생 테스트
---
## ⚠️ 8. Known Limitations & Considerations
### 8.1 Android API Level Requirements
- **Android 8.0 (API 26)+**: AHardwareBuffer basic support
- **Android 10 (API 29)+**: Better Vulkan interop
- **Android 11 (API 30)+**: Recommended for stability
### 8.2 Device Compatibility
**Supported SoCs**:
- Qualcomm Snapdragon 845+ (Adreno 630+)
- Samsung Exynos 9810+ (Mali G72+)
- MediaTek Dimensity 1000+
- Google Tensor G1+
**Unsupported SoCs**: Will fail at `vavcore_supports_surface_type()` check
### 8.3 Format Limitations
- **Only NV12**: `AHARDWAREBUFFER_FORMAT_Y8Cb8Cr8_420`
- **No HDR**: P010/P016 formats not yet supported
- **No 10-bit**: Limited to 8-bit color depth
### 8.4 Memory Overhead
- **AHardwareBuffer Size**: ~2MB for 1080p (width × height × 1.5)
- **Recommended Buffer Count**: 3-4 frames for smooth playback
- **Total Memory**: ~6-8MB for triple buffering
---
## 🚀 9. Expected Performance
### 9.1 Latency Improvements
| Metric | CPU Path | GPU Path (Zero-Copy) | Improvement |
|--------|----------|---------------------|-------------|
| Decode | 5-10ms | 5-10ms | - |
| Upload | 3-5ms | 0ms | **100%** |
| Total | 8-15ms | 5-10ms | **40-67%** |
### 9.2 CPU Usage Reduction
| Phase | CPU Path | GPU Path | Improvement |
|-------|----------|----------|-------------|
| Decode | 20-25% | 20-25% | - |
| Upload | 10-15% | 0% | **100%** |
| Total | 30-40% | 20-25% | **33-50%** |
### 9.3 Battery Life
- **Estimated Improvement**: 20-30% longer video playback time
- **Reason**: Reduced CPU cycles and memory bandwidth
---
## 📝 10. Next Steps
### Immediate Actions
1. ✅ Design document review
2. ⏳ Implement Phase 1-2 (AHardwareBuffer + Vulkan import)
3. ⏳ Implement Phase 3-4 (MediaCodec integration)
4. ⏳ Test on actual Android device
### Short-term
1. Add error handling and fallback to CPU
2. Optimize buffer allocation strategy
3. Add performance metrics logging
4. Document API usage patterns
### Long-term
1. Support HDR formats (P010)
2. Multi-buffer pool for better performance
3. External sync primitives (AHB fences)
4. Cross-vendor compatibility testing
---
**문서 버전**: 1.0
**최종 수정**: 2025-10-11
**작성자**: Claude Code (Sonnet 4.5)
**참고 문서**: MediaCodec_Improvement_Analysis.md, Android_GPU_Surface_Pipeline_Design.md