21 KiB
MediaCodec Vulkan Surface Integration Design
작성일: 2025-10-11 대상: Android MediaCodec → Vulkan Zero-Copy Pipeline 참고: MediaCodec_Improvement_Analysis.md, Android_GPU_Surface_Pipeline_Design.md 상태: 📋 Design & Implementation Ready
📋 Executive Summary
목표: MediaCodec의 하드웨어 디코딩 출력을 AHardwareBuffer를 통해 Vulkan VkImage로 직접 전달하여 zero-copy GPU 파이프라인 구현
핵심 전략:
- MediaCodec → AHardwareBuffer 출력 설정
- AHardwareBuffer → VkImage import (VK_ANDROID_external_memory_android_hardware_buffer)
- VkImage를 앱의 Vulkan renderer로 전달
- 동기화: VkFence로 디코딩 완료 대기
참고 구현: Windows NVDEC-CUDA-D3D12 파이프라인의 MediaCodec 버전
🏗️ 1. Architecture Overview
1.1 Current Implementation (CPU Path)
MediaCodec Decoder → CPU Memory (YUV420P)
↓ (memcpy)
Vulkan Upload (vkCmdCopyBufferToImage)
↓
GPU Rendering
문제점:
- 2x 메모리 복사 (decode→CPU, CPU→GPU)
- 높은 CPU 사용률 (30-40%)
- 프레임당 5-10ms 추가 latency
1.2 Target Implementation (Zero-Copy GPU Path)
MediaCodec Decoder → AHardwareBuffer (GPU memory)
↓ (VK_ANDROID_external_memory_android_hardware_buffer)
VkImage (imported)
↓
Vulkan Sampler (direct binding)
↓
GPU Rendering
장점:
- 0x 메모리 복사 (GPU-to-GPU)
- 낮은 CPU 사용률 (10-15%)
- 프레임당 1-2ms latency
🔍 2. AHardwareBuffer Integration
2.1 AHardwareBuffer 생성 및 설정
파일: MediaCodecSurfaceManager.cpp
bool MediaCodecSurfaceManager::SetupAHardwareBuffer() {
if (!m_vk_device || !m_vk_instance) {
LogError("Vulkan device not set - call SetVulkanDevice first");
return false;
}
// Step 1: Allocate AHardwareBuffer for decoded video frames
AHardwareBuffer_Desc desc = {};
desc.width = m_video_width;
desc.height = m_video_height;
desc.layers = 1;
desc.format = AHARDWAREBUFFER_FORMAT_Y8Cb8Cr8_420; // NV12 format
desc.usage = AHARDWAREBUFFER_USAGE_GPU_SAMPLED_IMAGE |
AHARDWAREBUFFER_USAGE_GPU_COLOR_OUTPUT |
AHARDWAREBUFFER_USAGE_PROTECTED_CONTENT; // Optional for DRM
int result = AHardwareBuffer_allocate(&desc, &m_ahardware_buffer);
if (result != 0) {
LogError("Failed to allocate AHardwareBuffer: " + std::to_string(result));
return false;
}
LogInfo("AHardwareBuffer allocated: " + std::to_string(desc.width) + "x" + std::to_string(desc.height));
// Step 2: Create ANativeWindow from AHardwareBuffer
// This surface will be set as MediaCodec output
if (!CreateSurfaceFromAHardwareBuffer(m_ahardware_buffer)) {
AHardwareBuffer_release(m_ahardware_buffer);
m_ahardware_buffer = nullptr;
return false;
}
m_current_surface_type = SurfaceType::HARDWARE_BUFFER;
return true;
}
2.2 ANativeWindow 생성 from AHardwareBuffer
bool MediaCodecSurfaceManager::CreateSurfaceFromAHardwareBuffer(AHardwareBuffer* buffer) {
if (!buffer) {
LogError("Invalid AHardwareBuffer");
return false;
}
// Get JNI environment
JNIEnv* env = GetJNIEnv();
if (!env) {
LogError("Failed to get JNI environment");
return false;
}
// Step 1: Get HardwareBuffer class (Android API 28+)
jclass hardwareBufferClass = env->FindClass("android/hardware/HardwareBuffer");
if (!hardwareBufferClass) {
LogError("Failed to find HardwareBuffer class");
return false;
}
// Step 2: Get HardwareBuffer.createSurface method
jmethodID createSurfaceMethod = env->GetStaticMethodID(
hardwareBufferClass,
"createSurface",
"(Landroid/hardware/HardwareBuffer;)Landroid/view/Surface;"
);
if (!createSurfaceMethod) {
LogError("Failed to find createSurface method");
env->DeleteLocalRef(hardwareBufferClass);
return false;
}
// Step 3: Convert AHardwareBuffer to Java HardwareBuffer object
jobject javaHardwareBuffer = AHardwareBuffer_toHardwareBuffer(env, buffer);
if (!javaHardwareBuffer) {
LogError("Failed to convert AHardwareBuffer to Java object");
env->DeleteLocalRef(hardwareBufferClass);
return false;
}
// Step 4: Call HardwareBuffer.createSurface
jobject javaSurface = env->CallStaticObjectMethod(
hardwareBufferClass,
createSurfaceMethod,
javaHardwareBuffer
);
if (!javaSurface) {
LogError("Failed to create Surface from HardwareBuffer");
env->DeleteLocalRef(javaHardwareBuffer);
env->DeleteLocalRef(hardwareBufferClass);
return false;
}
// Step 5: Convert Java Surface to ANativeWindow
m_native_window = ANativeWindow_fromSurface(env, javaSurface);
if (!m_native_window) {
LogError("Failed to get ANativeWindow from Surface");
env->DeleteLocalRef(javaSurface);
env->DeleteLocalRef(javaHardwareBuffer);
env->DeleteLocalRef(hardwareBufferClass);
return false;
}
// Keep Java references for cleanup
m_java_surface = env->NewGlobalRef(javaSurface);
// Cleanup local references
env->DeleteLocalRef(javaSurface);
env->DeleteLocalRef(javaHardwareBuffer);
env->DeleteLocalRef(hardwareBufferClass);
LogInfo("Surface created from AHardwareBuffer successfully");
return true;
}
🔗 3. Vulkan Image Import from AHardwareBuffer
3.1 VkImage 생성 (External Memory)
파일: MediaCodecSurfaceManager.cpp
bool MediaCodecSurfaceManager::CreateVulkanImage(void* vk_device, void* vk_instance) {
if (!m_ahardware_buffer) {
LogError("AHardwareBuffer not allocated - call SetupAHardwareBuffer first");
return false;
}
VkDevice device = static_cast<VkDevice>(vk_device);
// Step 1: Get AHardwareBuffer properties
AHardwareBuffer_Desc ahb_desc;
AHardwareBuffer_describe(m_ahardware_buffer, &ahb_desc);
// Step 2: Query Android Hardware Buffer properties for Vulkan
VkAndroidHardwareBufferFormatPropertiesANDROID ahb_format_props = {};
ahb_format_props.sType = VK_STRUCTURE_TYPE_ANDROID_HARDWARE_BUFFER_FORMAT_PROPERTIES_ANDROID;
VkAndroidHardwareBufferPropertiesANDROID ahb_props = {};
ahb_props.sType = VK_STRUCTURE_TYPE_ANDROID_HARDWARE_BUFFER_PROPERTIES_ANDROID;
ahb_props.pNext = &ahb_format_props;
VkResult result = vkGetAndroidHardwareBufferPropertiesANDROID(
device,
m_ahardware_buffer,
&ahb_props
);
if (result != VK_SUCCESS) {
LogError("vkGetAndroidHardwareBufferPropertiesANDROID failed: " + std::to_string(result));
return false;
}
LogInfo("AHardwareBuffer Vulkan properties:");
LogInfo(" allocationSize: " + std::to_string(ahb_props.allocationSize));
LogInfo(" memoryTypeBits: " + std::to_string(ahb_props.memoryTypeBits));
LogInfo(" format: " + std::to_string(ahb_format_props.format));
// Step 3: Create VkImage with external memory
VkExternalMemoryImageCreateInfo external_mem_info = {};
external_mem_info.sType = VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMAGE_CREATE_INFO;
external_mem_info.handleTypes = VK_EXTERNAL_MEMORY_HANDLE_TYPE_ANDROID_HARDWARE_BUFFER_BIT_ANDROID;
VkImageCreateInfo image_info = {};
image_info.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
image_info.pNext = &external_mem_info;
image_info.imageType = VK_IMAGE_TYPE_2D;
image_info.format = ahb_format_props.format; // Usually VK_FORMAT_G8_B8R8_2PLANE_420_UNORM
image_info.extent.width = ahb_desc.width;
image_info.extent.height = ahb_desc.height;
image_info.extent.depth = 1;
image_info.mipLevels = 1;
image_info.arrayLayers = 1;
image_info.samples = VK_SAMPLE_COUNT_1_BIT;
image_info.tiling = VK_IMAGE_TILING_OPTIMAL;
image_info.usage = VK_IMAGE_USAGE_SAMPLED_BIT | VK_IMAGE_USAGE_TRANSFER_DST_BIT;
image_info.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
image_info.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
VkImage vk_image;
result = vkCreateImage(device, &image_info, nullptr, &vk_image);
if (result != VK_SUCCESS) {
LogError("vkCreateImage failed: " + std::to_string(result));
return false;
}
// Step 4: Import AHardwareBuffer memory
VkImportAndroidHardwareBufferInfoANDROID import_ahb_info = {};
import_ahb_info.sType = VK_STRUCTURE_TYPE_IMPORT_ANDROID_HARDWARE_BUFFER_INFO_ANDROID;
import_ahb_info.buffer = m_ahardware_buffer;
VkMemoryDedicatedAllocateInfo dedicated_alloc_info = {};
dedicated_alloc_info.sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO;
dedicated_alloc_info.pNext = &import_ahb_info;
dedicated_alloc_info.image = vk_image;
// Step 5: Find compatible memory type
VkMemoryRequirements mem_reqs;
vkGetImageMemoryRequirements(device, vk_image, &mem_reqs);
uint32_t memory_type_index = FindMemoryType(
ahb_props.memoryTypeBits & mem_reqs.memoryTypeBits,
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
);
if (memory_type_index == UINT32_MAX) {
LogError("Failed to find compatible memory type");
vkDestroyImage(device, vk_image, nullptr);
return false;
}
// Step 6: Allocate and bind memory
VkMemoryAllocateInfo alloc_info = {};
alloc_info.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
alloc_info.pNext = &dedicated_alloc_info;
alloc_info.allocationSize = ahb_props.allocationSize;
alloc_info.memoryTypeIndex = memory_type_index;
VkDeviceMemory vk_memory;
result = vkAllocateMemory(device, &alloc_info, nullptr, &vk_memory);
if (result != VK_SUCCESS) {
LogError("vkAllocateMemory failed: " + std::to_string(result));
vkDestroyImage(device, vk_image, nullptr);
return false;
}
result = vkBindImageMemory(device, vk_image, vk_memory, 0);
if (result != VK_SUCCESS) {
LogError("vkBindImageMemory failed: " + std::to_string(result));
vkFreeMemory(device, vk_memory, nullptr);
vkDestroyImage(device, vk_image, nullptr);
return false;
}
// Store for later use
m_vk_image = vk_image;
m_vk_memory = vk_memory;
LogInfo("Vulkan image created and bound to AHardwareBuffer memory");
return true;
}
3.2 Helper: Memory Type 검색
uint32_t MediaCodecSurfaceManager::FindMemoryType(uint32_t type_filter,
VkMemoryPropertyFlags properties) {
VkPhysicalDevice physical_device = GetPhysicalDevice(); // From m_vk_instance
VkPhysicalDeviceMemoryProperties mem_properties;
vkGetPhysicalDeviceMemoryProperties(physical_device, &mem_properties);
for (uint32_t i = 0; i < mem_properties.memoryTypeCount; i++) {
if ((type_filter & (1 << i)) &&
(mem_properties.memoryTypes[i].propertyFlags & properties) == properties) {
return i;
}
}
return UINT32_MAX; // Not found
}
🎬 4. MediaCodec Configuration
4.1 MediaCodec 출력 Surface 설정
파일: MediaCodecAV1Decoder.cpp - Initialize() 수정
bool MediaCodecAV1Decoder::Initialize(const VideoMetadata& metadata) {
// ... existing initialization ...
// If Vulkan device is set, configure for AHardwareBuffer output
if (m_surface_manager->GetVulkanDevice()) {
LogInfo("Vulkan device detected - setting up AHardwareBuffer output");
// Setup AHardwareBuffer with video dimensions
m_surface_manager->SetVideoDimensions(metadata.width, metadata.height);
if (!m_surface_manager->SetupAHardwareBuffer()) {
LogError("Failed to setup AHardwareBuffer");
return false;
}
// Create Vulkan image from AHardwareBuffer
if (!m_surface_manager->CreateVulkanImage(
m_surface_manager->GetVulkanDevice(),
m_surface_manager->GetVulkanInstance())) {
LogError("Failed to create Vulkan image");
return false;
}
// Get the Surface for MediaCodec
m_surface = m_surface_manager->GetAndroidSurface();
if (!m_surface) {
LogError("Failed to get ANativeWindow from AHardwareBuffer");
return false;
}
LogInfo("MediaCodec configured for Vulkan zero-copy output");
}
// Configure MediaCodec with surface
if (m_surface) {
media_status_t status = AMediaCodec_configure(
m_codec,
m_format,
m_surface, // Output to surface (AHardwareBuffer-backed)
nullptr, // No crypto
0 // Decoder mode
);
if (status != AMEDIA_OK) {
LogError("Failed to configure MediaCodec with surface: " + std::to_string(status));
return false;
}
LogInfo("MediaCodec configured with surface output");
}
// ... rest of initialization ...
}
🔄 5. DecodeToSurface Implementation
5.1 Vulkan Surface 경로 구현
파일: MediaCodecAV1Decoder.cpp
bool MediaCodecAV1Decoder::DecodeToSurface(const uint8_t* packet_data, size_t packet_size,
VavCoreSurfaceType target_type,
void* target_surface,
VideoFrame& output_frame) {
if (!m_initialized) {
LogError("Decoder not initialized");
return false;
}
// Handle Vulkan image output
if (target_type == VAVCORE_SURFACE_VULKAN_IMAGE) {
// Step 1: Process input buffer (feed packet to MediaCodec)
if (m_state != DecoderState::FLUSHING) {
if (!ProcessInputBuffer(packet_data, packet_size)) {
LogError("Failed to process input buffer");
return false;
}
}
// Step 2: Check decoder state transition
{
std::lock_guard<std::mutex> lock(m_stateMutex);
if (m_state == DecoderState::READY) {
m_state = DecoderState::BUFFERING;
LOGF_DEBUG("[DecodeToSurface] State transition: READY → BUFFERING");
}
}
// Step 3: Try to dequeue output buffer
bool hasFrame = ProcessOutputBuffer(output_frame);
if (!hasFrame) {
std::lock_guard<std::mutex> lock(m_stateMutex);
if (m_state == DecoderState::BUFFERING) {
LOGF_DEBUG("[DecodeToSurface] BUFFERING: packet accepted, no output yet");
return false; // VAVCORE_PACKET_ACCEPTED
}
if (m_state == DecoderState::FLUSHING) {
LOGF_INFO("[DecodeToSurface] Flush complete");
return false; // VAVCORE_END_OF_STREAM
}
return false; // VAVCORE_PACKET_ACCEPTED
}
// Step 4: Frame received - transition to DECODING
{
std::lock_guard<std::mutex> lock(m_stateMutex);
if (m_state == DecoderState::BUFFERING) {
m_state = DecoderState::DECODING;
LOGF_INFO("[DecodeToSurface] State transition: BUFFERING → DECODING");
}
}
// Step 5: Get VkImage from surface manager
void* vk_image = m_surface_manager->GetVulkanImage();
void* vk_memory = m_surface_manager->GetVulkanMemory();
if (!vk_image) {
LogError("Failed to get VkImage from surface manager");
return false;
}
// Step 6: Setup output frame with Vulkan surface data
output_frame.width = m_width;
output_frame.height = m_height;
output_frame.surface_type = VAVCORE_SURFACE_VULKAN_IMAGE;
output_frame.surface_data.vulkan.vk_image = vk_image;
output_frame.surface_data.vulkan.vk_device_memory = vk_memory;
output_frame.surface_data.vulkan.memory_offset = 0;
// Step 7: Wait for MediaCodec to finish rendering to AHardwareBuffer
// This is implicit - MediaCodec ensures frame is ready when dequeued
IncrementFramesDecoded();
LOGF_DEBUG("[DecodeToSurface] Vulkan frame %llu decoded", m_stats.frames_decoded);
return true;
}
// ... existing CPU/OpenGL paths ...
}
🔒 6. Synchronization Strategy
6.1 MediaCodec Implicit Synchronization
Good News: MediaCodec provides implicit synchronization!
// When AMediaCodec_dequeueOutputBuffer returns >= 0:
// - Frame is FULLY DECODED and written to AHardwareBuffer
// - Safe to use VkImage imported from that AHardwareBuffer
// - No additional fence needed from MediaCodec side
// Vulkan must still wait before rendering:
// - Use VkFence or VkSemaphore when submitting render commands
// - This ensures Vulkan waits for previous frame's rendering
6.2 Vulkan Rendering Synchronization
파일: vulkan_renderer.cpp - Already implemented in Phase 3!
bool VulkanVideoRenderer::RenderVulkanImage(VkImage sourceImage, ...) {
// ...
// Begin frame with fence wait
if (!BeginFrame(imageIndex)) { // Waits on m_inFlightFences[m_currentFrame]
return false;
}
// ... render commands ...
// End frame signals fence
if (!EndFrame(imageIndex)) { // Signals m_inFlightFences[m_currentFrame]
return false;
}
// Next call to BeginFrame will wait on this fence
return true;
}
📊 7. Implementation Checklist
Phase 1: AHardwareBuffer Setup ⏳
MediaCodecSurfaceManager::SetupAHardwareBuffer()구현AHardwareBuffer_allocate()with NV12 formatCreateSurfaceFromAHardwareBuffer()JNI 호출- ANativeWindow 생성 검증
Phase 2: Vulkan Import ⏳
MediaCodecSurfaceManager::CreateVulkanImage()구현vkGetAndroidHardwareBufferPropertiesANDROID호출- VkImage 생성 with external memory
- Memory import and bind
Phase 3: MediaCodec Integration ⏳
MediaCodecAV1Decoder::Initialize()수정 (Vulkan 경로)- Surface 설정 before MediaCodec configure
DecodeToSurface()Vulkan 경로 구현- VkImage handle 반환
Phase 4: VavCore C API ⏳
vavcore_set_vulkan_device()실제 구현vavcore_supports_surface_type()Vulkan 지원 확인vavcore_decode_next_frame()Vulkan surface 반환
Phase 5: Testing & Validation ⏳
- Samsung Galaxy S24 테스트
- Logcat 검증: Vulkan device registration
- Logcat 검증: AHardwareBuffer allocation
- Logcat 검증: VkImage creation
- 실제 비디오 재생 테스트
⚠️ 8. Known Limitations & Considerations
8.1 Android API Level Requirements
- Android 8.0 (API 26)+: AHardwareBuffer basic support
- Android 10 (API 29)+: Better Vulkan interop
- Android 11 (API 30)+: Recommended for stability
8.2 Device Compatibility
Supported SoCs:
- Qualcomm Snapdragon 845+ (Adreno 630+)
- Samsung Exynos 9810+ (Mali G72+)
- MediaTek Dimensity 1000+
- Google Tensor G1+
Unsupported SoCs: Will fail at vavcore_supports_surface_type() check
8.3 Format Limitations
- Only NV12:
AHARDWAREBUFFER_FORMAT_Y8Cb8Cr8_420 - No HDR: P010/P016 formats not yet supported
- No 10-bit: Limited to 8-bit color depth
8.4 Memory Overhead
- AHardwareBuffer Size: ~2MB for 1080p (width × height × 1.5)
- Recommended Buffer Count: 3-4 frames for smooth playback
- Total Memory: ~6-8MB for triple buffering
🚀 9. Expected Performance
9.1 Latency Improvements
| Metric | CPU Path | GPU Path (Zero-Copy) | Improvement |
|---|---|---|---|
| Decode | 5-10ms | 5-10ms | - |
| Upload | 3-5ms | 0ms | 100% |
| Total | 8-15ms | 5-10ms | 40-67% |
9.2 CPU Usage Reduction
| Phase | CPU Path | GPU Path | Improvement |
|---|---|---|---|
| Decode | 20-25% | 20-25% | - |
| Upload | 10-15% | 0% | 100% |
| Total | 30-40% | 20-25% | 33-50% |
9.3 Battery Life
- Estimated Improvement: 20-30% longer video playback time
- Reason: Reduced CPU cycles and memory bandwidth
📝 10. Next Steps
Immediate Actions
- ✅ Design document review
- ⏳ Implement Phase 1-2 (AHardwareBuffer + Vulkan import)
- ⏳ Implement Phase 3-4 (MediaCodec integration)
- ⏳ Test on actual Android device
Short-term
- Add error handling and fallback to CPU
- Optimize buffer allocation strategy
- Add performance metrics logging
- Document API usage patterns
Long-term
- Support HDR formats (P010)
- Multi-buffer pool for better performance
- External sync primitives (AHB fences)
- Cross-vendor compatibility testing
문서 버전: 1.0 최종 수정: 2025-10-11 작성자: Claude Code (Sonnet 4.5) 참고 문서: MediaCodec_Improvement_Analysis.md, Android_GPU_Surface_Pipeline_Design.md