Files
video-v1/vav2/docs/working/MediaCodec_Vulkan_Integration_Design.md

21 KiB
Raw Permalink Blame History

MediaCodec Vulkan Surface Integration Design

작성일: 2025-10-11 대상: Android MediaCodec → Vulkan Zero-Copy Pipeline 참고: MediaCodec_Improvement_Analysis.md, Android_GPU_Surface_Pipeline_Design.md 상태: 📋 Design & Implementation Ready


📋 Executive Summary

목표: MediaCodec의 하드웨어 디코딩 출력을 AHardwareBuffer를 통해 Vulkan VkImage로 직접 전달하여 zero-copy GPU 파이프라인 구현

핵심 전략:

  1. MediaCodec → AHardwareBuffer 출력 설정
  2. AHardwareBuffer → VkImage import (VK_ANDROID_external_memory_android_hardware_buffer)
  3. VkImage를 앱의 Vulkan renderer로 전달
  4. 동기화: VkFence로 디코딩 완료 대기

참고 구현: Windows NVDEC-CUDA-D3D12 파이프라인의 MediaCodec 버전


🏗️ 1. Architecture Overview

1.1 Current Implementation (CPU Path)

MediaCodec Decoder → CPU Memory (YUV420P)
                        ↓ (memcpy)
                   Vulkan Upload (vkCmdCopyBufferToImage)
                        ↓
                   GPU Rendering

문제점:

  • 2x 메모리 복사 (decode→CPU, CPU→GPU)
  • 높은 CPU 사용률 (30-40%)
  • 프레임당 5-10ms 추가 latency

1.2 Target Implementation (Zero-Copy GPU Path)

MediaCodec Decoder → AHardwareBuffer (GPU memory)
                        ↓ (VK_ANDROID_external_memory_android_hardware_buffer)
                   VkImage (imported)
                        ↓
                   Vulkan Sampler (direct binding)
                        ↓
                   GPU Rendering

장점:

  • 0x 메모리 복사 (GPU-to-GPU)
  • 낮은 CPU 사용률 (10-15%)
  • 프레임당 1-2ms latency

🔍 2. AHardwareBuffer Integration

2.1 AHardwareBuffer 생성 및 설정

파일: MediaCodecSurfaceManager.cpp

bool MediaCodecSurfaceManager::SetupAHardwareBuffer() {
    if (!m_vk_device || !m_vk_instance) {
        LogError("Vulkan device not set - call SetVulkanDevice first");
        return false;
    }

    // Step 1: Allocate AHardwareBuffer for decoded video frames
    AHardwareBuffer_Desc desc = {};
    desc.width = m_video_width;
    desc.height = m_video_height;
    desc.layers = 1;
    desc.format = AHARDWAREBUFFER_FORMAT_Y8Cb8Cr8_420;  // NV12 format
    desc.usage = AHARDWAREBUFFER_USAGE_GPU_SAMPLED_IMAGE |
                 AHARDWAREBUFFER_USAGE_GPU_COLOR_OUTPUT |
                 AHARDWAREBUFFER_USAGE_PROTECTED_CONTENT;  // Optional for DRM

    int result = AHardwareBuffer_allocate(&desc, &m_ahardware_buffer);
    if (result != 0) {
        LogError("Failed to allocate AHardwareBuffer: " + std::to_string(result));
        return false;
    }

    LogInfo("AHardwareBuffer allocated: " + std::to_string(desc.width) + "x" + std::to_string(desc.height));

    // Step 2: Create ANativeWindow from AHardwareBuffer
    // This surface will be set as MediaCodec output
    if (!CreateSurfaceFromAHardwareBuffer(m_ahardware_buffer)) {
        AHardwareBuffer_release(m_ahardware_buffer);
        m_ahardware_buffer = nullptr;
        return false;
    }

    m_current_surface_type = SurfaceType::HARDWARE_BUFFER;
    return true;
}

2.2 ANativeWindow 생성 from AHardwareBuffer

bool MediaCodecSurfaceManager::CreateSurfaceFromAHardwareBuffer(AHardwareBuffer* buffer) {
    if (!buffer) {
        LogError("Invalid AHardwareBuffer");
        return false;
    }

    // Get JNI environment
    JNIEnv* env = GetJNIEnv();
    if (!env) {
        LogError("Failed to get JNI environment");
        return false;
    }

    // Step 1: Get HardwareBuffer class (Android API 28+)
    jclass hardwareBufferClass = env->FindClass("android/hardware/HardwareBuffer");
    if (!hardwareBufferClass) {
        LogError("Failed to find HardwareBuffer class");
        return false;
    }

    // Step 2: Get HardwareBuffer.createSurface method
    jmethodID createSurfaceMethod = env->GetStaticMethodID(
        hardwareBufferClass,
        "createSurface",
        "(Landroid/hardware/HardwareBuffer;)Landroid/view/Surface;"
    );

    if (!createSurfaceMethod) {
        LogError("Failed to find createSurface method");
        env->DeleteLocalRef(hardwareBufferClass);
        return false;
    }

    // Step 3: Convert AHardwareBuffer to Java HardwareBuffer object
    jobject javaHardwareBuffer = AHardwareBuffer_toHardwareBuffer(env, buffer);
    if (!javaHardwareBuffer) {
        LogError("Failed to convert AHardwareBuffer to Java object");
        env->DeleteLocalRef(hardwareBufferClass);
        return false;
    }

    // Step 4: Call HardwareBuffer.createSurface
    jobject javaSurface = env->CallStaticObjectMethod(
        hardwareBufferClass,
        createSurfaceMethod,
        javaHardwareBuffer
    );

    if (!javaSurface) {
        LogError("Failed to create Surface from HardwareBuffer");
        env->DeleteLocalRef(javaHardwareBuffer);
        env->DeleteLocalRef(hardwareBufferClass);
        return false;
    }

    // Step 5: Convert Java Surface to ANativeWindow
    m_native_window = ANativeWindow_fromSurface(env, javaSurface);
    if (!m_native_window) {
        LogError("Failed to get ANativeWindow from Surface");
        env->DeleteLocalRef(javaSurface);
        env->DeleteLocalRef(javaHardwareBuffer);
        env->DeleteLocalRef(hardwareBufferClass);
        return false;
    }

    // Keep Java references for cleanup
    m_java_surface = env->NewGlobalRef(javaSurface);

    // Cleanup local references
    env->DeleteLocalRef(javaSurface);
    env->DeleteLocalRef(javaHardwareBuffer);
    env->DeleteLocalRef(hardwareBufferClass);

    LogInfo("Surface created from AHardwareBuffer successfully");
    return true;
}

🔗 3. Vulkan Image Import from AHardwareBuffer

3.1 VkImage 생성 (External Memory)

파일: MediaCodecSurfaceManager.cpp

bool MediaCodecSurfaceManager::CreateVulkanImage(void* vk_device, void* vk_instance) {
    if (!m_ahardware_buffer) {
        LogError("AHardwareBuffer not allocated - call SetupAHardwareBuffer first");
        return false;
    }

    VkDevice device = static_cast<VkDevice>(vk_device);

    // Step 1: Get AHardwareBuffer properties
    AHardwareBuffer_Desc ahb_desc;
    AHardwareBuffer_describe(m_ahardware_buffer, &ahb_desc);

    // Step 2: Query Android Hardware Buffer properties for Vulkan
    VkAndroidHardwareBufferFormatPropertiesANDROID ahb_format_props = {};
    ahb_format_props.sType = VK_STRUCTURE_TYPE_ANDROID_HARDWARE_BUFFER_FORMAT_PROPERTIES_ANDROID;

    VkAndroidHardwareBufferPropertiesANDROID ahb_props = {};
    ahb_props.sType = VK_STRUCTURE_TYPE_ANDROID_HARDWARE_BUFFER_PROPERTIES_ANDROID;
    ahb_props.pNext = &ahb_format_props;

    VkResult result = vkGetAndroidHardwareBufferPropertiesANDROID(
        device,
        m_ahardware_buffer,
        &ahb_props
    );

    if (result != VK_SUCCESS) {
        LogError("vkGetAndroidHardwareBufferPropertiesANDROID failed: " + std::to_string(result));
        return false;
    }

    LogInfo("AHardwareBuffer Vulkan properties:");
    LogInfo("  allocationSize: " + std::to_string(ahb_props.allocationSize));
    LogInfo("  memoryTypeBits: " + std::to_string(ahb_props.memoryTypeBits));
    LogInfo("  format: " + std::to_string(ahb_format_props.format));

    // Step 3: Create VkImage with external memory
    VkExternalMemoryImageCreateInfo external_mem_info = {};
    external_mem_info.sType = VK_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMAGE_CREATE_INFO;
    external_mem_info.handleTypes = VK_EXTERNAL_MEMORY_HANDLE_TYPE_ANDROID_HARDWARE_BUFFER_BIT_ANDROID;

    VkImageCreateInfo image_info = {};
    image_info.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
    image_info.pNext = &external_mem_info;
    image_info.imageType = VK_IMAGE_TYPE_2D;
    image_info.format = ahb_format_props.format;  // Usually VK_FORMAT_G8_B8R8_2PLANE_420_UNORM
    image_info.extent.width = ahb_desc.width;
    image_info.extent.height = ahb_desc.height;
    image_info.extent.depth = 1;
    image_info.mipLevels = 1;
    image_info.arrayLayers = 1;
    image_info.samples = VK_SAMPLE_COUNT_1_BIT;
    image_info.tiling = VK_IMAGE_TILING_OPTIMAL;
    image_info.usage = VK_IMAGE_USAGE_SAMPLED_BIT | VK_IMAGE_USAGE_TRANSFER_DST_BIT;
    image_info.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
    image_info.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;

    VkImage vk_image;
    result = vkCreateImage(device, &image_info, nullptr, &vk_image);
    if (result != VK_SUCCESS) {
        LogError("vkCreateImage failed: " + std::to_string(result));
        return false;
    }

    // Step 4: Import AHardwareBuffer memory
    VkImportAndroidHardwareBufferInfoANDROID import_ahb_info = {};
    import_ahb_info.sType = VK_STRUCTURE_TYPE_IMPORT_ANDROID_HARDWARE_BUFFER_INFO_ANDROID;
    import_ahb_info.buffer = m_ahardware_buffer;

    VkMemoryDedicatedAllocateInfo dedicated_alloc_info = {};
    dedicated_alloc_info.sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO;
    dedicated_alloc_info.pNext = &import_ahb_info;
    dedicated_alloc_info.image = vk_image;

    // Step 5: Find compatible memory type
    VkMemoryRequirements mem_reqs;
    vkGetImageMemoryRequirements(device, vk_image, &mem_reqs);

    uint32_t memory_type_index = FindMemoryType(
        ahb_props.memoryTypeBits & mem_reqs.memoryTypeBits,
        VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
    );

    if (memory_type_index == UINT32_MAX) {
        LogError("Failed to find compatible memory type");
        vkDestroyImage(device, vk_image, nullptr);
        return false;
    }

    // Step 6: Allocate and bind memory
    VkMemoryAllocateInfo alloc_info = {};
    alloc_info.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
    alloc_info.pNext = &dedicated_alloc_info;
    alloc_info.allocationSize = ahb_props.allocationSize;
    alloc_info.memoryTypeIndex = memory_type_index;

    VkDeviceMemory vk_memory;
    result = vkAllocateMemory(device, &alloc_info, nullptr, &vk_memory);
    if (result != VK_SUCCESS) {
        LogError("vkAllocateMemory failed: " + std::to_string(result));
        vkDestroyImage(device, vk_image, nullptr);
        return false;
    }

    result = vkBindImageMemory(device, vk_image, vk_memory, 0);
    if (result != VK_SUCCESS) {
        LogError("vkBindImageMemory failed: " + std::to_string(result));
        vkFreeMemory(device, vk_memory, nullptr);
        vkDestroyImage(device, vk_image, nullptr);
        return false;
    }

    // Store for later use
    m_vk_image = vk_image;
    m_vk_memory = vk_memory;

    LogInfo("Vulkan image created and bound to AHardwareBuffer memory");
    return true;
}

3.2 Helper: Memory Type 검색

uint32_t MediaCodecSurfaceManager::FindMemoryType(uint32_t type_filter,
                                                  VkMemoryPropertyFlags properties) {
    VkPhysicalDevice physical_device = GetPhysicalDevice();  // From m_vk_instance

    VkPhysicalDeviceMemoryProperties mem_properties;
    vkGetPhysicalDeviceMemoryProperties(physical_device, &mem_properties);

    for (uint32_t i = 0; i < mem_properties.memoryTypeCount; i++) {
        if ((type_filter & (1 << i)) &&
            (mem_properties.memoryTypes[i].propertyFlags & properties) == properties) {
            return i;
        }
    }

    return UINT32_MAX;  // Not found
}

🎬 4. MediaCodec Configuration

4.1 MediaCodec 출력 Surface 설정

파일: MediaCodecAV1Decoder.cpp - Initialize() 수정

bool MediaCodecAV1Decoder::Initialize(const VideoMetadata& metadata) {
    // ... existing initialization ...

    // If Vulkan device is set, configure for AHardwareBuffer output
    if (m_surface_manager->GetVulkanDevice()) {
        LogInfo("Vulkan device detected - setting up AHardwareBuffer output");

        // Setup AHardwareBuffer with video dimensions
        m_surface_manager->SetVideoDimensions(metadata.width, metadata.height);

        if (!m_surface_manager->SetupAHardwareBuffer()) {
            LogError("Failed to setup AHardwareBuffer");
            return false;
        }

        // Create Vulkan image from AHardwareBuffer
        if (!m_surface_manager->CreateVulkanImage(
                m_surface_manager->GetVulkanDevice(),
                m_surface_manager->GetVulkanInstance())) {
            LogError("Failed to create Vulkan image");
            return false;
        }

        // Get the Surface for MediaCodec
        m_surface = m_surface_manager->GetAndroidSurface();
        if (!m_surface) {
            LogError("Failed to get ANativeWindow from AHardwareBuffer");
            return false;
        }

        LogInfo("MediaCodec configured for Vulkan zero-copy output");
    }

    // Configure MediaCodec with surface
    if (m_surface) {
        media_status_t status = AMediaCodec_configure(
            m_codec,
            m_format,
            m_surface,  // Output to surface (AHardwareBuffer-backed)
            nullptr,    // No crypto
            0           // Decoder mode
        );

        if (status != AMEDIA_OK) {
            LogError("Failed to configure MediaCodec with surface: " + std::to_string(status));
            return false;
        }

        LogInfo("MediaCodec configured with surface output");
    }

    // ... rest of initialization ...
}

🔄 5. DecodeToSurface Implementation

5.1 Vulkan Surface 경로 구현

파일: MediaCodecAV1Decoder.cpp

bool MediaCodecAV1Decoder::DecodeToSurface(const uint8_t* packet_data, size_t packet_size,
                                          VavCoreSurfaceType target_type,
                                          void* target_surface,
                                          VideoFrame& output_frame) {
    if (!m_initialized) {
        LogError("Decoder not initialized");
        return false;
    }

    // Handle Vulkan image output
    if (target_type == VAVCORE_SURFACE_VULKAN_IMAGE) {
        // Step 1: Process input buffer (feed packet to MediaCodec)
        if (m_state != DecoderState::FLUSHING) {
            if (!ProcessInputBuffer(packet_data, packet_size)) {
                LogError("Failed to process input buffer");
                return false;
            }
        }

        // Step 2: Check decoder state transition
        {
            std::lock_guard<std::mutex> lock(m_stateMutex);

            if (m_state == DecoderState::READY) {
                m_state = DecoderState::BUFFERING;
                LOGF_DEBUG("[DecodeToSurface] State transition: READY → BUFFERING");
            }
        }

        // Step 3: Try to dequeue output buffer
        bool hasFrame = ProcessOutputBuffer(output_frame);

        if (!hasFrame) {
            std::lock_guard<std::mutex> lock(m_stateMutex);

            if (m_state == DecoderState::BUFFERING) {
                LOGF_DEBUG("[DecodeToSurface] BUFFERING: packet accepted, no output yet");
                return false;  // VAVCORE_PACKET_ACCEPTED
            }

            if (m_state == DecoderState::FLUSHING) {
                LOGF_INFO("[DecodeToSurface] Flush complete");
                return false;  // VAVCORE_END_OF_STREAM
            }

            return false;  // VAVCORE_PACKET_ACCEPTED
        }

        // Step 4: Frame received - transition to DECODING
        {
            std::lock_guard<std::mutex> lock(m_stateMutex);
            if (m_state == DecoderState::BUFFERING) {
                m_state = DecoderState::DECODING;
                LOGF_INFO("[DecodeToSurface] State transition: BUFFERING → DECODING");
            }
        }

        // Step 5: Get VkImage from surface manager
        void* vk_image = m_surface_manager->GetVulkanImage();
        void* vk_memory = m_surface_manager->GetVulkanMemory();

        if (!vk_image) {
            LogError("Failed to get VkImage from surface manager");
            return false;
        }

        // Step 6: Setup output frame with Vulkan surface data
        output_frame.width = m_width;
        output_frame.height = m_height;
        output_frame.surface_type = VAVCORE_SURFACE_VULKAN_IMAGE;
        output_frame.surface_data.vulkan.vk_image = vk_image;
        output_frame.surface_data.vulkan.vk_device_memory = vk_memory;
        output_frame.surface_data.vulkan.memory_offset = 0;

        // Step 7: Wait for MediaCodec to finish rendering to AHardwareBuffer
        // This is implicit - MediaCodec ensures frame is ready when dequeued

        IncrementFramesDecoded();
        LOGF_DEBUG("[DecodeToSurface] Vulkan frame %llu decoded", m_stats.frames_decoded);
        return true;
    }

    // ... existing CPU/OpenGL paths ...
}

🔒 6. Synchronization Strategy

6.1 MediaCodec Implicit Synchronization

Good News: MediaCodec provides implicit synchronization!

// When AMediaCodec_dequeueOutputBuffer returns >= 0:
// - Frame is FULLY DECODED and written to AHardwareBuffer
// - Safe to use VkImage imported from that AHardwareBuffer
// - No additional fence needed from MediaCodec side

// Vulkan must still wait before rendering:
// - Use VkFence or VkSemaphore when submitting render commands
// - This ensures Vulkan waits for previous frame's rendering

6.2 Vulkan Rendering Synchronization

파일: vulkan_renderer.cpp - Already implemented in Phase 3!

bool VulkanVideoRenderer::RenderVulkanImage(VkImage sourceImage, ...) {
    // ...

    // Begin frame with fence wait
    if (!BeginFrame(imageIndex)) {  // Waits on m_inFlightFences[m_currentFrame]
        return false;
    }

    // ... render commands ...

    // End frame signals fence
    if (!EndFrame(imageIndex)) {  // Signals m_inFlightFences[m_currentFrame]
        return false;
    }

    // Next call to BeginFrame will wait on this fence
    return true;
}

📊 7. Implementation Checklist

Phase 1: AHardwareBuffer Setup

  • MediaCodecSurfaceManager::SetupAHardwareBuffer() 구현
  • AHardwareBuffer_allocate() with NV12 format
  • CreateSurfaceFromAHardwareBuffer() JNI 호출
  • ANativeWindow 생성 검증

Phase 2: Vulkan Import

  • MediaCodecSurfaceManager::CreateVulkanImage() 구현
  • vkGetAndroidHardwareBufferPropertiesANDROID 호출
  • VkImage 생성 with external memory
  • Memory import and bind

Phase 3: MediaCodec Integration

  • MediaCodecAV1Decoder::Initialize() 수정 (Vulkan 경로)
  • Surface 설정 before MediaCodec configure
  • DecodeToSurface() Vulkan 경로 구현
  • VkImage handle 반환

Phase 4: VavCore C API

  • vavcore_set_vulkan_device() 실제 구현
  • vavcore_supports_surface_type() Vulkan 지원 확인
  • vavcore_decode_next_frame() Vulkan surface 반환

Phase 5: Testing & Validation

  • Samsung Galaxy S24 테스트
  • Logcat 검증: Vulkan device registration
  • Logcat 검증: AHardwareBuffer allocation
  • Logcat 검증: VkImage creation
  • 실제 비디오 재생 테스트

⚠️ 8. Known Limitations & Considerations

8.1 Android API Level Requirements

  • Android 8.0 (API 26)+: AHardwareBuffer basic support
  • Android 10 (API 29)+: Better Vulkan interop
  • Android 11 (API 30)+: Recommended for stability

8.2 Device Compatibility

Supported SoCs:

  • Qualcomm Snapdragon 845+ (Adreno 630+)
  • Samsung Exynos 9810+ (Mali G72+)
  • MediaTek Dimensity 1000+
  • Google Tensor G1+

Unsupported SoCs: Will fail at vavcore_supports_surface_type() check

8.3 Format Limitations

  • Only NV12: AHARDWAREBUFFER_FORMAT_Y8Cb8Cr8_420
  • No HDR: P010/P016 formats not yet supported
  • No 10-bit: Limited to 8-bit color depth

8.4 Memory Overhead

  • AHardwareBuffer Size: ~2MB for 1080p (width × height × 1.5)
  • Recommended Buffer Count: 3-4 frames for smooth playback
  • Total Memory: ~6-8MB for triple buffering

🚀 9. Expected Performance

9.1 Latency Improvements

Metric CPU Path GPU Path (Zero-Copy) Improvement
Decode 5-10ms 5-10ms -
Upload 3-5ms 0ms 100%
Total 8-15ms 5-10ms 40-67%

9.2 CPU Usage Reduction

Phase CPU Path GPU Path Improvement
Decode 20-25% 20-25% -
Upload 10-15% 0% 100%
Total 30-40% 20-25% 33-50%

9.3 Battery Life

  • Estimated Improvement: 20-30% longer video playback time
  • Reason: Reduced CPU cycles and memory bandwidth

📝 10. Next Steps

Immediate Actions

  1. Design document review
  2. Implement Phase 1-2 (AHardwareBuffer + Vulkan import)
  3. Implement Phase 3-4 (MediaCodec integration)
  4. Test on actual Android device

Short-term

  1. Add error handling and fallback to CPU
  2. Optimize buffer allocation strategy
  3. Add performance metrics logging
  4. Document API usage patterns

Long-term

  1. Support HDR formats (P010)
  2. Multi-buffer pool for better performance
  3. External sync primitives (AHB fences)
  4. Cross-vendor compatibility testing

문서 버전: 1.0 최종 수정: 2025-10-11 작성자: Claude Code (Sonnet 4.5) 참고 문서: MediaCodec_Improvement_Analysis.md, Android_GPU_Surface_Pipeline_Design.md