Skip to content

fix: Fmp4::init_audio() doesn't populate description for AAC, causing downstream format mismatch#1024

Open
wanjohiryan wants to merge 1 commit intomoq-dev:mainfrom
nestrilabs:main
Open

fix: Fmp4::init_audio() doesn't populate description for AAC, causing downstream format mismatch#1024
wanjohiryan wants to merge 1 commit intomoq-dev:mainfrom
nestrilabs:main

Conversation

@wanjohiryan
Copy link
Contributor

In rs/moq-mux/src/import/fmp4.rs, init_audio() sets description: None for AAC tracks (marked with // TODO?):

mp4_atom::Codec::Mp4a(mp4a) => {
    // ...
    AudioConfig {
         codec: AAC { profile: desc.dec_specific.profile }.into(),
         sample_rate: mp4a.audio.sample_rate.integer() as _,
         channel_count: mp4a.audio.channel_count as _,
         bitrate: Some(bitrate.into()),
         container,
         jitter: None,
         description: None, // TODO?
    }

The problem: Fmp4::extract() writes raw AAC frames (no ADTS headers) into the track — that's correct for fMP4, where frames are always raw. But without description, consumers have no AudioSpecificConfig to initialize their decoder. They're forced to assume ADTS framing (which includes the config inline), but the actual payload is headerless raw AAC. The decoder either fails or produces silence/garbage.

The fix: Build the AudioSpecificConfig from the already-parsed ESDS fields. This is a 2-byte blob for standard sample rates (e.g. 0x12 0x10 for AAC-LC / 44100 Hz / stereo):

mp4_atom::Codec::Mp4a(mp4a) => {
    let desc = &mp4a.esds.es_desc.dec_config;

    if desc.object_type_indication != 0x40 {
        anyhow::bail!("unsupported codec: MPEG2");
    }

    let bitrate = desc.avg_bitrate.max(desc.max_bitrate);
    let profile = desc.dec_specific.profile;
    let sample_rate = mp4a.audio.sample_rate.integer() as u32;
    let channel_count = mp4a.audio.channel_count as u32;

    AudioConfig {
        codec: AAC { profile }.into(),
        sample_rate,
        channel_count,
        bitrate: Some(bitrate.into()),
        description: Some(build_audio_specific_config(profile, sample_rate, channel_count)),
    }
}

/// ISO 14496-3 §1.6.2.1 AudioSpecificConfig
fn build_audio_specific_config(profile: u8, sample_rate: u32, channels: u32) -> Bytes {
    let freq_index: u8 = match sample_rate {
        96000 => 0, 88200 => 1, 64000 => 2, 48000 => 3,
        44100 => 4, 32000 => 5, 24000 => 6, 22050 => 7,
        16000 => 8, 12000 => 9, 11025 => 10, 8000 => 11,
        7350 => 12, _ => 0xF,
    };

    if freq_index != 0xF {
        // 5 + 4 + 4 = 13 bits → 2 bytes
        let b0 = (profile << 3) | (freq_index >> 1);
        let b1 = ((freq_index & 1) << 7) | ((channels as u8 & 0x0F) << 3);
        Bytes::from(vec![b0, b1])
    } else {
        // 5 + 4 + 24 + 4 = 37 bits → 5 bytes
        let mut bits: u64 = 0;
        bits |= (profile as u64) << 35;
        bits |= 0xF_u64 << 31;
        bits |= (sample_rate as u64) << 7;
        bits |= ((channels as u64) & 0xF) << 3;
        Bytes::copy_from_slice(&bits.to_be_bytes()[3..8])
    }
}

Note: The same // TODO? exists for Opus. Opus doesn't strictly need it (the essential config is in the codec string + sample rate + channels), but for completeness it could carry the OpusHead bytes.

Comparison with video: init_video already does this correctly — it calls avcc.encode_body() to populate description for H.264. Audio just needs the equivalent treatment.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 2, 2026

Walkthrough

The change modifies AAC audio handling in the FMP4 import module. It introduces a helper function that reconstructs the AudioSpecificConfig for MP4A audio by extracting profile, sample rate, and channel count, then building the appropriate codec data according to ISO 14496-3 §1.6.2.1 specifications. The AudioConfig now uses this computed codec description instead of direct inline expressions. The helper function generates either a 2-byte or 5-byte AudioSpecificConfig encoding based on the sample rate frequency index.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and specifically describes the main bug fix: AAC audio tracks lack description population, causing downstream format mismatch.
Description check ✅ Passed The description thoroughly explains the problem, the root cause, the specific fix with code examples, and references related work (video codec handling, Opus TODO).
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
rs/moq-mux/src/import/fmp4.rs (1)

688-721: Bit manipulation and frequency index table are correct per ISO 14496-3.

The implementation correctly handles both the 2-byte (standard sample rates) and 5-byte (extended) forms.

One minor defensive consideration: if profile >= 32, line 708 would panic in debug builds due to shift overflow (profile << 3 on a u8). While standard AAC profiles (1=Main, 2=LC, 3=SSR, 4=LTP, etc.) are well within range, malformed input could theoretically trigger this.

🛡️ Optional: Add defensive bounds check
 fn build_aac_audio_specific_config(profile: u8, sample_rate: u32, channels: u32) -> Bytes {
+	// audioObjectType is 5 bits; values >= 32 would require escape coding per ISO 14496-3
+	debug_assert!(profile < 32, "profile exceeds 5-bit audioObjectType range");
+
 	let freq_index: u8 = match sample_rate {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rs/moq-mux/src/import/fmp4.rs` around lines 688 - 721, The code can panic if
profile >= 32 due to left shifts in build_aac_audio_specific_config; clamp or
mask the profile to 5 bits before any shifts (e.g., compute let prof5 = profile
& 0x1F) and use prof5 in the byte construction (replace uses in b0 = (profile <<
3) and in bits |= (profile as u64) << 35) so shift operations cannot overflow;
alternatively validate and early-return or sanitize input, but ensure all shifts
use the masked 5-bit value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@rs/moq-mux/src/import/fmp4.rs`:
- Around line 688-721: The code can panic if profile >= 32 due to left shifts in
build_aac_audio_specific_config; clamp or mask the profile to 5 bits before any
shifts (e.g., compute let prof5 = profile & 0x1F) and use prof5 in the byte
construction (replace uses in b0 = (profile << 3) and in bits |= (profile as
u64) << 35) so shift operations cannot overflow; alternatively validate and
early-return or sanitize input, but ensure all shifts use the masked 5-bit
value.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a2763f2 and 792c349.

📒 Files selected for processing (1)
  • rs/moq-mux/src/import/fmp4.rs

@kixelated
Copy link
Collaborator

We'll have to test this and figure out why Chrome works anyway

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants