fix: Fmp4::init_audio() doesn't populate description for AAC, causing downstream format mismatch#1024
fix: Fmp4::init_audio() doesn't populate description for AAC, causing downstream format mismatch#1024wanjohiryan wants to merge 1 commit intomoq-dev:mainfrom
Fmp4::init_audio() doesn't populate description for AAC, causing downstream format mismatch#1024Conversation
… downstream format mismatch
WalkthroughThe change modifies AAC audio handling in the FMP4 import module. It introduces a helper function that reconstructs the AudioSpecificConfig for MP4A audio by extracting profile, sample rate, and channel count, then building the appropriate codec data according to ISO 14496-3 §1.6.2.1 specifications. The AudioConfig now uses this computed codec description instead of direct inline expressions. The helper function generates either a 2-byte or 5-byte AudioSpecificConfig encoding based on the sample rate frequency index. 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
rs/moq-mux/src/import/fmp4.rs (1)
688-721: Bit manipulation and frequency index table are correct per ISO 14496-3.The implementation correctly handles both the 2-byte (standard sample rates) and 5-byte (extended) forms.
One minor defensive consideration: if
profile >= 32, line 708 would panic in debug builds due to shift overflow (profile << 3on au8). While standard AAC profiles (1=Main, 2=LC, 3=SSR, 4=LTP, etc.) are well within range, malformed input could theoretically trigger this.🛡️ Optional: Add defensive bounds check
fn build_aac_audio_specific_config(profile: u8, sample_rate: u32, channels: u32) -> Bytes { + // audioObjectType is 5 bits; values >= 32 would require escape coding per ISO 14496-3 + debug_assert!(profile < 32, "profile exceeds 5-bit audioObjectType range"); + let freq_index: u8 = match sample_rate {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rs/moq-mux/src/import/fmp4.rs` around lines 688 - 721, The code can panic if profile >= 32 due to left shifts in build_aac_audio_specific_config; clamp or mask the profile to 5 bits before any shifts (e.g., compute let prof5 = profile & 0x1F) and use prof5 in the byte construction (replace uses in b0 = (profile << 3) and in bits |= (profile as u64) << 35) so shift operations cannot overflow; alternatively validate and early-return or sanitize input, but ensure all shifts use the masked 5-bit value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@rs/moq-mux/src/import/fmp4.rs`:
- Around line 688-721: The code can panic if profile >= 32 due to left shifts in
build_aac_audio_specific_config; clamp or mask the profile to 5 bits before any
shifts (e.g., compute let prof5 = profile & 0x1F) and use prof5 in the byte
construction (replace uses in b0 = (profile << 3) and in bits |= (profile as
u64) << 35) so shift operations cannot overflow; alternatively validate and
early-return or sanitize input, but ensure all shifts use the masked 5-bit
value.
|
We'll have to test this and figure out why Chrome works anyway |
In
rs/moq-mux/src/import/fmp4.rs,init_audio()sets description: None for AAC tracks (marked with// TODO?):The problem:
Fmp4::extract()writes raw AAC frames (no ADTS headers) into the track — that's correct for fMP4, where frames are always raw. But without description, consumers have noAudioSpecificConfigto initialize their decoder. They're forced to assume ADTS framing (which includes the config inline), but the actual payload is headerless raw AAC. The decoder either fails or produces silence/garbage.The fix: Build the AudioSpecificConfig from the already-parsed ESDS fields. This is a 2-byte blob for standard sample rates (e.g. 0x12 0x10 for AAC-LC / 44100 Hz / stereo):
Note: The same
// TODO?exists for Opus. Opus doesn't strictly need it (the essential config is in the codec string + sample rate + channels), but for completeness it could carry the OpusHead bytes.Comparison with video:
init_videoalready does this correctly — it callsavcc.encode_body()to populate description for H.264. Audio just needs the equivalent treatment.