[upstream test] ASoC: SOF: Fix IPC reliability and post-resume SoundWire init#5671
Open
ujfalusi wants to merge 2 commits intothesofproject:topic/sof-devfrom
Open
[upstream test] ASoC: SOF: Fix IPC reliability and post-resume SoundWire init#5671ujfalusi wants to merge 2 commits intothesofproject:topic/sof-devfrom
ujfalusi wants to merge 2 commits intothesofproject:topic/sof-devfrom
Conversation
The SOF IPC4 platform send_msg functions (hda_dsp_ipc4_send_msg, mtl_ipc_send_msg, cnl_ipc4_send_msg) previously stored the message in delayed_ipc_tx_msg and returned 0 when the TX register was busy. The deferred message was supposed to be dispatched from the IRQ handler when the DSP acknowledged the previous message. This mechanism silently drops messages during D0i3 power transitions because the IRQ handler never fires while the DSP is in a low-power state. The caller then hangs in wait_event_timeout() for up to 500ms per IPC chunk, causing multi-second audio stalls under CPU load. Fix this by making the platform send_msg functions return -EBUSY immediately when the TX register is busy (safe since they execute under spin_lock_irq in sof_ipc_send_msg), and adding a bounded retry loop with usleep_range() in ipc4_tx_msg_unlocked() which only holds the tx_mutex (a sleepable context). The retry loop attempts up to 50 iterations with 100-200us delays, bounding the maximum busy-wait to approximately 10ms instead of the previous 500ms timeout. Also remove the now-dead delayed_ipc_tx_msg field from sof_intel_hda_dev, the dispatch code, and the ack_received tracking variable from all three IRQ thread handlers (hda_dsp_ipc4_irq_thread, mtl_ipc_irq_thread, cnl_ipc4_irq_thread). Signed-off-by: Cole Leavitt <cole@unwrap.rs> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
After suspend/resume (D3->D0), the SOF firmware is reloaded fresh and pipelines are recreated lazily when userspace opens a PCM. However, SoundWire slave re-enumeration runs asynchronously via a 100ms delayed work item (SDW_INTEL_DELAYED_ENUMERATION_MS). If userspace attempts to play audio before SoundWire slaves finish re-enumerating, the firmware returns error 9 (resource not found) when creating ALH copier modules, leaving the DSP in an unrecoverable wedged state requiring reboot. Add a new optional dai_link_hw_ready callback to struct snd_sof_dsp_ops that allows platform-specific code to wait for DAI link hardware to become ready before pipeline setup. The generic ipc4-topology.c calls this callback (when set) in sof_ipc4_prepare_copier_module() before configuring DAI copiers, maintaining SOF's platform abstraction. The Intel HDA implementation (hda_sdw_dai_hw_ready) waits for all attached SoundWire slaves to complete initialization using wait_for_completion_interruptible_timeout() with a 2-second timeout. This is safe for multiple waiters since the SoundWire subsystem uses complete_all() for initialization_complete. Unattached slaves (declared in ACPI but not physically present) are skipped to avoid false timeouts. The function returns -ETIMEDOUT on timeout (instead of warn-and-continue) to prevent the DSP from entering a wedged state. On non-resume paths the completions are already done, so the wait returns immediately. Link: thesofproject/sof#8662 Link: thesofproject/sof#9308 Signed-off-by: Cole Leavitt <cole@unwrap.rs> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Test for https://lore.kernel.org/linux-sound/20260214064054.19961-1-cole@unwrap.rs/T/#t series
Cover letter:
Two fixes for SOF IPC4 reliability issues observed on Lenovo ThinkPad
P16 Gen 3 (Arrow Lake-S, CS42L43 + CS35L56 over SoundWire):
Replace the broken delayed_ipc_tx_msg mechanism with a bounded retry
loop. The old deferred dispatch silently drops messages during D0i3
transitions, causing 500ms+ hangs per IPC chunk.
Add a platform ops callback (dai_link_hw_ready) so Intel HDA
platforms can wait for SoundWire slave initialization before ALH
copier setup. Without this, the DSP enters an unrecoverable wedged
state when userspace opens a PCM before slaves finish re-enumerating
after resume.
Tested on ThinkPad P16 Gen 3 with repeated suspend/resume cycles
and concurrent audio playback.
Cole Leavitt (2):
ASoC: SOF: Replace IPC TX busy deferral with bounded retry
ASoC: SOF: Add platform ops callback for DAI link hardware readiness
sound/soc/sof/intel/cnl.c | 17 ++---------
sound/soc/sof/intel/hda-common-ops.c | 1 +
sound/soc/sof/intel/hda-ipc.c | 17 ++---------
sound/soc/sof/intel/hda.c | 44 ++++++++++++++++++++++++++++
sound/soc/sof/intel/hda.h | 14 ++++-----
sound/soc/sof/intel/mtl.c | 17 ++---------
sound/soc/sof/ipc4-topology.c | 8 +++++
sound/soc/sof/ipc4.c | 17 +++++++++--
sound/soc/sof/sof-priv.h | 3 ++
9 files changed, 83 insertions(+), 55 deletions(-)
base-commit: 2687c84
2.52.0