sched: disable preemption around blk_flush_plug in sched_submit_work#765
sched: disable preemption around blk_flush_plug in sched_submit_work#765blktests-ci[bot] wants to merge 1 commit intolinus-master_basefrom
Conversation
|
Upstream branch: 6596a02 |
3b54e52 to
6a0b974
Compare
|
Upstream branch: 507bd4b |
ca74bd9 to
2f53b0b
Compare
6a0b974 to
59ca59b
Compare
|
Upstream branch: dd6c438 |
2f53b0b to
1250dc6
Compare
94f0438 to
857ada9
Compare
|
Upstream branch: dd6c438 |
1250dc6 to
621292f
Compare
857ada9 to
482ce5b
Compare
|
Upstream branch: dca922e |
621292f to
da26c71
Compare
482ce5b to
5a9f7c7
Compare
|
Upstream branch: e75a43c |
da26c71 to
3e9643b
Compare
5a9f7c7 to
25a041f
Compare
On preemptible kernels, a three-way deadlock can occur involving blk_mq_freeze_queue and blk_mq_dispatch_list: - Task A holds a filesystem lock (e.g., f2fs io_rwsem) and enters __bio_queue_enter(), waiting for mq_freeze_depth == 0 - Task B holds mq_freeze_depth=1 (elevator_change) and waits for q_usage_counter to reach zero in blk_mq_freeze_queue_wait() - Task C is going to sleep waiting for the filesystem lock. Before sleeping, schedule() calls sched_submit_work() -> blk_flush_plug() -> blk_mq_dispatch_list(), which acquires q_usage_counter via percpu_ref_get(). If Task C gets preempted before percpu_ref_put(), it will not be scheduled back because the task is already in uninterruptible sleep state (TASK_UNINTERRUPTIBLE). This means it holds the percpu_ref indefinitely, preventing freeze from completing. This is fundamentally an ABBA deadlock between queue freeze and the filesystem lock, exposed by preemption creating an artificial hold on q_usage_counter during the plug flush. Fix by disabling preemption around blk_flush_plug() in sched_submit_work(). The _notrace variants are used since this runs in scheduler context. preempt_enable_no_resched_notrace() is correct because we are already inside __schedule() and about to pick the next task. Fixes: 73c1010 ("block: initial patch for on-stack per-task plugging") Reported-by: Michael Wu <michael@allwinnertech.com> Tested-by: Michael Wu <michael@allwinnertech.com> Link: https://lore.kernel.org/linux-block/20260417082744.30124-1-michael@allwinnertech.com/ Signed-off-by: Ming Lei <tom.leiming@gmail.com>
|
Upstream branch: 66edb90 |
3e9643b to
3343a08
Compare
Pull request for series with
subject: sched: disable preemption around blk_flush_plug in sched_submit_work
version: 1
url: https://patchwork.kernel.org/project/linux-block/list/?series=1084708