CHROMIUM: usb: dwc3: rockchip: avoid removing hcd while system is frozen
authorWilliam wu <wulf@rock-chips.com>
Tue, 8 Nov 2016 07:15:34 +0000 (15:15 +0800)
committerHuang, Tao <huangtao@rock-chips.com>
Sun, 13 Nov 2016 09:21:14 +0000 (17:21 +0800)
commitba1c0c58ac1567a6918b77244a86aa959c58eb69
treee5ae9bb28034c35b56ba05c06c5464f7e233c7f9
parent6554e715f53aae3f0ef75e2170be9035412b86c5
CHROMIUM: usb: dwc3: rockchip: avoid removing hcd while system is frozen

Refer to the commit 85fbd722ad0f ("libata, freezer: avoid
block device removal while system is frozen"), when system
enter suspend, it may freeze kthreads and workqueues, and
do not restart them until complete PM resume all of devices.

If we remove XHCI hcd while system is frozen, it may call
usb_disconnect() to remove a usb block device which pluged
in before, but has gone missing. Unfortunately, remove the
block device can race with the rest of device resume. Since
freezable kthreads and workqueues are thawed after all of
devices resume are completed and block device removal depends
on freezable workqueues and kthreads (e.g. bdi_wq) to make
progress, this can lead to deadlock - block device removal
can't proceed because kthreads and workqueues are frozen and
can't be restarted because device resume is blocked behind
block device removal.

This patch must be used and tested with the commit bc68c26eff86
("CHROMIUM: usb: dwc3: rockchip: fix NULL pointer dereference
when resume"). This issue can be easily reproduced with USB-C
HUB and USB2/3 flash drive, result in the following backtrace.

[  360.201135] INFO: task kworker/u12:3:122 blocked for more than 120 seconds.
[  360.208094]       Not tainted 4.4.21 #185
[  360.212102] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  360.219923] kworker/u12:3   D ffffffc000204fd8     0   122      2 0x00000000
[  360.227007] Workqueue: events_unbound async_run_entry_fn
[  360.232326] Call trace:
[  360.234776] [<ffffffc000204fd8>] __switch_to+0x9c/0xa8
[  360.239918] [<ffffffc000915bf4>] __schedule+0x440/0x6d8
[  360.245139] [<ffffffc000915f20>] schedule+0x94/0xb4
[  360.250016] [<ffffffc00091909c>] schedule_timeout+0x44/0x27c
[  360.255670] [<ffffffc000916b78>] wait_for_common+0xf8/0x198
[  360.261237] [<ffffffc000916c40>] wait_for_completion+0x28/0x34
[  360.267067] [<ffffffc0005f3f4c>] dpm_wait+0x40/0x4c
[  360.271942] [<ffffffc0005f4770>] device_resume+0x60/0x1a4
[  360.277337] [<ffffffc0005f48e4>] async_resume+0x30/0x60
[  360.282558] [<ffffffc000242fc4>] async_run_entry_fn+0x50/0x104
[  360.288387] [<ffffffc0002397f0>] process_one_work+0x240/0x424
[  360.294128] [<ffffffc00023a28c>] worker_thread+0x2fc/0x424
[  360.299608] [<ffffffc00023f5fc>] kthread+0x10c/0x114
[  360.304570] [<ffffffc000203dd0>] ret_from_fork+0x10/0x40
[  360.309876]   task                        PC stack   pid father
[  360.315789] init            D ffffffc000204fd8     0     1      0 0x00400009
...
[  360.564124] [<ffffffc000204fd8>] __switch_to+0x9c/0xa8
[  360.569259] [<ffffffc000915bf4>] __schedule+0x440/0x6d8
[  360.574481] [<ffffffc000915f20>] schedule+0x94/0xb4
[  360.579355] [<ffffffc00091909c>] schedule_timeout+0x44/0x27c
[  360.585010] [<ffffffc000916b78>] wait_for_common+0xf8/0x198
[  360.590580] [<ffffffc000916c40>] wait_for_completion+0x28/0x34
[  360.596408] [<ffffffc000239270>] flush_work+0x168/0x1a4
[  360.601629] [<ffffffc0002395a4>] flush_delayed_work+0x44/0x50
[  360.607371] [<ffffffc000322f48>] bdi_unregister+0xa8/0xfc
[  360.612766] [<ffffffc00049afdc>] blk_cleanup_queue+0xf4/0x10c
[  360.618508] [<ffffffc000625d7c>] __scsi_remove_device+0x80/0xc8
[  360.624423] [<ffffffc000623dec>] scsi_forget_host+0x5c/0x74
[  360.629991] [<ffffffc000619a98>] scsi_remove_host+0x90/0x110
[  360.635646] [<ffffffc000692940>] usb_stor_disconnect+0x78/0xec
[  360.641474] [<ffffffc0006545e4>] usb_unbind_interface+0xa0/0x1f8
[  360.647477] [<ffffffc0005e70cc>] __device_release_driver+0xb4/0x114
[  360.653746] [<ffffffc0005e7158>] device_release_driver+0x2c/0x40
[  360.659748] [<ffffffc0005e61f8>] bus_remove_device+0x110/0x128
[  360.665575] [<ffffffc0005e3178>] device_del+0x164/0x1f4
[  360.670797] [<ffffffc000652094>] usb_disable_device+0x94/0x1c8
[  360.676625] [<ffffffc000649b74>] usb_disconnect+0x9c/0x1d0
[  360.682106] [<ffffffc000649b60>] usb_disconnect+0x88/0x1d0
[  360.687587] [<ffffffc00064e0e4>] usb_remove_hcd+0xc8/0x1e0
[  360.693068] [<ffffffc000664b4c>] dwc3_rockchip_otg_extcon_evt_work+0x14c/0x198
[  360.700284] [<ffffffc0002397f0>] process_one_work+0x240/0x424
[  360.706026] [<ffffffc00023a28c>] worker_thread+0x2fc/0x424
[  360.711506] [<ffffffc00023f5fc>] kthread+0x10c/0x114
[  360.716467] [<ffffffc000203dd0>] ret_from_fork+0x10/0x40

BUG=chrome-os-partner:58705, chrome-os-partner:59103
TEST=Plug in USB-C HUB and USB2/3 flash drive, then set
system to enter S3. After system suspend, plug out the
USB-C HUB first, and then press keyboard or power key to
check if system can wakeup successfully.

Change-Id: I6cb8ea1a4399b9b69b522ec0ed5f0f7810118850
Signed-off-by: William wu <wulf@rock-chips.com>
Reviewed-on: https://chromium-review.googlesource.com/408499
Commit-Ready: Guenter Roeck <groeck@chromium.org>
Tested-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: William Wu <wulf@rock-chips.com>
drivers/usb/dwc3/dwc3-rockchip.c