USB: EHCI: support running URB giveback in tasklet context
authorMing Lei <ming.lei@canonical.com>
Wed, 3 Jul 2013 14:53:11 +0000 (22:53 +0800)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Mon, 12 Aug 2013 18:43:49 +0000 (11:43 -0700)
All 4 transfer types can work well on EHCI HCD after switching to run
URB giveback in tasklet context, so mark all HCD drivers to support
it.

Also we don't need to release ehci->lock during URB giveback any more.

>From below test results on 3 machines(2 ARM and one x86), time
consumed by EHCI interrupt handler droped much without performance
loss.

1 test description
1.1 mass storage performance test:
- run below command 10 times and compute the average performance

    dd if=/dev/sdN iflag=direct of=/dev/null bs=200M count=1

- two usb mass storage device:
A: sandisk extreme USB 3.0 16G(used in test case 1 & case 2)
B: kingston DataTraveler G2 4GB(only used in test case 2)

1.2 uvc function test:
- run one simple capture program in the below link

   http://kernel.ubuntu.com/~ming/up/capture.c

- capture format 640*480 and results in High Bandwidth mode on the
uvc device: Z-Star 0x0ac8/0x3450

- on T410(x86) laptop, also use guvcview to watch video capture/playback

1.3 about test2 and test4
- both two devices involved are tested concurrently by above test items

1.4 how to compute irq time(the time consumed by ehci_irq)
- use trace points of irq:irq_handler_entry and irq:irq_handler_exit

1.5 kernel
3.10.0-rc3-next-20130528

1.6 test machines
Pandaboard A1: ARM CortexA9 dural core
Arndale board: ARM CortexA15 dural core
T410: i5 CPU 2.67GHz quad core

2 test result
2.1 test case1: single mass storage device performance test
--------------------------------------------------------------------
upstream  | patched
perf(MB/s)+irq time(us) | perf(MB/s)+irq time(us)
--------------------------------------------------------------------
Pandaboard A1:  25.280(avg:145,max:772) | 25.540(avg:14, max:75)
Arndale board:  29.700(avg:33, max:129) | 29.700(avg:10,  max:50)
T410:  34.430(avg:17, max:154*)| 34.660(avg:12, max:155)
---------------------------------------------------------------------

2.2 test case2: two mass storage devices' performance test
--------------------------------------------------------------------
upstream  | patched
perf(MB/s)+irq time(us) | perf(MB/s)+irq time(us)
--------------------------------------------------------------------
Pandaboard A1:  15.840/15.580(avg:158,max:1216) | 16.500/16.160(avg:15,max:139)
Arndale board:  17.370/16.220(avg:33 max:234) | 17.480/16.200(avg:11, max:91)
T410:  21.180/19.820(avg:18 max:160) | 21.220/19.880(avg:11, max:149)
---------------------------------------------------------------------

2.3 test case3: one uvc streaming test
- uvc device works well(on x86, luvcview can be used too and has
same result with uvc capture)
--------------------------------------------------------------------
upstream  | patched
irq time(us) | irq time(us)
--------------------------------------------------------------------
Pandaboard A1:  (avg:445, max:873) | (avg:33, max:44)
Arndale board:  (avg:316, max:630) | (avg:20, max:27)
T410:  (avg:39,  max:107) | (avg:10, max:65)
---------------------------------------------------------------------

2.4 test case4: one uvc streaming plus one mass storage device test
--------------------------------------------------------------------
upstream  | patched
perf(MB/s)+irq time(us) | perf(MB/s)+irq time(us)
--------------------------------------------------------------------
Pandaboard A1:  20.340(avg:259,max:1704)| 20.390(avg:24, max:101)
Arndale board:  23.460(avg:124,max:726) | 23.370(avg:15, max:52)
T410:  28.520(avg:27, max:169) | 28.630(avg:13, max:160)
---------------------------------------------------------------------

2.5 test case5: read single mass storage device with small transfer
- run below command 10 times and compute the average speed

 dd if=/dev/sdN iflag=direct of=/dev/null bs=4K count=4000

1), test device A:
--------------------------------------------------------------------
upstream  | patched
perf(MB/s)+irq time(us) | perf(MB/s)+irq time(us)
--------------------------------------------------------------------
Pandaboard A1:  6.5(avg:21, max:64) | 6.5(avg:10, max:24)
Arndale board:  8.13(avg:12, max:23) | 8.06(avg:7,  max:17)
T410:  6.66(avg:13, max:131)   | 6.84(avg:11, max:149)
---------------------------------------------------------------------

2), test device B:
--------------------------------------------------------------------
upstream  | patched
perf(MB/s)+irq time(us) | perf(MB/s)+irq time(us)
--------------------------------------------------------------------
Pandaboard A1:  5.5(avg:21,max:43) | 5.49(avg:10, max:24)
Arndale board:  5.9(avg:12, max:22) | 5.9(avg:7, max:17)
T410:  5.48(avg:13, max:155) | 5.48(avg:7, max:140)
---------------------------------------------------------------------

* On T410, sometimes read ehci status register in ehci_irq takes more
than 100us, and the problem has been reported on the link:

http://marc.info/?t=137065867300001&r=1&w=2

Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
14 files changed:
drivers/usb/host/ehci-fsl.c
drivers/usb/host/ehci-grlib.c
drivers/usb/host/ehci-hcd.c
drivers/usb/host/ehci-mv.c
drivers/usb/host/ehci-octeon.c
drivers/usb/host/ehci-pmcmsp.c
drivers/usb/host/ehci-ppc-of.c
drivers/usb/host/ehci-ps3.c
drivers/usb/host/ehci-q.c
drivers/usb/host/ehci-sead3.c
drivers/usb/host/ehci-sh.c
drivers/usb/host/ehci-tilegx.c
drivers/usb/host/ehci-w90x900.c
drivers/usb/host/ehci-xilinx-of.c

index 45eee6e9c6c858630f8d8a3a75a2d01a869917f2..e44f442e2fb73b8ce6ce1a5bae0a884d9d540791 100644 (file)
@@ -669,7 +669,7 @@ static const struct hc_driver ehci_fsl_hc_driver = {
         * generic hardware linkage
         */
        .irq = ehci_irq,
-       .flags = HCD_USB2 | HCD_MEMORY,
+       .flags = HCD_USB2 | HCD_MEMORY | HCD_BH,
 
        /*
         * basic lifecycle operations
index 83ab51af250f158e735373760f8c008ce6fe0ad8..b52a66ce92e8592b123239aa24b724dddfd085fa 100644 (file)
@@ -43,7 +43,7 @@ static const struct hc_driver ehci_grlib_hc_driver = {
         * generic hardware linkage
         */
        .irq                    = ehci_irq,
-       .flags                  = HCD_MEMORY | HCD_USB2,
+       .flags                  = HCD_MEMORY | HCD_USB2 | HCD_BH,
 
        /*
         * basic lifecycle operations
index 7ad9ef8aedee27d98087f864f78bb5147325e663..73c72997cd4f0c363ae386ecaf65b1609c99cc06 100644 (file)
@@ -1166,7 +1166,7 @@ static const struct hc_driver ehci_hc_driver = {
         * generic hardware linkage
         */
        .irq =                  ehci_irq,
-       .flags =                HCD_MEMORY | HCD_USB2,
+       .flags =                HCD_MEMORY | HCD_USB2 | HCD_BH,
 
        /*
         * basic lifecycle operations
index 35cdbd88bbbef62a93a3aa869c12dbfda8183bf2..417c10da945078e37ddf20e7be290e996936074e 100644 (file)
@@ -96,7 +96,7 @@ static const struct hc_driver mv_ehci_hc_driver = {
         * generic hardware linkage
         */
        .irq = ehci_irq,
-       .flags = HCD_MEMORY | HCD_USB2,
+       .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
 
        /*
         * basic lifecycle operations
index 45cc00158412ac8a380cda88a28a4bb7536d62fa..ab0397e4d8f3eadae916d07434f4431f3d59def3 100644 (file)
@@ -51,7 +51,7 @@ static const struct hc_driver ehci_octeon_hc_driver = {
         * generic hardware linkage
         */
        .irq                    = ehci_irq,
-       .flags                  = HCD_MEMORY | HCD_USB2,
+       .flags                  = HCD_MEMORY | HCD_USB2 | HCD_BH,
 
        /*
         * basic lifecycle operations
index 601e208bd782c07e9d0bb1b60d238ccbb7774758..893b707f0000abf0e323f39b28b6060a3293bcef 100644 (file)
@@ -286,7 +286,7 @@ static const struct hc_driver ehci_msp_hc_driver = {
 #else
        .irq =                  ehci_irq,
 #endif
-       .flags =                HCD_MEMORY | HCD_USB2,
+       .flags =                HCD_MEMORY | HCD_USB2 | HCD_BH,
 
        /*
         * basic lifecycle operations
index 932293fa32de657de2e36ba091127e50009a6ff7..6cc5567bf9c87faaa8c4a21dcb4202e8e6e7fb94 100644 (file)
@@ -28,7 +28,7 @@ static const struct hc_driver ehci_ppc_of_hc_driver = {
         * generic hardware linkage
         */
        .irq                    = ehci_irq,
-       .flags                  = HCD_MEMORY | HCD_USB2,
+       .flags                  = HCD_MEMORY | HCD_USB2 | HCD_BH,
 
        /*
         * basic lifecycle operations
index fd983771b02559cb56c6210e6813d9b223a80f7a..8188542ba17ea01214a3ab0f269fe07cb6cb1744 100644 (file)
@@ -71,7 +71,7 @@ static const struct hc_driver ps3_ehci_hc_driver = {
        .product_desc           = "PS3 EHCI Host Controller",
        .hcd_priv_size          = sizeof(struct ehci_hcd),
        .irq                    = ehci_irq,
-       .flags                  = HCD_MEMORY | HCD_USB2,
+       .flags                  = HCD_MEMORY | HCD_USB2 | HCD_BH,
        .reset                  = ps3_ehci_hc_reset,
        .start                  = ehci_run,
        .stop                   = ehci_stop,
index d34b399b78e2c30665792857a34615b6c9a0c227..b637a65e1e5268978dfcdd0bc9bff8bc468276ac 100644 (file)
@@ -254,8 +254,6 @@ static int qtd_copy_status (
 
 static void
 ehci_urb_done(struct ehci_hcd *ehci, struct urb *urb, int status)
-__releases(ehci->lock)
-__acquires(ehci->lock)
 {
        if (usb_pipetype(urb->pipe) == PIPE_INTERRUPT) {
                /* ... update hc-wide periodic stats */
@@ -281,11 +279,8 @@ __acquires(ehci->lock)
                urb->actual_length, urb->transfer_buffer_length);
 #endif
 
-       /* complete() can reenter this HCD */
        usb_hcd_unlink_urb_from_ep(ehci_to_hcd(ehci), urb);
-       spin_unlock (&ehci->lock);
        usb_hcd_giveback_urb(ehci_to_hcd(ehci), urb, status);
-       spin_lock (&ehci->lock);
 }
 
 static int qh_schedule (struct ehci_hcd *ehci, struct ehci_qh *qh);
index b2de52d3961488f249aeb9d4026efe2d603b72bb..8a734498079bc176938c57a6b132c146e9be00dd 100644 (file)
@@ -55,7 +55,7 @@ const struct hc_driver ehci_sead3_hc_driver = {
         * generic hardware linkage
         */
        .irq                    = ehci_irq,
-       .flags                  = HCD_MEMORY | HCD_USB2,
+       .flags                  = HCD_MEMORY | HCD_USB2 | HCD_BH,
 
        /*
         * basic lifecycle operations
index 93e59a13bc1fec919ef81690bbce9f4bdcd3e951..dc899eb2b86183561351d78e8dba1ceffc9cbb18 100644 (file)
@@ -36,7 +36,7 @@ static const struct hc_driver ehci_sh_hc_driver = {
         * generic hardware linkage
         */
        .irq                            = ehci_irq,
-       .flags                          = HCD_USB2 | HCD_MEMORY,
+       .flags                          = HCD_USB2 | HCD_MEMORY | HCD_BH,
 
        /*
         * basic lifecycle operations
index cca4be90a864dba009c852f606653fb5fe60d568..67026ffbf9a871c9780a4b0b7f0d6c739c904e73 100644 (file)
@@ -61,7 +61,7 @@ static const struct hc_driver ehci_tilegx_hc_driver = {
         * Generic hardware linkage.
         */
        .irq                    = ehci_irq,
-       .flags                  = HCD_MEMORY | HCD_USB2,
+       .flags                  = HCD_MEMORY | HCD_USB2 | HCD_BH,
 
        /*
         * Basic lifecycle operations.
index 59e0e24c753febfb76369be8f872cb3731c2f365..1c370dfbee0d35e6a3cf2b1288836df2116599c5 100644 (file)
@@ -108,7 +108,7 @@ static const struct hc_driver ehci_w90x900_hc_driver = {
         * generic hardware linkage
         */
        .irq = ehci_irq,
-       .flags = HCD_USB2|HCD_MEMORY,
+       .flags = HCD_USB2|HCD_MEMORY|HCD_BH,
 
        /*
         * basic lifecycle operations
index eba962e6ebfbbd8ffdd46a85e0fed5ec9b18b78a..95979f9f4381d8e8e573c7e0fe254d5585d23ab8 100644 (file)
@@ -79,7 +79,7 @@ static const struct hc_driver ehci_xilinx_of_hc_driver = {
         * generic hardware linkage
         */
        .irq                    = ehci_irq,
-       .flags                  = HCD_MEMORY | HCD_USB2,
+       .flags                  = HCD_MEMORY | HCD_USB2 | HCD_BH,
 
        /*
         * basic lifecycle operations