[SCSI] libiscsi: fix cmd timeout/completion race
authorMike Christie <michaelc@cs.wisc.edu>
Fri, 27 Jan 2012 03:13:11 +0000 (21:13 -0600)
committerJames Bottomley <JBottomley@Parallels.com>
Sun, 19 Feb 2012 14:09:00 +0000 (08:09 -0600)
If the driver/lib has called scsi_done and cleaned up internally but
scsi layer has not yet called blk_mark_rq_complete when the command
times out we hit a problem if the timeout code calls blk_mark_rq_complete first.
When the time out code calls into the driver we were returning
BLK_EH_RESET_TIMER and that causes the timeout code to just call
us again later.

We need to be calling BLK_EH_HANDLED so the timeout code can complete
the completion process because it had called blk_mark_rq_complete
on the command and now owns its processing.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
drivers/scsi/libiscsi.c

index 8582d7c257325a899e165e234783aa80711331ad..82c3fd4bc938511e5242fbbef85cbec077b3c119 100644 (file)
@@ -1909,6 +1909,16 @@ static enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
        ISCSI_DBG_EH(session, "scsi cmd %p timedout\n", sc);
 
        spin_lock(&session->lock);
+       task = (struct iscsi_task *)sc->SCp.ptr;
+       if (!task) {
+               /*
+                * Raced with completion. Blk layer has taken ownership
+                * so let timeout code complete it now.
+                */
+               rc = BLK_EH_HANDLED;
+               goto done;
+       }
+
        if (session->state != ISCSI_STATE_LOGGED_IN) {
                /*
                 * We are probably in the middle of iscsi recovery so let
@@ -1925,16 +1935,6 @@ static enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc)
                goto done;
        }
 
-       task = (struct iscsi_task *)sc->SCp.ptr;
-       if (!task) {
-               /*
-                * Raced with completion. Just reset timer, and let it
-                * complete normally
-                */
-               rc = BLK_EH_RESET_TIMER;
-               goto done;
-       }
-
        /*
         * If we have sent (at least queued to the network layer) a pdu or
         * recvd one for the task since the last timeout ask for