zfcp: close window with unblocked rport during rport gone
authorSteffen Maier <maier@linux.vnet.ibm.com>
Wed, 10 Aug 2016 16:30:46 +0000 (18:30 +0200)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Fri, 28 Oct 2016 07:01:29 +0000 (03:01 -0400)
commita01fae0faf1dd6819c67873a73ee1635df163742
treef420c79ab8f7cca8e065386c703f4c3bfd9d17fd
parentc9f34228ac085b7ecb680e7b5d4f4d807feb3b48
zfcp: close window with unblocked rport during rport gone

commit 4eeaa4f3f1d6c47b69f70e222297a4df4743363e upstream.

On a successful end of reopen port forced,
zfcp_erp_strategy_followup_success() re-uses the port erp_action
and the subsequent zfcp_erp_action_cleanup() now
sees ZFCP_ERP_SUCCEEDED with
erp_action->action==ZFCP_ERP_ACTION_REOPEN_PORT
instead of ZFCP_ERP_ACTION_REOPEN_PORT_FORCED
but must not perform zfcp_scsi_schedule_rport_register().

We can detect this because the fresh port reopen erp_action
is in its very first step ZFCP_ERP_STEP_UNINITIALIZED.

Otherwise this opens a time window with unblocked rport
(until the followup port reopen recovery would block it again).
If a scsi_cmnd timeout occurs during this time window
fc_timed_out() cannot work as desired and such command
would indeed time out and trigger scsi_eh. This prevents
a clean and timely path failover.
This should not happen if the path issue can be recovered
on FC transport layer such as path issues involving RSCNs.

Also, unnecessary and repeated DID_IMM_RETRY for pending and
undesired new requests occur because internally zfcp still
has its zfcp_port blocked.

As follow-on errors with scsi_eh, it can cause,
in the worst case, permanently lost paths due to one of:
sd <scsidev>: [<scsidisk>] Medium access timeout failure. Offlining disk!
sd <scsidev>: Device offlined - not ready after error recovery

For fix validation and to aid future debugging with other recoveries
we now also trace (un)blocking of rports.

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 5767620c383a ("[SCSI] zfcp: Do not unblock rport from REOPEN_PORT_FORCED")
Fixes: a2fa0aede07c ("[SCSI] zfcp: Block FC transport rports early on errors")
Fixes: 5f852be9e11d ("[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI")
Fixes: 338151e06608 ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable")
Fixes: 3859f6a248cb ("[PATCH] zfcp: add rports to enable scsi_add_device to work again")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/s390/scsi/zfcp_dbf.h
drivers/s390/scsi/zfcp_erp.c
drivers/s390/scsi/zfcp_scsi.c