You might have seen where you unmounted/detached a LUN from ESXi host and after some time few hosts
in your env are showing as Inaccessible/Not-Responding. When you would further check the VMkernel logs on affected
host, you would find APD/PDL related log entries.
This is something that could be related to not
following the proper procedure during LUN detach. If you wouldn’t follow the proper process during
LUN/Datastore unmount/detach then it could lead the host to APD/PDL state.
In this post I will summarizing the best practice of
unmounting a LUN from ESXi 5.x or 6.
Before doing anything, ensure that:
- Host should
not have any registered virtual machines/template residing on this
datastore and all CD/DVD images located on the VMFS datastore must also be
unregistered from any virtual machines.
- The
datastore is not used for vSphere HA heartbeat.
- The
datastore is not part of a datastore cluster (managed by Storage DRS).
- The
datastore is not configured as a diagnostic coredump partition.
- Storage
I/O Control is disabled for the datastore.
- If
the LUN is being used as an RDM, remove the RDM from the virtual machine.
Click Edit Settings, highlight the RDM hard disk, and
click Remove. Select Delete from disk and
click OK.
Note: This destroys the mapping file but not the LUN content.
- Check
if the LUN/datastore is used as the persistent
scratch location for the host.
Note: When using the vSphere Web
Client with vSphere 5.1, 5.5, and 6.0, only following checks are required
during datastore unmount,
- Host
should not have any virtual machines residing on this datastore
- Host
should not use the datastore for HA heartbeats
Obtaining the NAA ID of the LUN to
be removed
From the vSphere Client, this information is visible in the
Properties window of the datastore.
From the ESXi host, run this command:
# esxcli storage vmfs extent list
You see output similar to:
Volume Name VMFS UUID
Extent Number Device Name Partition
----------- ----------------------------------- -------------
------------------------------------ ---------
datastore1
4de4cb24-4cff750f-85f5-0019b9f1ecf6 0
naa.6001c230d8abfe000ff76c198ddbc13e 3
Storage2
4c5fbff6-f4069088-af4f-0019b9f1ecf4 0
naa.6001c230d8abfe000ff76c2e7384fc9a 1
Storage4
4c5fc023-ea0d4203-8517-0019b9f1ecf4 0
naa.6001c230d8abfe000ff76c51486715db 1
LUN01
4e414917-a8d75514-6bae-0019b9f1ecf4 0
naa.60a98000572d54724a34655733506751 1
Make a note of the NAA ID of the datastore to use this information later in
this procedure.
Note: Alternatively, you can run the esxcli storage filesystem
list command, which lists all file systems recognized by the ESXi host. To
find the unique identifier of the LUN housing the datastore to be removed, run
this command:
# esxcfg-scsidevs –m
This command generates a list of VMFS datastore volumes and their related
unique identifiers. Make a note of the unique identifier (NAA_ID) for
the datastore you want to unmount as this will be used later on.
Unmounting and de a LUN using the
vSphere Client
To detach a storage device using the vSphere Client, first
unmount the datastore and then detach the LUN, process is as follows,
1.
If the LUN is an RDM, skip to step 2. Otherwise,
in the Configuration tab of the ESXi host, click Storage.
Right-click the datastore being removed and click Unmount.
A Confirm Datastore Unmount window appears. When the prerequisite criteria have
been passed, click OK.
Note: To unmount a datastore from multiple hosts in the vSphere Client,
click Hosts and Clusters > Datastores and Datastore
Clusters view (Ctrl+Shift+D). Perform the unmount
task and select the appropriate hosts that should no longer access the datastore
to be unmounted.
2.
Click the Devices view
(under Configuration > Storage):
3.
Right-click the NAA ID of the LUN (as noted
above) and click Detach. A Confirm Device Unmount window is
displayed. When the prerequisite criteria are passed, click OK.
Under the Operational State of the Device, the LUN is listed as Unmounted.
Note: The Detach function must be performed on a per-host basis and does
not propagate to other hosts in vCenter Server. If a LUN is presented to an
initiator group or storage group on the SAN, the Detach function must be
performed on every host in that initiator group before unmapping the LUN from
the group on the SAN. Failing to follow this step results in an all-paths-down
(APD) state for those hosts in the storage group on which Detach was not
performed for the LUN being unmapped.
4.
Confirm if the LUN is successfully detached. The
LUN can then be safely unpresented from the SAN.
5.
Perform a rescan on all ESXi hosts which had visibility
to the LUN. The device is automatically removed from the Storage Adapters.
When the device is detached, it stays in an unmounted state
even if the device is re-presented (that is, the detached state is persistent).
To bring the device back online, the device must be attached.
If you want the device to permanently decommission from an ESXi host, manually
remove the NAA entries from the host configuration:
·
To list the permanently detached devices, run
this command:
# esxcli storage core device detached list
You see output similar to:
Device UID State
------------------------------------ -----
naa.50060160c46036df50060160c46036df off
naa.6006016094602800c8e3e1c5d3c8e011 off
·
To permanently remove the device configuration
information from the system, run this command:
# esxcli storage core device detached remove -d NAA_ID
For example:
# esxcli storage core device detached remove -d
naa.50060160c46036df50060160c46036df
This is it.
Unmounting a LUN using the command
line
To unmount a LUN from an ESXi 5.x/6.0 host using the command
line:
- As
earlier, obtain the NAA ID of the LUN to be removed
- Now
unmount the datastore by running this command:
# esxcli storage filesystem unmount [-u UUID | -l label |
-p path ]
For example, use one of these commands to unmount
the LUN01 datastore:
# esxcli storage filesystem unmount -l LUN01
# esxcli storage filesystem unmount -u 4e414917-a8d75514-6bae-0019b9f1ecf4
# esxcli storage filesystem unmount -p /vmfs/volumes/4e414917-a8d75514-6bae-0019b9f1ecf4
Note: If the VMFS filesystem you are attempting to unmount has
active I/O or has not fulfilled the prerequisites to unmount the VMFS
datastore, you see an error in the VMkernel logs similar
to:
WARNING: VC: 637: unmounting opened volume
('4e414917-a8d75514-6bae-0019b9f1ecf4' 'LUN01') is not allowed.
VC: 802: Unmount VMFS volume f530 28 2 4e414917a8d7551419006bae f4ecf19b 4
1 0 0 0 0 0 : Busy
- To
verify that the datastore is unmounted, run this command:
# esxcli storage filesystem list
You see output similar to:
Mount Point Volume Name UUID
Mounted Type Size
Free
------------------------------------------------- -----------
----------------------------------- ------- ------ ----------- -----------
/vmfs/volumes/4de4cb24-4cff750f-85f5-0019b9f1ecf6 datastore1 4de4cb24-4cff750f-85f5-0019b9f1ecf6 true
VMFS-5 140660178944 94577360896
/vmfs/volumes/4c5fbff6-f4069088-af4f-0019b9f1ecf4 Storage2
4c5fbff6-f4069088-af4f-0019b9f1ecf4
true VMFS-3 146028888064 7968129024
/vmfs/volumes/4c5fc023-ea0d4203-8517-0019b9f1ecf4 Storage4
4c5fc023-ea0d4203-8517-0019b9f1ecf4
true VMFS-3 146028888064 121057050624
LUN01
4e414917-a8d75514-6bae-0019b9f1ecf4
false VMFS-unknown version 0
0
The Mounted field is set to false, the Type field
is set to VMFS-unknown version, and that no Mount
Point exists.
Note: The unmounted state of the VMFS datastore persists across
reboots. This is the default behavior. If you need to unmount a datastore
temporarily, you can do so by appending the --no-persist flag to
the unmount command.
- To
detach the device/LUN, run this command:
# esxcli storage core device set --state=off -d NAA_ID
- To
verify that the device is offline, run this command:
# esxcli storage core device list -d NAA_ID
You see output, which shows that the Status of the disk
is off, similar to:
naa.60a98000572d54724a34655733506751
Display Name: NETAPP Fibre Channel Disk
(naa.60a98000572d54724a34655733506751)
Has Settable Display Name: true
Size: 1048593
Device Type: Direct-Access
Multipath Plugin: NMP
Devfs Path: /vmfs/devices/disks/naa.60a98000572d54724a34655733506751
Vendor: NETAPP
Model: LUN
Revision: 7330
SCSI Level: 4
Is Pseudo: false
Status: off
Is RDM Capable: true
Is Local: false
Is Removable: false
Is SSD: false
Is Offline: false
Is Perennially Reserved: false
Thin Provisioning Status: yes
Attached Filters:
VAAI Status: unknown
Other UIDs: vml.020000000060a98000572d54724a346557335067514c554e202020
This device is now successfully detached from
the host. It remains visible the UI at this point.
If the device is to be permanently decommissioned, it is now possible to
unpresent the LUN from the SAN.
- To
rescan all devices on the ESXi host, run this command:
# esxcli storage core adapter rescan [ -A vmhba# | --all
]
The devices are automatically removed from the Storage Adapters.
Notes:
- A
rescan must be run on all hosts that had visibility to the removed LUN.
- When
the device is detached, it stays in an unmounted state even if the device
is re-presented (that is, the detached state is persistent). To bring the
device back online, the device must be attached. To do this via the
command line, run this command:
# esxcli storage core device set --state=on -d NAA_ID
- If the
device is to be permanently decommissioned from an ESXi host, (that is,
the LUN has been or will be destroyed), remove the NAA entries from the
host configuration by running these commands:
- To
list the permanently detached devices:
# esxcli storage core device detached list
You see output similar to:
Device UID State
---------------------------- -----
naa.50060160c46036df50060160c46036df off
naa.6006016094602800c8e3e1c5d3c8e011 off
- To
permanently remove the device configuration information from the system:
# esxcli storage core device detached remove -d NAA_ID
For example:
# esxcli storage core device detached remove -d
naa.50060160c46036df50060160c46036df
- The reference to the device configuration is
permanently removed from the ESXi host's configuration.
Note: If the device is detached but still presented (highlighted step was
skipped), the preceding command fails to permanently remove the device
from the system, and the device is automatically re-attached. You must
complete highlighted step for the device to be permanently removed.
Reference: VMware KB# 2004605, 2004684
That's it... :)