The Fibre Channel NVMe cookbook: QED from a storage whizz's POV
Let's nerd out with Greg Scherer again
Interview OK, you think Fibre Channel-based NVMe fabric access is a good idea. How do you implement it? Where is the FC-NVMe cookbook?
Consulting technologist Greg Scherer told us about NVMe over Fibre Channel and Ethernet in what has become the first interview of two. He said FC-NVMe was a practical proposition, and we asked him how it could be implemented, starting from an existing Fibre Channel SAN infrastructure point of view.
El Reg SANs have controllers, which operate a storage IO and management stack, and present LUNs to accessing servers. What happens with FC-NVMe which treats a storage array like remote direct access storage (Remote DAS)?
Greg Scherer NVMe has no concept of controller and LUN; the two concepts are merged in an NVMe "Name Space". It is undefined as to where services live, just like in SCSI, the disk drives (like SAS), typically don't provide services (RAID, replication, etc.) but rather an array controller on top of the disks provide the services. Services could be distributed in Host Software, but that model is not widely used.
El Reg So what will storage arrays do?
Greg Scherer For an array controller, they will layer services behind each NVMe name space (software and hardware); there are a few startups that are attempting to create a distributed services software model, but the more typical "Cloud-like" application will be to treat NVMe-oF as DAS (Direct Attach Storage) and use a global file system that handles Replication/Data Protection at the file-system layer (HDFS, MongoDB, Casandra, NFS, etc.).
For the array case, for all practical purposes they will advertise multiple Name Spaces, typically one per exported storage area (read this as Controller + LUN). This will be virtualised storage, so each exported storage area will be RAIDed, replicated or erasure-coded.
El Reg How will an Fibre Channel SAN user adopt FC-NVMe?
Greg Scherer It should be a relatively minor change:
On the FC infrastructure-side:
- It does require a new FC-NVMe driver per OS; this is different than NVMe-oF that depends on "built-in" OS/hypervisor support, including the RNIC (RDMA NIC) driver
- Both FC HBA providers (Emulex and QLogic) support FC-NVMe with their recent adapters, though the drivers are still in beta form
- As of yet, NVMe-oF Host software doesn't support multi-pathing (MPIO); this means that any FC-NVMe drivers that leverage the NVMe-oF software stack don't support failover, like FC, SCSI and iSCSI do
MPIO support is being worked on in Linux now and the belief is that support will occur later this year.
El Reg What does it need from HBA and switch suppliers?
Greg Scherer FC HBA providers have the choice of providing an FC-NVMe driver that leverages the NVMe-oF stack (preferred due to more parallel processing enhancements and deeper queues) or leveraging the SCSI stack for "legacy" OS's. This is virtually transparent to the host and applications, but doesn't take advantage of performance enhancements in the NVMe-oF stack.
On the FC switch side, minor modifications to support FC-NVMe need to occur, mostly for the name server. Brocade has already released its FC-NVMe support; with its newest switch. Brocade even supports multiple enhancements aimed at FC-NVMe, including latency measurements.
El Reg And what needs to happen with storage arrays?
Greg Scherer On the array side all recent FC controllers from Emulex and QLogic support FC-NVMe-oF target mode, so array controllers don't need to change out their FC hardware, but they do need to support a new FC-NVMe driver and integrate this into their storage software ecosystem. Many arrays already have active support plans under way.
Essentially the effort is to export a virtualised LUN in the form of an NVMe NameSpace, and register this name with the FC switch NameServer; so it can be discovered by FC-NVMe Initiators.
El Reg What about the network server side?
Greg Scherer The host dependency is FC-NVMe drivers, supplied by Emulex or QLogic; the rest of the infrastructure relies on tried and true FC capabilities.
The implication is that, upstream of the FC-NVMe drivers, the existing FC capabilities remain the same and so applications which use it have no need to be changed. They use an FC infrastructure layer between themselves and the destination storage array drives before FC-NVMe is implemented, and carry on doing so after FC-NVMe is in place.
The destination storage array will provide data services by using "a global file system that handles replication/data protection at the file-system layer (HDFS, MongoDB, Casandra, NFS, etc.)". Comment from storage vendors will be most welcome. ®