Boot from NVMe-over-TCP
I have previously written about NVMe over TCP, including comparisons against other protocols. While the ecosystem is still growing, another significant hurdle has been crossed. Many enterprise customers boot their servers from storage arrays via Fibre Channel or iSCSI. SNIA hosted a presentation last March talking about the finalization of the NVMe specification for booting over the protocol; the talk said both Dell and HPE were working on integration into their servers' UEFI BIOS.
If you're acquainted with configuring the operating system side of iSCSI storage, NVMe over TCP will feel like a familiar friend. The terminology is different, but the principle is the same. Similarly, when working on the boot spec, the method for iSCSI was used as the baseline, basically swapping out NVMe over TCP instead of iSCSI; LUNs became namespaces, targets became subsystems, iBFT became nBFT (more on that later), and IQNs became NQNs.
Boot-from-SAN operates similarly across the board: you tell the host BIOS where it needs to go to find its boot volume across some fabric transport mechanism. For Fibre Channel, zoning must first be in place, then you tell the host which WWN and LUN ID holds its boot volume. With iSCSI, you need to bring up the Ethernet interface with an IP address, assign it an IQN, then tell it which IP and target IQN to connect to, and finally the correct LUN ID.
The BIOS configuration (PowerEdge 16G shown here) is straightforward. First, enable NVMe-oF. In this section, you can set the network qualified name (NQN), which is analogous to iSCSI's IQN. This needs to match the NQN you have configured for the host on the storage system.
In the next section, configure the IP interface and, at a minimum, the subsystem address, setting the subsystem port to 8009. This will let the server query the array's discovery controller and connect to all target ports in that network. To explicitly connect to a target port, set the subsystem address, NQN, and use 4420 for the TCP port.
After that, save your settings and reboot. You're ready to install an operating system, so tell the system to boot from your operating system ISO. Upon boot, the installer *should* find the namespace you've allocated and install to it.
When the system boots, the BIOS will pass off relevant connection information to the operating system via the NVMe Boot Firmware Table (nBFT), similar to iSCSI's iBFT. iBFT and nBFT are ACPI firmware tables that allow pre-OS information to be shared with the operating system when it start booting. The operating system utilizes this connection info to continue booting under its own power.
I've discussed in previous articles and up above, NVMe over TCP is an evolving ecosystem. Each OEM will have slightly different components they've qualified within the ecosystem, so please check with them on the complete path requirements. They'll all likely support ESXi 7.0u3 and newer as well as a few different Linux distributions, and various multipath options within the latter.