Data is everywhere and must land somewhere. Somewhere, every bit of data that is created inhabits a piece of IT hardware: a storage array, a cloud service server, a desktop storage device, a deep freeze-like digital archive. No surprise, solid-state devices (SSDs) and all-NAND flash arrays (AFA) now dominate high-value, time-sensitive storage in data centers. The related, but well-kept secret is the innovation below the surface that will profoundly change both public and private data centers.
Here are the updates:
- Topping the innovation list is NVM Express (NVMe), an open standard and specification for accessing non-volatile storage media attached via a PCI Express (PCIe) bus. The acronym NVM stands for non-volatile memory. Think NAND flash memory in SSDs, PCIe add-in cards, M.2 cards and other forms.
- The data storage industry’s NVMe professional working group has been highly prolific, especially lately. These five recent specification updates I have chosen to spotlight here in this eWEEK Data Points article are significant. I consider them nothing short of revolutionary.
Data Point No. 1: NVMe Sets in Cloud Service Deployments
NVMe Sets serve to isolate so-called “noisy neighbors” by separating and allocating NAND media so workloads (or containers) using one NVM Set does not impact other workloads on other sets. “Noisy neighbors” disallow cloud service providers to offer container services on shared hardware with a service-level agreement. NVMe Sets solve this problem.
Picture servers running cloud containers. Now picture one of those containers getting blasted with writes. When this happens, other containers on the same SSD stop responding.
The cloud business impact is obvious. Cloud service providers (also private cloud operations) cannot offer containers on shared hardware with a guarantee quality of service. This noisy neighbor problem is so pronounced that cloud companies have embarked on projects to entirely rewrite SSD firmware. NVMe Sets solve this problem.
Data Point No. 2: NVMe Deterministic IO
This eliminates read latency outliers caused by SSD housekeeping. A chunk of time is allocated to deliver predictable read latency (this is deterministic mode). Another chunk of time is allocated for housekeeping and read latency is unpredictable (this is non-deterministic).
Solving unpredictable latency in one typical cloud database use case:
- Picture a cloud database that spans a dozen servers, and each server has a dozen SSDs. Now picture a database query that hits these dozen servers and SSDs. The cloud database query completes only as fast as the slowest response. If SSDs happen to be otherwise busy with housekeeping … then ouch. And when hundreds of SSDs are involved, the probability of encountering an SSD in housekeeping mode is magnified.
When IO determinism is coordinated across a group of SSDs, then SSDs in deterministic mode are employed while SSDs in non-deterministic mode are conveniently omitted from service. This remedies the “unpredictable” latency problem.
Data Point No. 3: NVMe Over Fabrics
This decouples storage from servers without high penalties of latency. The latest spec adds TCP for routing, more viable for replication target.
Why decouple compute from storage? There are no performance penalties, and it enables TCP route-ability for edge storage for replication, media caching, IoT and AI/ML.
Data centers are dynamic places with either lots of storage and very little compute or lots of compute and very little storage. Over time, this has led to disaggregated storage.
Now, NVMe over Fabrics allows disaggregated storage and with performance of local NVMe SSD. TCP route-ability makes an NVMe over Fabrics storage target acceptable for remote replication. Why then should we need to wrap a server around a group of SSDs for replication?
Data Point No. 4: NVMe Management Interface
NVMe Management Interface (MI) has optional PCI commands that allow specific control of device reads/writes; this is significant when used with NVME over fabrics in points of presence (POPs).
Data Point No. 5: NVMe Namespace Sharing
NVMe namespace sharing is exactly what it sounds like. Create a namespace; allow two or more completely independent PCI express paths between a single host and namespace. This enables significantly better storage at the edge for media edge caching, IoT, AI/ML and other functions.
Having one PCI express path writing to a device and another PCI express path reading from that same device has serious value when used for endpoint POP media caching, IoT or replication use cases.
————————–
We need to give credit where credit is due: Facebook has published this critical information, leading to the standardization of NVMe Sets and NVMe deterministic IO. Kazan Networks (now part of WDC) pioneered NVMe over Fabrics with the Onyx chip and the Fuji chip.
Hubbert Smith, a respected storage and memory expert and consultant who’s worked at Samsung, Toshiba and Netapp, is a member of the NVME working group.