This year HP made some cool announcements again on the 3PAR storage platform in Las Vegas.
The 2 most important ones were the new 1,92TB SSD and hardware deduplication.
The result of these 2 updates is that HP can deliver the performance of SSD storage for the (street) price of 15k SAS drives: around 2$/GB.
Thanks to the several features and software updates in the last year, HP made it possible to lower the all-flash storage by 85%. So you get all the Tier 1 data services (including 6 nines availability) for the price of a midrange storage box.
1,92TB new SSD
How does HP get to this 1,92TB size? This is not a typical size we should expect right?
Well SSD’s are known for their poor ratios usable to raw capacity. An SSD that has a usable capacity of 400GB typically has 512GB raw capacity. That is 112GB of (overprovisioned) capacity that is not available for the users because the SSD mechanism has reserved all that extra capacity for things like garbage collection, avoiding write amplification and so.
Garbage collection? Well for more in-depth information on that I propose to check out the following page on Wikipedia.
Now HP did 2 things. They worked together with the SSD vendors to lower the overprovisioned capacity on SSD’s, resulting in getting 480GB and 920 GB SSD’s instead of the typical 400GB and 800GB. That’s 15% to 20% more capacity for the customers to use. Important feature!
But! How can the SSD do garbage collection so that the endurance of the SSD stays acceptable?
Well that is why HP has developed a thing called Adaptive Sparing.
Remember that 3PAR arrays use chunklets of 1GB on all drives in the array. Most of them are data chunklets but a certain percentage are spare chunklets. And so Adaptive Sparing releases these spare chunklets to the SSD so that he can use them as over-provisioned capacity to do their geeky stuff like garbage collection… Innovative, right? Customers gets more capacity available with the same endurance, so lowering the price per GB.
Since HP has full control on their SSD’s, they say also they can guarantee (and they give) 5 years of warranty on their latest SSD drives.
The 1,92TB SSD will be first available on the all-flash 7450, later in 2014 it will be also available in the other 7000 and 10000 3PAR models.
Hardware-accelerated Inline Deduplication
Today HP StoreOnce was the HP platform known for their excellent deduplication ratios, but StoreOnce is not a SAN but backup target platform. With the announced software update on 3PAR, HP can now also deliver inline deduplication on 3PAR. And HP was really lucky with the fact that the Gen4 ASIC in the latest generation of 3PAR’s is perfectly suited to do this. So it is effectively a hardware accelerated solution and not software based solution like most competitors.
Some basics on deduplication: Deduplication in primary storage arrays has become increasingly important with the addition of SSDs to the supported media in these arrays. The cost differential between SSDs and HDDs requires solutions like deduplication and compression to reduce the cost per byte of these storage arrays. Primary storage arrays have to meet the high performance demands placed on them by host operating systems in terms of low latency and high throughput. The impact of deduplication on IO performance is determined by various parameters, such as whether we deduplicate inline or in the background, and the granularity of deduplication. Deduplicating data at a smaller granularity (such as 4KB), while providing better space savings, requires a lot of CPU processing and memory.
How does 3PAR achieve this deduplication? Asked if it was the same engine from StoreOnce, the answer was clear: NO.
3PAR deduplication is fixed block deduplication oriented towards block-based primary use cases where there is one copy of the data.
StoreOnce uses a variable chunking algorithm that is suited to data streams (backup) that repeat over and over. The dedupe pipeline also includes compression.
At moment of backup when the StoreOnce host agent reads the data from the 3PAR LUN if this has been deduplicated as the block layer this will be rehydrated and eventually deduplicated and compressed to the StoreOnce appliance.
HP created a mechanism called Express Indexing to detect in a very efficient way duplicate pages. This technology uses the computed hash signature as an index to lookup if a match exist in the Dedup Store, using a three level translation mechanism, which existed already before in the 3PAR hardware. When new a write I/O request comes in, the Logical Block Address (LBA) is used as an index into three different page tables as per a regular TPVV but instead of allocating a new page the hash signature of the incoming data page is computed by the HP 3PAR ASIC and compared to the signatures of the data already stored in the CPG. If a match is found then the L3 page table entry will be set to point to the existing copy of the data page. Only if no match is found will a new page be allocated.
Where and how do we get this new feature?
It will be available as part of the Base OS Suite, so there is no additional cost. It will be supported on Gen4 ASIC’s only. It will be supported only on volumes and snapshots pinned to an SSD tier. For this a new volume type called TDVV is used for dedup volumes. TDVVs in the same CPG will share deduplicate pages. Finally, TDVV volumes cannot be part of an AO configuration.
Some usefull SSD terms
- SSD – Solid State Drive
- SLC – Single Level Cell, the earliest NANDs used, higher write endurance than MLC, but more expensive
- MLC – Multi Level Cell, have taken over the SSD industry, can be configured for high to low endurance using various techniques, are less expensive than SLC
- eMLC SSD – Enterprise MLC SSD, a generally accepted term, not officially defined, but meant to indicate high write endurance
- cMLC SSD – Commercial (or Consumer) MLC SSD, a generally accepted term, not officially defined, but meant to indicate low write endurance
- High Endurance (HE) – Typically used to indicate SSDs in the 25 DWPD write endurance level
- Mainstream Endurance (ME) – Typically used to indicate SSDs in the 10 DWPD write endurance level
- Light Endurance (LE) – Typically used to indicate SSDs in the 3 DWPD write endurance level
- Value Endurance (VE) – Typically used to indicate SSDs below 1 DWPD write endurance level
- Read Intensive (RI) SSD – Meant to indicate low write endurance suitable for Read Intensive usage
- V-NAND – Vertical NAND, a newer memory technology emerging, which should improve performance and write endurance
Around The Storage Blog from @HPStorageGuy aka Calvin Zito : URL
3PAR guru Ivan Iannaccone blog : URL
Thin Deduplication overview : PDF
3CVGuy blog : URL
All the other HP Discover bloggers : URL