« VMware boot storm on NetApp | Main | Oracle & Sun - What to do with the hardware business »

ZFS Capacity Usage - Optimizing Compression and Record Size Settings

I have migrated some data to ZFS filesystems recently and the capacity consumed has surprised me a couple times. In general, it has appeared that the data uses more capacity when stored on the ZFS filesystem. This prompted me to do a little investigating. Is ZFS using more capacity? Is it simply a reporting anomaly? Where is that space going? Does ZFS record size have a major impact? Does enabling compression have a significant impact?

In part, the extra space use is a result of ZFS reporting space utilization differently than other filesystems. When a ZFS filesystem is formatted, almost no capacity is used. A df command will show nearly the entire raw capacity. Many other filesystems take a portion of the raw capacity off the top and reserve it for metadata. This reserve will not show up in df. As data is added to the ZFS filesystem, blocks are allocated for both data and metadata. Both the data and metadata blocks will show up as used capacity. In many other filesystems, at least some of the metadata blocks will be taken from the reserve and only the data blocks will show as consumed capacity. For example, in Solaris, the du command will return the capacity used by the data blocks in a file. In ZFS, that du command returns the total space consumed by the file including metadata and compression. So the question at hand is, when storing a given set of files, does ZFS use more total space than other file systems? That one is difficult to test, given all the variables. But we can test various ZFS configuration options to determine the best settings for minimizing block use.

All of our testing was done on RAID-Z2. In a RAID-Z2 filesystem, each data block will require at least 2 512-byte sectors of parity information. With a larger record size, this is not noticeable, but with a small record size it can really add up. Imagine the impact if the filesystem is using a 1KB record size. The parity data could double the capacity consumed! So, is the solution to use the largest possible block size? Unfortunately, it is never that simple.

The last block of any file will be on average 50% utilized. With a 128KB block size, each file is going to have an average of 64KB of wasted space. Enabling compression will zero out the unused portion of the block and that portion of the block will compress extremely well. To test this out, I created filesystems with block sizes from 1KB to 128KB and using lzjb, gzip-2, gzip-6, and gzip-9 compression. Then I copied a data set to each of these filesystems. The test data set consisted of 179,559 PDF files totaling approximately 111GB uncompressed. Nearly all of the files are larger than the largest 128KB block size. The results would be very different if the data set consisted of thousands of very small files. The intent of this test is to simulate the file sizes that might exist in a home directory environment. The goal is to examine the "wasted" capacity, not the impact of compression on the overall data set.


So with all of that said, let's take a look at the data:

ZFS Block Size & Compression Comparison
You can click on the chart to open a larger version in a new window.

The additional capacity consumed by the metadata is most obvious in the 1KB block size results. The PDF files are not very compressible, so a large portion of the data reduction between no compression and lzjb is likely due to saving and average of 50% of the last block of each file. For 128KB blocks, there will be 179,559 files that waste on average 64KB of space each. If it averages exactly 50% (unlikely) and that capacity compressed down to take no space (not quite true) it would save roughly 10.95GB of capacity. Interestingly, that is in the region of what is saved between 128KB with no compression and lzjb.

UPDATE: I analyzed the file sizes for this specific data set and there is ~12.3GB of space wasted in the last blocks of the 128KB no compression test. That means we are not averaging 50% utilization in the last block, but it is reasonably close.

These results would vary dramatically if the data was highly compressible or if there were many small files. Also, performance was completely ignored in these tests. The better the compression the rate on the chart above, the more CPU that was required. The goal here was to talk a bit about ZFS and the effect of compression on space utilization. Watch for a detailed discussion on selecting the correct block size for your application in a future post.

You can find more information about ZFS in the ZFS FAQ.

Here are a couple more charts that show the makeup of the data set that was used for these tests.

ZFS - PDF File Distribution by Capacity
You can click on the chart to open a larger version in a new window.

ZFS - PDF File Distribution by Count
You can click on the chart to open a larger version in a new window.

Originally posted at http://ctistrategy.com

PrintView Printer Friendly Version

EmailEmail Article to Friend

References (3)

References allow you to track sources for this article, as well as articles that were written in response to this article.

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>