Thursday, September 15, 2011

Basic guide to using ZFS in Solaris


To use and config zfs, you need to create at least one zpool first.
After that, you should have something like this:
# zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
zfstest                23.8G     91K   23.8G     0%  ONLINE     -
This zpool “zfstest” also has one incorporated zfs filesystem on it. To manipulate zfs there is the “zfs” command. So keep in mind: zpool manipulates pool storage, zfs manipulates zfs generation and options. Try this:
# zfs list
NAME      USED  AVAIL  REFER  MOUNTPOINT
zfstest    88K  23.4G  24.5K  /zfstest
As you can see, the pool “zfstest” also has a filesystem on it, mounted automatically at mountpoint /zfstest.
You may create a new filesystem by using “zfs create”:
# zfs create zfstest/king
# zfs list
NAME           USED  AVAIL  REFER  MOUNTPOINT
zfstest        118K  23.4G  25.5K  /zfstest
zfstest/king  24.5K  23.4G  24.5K  /zfstest/king
New filesystems within a pool are always named “poolname/filesystemname”. Without any additional options, it will also mount automatically on “/poolname/filesystemname”.   Let’s create another one:
# zfs create zfstest/queen
# zfs list
NAME            USED  AVAIL  REFER  MOUNTPOINT
zfstest         147K  23.4G  25.5K  /zfstest
zfstest/king   24.5K  23.4G  24.5K  /zfstest/king
zfstest/queen  24.5K  23.4G  24.5K  /zfstest/queen
We see some differences between old-fashioned filesystems and zfs: Usable storage is shared among all filesystems in a pool. “zfstest/king” has 23.4G available, “zfstest/queen” also, as does the master pool filesystem “zfstest”.
 So why create filesystems then? Couldn’t we just use subdirectories in our master pool filesystem “zfstest” (mounted on /zfstest)?
The “trick” about zfs filesystems is the possibility to assign options to them, so they can be treated differently. We will see that later.
First, let’s push some senseless data on our newly created filesystem:
# dd if=/dev/zero bs=128k count=5000 of=/zfstest/king/bigfile
5000+0 records in
5000+0 records out
This command creates a file “bigfile” in directory /zfstest/king, consisting of 5000 times 128 kilobytes. That’s big enough for our purpose.
 ”zfs list” reads:
# zfs list
NAME            USED  AVAIL  REFER  MOUNTPOINT
zfstest         625M  22.8G  27.5K  /zfstest
zfstest/king    625M  22.8G   625M  /zfstest/king
zfstest/queen  24.5K  22.8G  24.5K  /zfstest/queen
625 megabytes are used from filesystem zfstest/king, as expected. Notice also that now every other filesystem on that pool only can allocate 22.8G, as 625M are taken (compare with 23.4 G above, before creating that big file).
You CAN look up free space in your zfs filesystems also doing a “df -k”, but I wouldn’t recommend it: You won’t see snapshots and the numbers can be very big.
Example for our zpool “zfstest”:
# df -k
Filesystem            kbytes    used   avail capacity  Mounted on
/dev/dsk/c0d0s0      14951508 5725184 9076809    39%    /
/devices                   0       0       0     0%    /devices
ctfs                       0       0       0     0%    /system/contract
[... lines omitted ...]
zfstest              24579072      27 23938789     1%    /zfstest
zfstest/king         24579072  640149 23938789     3%    /zfstest/king
zfstest/queen        24579072      24 23938789     1%    /zfstest/queen
So 22.8G are 23938789 bytes. Sun uses 1K=1024 bytes, 1M = 1024K, 1G = 1024M, 1T = 1024G. They’re a computer company and not an ISO metric organization…
So let’s try out first option: “quota”.
As you can imagine, “quota” limits storage. You know that as nearly every mailbox provider do impose a quota on your storage, as do file space providers.
First: To set and get options, you need to use “zfs set” and “zfs get”, respectively.
So here we define a quota on zfstest/queen:
# zfs set quota=5G zfstest/queen
Result:
# zfs list
NAME            USED  AVAIL  REFER  MOUNTPOINT
zfstest         625M  22.8G  27.5K  /zfstest
zfstest/king    625M  22.8G   625M  /zfstest/king
zfstest/queen  24.5K  5.00G  24.5K  /zfstest/queen
Only 5G left to use at mountpoint /zfstest/queen. Note, that you may still gobble up 22.8G in /zfstest/king, making it impossible then to put 5G in /zfstest/queen. So a quota does not guarantee any storage, it only limits it.
To guarantee a certain amount of storage, use the option “reservation”:
# zfs set reservation=5G zfstest/queen
Now we simulated a classical “partition” – we reserved the same amount of storage as the quota implies, 5G:
# zfs list
NAME            USED  AVAIL  REFER  MOUNTPOINT
zfstest        5.61G  17.8G  27.5K  /zfstest
zfstest/king    625M  17.8G   625M  /zfstest/king
zfstest/queen  24.5K  5.00G  24.5K  /zfstest/queen
The other filesystems only have 17.8G left, as 5 G are really reserved for zfstest/queen.
Now, let’s try another nice option: compression
Perhaps now you are thinking about compression nightmares on windows systems, like doublespace, stacker and all these other parasital programs which killed performance, not storage. Forget them! zfs compression IS reliable and – fast!
With todays’ CPU power the effect of compressing and decompressing objects is a charm and won’t harm significantly your overall performance – it can boost performance as you will need less i/o due to compression.
As with many other zfs options, changing the compression only affects newly written files/sectors. Uncompressed blocks still can be read. It’s transparent to the application. fseek() et.al. do not even notice that files are compressed.
# zfs set compression=on zfstest/queen
Now, compression is activated on /zfstest/queen (as “zfstest/queen” is mounted on /zfstest/queen, we did not change the mountpoint – and yes, you’re right, the mountpoint is also just another zfs option…).
Let’s copy our “bigfile” from king to queen:
# cp /zfstest/king/bigfile /zfstest/queen
Ok THIS in unfair – as our file consists of only zeroes, zfs won’t compress it, it only sets up a marker saying that 655360000 bytes of zeroes have to be generated. It is some kind of “benchmark” hook to get nice results and to avoid to waste space with “hole files”:
# zfs list
NAME            USED  AVAIL  REFER  MOUNTPOINT
zfstest        5.61G  17.8G  27.5K  /zfstest
zfstest/king    625M  17.8G   625M  /zfstest/king
zfstest/queen  24.5K  5.00G  24.5K  /zfstest/queen
No space needed in zfstest/queen… You may check it with “ls -las” (option “s” prints out the number of needed disk blocks to store the file):
# ls -las /zfstest/queen
total 7
   3 drwxr-xr-x   2 root     sys            3 Apr 23 06:17 .
   3 drwxr-xr-x   4 root     sys            4 Apr 23 06:05 ..
   1 -rw-r–r–   1 root     root     655360000 Apr 23 06:18 bigfile
One block. On our uncompressed king filesystem the situation is like that:
# ls -las /zfstest/king
total 1280257
   3 drwxr-xr-x   2 root     sys            4 Apr 23 06:19 .
   3 drwxr-xr-x   4 root     sys            4 Apr 23 06:05 ..
1280251 -rw-r–r–   1 root     root     655360000 Apr 23 06:10 bigfile
To be able to create a “real world” file, we will use the “zfs get all” command, to get ALL options of a zfs filesystem:
# zfs get all zfstest/queen
NAME           PROPERTY       VALUE                  SOURCE
zfstest/queen  type           filesystem             -
zfstest/queen  creation       Wed Apr 23  6:05 2008  -
zfstest/queen  used           24.5K                  -
zfstest/queen  available      5.00G                  -
zfstest/queen  referenced     24.5K                  -
zfstest/queen  compressratio  1.00x                  -
zfstest/queen  mounted        yes                    -
zfstest/queen  quota          5G                     local
zfstest/queen  reservation    5G                     local
zfstest/queen  recordsize     128K                   default
zfstest/queen  mountpoint     /zfstest/queen         default
zfstest/queen  sharenfs       off                    default
zfstest/queen  checksum       on                     default
zfstest/queen  compression    on                     local
zfstest/queen  atime          on                     default
zfstest/queen  devices        on                     default
zfstest/queen  exec           on                     default
zfstest/queen  setuid         on                     default
zfstest/queen  readonly       off                    default
zfstest/queen  zoned          off                    default
zfstest/queen  snapdir        hidden                 default
zfstest/queen  aclmode        groupmask              default
zfstest/queen  aclinherit     secure                 default
zfstest/queen  canmount       on                     default
zfstest/queen  shareiscsi     off                    default
zfstest/queen  xattr          on                     default
As you remark, the “compressratio” option (which is a read-only option, so you may only use “zfs get” and not “zfs set”) gives the compression ratio of your filesystem, but our “zero file” does not count, so it remains 1.00x!) .  So let’s create another file now in your compressed queen filesystem:
# zfs get all zfstest/queen > /zfstest/queen/outputfile
Our file will use 3 disk blocks:
# ls -las /zfstest/queen
total 10
   3 drwxr-xr-x   2 root     sys            4 Apr 23 06:18 .
   3 drwxr-xr-x   4 root     sys            4 Apr 23 06:05 ..
   1 -rw-r–r–   1 root     root     655360000 Apr 23 06:18 bigfile
   3 -rw-r–r–   1 root     root        1598 Apr 23 06:18 outputfile
Let’s copy it to our uncompressed king filesystem:
# cp /zfstest/queen/outputfile /zfstest/king/
Here it will use 5 blocks:
# ls -las /zfstest/king
total 1280262
   3 drwxr-xr-x   2 root     sys            4 Apr 23 06:19 .
   3 drwxr-xr-x   4 root     sys            4 Apr 23 06:05 ..
1280251 -rw-r–r–   1 root     root     655360000 Apr 23 06:10 bigfile
   5 -rw-r–r–   1 root     root        1598 Apr 23 06:19 outputfile
These were the basic steps to create zfs filesystems, but at least one command is missing: How do destroy filesystems? Use “zfs destroy”:
# zfs destroy zfstest/king
# zfs destroy zfstest/queen
Note, that the filesystem must not be in use, otherwise it won’t work (just like any unmount (umount) of a classical filesystem won’t work when it’s in use).
Note, you may NOT destroy “zfstest”, because that’s the master filesystem of your pool, destroy your pool if you want to get rid of it:
# zfs destroy zfstest     
cannot destroy ‘zfstest’: operation does not apply to pools
use ‘zfs destroy -r zfstest’ to destroy all datasets in the pool
use ‘zpool destroy zfstest’ to destroy the pool itself

No comments:

Post a Comment