This is an internal documentation. There is a good chance you’re looking for something else. See Disclaimer.

BTRFS

Work through this document from top to bottom. A BTRFS filesystem is set up first, then various features are shown and, finally, the filesystem is removed again.

Preparations

Create storage for virtual disks:

truncate -s 2G disk1 disk2 disk3

Create virtual disks:

l1=$(sudo losetup --show -f disk1)
l2=$(sudo losetup --show -f disk2)
l3=$(sudo losetup --show -f disk3)

Format and Mount

Create single-disk filesystem:

mkfs.btrfs --csum xxhash $l1

Mount:

mkdir mnt
mount $l1 mnt -o noatime

CoW and Snapshots

Create a subvolume:

btrfs subvolume create mnt/vol1

Note

Subvolumes are mostly like directories but it’s only possible to create snapshots of subvolumes and not directories.

The root directory, mnt/, is a subvolume too.

Add some data:

echo "Some simple, meaningless text." >mnt/vol1/text
dd if=/dev/urandom of=mnt/vol1/bulk bs=1M count=300

Create a snapshot:

btrfs subvolume snapshot mnt/vol1 mnt/vol2

vol1 and vol2 now have the same content:

$ ls -lh mnt/vol1 mnt/vol2
mnt/vol1:
total 301M
-rw-r--r-- 1 root root 300M Oct  2 09:26 bulk
-rw-r--r-- 1 root root   31 Oct  2 09:26 text

mnt/vol2:
total 301M
-rw-r--r-- 1 root root 300M Oct  2 09:26 bulk
-rw-r--r-- 1 root root   31 Oct  2 09:26 text

But the data was not physically copied (note the used space):

$ df -h mnt
Filesystem      Size  Used Avail Use% Mounted on
/dev/loop0      2.0G  308M  1.5G  17% /home/user/dumps/btrfs/mnt

Note

Snapshots leverage the copy-on-write (CoW) principal. That is, when a snapshot is created the content of a subvolume isn’t copied but rather the same content is referenced. Only once a file is modified, the file’s content, or parts of it, gets copied.

Modify files:

sed -i 's/meaningless/meaningful/' mnt/vol2/text
dd if=/dev/urandom of=mnt/vol2/bulk count=1 conv=notrunc

File have been modified:

$ cat mnt/vol1/text
Some simple, meaningless text.
$ cat mnt/vol2/text
Some simple, meaningful text.
$ cmp mnt/vol1/bulk mnt/vol2/bulk
mnt/vol1/bulk mnt/vol2/bulk differ: byte 1, line 1

Again, most data was not physically copied (note the used space):

$ df -h mnt
Filesystem      Size  Used Avail Use% Mounted on
/dev/loop0      2.0G  308M  1.5G  17% /home/user/dumps/btrfs/mnt

Note

When files are modified, only blocks which were touched are modified (copying content as needed). Other blockes keep being shared with other subvolumes.

We see only very little additional disk space is used as result of modification:

$ btrfs filesystem du -s mnt/vol1
     Total   Exclusive  Set shared  Filename
 300.00MiB       0.00B   300.00MiB  mnt/vol1
$ btrfs filesystem du -s mnt/vol2
     Total   Exclusive  Set shared  Filename
 300.00MiB     4.00KiB   300.00MiB  mnt/vol2

Note

Exclusive are bytes not shared among files. Set shared are bytes shared.

It’s also possible to create a read-only subvolume:

$ btrfs subvolume snapshot -r mnt/vol1 mnt/vol3
$ touch mnt/vol3/new_file
touch: cannot touch 'mnt/vol3/new_file': Read-only file system

List subvolumes:

$ btrfs subvolume list mnt
ID 260 gen 25 top level 5 path vol1
ID 261 gen 25 top level 5 path vol2
ID 262 gen 24 top level 5 path vol3

Remove subvolumes:

btrfs subvolume delete mnt/vol[123]

CoW is also applied while copying:

dd if=/dev/urandom of=mnt/data1 bs=1M count=512
cp mnt/data1 mnt/data2
cp mnt/data1 mnt/data3
cp mnt/data1 mnt/data4
cp mnt/data1 mnt/data5

Note

For older version of GNU cp, use cp --reflink=auto <source> <target> to enable CoW support.

Note the disk usage:

$ df -h mnt
Filesystem      Size  Used Avail Use% Mounted on
/dev/loop0      2.0G  521M  1.3G  29% /home/user/dumps/btrfs/mnt

Remove files:

rm mnt/data[1-5]

Tip

When to use subvolumes:

  • When you want to be able to quickly create atomic snapshots of a subvolume/directory.

  • When you want to be able to quickly remove a subvolume/directory. Cleanup after btrfs subvolume delete happens in the background.

When not to use subvolumes:

  • For workload that heavily modifies files in random places. Notable for databases and VM disk images. For such workloads, disable CoW entirely:

    chattr +C ${directory_or_file}
    

    Flag is inherited when set on a directory. When setting this on a file, this needs to be set while file is still empty.

Compression

Enable compression on a directory:

mkdir mnt/compressed
btrfs property set mnt/compressed compression zstd

Add some easily compressible content:

yes | dd bs=1M count=500 iflag=fullblock of=mnt/compressed/file

Used disk space is lower than file size:

$ ls -lh mnt/compressed/file
-rw-r--r-- 1 root root 500M Oct  2 10:33 mnt/compressed/file
$ df -h mnt
Filesystem      Size  Used Avail Use% Mounted on
/dev/loop0      2.0G   23M  1.8G   2% /home/user/dumps/btrfs/mnt

Note

Use compsize(8) to show detailed compression stats for a file or directory.

Recreate directory:

rm -rf mnt/compressed
mkdir mnt/compressed

Create uncompressed file:

$ yes | dd bs=1M count=500 iflag=fullblock of=mnt/compressed/file
$ df -h mnt
Filesystem      Size  Used Avail Use% Mounted on
/dev/loop0      2.0G  509M  1.3G  28% /home/user/dumps/btrfs/mnt

It’s also possible to compress existing files:

$ btrfs filesystem defragment -czstd mnt/compressed/file
$ df -h mnt
Filesystem      Size  Used Avail Use% Mounted on
/dev/loop0      2.0G   23M  1.8G   2% /home/user/dumps/btrfs/mnt

Note

This will not enable compression for modifications made to the file. Do so like this:

btrfs property set mnt/compressed/file compression zstd

See also:

Multi-Device (RAID)

Add second device:

btrfs device add $l2 mnt

Now there is an empty device:

$ btrfs device usage mnt
/dev/loop0, ID: 1
   Device size:             2.00GiB
   Device slack:              0.00B
   Data,single:           208.00MiB
   Metadata,DUP:          204.75MiB
   System,DUP:             16.00MiB
   Unallocated:             1.58GiB

/dev/loop1, ID: 2
   Device size:             2.00GiB
   Device slack:              0.00B
   Unallocated:             2.00GiB

Convert to RAID1:

btrfs balance start -mconvert=raid1 -dconvert=raid1 mnt

Note

You can also create a RAID1 filesystem directly:

mkfs.btrfs -draid1 -mraid1 --csum xxhash ${device1} ${device2}

Both devices are used now:

$ btrfs device usage mnt
/dev/loop0, ID: 1
   Device size:             2.00GiB
   Device slack:              0.00B
   Data,RAID1:            416.00MiB
   Metadata,RAID1:        512.00MiB
   System,RAID1:           32.00MiB
   Unallocated:             1.06GiB

/dev/loop1, ID: 2
   Device size:             2.00GiB
   Device slack:              0.00B
   Data,RAID1:            416.00MiB
   Metadata,RAID1:        512.00MiB
   System,RAID1:           32.00MiB
   Unallocated:             1.06GiB

Replace device with ID 1:

$ btrfs replace start 1 $l3 mnt
$ btrfs replace status mnt
Started on  2.Oct 11:24:34, finished on  2.Oct 11:24:34, 0 write errs, 0 uncorr. read errs
$ btrfs device usage mnt
/dev/loop2, ID: 1
   Device size:             2.00GiB
   Device slack:              0.00B
   Data,RAID1:            416.00MiB
   Metadata,RAID1:        256.00MiB
   System,RAID1:           64.00MiB
   Unallocated:             1.28GiB

/dev/loop1, ID: 2
   Device size:             2.00GiB
   Device slack:              0.00B
   Data,RAID1:            416.00MiB
   Metadata,RAID1:        256.00MiB
   System,RAID1:           64.00MiB
   Unallocated:             1.28GiB

Clean up:

rm -rf mnt/compressed

Defragmentation

Defragment directory recursively:

btrfs filesystem defragment -r mnt

Note

CoW can lead do fragmentation. Particularly, when existing files and directories are modified. For better performance, defragmentation can be used.

However, this warning from the manpage should be considered:

Defragmenting […] will break up the reflinks of COW data (for example files copied with cp –reflink, snapshots or de-duplicated data). This may cause considerable increase of space usage depending on the broken up reflinks.

Deduplication

There are various tools that allow deduplication of files. That is, tools that allow you to find identical files (or chunks) and merge them together so you end up with just one copy on disk.

Create two identical files:

$ yes content | dd of=mnt/a bs=1M count=50 iflag=fullblock
$ yes content | dd of=mnt/b bs=1M count=50 iflag=fullblock
$ df -h mnt/
Filesystem      Size  Used Avail Use% Mounted on
/dev/loop2      2.0G  106M  1.7G   6% /home/user/dumps/btrfs/mnt

Deduplicate directory recursively:

$ duperemove -rhd mnt
$ df -h mnt
Filesystem      Size  Used Avail Use% Mounted on
/dev/loop2      2.0G   56M  1.7G   4% /home/user/dumps/btrfs/mnt

Clean up:

rm mnt/[ab]

FS Repair / Scrubbing

Scrubbing can be used to check the integrity of the filesystem. BTRFS stores checksums with everything it writes to disk. Scrubbing will check this checksums together with a series of other integrity checks. Scrubbing will also try to recover from checksum mismatches by replacing the copy with a good one (e.g. from a RAID1 mirror).

Scrub:

$ btrfs scrub start mnt
$ btrfs scrub status  mnt

Note

There is also a more thorough btrfs check but it, currently, requires that the drive is offline.

BTRFS also has some self-healing properties where it tries to repair invalid items with good copies. BTRFS tracks stats about such events as well as incorrectable errors:

$ btrfs device stats mnt/
[/dev/loop2].write_io_errs    0
[/dev/loop2].read_io_errs     0
[/dev/loop2].flush_io_errs    0
[/dev/loop2].corruption_errs  0
[/dev/loop2].generation_errs  0
[/dev/loop1].write_io_errs    0
[/dev/loop1].read_io_errs     0
[/dev/loop1].flush_io_errs    0
[/dev/loop1].corruption_errs  0
[/dev/loop1].generation_errs  0

Mount on Boot

Find UUID:

blkid -s UUID -o value --probe $l2

Create entry in /etc/fstab:

UUID=<uuid>  <target_dir>  btrfs  noatime,x-systemd.growfs  0  1

Note

Options

noatime:

Always use noatime to disable access times. Updating access times will heavily fragment the filesystem due to the use of CoW.

x-systemd.growfs:

Automatically grow FS to maximum after mounting.

nofail:

Continuing boot process if device is missing or mount fails.

Clean Up

Clean up virtual disks:

umount mnt
losetup -d $l1 $l2 $l3
rm disk[1-3]

Notable Mentions

btrfs send(8), btrfs receive(8):

Send read-only snapshots between filesystems.

btrfs filesystem resize:

Online grow or shrink filesystem.

btrfs balance(8):

Rebalance devices (e.g. after adding a drive).

btrfs-convert(8):

Convert ext4 filesystem to BTRFS.