This is an internal documentation. There is a good chance you’re looking for something else. See Disclaimer.
BTRFS¶
Work through this document from top to bottom. A BTRFS filesystem is set up first, then various features are shown and, finally, the filesystem is removed again.
Preparations¶
Create storage for virtual disks:
truncate -s 2G disk1 disk2 disk3
Create virtual disks:
l1=$(sudo losetup --show -f disk1)
l2=$(sudo losetup --show -f disk2)
l3=$(sudo losetup --show -f disk3)
Format and Mount¶
Create single-disk filesystem:
mkfs.btrfs --csum xxhash $l1
Mount:
mkdir mnt
mount $l1 mnt -o noatime
CoW and Snapshots¶
Create a subvolume:
btrfs subvolume create mnt/vol1
Note
Subvolumes are mostly like directories but it’s only possible to create snapshots of subvolumes and not directories.
The root directory, mnt/, is a subvolume too.
Add some data:
echo "Some simple, meaningless text." >mnt/vol1/text
dd if=/dev/urandom of=mnt/vol1/bulk bs=1M count=300
Create a snapshot:
btrfs subvolume snapshot mnt/vol1 mnt/vol2
vol1 and vol2 now have the same content:
$ ls -lh mnt/vol1 mnt/vol2
mnt/vol1:
total 301M
-rw-r--r-- 1 root root 300M Oct 2 09:26 bulk
-rw-r--r-- 1 root root 31 Oct 2 09:26 text
mnt/vol2:
total 301M
-rw-r--r-- 1 root root 300M Oct 2 09:26 bulk
-rw-r--r-- 1 root root 31 Oct 2 09:26 text
But the data was not physically copied (note the used space):
$ df -h mnt
Filesystem Size Used Avail Use% Mounted on
/dev/loop0 2.0G 308M 1.5G 17% /home/user/dumps/btrfs/mnt
Note
Snapshots leverage the copy-on-write (CoW) principal. That is, when a snapshot is created the content of a subvolume isn’t copied but rather the same content is referenced. Only once a file is modified, the file’s content, or parts of it, gets copied.
Modify files:
sed -i 's/meaningless/meaningful/' mnt/vol2/text
dd if=/dev/urandom of=mnt/vol2/bulk count=1 conv=notrunc
File have been modified:
$ cat mnt/vol1/text
Some simple, meaningless text.
$ cat mnt/vol2/text
Some simple, meaningful text.
$ cmp mnt/vol1/bulk mnt/vol2/bulk
mnt/vol1/bulk mnt/vol2/bulk differ: byte 1, line 1
Again, most data was not physically copied (note the used space):
$ df -h mnt
Filesystem Size Used Avail Use% Mounted on
/dev/loop0 2.0G 308M 1.5G 17% /home/user/dumps/btrfs/mnt
Note
When files are modified, only blocks which were touched are modified (copying content as needed). Other blockes keep being shared with other subvolumes.
We see only very little additional disk space is used as result of modification:
$ btrfs filesystem du -s mnt/vol1
Total Exclusive Set shared Filename
300.00MiB 0.00B 300.00MiB mnt/vol1
$ btrfs filesystem du -s mnt/vol2
Total Exclusive Set shared Filename
300.00MiB 4.00KiB 300.00MiB mnt/vol2
Note
Exclusive are bytes not shared among files. Set shared are bytes shared.
It’s also possible to create a read-only subvolume:
$ btrfs subvolume snapshot -r mnt/vol1 mnt/vol3
$ touch mnt/vol3/new_file
touch: cannot touch 'mnt/vol3/new_file': Read-only file system
List subvolumes:
$ btrfs subvolume list mnt
ID 260 gen 25 top level 5 path vol1
ID 261 gen 25 top level 5 path vol2
ID 262 gen 24 top level 5 path vol3
Remove subvolumes:
btrfs subvolume delete mnt/vol[123]
CoW is also applied while copying:
dd if=/dev/urandom of=mnt/data1 bs=1M count=512
cp mnt/data1 mnt/data2
cp mnt/data1 mnt/data3
cp mnt/data1 mnt/data4
cp mnt/data1 mnt/data5
Note
For older version of GNU cp, use cp --reflink=auto <source> <target>
to
enable CoW support.
Note the disk usage:
$ df -h mnt
Filesystem Size Used Avail Use% Mounted on
/dev/loop0 2.0G 521M 1.3G 29% /home/user/dumps/btrfs/mnt
Remove files:
rm mnt/data[1-5]
Tip
When to use subvolumes:
When you want to be able to quickly create atomic snapshots of a subvolume/directory.
When you want to be able to quickly remove a subvolume/directory. Cleanup after
btrfs subvolume delete
happens in the background.
When not to use subvolumes:
For workload that heavily modifies files in random places. Notable for databases and VM disk images. For such workloads, disable CoW entirely:
chattr +C ${directory_or_file}
Flag is inherited when set on a directory. When setting this on a file, this needs to be set while file is still empty.
Compression¶
Enable compression on a directory:
mkdir mnt/compressed
btrfs property set mnt/compressed compression zstd
Add some easily compressible content:
yes | dd bs=1M count=500 iflag=fullblock of=mnt/compressed/file
Used disk space is lower than file size:
$ ls -lh mnt/compressed/file
-rw-r--r-- 1 root root 500M Oct 2 10:33 mnt/compressed/file
$ df -h mnt
Filesystem Size Used Avail Use% Mounted on
/dev/loop0 2.0G 23M 1.8G 2% /home/user/dumps/btrfs/mnt
Note
Use compsize(8) to show detailed compression stats for a file or directory.
Recreate directory:
rm -rf mnt/compressed
mkdir mnt/compressed
Create uncompressed file:
$ yes | dd bs=1M count=500 iflag=fullblock of=mnt/compressed/file
$ df -h mnt
Filesystem Size Used Avail Use% Mounted on
/dev/loop0 2.0G 509M 1.3G 28% /home/user/dumps/btrfs/mnt
It’s also possible to compress existing files:
$ btrfs filesystem defragment -czstd mnt/compressed/file
$ df -h mnt
Filesystem Size Used Avail Use% Mounted on
/dev/loop0 2.0G 23M 1.8G 2% /home/user/dumps/btrfs/mnt
Note
This will not enable compression for modifications made to the file. Do so like this:
btrfs property set mnt/compressed/file compression zstd
See also:
Multi-Device (RAID)¶
Add second device:
btrfs device add $l2 mnt
Now there is an empty device:
$ btrfs device usage mnt
/dev/loop0, ID: 1
Device size: 2.00GiB
Device slack: 0.00B
Data,single: 208.00MiB
Metadata,DUP: 204.75MiB
System,DUP: 16.00MiB
Unallocated: 1.58GiB
/dev/loop1, ID: 2
Device size: 2.00GiB
Device slack: 0.00B
Unallocated: 2.00GiB
Convert to RAID1:
btrfs balance start -mconvert=raid1 -dconvert=raid1 mnt
Note
You can also create a RAID1 filesystem directly:
mkfs.btrfs -draid1 -mraid1 --csum xxhash ${device1} ${device2}
Both devices are used now:
$ btrfs device usage mnt
/dev/loop0, ID: 1
Device size: 2.00GiB
Device slack: 0.00B
Data,RAID1: 416.00MiB
Metadata,RAID1: 512.00MiB
System,RAID1: 32.00MiB
Unallocated: 1.06GiB
/dev/loop1, ID: 2
Device size: 2.00GiB
Device slack: 0.00B
Data,RAID1: 416.00MiB
Metadata,RAID1: 512.00MiB
System,RAID1: 32.00MiB
Unallocated: 1.06GiB
Replace device with ID 1:
$ btrfs replace start 1 $l3 mnt
$ btrfs replace status mnt
Started on 2.Oct 11:24:34, finished on 2.Oct 11:24:34, 0 write errs, 0 uncorr. read errs
$ btrfs device usage mnt
/dev/loop2, ID: 1
Device size: 2.00GiB
Device slack: 0.00B
Data,RAID1: 416.00MiB
Metadata,RAID1: 256.00MiB
System,RAID1: 64.00MiB
Unallocated: 1.28GiB
/dev/loop1, ID: 2
Device size: 2.00GiB
Device slack: 0.00B
Data,RAID1: 416.00MiB
Metadata,RAID1: 256.00MiB
System,RAID1: 64.00MiB
Unallocated: 1.28GiB
Clean up:
rm -rf mnt/compressed
Defragmentation¶
Defragment directory recursively:
btrfs filesystem defragment -r mnt
Note
CoW can lead do fragmentation. Particularly, when existing files and directories are modified. For better performance, defragmentation can be used.
However, this warning from the manpage should be considered:
Defragmenting […] will break up the reflinks of COW data (for example files copied with cp –reflink, snapshots or de-duplicated data). This may cause considerable increase of space usage depending on the broken up reflinks.
Deduplication¶
There are various tools that allow deduplication of files. That is, tools that allow you to find identical files (or chunks) and merge them together so you end up with just one copy on disk.
Create two identical files:
$ yes content | dd of=mnt/a bs=1M count=50 iflag=fullblock
$ yes content | dd of=mnt/b bs=1M count=50 iflag=fullblock
$ df -h mnt/
Filesystem Size Used Avail Use% Mounted on
/dev/loop2 2.0G 106M 1.7G 6% /home/user/dumps/btrfs/mnt
Deduplicate directory recursively:
$ duperemove -rhd mnt
$ df -h mnt
Filesystem Size Used Avail Use% Mounted on
/dev/loop2 2.0G 56M 1.7G 4% /home/user/dumps/btrfs/mnt
Clean up:
rm mnt/[ab]
FS Repair / Scrubbing¶
Scrubbing can be used to check the integrity of the filesystem. BTRFS stores checksums with everything it writes to disk. Scrubbing will check this checksums together with a series of other integrity checks. Scrubbing will also try to recover from checksum mismatches by replacing the copy with a good one (e.g. from a RAID1 mirror).
Scrub:
$ btrfs scrub start mnt
$ btrfs scrub status mnt
Note
There is also a more thorough btrfs check
but it, currently, requires
that the drive is offline.
BTRFS also has some self-healing properties where it tries to repair invalid items with good copies. BTRFS tracks stats about such events as well as incorrectable errors:
$ btrfs device stats mnt/
[/dev/loop2].write_io_errs 0
[/dev/loop2].read_io_errs 0
[/dev/loop2].flush_io_errs 0
[/dev/loop2].corruption_errs 0
[/dev/loop2].generation_errs 0
[/dev/loop1].write_io_errs 0
[/dev/loop1].read_io_errs 0
[/dev/loop1].flush_io_errs 0
[/dev/loop1].corruption_errs 0
[/dev/loop1].generation_errs 0
Mount on Boot¶
Find UUID:
blkid -s UUID -o value --probe $l2
Create entry in /etc/fstab:
UUID=<uuid> <target_dir> btrfs noatime,x-systemd.growfs 0 1
Note
Options
noatime:
Always use noatime to disable access times. Updating access times will heavily fragment the filesystem due to the use of CoW.
x-systemd.growfs:
Automatically grow FS to maximum after mounting.
nofail:
Continuing boot process if device is missing or mount fails.
Clean Up¶
Clean up virtual disks:
umount mnt
losetup -d $l1 $l2 $l3
rm disk[1-3]
Notable Mentions¶
btrfs send(8), btrfs receive(8):
Send read-only snapshots between filesystems.
btrfs filesystem resize
:
Online grow or shrink filesystem.
Rebalance devices (e.g. after adding a drive).
Convert ext4 filesystem to BTRFS.