fsck errors after shrinking an unmounted ext4 with resize2fs

This post was written by eli on November 29, 2018
Posted Under: Linux

Motivation

I’m using resize2fs a lot to when backing up into a USB stick. The procedure is to create an image of an encrypted ext4 file system, and raw write it into the USB flash device. To save time writing to the USB stick the image is shrunk to its minimal size with resize2fs -M.

Uh-oh

This has been working great for years with my oldie resize2fs 1.41.9, but after upgrading my computer (Linux Mint 19), and starting to use 1.44.1, things began to go wrong:

# e2fsck -f /dev/mapper/temporary_18395
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/mapper/temporary_18395: 1078201/7815168 files (0.1% non-contiguous), 27434779/31249871 blocks

# resize2fs -M -p /dev/mapper/temporary_18395
resize2fs 1.44.1 (24-Mar-2018)
Resizing the filesystem on /dev/mapper/temporary_18395 to 27999634 (4k) blocks.
Begin pass 2 (max = 1280208)
Relocating blocks             XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Begin pass 3 (max = 954)
Scanning inode table          XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Begin pass 4 (max = 89142)
Updating inode references     XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
The filesystem on /dev/mapper/temporary_18395 is now 27999634 (4k) blocks long.

# e2fsck -f /dev/mapper/temporary_18395
e2fsck 1.44.1 (24-Mar-2018)
Pass 1: Checking inodes, blocks, and sizes
Inode 85354 extent block passes checks, but checksum does not match extent
	(logical block 237568, physical block 11929600, len 24454)
Fix<y>? yes
Inode 85942 extent block passes checks, but checksum does not match extent
	(logical block 129024, physical block 12890112, len 7954)
Fix<y>? yes
Inode 117693 extent block passes checks, but checksum does not match extent
	(logical block 53248, physical block 391168, len 8310)
Fix<y>? yes
Inode 122577 extent block passes checks, but checksum does not match extent
	(logical block 61440, physical block 399478, len 607)
Fix<y>? yes
Inode 129597 extent block passes checks, but checksum does not match extent
	(logical block 409600, physical block 14016512, len 12918)
Fix<y>? yes
Inode 129599 extent block passes checks, but checksum does not match extent
	(logical block 274432, physical block 13640964, len 1570)
Fix<y>? yes
Inode 129600 extent block passes checks, but checksum does not match extent
	(logical block 120832, physical block 14653440, len 13287)
Fix<y>? yes
Inode 129606 extent block passes checks, but checksum does not match extent
	(logical block 133120, physical block 14870528, len 16556)
Fix<y>? yes
Inode 129613 extent block passes checks, but checksum does not match extent
	(logical block 75776, physical block 15054848, len 23962)
Fix<y>? yes
Inode 129617 extent block passes checks, but checksum does not match extent
	(logical block 284672, physical block 15716352, len 7504)
Fix ('a' enables 'yes' to all) <y>? yes
Inode 129622 extent block passes checks, but checksum does not match extent
	(logical block 86016, physical block 15532032, len 18477)
Fix ('a' enables 'yes' to all) <y>? yes
Inode 129626 extent block passes checks, but checksum does not match extent
	(logical block 145408, physical block 16967680, len 5536)
Fix ('a' enables 'yes' to all) <y>? yes
Inode 129630 extent block passes checks, but checksum does not match extent
	(logical block 165888, physical block 17125376, len 29036)
Fix ('a' enables 'yes' to all) <y>? yes
Inode 129677 extent block passes checks, but checksum does not match extent
	(logical block 126976, physical block 17100800, len 24239)
Fix<y>? yes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/mapper/temporary_18395: 1078201/7004160 files (0.1% non-contiguous), 27383882/27999634 blocks

Not the end of the world

This bug has been reported and fixed. Judging by the change made, it was only about the checksums, so while the bug caused fsck to detect (and properly fix) errors, there’s no loss of data (I encountered the same problem when shrinking a 5.7 TB partition by 40 GB — fsck errors, but I checked every single file, a total of ~3 TB, and all was fine).

I beg to differ on the commit message saying it’s a “relatively rare case” as it happened to me every single time in two completely different settings, none of which were special in any way. However we all use journaled filesystems, so fsck checks have become rare, which can explain how this has gone unnoticed: Unless resize2fs officially failed somehow, it leaves the filesystem marked as clean. Only “e2fsck -f ” will reveal the problem.

I would speculate that the reason for this bug is this commit (end of 2014), which speeds up the checksum rewrite after moving an inode. It’s somewhat worrying that a program of this sensitive type isn’t tested properly before being released for everyone’s use.

My own remedy was to compile an updated revision (1.44.4) from the repository, commit ID 75da66777937dc16629e4aea0b436e4cffaa866e. Actually, I first tried to revert to resize2fs 1.41.9, but that one failed shrinking a 128 GB filesystem with only 8 GB left, saying it had run out of space.

Conclusion

It’s almost 2019, the word is that shrinking an ext4 filesystem is dangerous, and guess what, it’s probably a bit true. One could wish it wasn’t, but unfortunately the utilities don’t seem to be maintained with the level of care that one could hope for, given the damage they can make.

Add a Comment

required, use real name
required, will not be published
optional, your blog address