Files on modern filesystems are not always stored contiguously. Instead, they are broken up into chunks (called
extents
), each of which may be located almost anywhere on the underlying disk. Each extent takes up one or more contiguous
blocks
on the disk, which represent the underlying physical disk structure. Files may not share blocks or extents between themselves, although there may be (and hopefully are) blocks not currently in use on the filesystem.
Despite all of the advances in modern disk technology, this disparate layout still makes for slower access than when all of a file's blocks are in order and adjacent (and therefore occupying a single extent). Because of this, many modern filesystems support defragmentation: the reorganizing of files on the disk so that the blocks of any given file are in order and adjacent.
With RAD's Awesome Dynamic Filesystem (commonly referred to as RADfs), a file can occupy any number of extents, each of which consists of at least two adjacent blocks. The first block in an extent holds metadata about that extent; the rest contain some portion of the file. Consider this simple representation of the extents and blocks occupied by a file:
RADfs.doc: 57-58,102-114,23-47
The file RADfs.doc currently occupies 40 blocks on the disk, even though the file itself is only 37 blocks in size. The smallest on-disk size that it could have is 38 blocks (37 plus a single metadata block), and that can only happen if the entire file is in a single extent. One potential single-extent representation of the file is:
RADfs.doc: 115-152
Here, the single metadata block (115) is followed by the 37 data blocks in a single extent.
The developers of RADfs have also developed the RADical Defragmentation Daemon, or RADDD. RADDD works with a simple two-step algorithm; despite its simplicity, it still manages to significantly reduce the number of extents consumed by files, given enough free space on the disk.
A RADDD pass works as follows:
- First Step ("to the back"):
- For every file on the filesystem that hasn't been run through this step on this pass, sorted in ascending order by the first block it occupies on the disk:
- Find the series of adjacent unused blocks nearest the end of the disk that can hold the file plus a single metadata block
- If such a series exists:
- Move the file to those blocks
- Mark the original blocks as unused
- Else do nothing to the file
- For every file on the filesystem that hasn't been run through this step on this pass, sorted in ascending order by the first block it occupies on the disk:
- Second Step ("to the front"):
- For every file on the filesystem that hasn't been run through this step on this pass, sorted in descending order by the last block it occupies on the disk:
- Find the series of adjacent unused blocks nearest the beginning of the disk that can hold the file plus a single metadata block
- If such a series exists:
- Move the file to those blocks
- Mark the original blocks as unused
- Else do nothing to the file
- For every file on the filesystem that hasn't been run through this step on this pass, sorted in descending order by the last block it occupies on the disk:
After one pass on a filesystem with some free space, some files have hopefully been reduced to a single extent. Multiple passes can often defragment the filesystem even further.
Of course, it's not that simple; some files simply cannot be moved, perhaps due to being in use at the time of the run. These are marked on the filesystem as immobile, and are ignored by RADDD.
Given the size of a disk, the current layout of files on the disk, and a number of passes of RADDD to run, can you determine the final filesystem layout?