Write-Ahead-Logging

One of the widely used techniques in computer science is Write-Ahead-Logging (WAL), also known as Journaling, which is an approach used for crash consistency and recovery in both the File System and database management systems(DBMS).

To understand how WAL works, let’s first take a quick peek at the basic makings of a file system. A file system has to keep some metadata about each file, which is stored in a structure called an inode. Directories (a different type of file) also need to be tracked, since they store metadata about the directory itself and key-value pairs of name→inode for the files and directories under this directory. The file system itself keeps track of which inodes and data blocks are used and which are free.

In the file system world, deleting or creating a file is a multi-step process where anything can go wrong. Let’s go through a simple version of deleting a file:

Removing its key-value pair from its parent directory’s inode;
Removing the inode from inode tables (maps inode numbers to the file’s metadata);
Releasing the data blocks once claimed by the file.

A crash can happen at any step during this process, leaving the file system’s data structure in an inconsistent state, leading to a storage leak, data corruption, or inaccessible files.

One approach to solve this problem is using Write-Ahead-Logging. The idea involves creating a space (a log or a journal) in which we record the changes we will make ahead of time. When a crash happens, we can simply go back and read the journal and reapply changes until a consistent state is reached. This is what a simple log would look like in a file system:

What we do is: we issue a transaction begin, then we issue a write of the log entries, ending by a transaction end. Once the transaction is safely written to the disk, we are ready to update the file system’s data structure.

If a crash happens at any point in time, for example:

If it happened before it was committed (transaction end), it simply ignores this update.
If it occurred after the commit and before the actual disk update, the log’s transaction is replayed until a consistent state is reached.

There are of course a lot of details that are involved in this, but if you want to know more, both chapter 40 (File System Implementation) and chapter 42 (Crash Consistency: FSCK and Journaling) from the famous “Operating Systems: Three Easy Pieces” are great entry points.

Another form of logging appears in DBMS, where consistency and durability are the main concerns. For example, you do not want the $100 bill transferred from your account to your buddy’s account to disappear out of existence because some database failed due to a crash, power loss, or disk crash.

In the database world, a buffer is used to cache the disk pages in memory to update them and then flush them back to the disk. This buffer is used to reduce the number of disk accesses since it is an expensive operation when compared to the memory access time.

During this process, many things can go wrong, which can lead to you losing your $100 bill. Write-ahead-logging allows you to buffer updates to persist all operations on disk until the cached copies of pages are disk persistent. These log entries contain sufficient information to perform the necessary undo and redo actions to restore the database in case of a crash. Here is an example of a database WAL:

When a crash happens, algorithms for recovery are used to examine the WAL from the last checkpoint, a moment where the database is consistent, and redo all the changes and undo the actions of transactions that were not committed before the crash.

However, logging can be expensive, since we are writing what we want to change and then rewriting it again. but there are two types of logging: physical logging and logical logging.

In Physical Logging, we write down the exact physical content we are making to the File System or the table in the case of DBMS which contains a significant overhead, for example:

<T1: I'm writing To data block number x at offset y with 1kb of data which are [some 1 kb worth of data]>.

Another example of physical logging is the git diff command, which shows the differences between two versions of a file or a directory. The git diff command displays the exact lines that have been added, deleted, or modified in the files or directories. This can be useful for tracking the changes made to a project, but it can also be verbose and hard to read.

Another alternative is Logical Logging where we put a compact logical representation of the update in the log, a logical log entry could look like this:

<T1, Query=" UPDATE table foo SET val=XYZ WHERE id=1">

Suppose you want to know more about Logging in the Database world. In that case, Lectures on “Database logging” and "Database Recovery with ARIES” from CMU's “Intro to Database Systems" are some of the nicest materials I have studied.

In conclusion, Write-ahead-logging is a technique that ensures crash consistency and recovery in both file systems and databases. It works by recording the changes in a log before applying them to the data structures. It has advantages such as more concurrency, and less vulnerability to system failures. It also has backdraws such as logging overhead.