User Tools

Site Tools


zfs:structure:zil_zfs_intent_log

ZFS - Structure - ZIL (ZFS Intent Log)

There are two major categories of write operations:

  • Synchronous (sync)
  • Asynchronous (async).

For most workloads, the vast majority of write operations are asynchronous.

  • The filesystem is allowed to aggregate them and commit them in batches, reducing fragmentation and tremendously increasing throughput.

Sync writes are an entirely different animal.

  • When an application requests a sync write, it is telling the filesystem to stop everything else and to commit this to non volatile storage straight away.
  • Sync writes must therefore be committed to disk immediately, and if that increases fragmentation or decreases throughput, so be it.

ZFS handles sync writes differently from normal filesystems.

  • Instead of flushing out sync writes to normal storage immediately, ZFS commits them to a special storage area called the ZFS Intent Log, or ZIL.
  • The trick here is, those writes also remain in memory, being aggregated along with normal asynchronous write requests, to later be flushed out to storage as perfectly normal TXGs (Transaction Groups).

In normal operation, the ZIL is written to and never read from again.

  • When writes saved to the ZIL are committed to main storage from RAM in normal TXGs a few moments later, they are unlinked from the ZIL.
  • The only time the ZIL is ever read from is upon pool import.

If ZFS crashes, or the operating system crashes, or there is an unhandled power outage, while there is data in the ZIL, that data will be read from during the next pool import (e.g. when a crashed system is restarted).

  • Whatever is in the ZIL will be read in, aggregated into TXGs, committed to main storage, and then unlinked from the ZIL during the import process.

One of the classes of supported vdev is LOG, also known as SLOG, or Secondary LOG device.

  • All the SLOG does is provide the pool with a separate, and hopefully far faster, with very high write endurance—vdev to store the ZIL in, instead of keeping the ZIL on the main storage vdevs.
  • In all respects, the ZIL behaves the same whether it is on main storage, or on a LOG vdev, but if the LOG vdev has very high write performance, then sync write returns will happen very quickly.

Adding a LOG vdev to a pool absolutely cannot and will not directly improve asynchronous write performance, even if you force all writes into the ZIL using zfs set sync=always, they still get committed to main storage in TXGs in the same way and at the same pace they would have without the LOG.

  • The only direct performance improvements are for synchronous write latency (since the LOGs greater speed enables the sync call to return faster).

However, in an environment that already requires lots of sync writes, a LOG vdev can indirectly accelerate asynchronous writes and uncached reads as well.

  • Offloading ZIL writes to a separate LOG vdev means less contention for IOPS on primary storage, thereby increasing performance for all reads and writes to some degree.
zfs/structure/zil_zfs_intent_log.txt · Last modified: 2021/10/13 02:06 by peter

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki