head of line blocking

When you mix bandwidth-limited and IOPS-limited (or latency-limited) workloads on a single shared file system, the IOPS-limited workload will be disproportionately affected due to head of line blocking.

The cause is depicted below:

If writes are bandwidth-limited (e.g., 1 MiB), you can get 5 MB/s
If writes are latency-limited (e.g., 256 KiB), you can get 25 IOPS (which is still 5 MB/s)
If writes are mixed, you get
- 4 MiB/s (a 20% bandwidth loss)
- 4 IOPS (an 84% IOPS loss)

Well-designed systems separate these workloads into separate queues so that the latency-sensitive workloads are not stuck behind bandwidth-limited I/Os that take a long time to complete.

This concept applies to both networking and storage, but the explanation above is much simpler than how this works in practice. Modern networking and storage systems have multiple queues per device and many devices connected in parallel.

Glenn's Digital Garden

Explorer

head of line blocking

Graph View