Knowledgebase:
Notes on I/O schedulers
19 May 2020 09:46 AM

Summary

I/O Schedulers are used to determine the order of block operations will be passed to the storage subsystem.

Linux Kernel 4.x

The Linux 4.x Kernel, used by Red Hat Enterprise Linux (RHEL 8), CentOS 8 and Amazon Linux 2, has 3 I/O schedulers that can be used with MarkLogic Server:

  • none - No reordering of I/O operations (Default for Amazon Linux 2)
  • mq-deadline - Reordering to minimize I/O (Default for RHEL 8)
  • kyber - Token based I/O scheduling

Linux Kernel 3.x

The Linux 3.x Kernel, used by RHEL 7 and CentOS 7, also has 3 I/O schedulers that can be used with MarkLogic Server:

  • none - No reordering of I/O operations
  • deadline - Reordering to minimize I/O
  • noop - Inserts I/O requests into a FIFO queue

Recommended IO Schedulers

Three I/O schedulers are recommended for use with MarkLogic Server:

  • deadline / mq-deadline
    • configured by setting elevator=deadline as a kernel boot parameter
  • noop
    • configured by setting elevator=noop as a kernel boot parameter
  • none
    • configured by setting elevator=none as a kernel boot parameter

Note: [none] is recommended for SSDs, NVMEs [1] and guest OS virtual machines [2]

Choosing a Scheduler

If your MarkLogic host has intelligent I/O controllers (hardware RAID) or only uses SSDs/NVMEs, choose none or noop. If you're unsure, choose deadline or mq-deadline.

The deadline Scheduler

The deadline scheduler attempts to minimize I/O latency by enforcing start service times for each incoming request. As I/O requests come in, they are assigned an expiry time (the deadline for that request). At the point where the expiry time for that request is reached, the scheduler forces the service of that request at the location on the disk. While it is doing this, any other requests within easy reach (without requiring too much movement) are attempted. Where possible, the scheduler attempts completion of any I/O request before the expiry time is met.

The deadline scheduler can be used in situations where the host is not concerned with "fairness" for all processes residing on the system. The concern is rather where the system requires I/O requests are not stalled for long periods.

The deadline scheduler can be considered the best choice given a host where one process dominates disk I/O. Most database servers are a natural fit for this category.

The mq-deadline Scheduler

The mq-deadline scheduler is the adaptation of the deadline scheduler to support multi-threading.

The noop Scheduler

The noop scheduler performs no scheduling optimizations, but does support request merging.

All incoming I/O requests are pushed onto a FIFO queue and left to the block device to manage. Intelligent disk controllers will manage the priority from there. In any situation where a hardware controller (an HBA or similar controller attached to a SAN) can manage scheduling - or where disk seek times are not important (such as on SSDs) - any extra work performed by the scheduler at Linux kernel level is wasted.

The noop scheduler can be considered the best choice when MarkLogic server is hosted on VMWare.

The Kyber Scheduler

The Kyber scheduler uses a token based system for managing requests. A queueing token is requirted to allocate a request and a dispatch token is used to limit operations of a certain priority. The Kyber scheduler also defines a target latency, and tunes itself to reach the target.

Output Schedulers

Kyber is a recent scheduler inspired by active queue management techniques used for network routing. The implementation is based on "tokens" that serve as a mechanism for limiting requests. A queuing token is required to allocate a request, this is used to prevent starvation of requests. A dispatch token is also needed and limits the operations of a certain priority on a given device. Finally, a target read latency is defined and the scheduler tunes itself to reach this latency goal. The implementation of the algorithm is relatively simple and it is deemed efficient for fast devices.

Finding the Active Scheduler

The active scheduler is identified in the file /sys/block/[device-name]/queue/scheduler, and is the option surrounded by square brackets.
The example below shows that the 'noop' scheduler is currently configured for the block device sdb:

> cat /sys/block/sdb/queue/scheduler

[noop] anticipatory deadline cfq

References

[1] Unable to change IO scheduler on nvme device
[2] What is the suggested I/O scheduler to improve disk performance when using Red Hat Enterprise Linux with virtualization?

(6 vote(s))
Helpful
Not helpful

Comments (0)