Disk Configuration
TOC
Storage Capacity
Mount the following partitions on dedicated disks or on LVM-provisioned logical volumes so they can be expanded later.
Recommended ETCD Practices
Fast storage is essential for etcd to perform reliably. etcd depends on durable, low-latency disk operations to persist proposals to its write-ahead log (WAL).
If disk writes take too long, fsync delays can cause the member to miss heartbeats, fail to commit proposals promptly, and experience request timeouts or temporary leader changes. These issues can also slow the Kubernetes API and degrade overall cluster responsiveness.
In conclusion, HDDs are a poor choice and are not recommended. If you must use HDDs for etcd, choose the fastest available (for example, 15,000 RPM).
The following hard drive practices provide optimal etcd performance:
-
Prefer SSDs or NVMe as etcd drives. When write endurance and stability are priorities, consider server-grade single-level cell (SLC) SSDs. Avoid NAS, SAN, and HDDs.
- Prefer drives with high write throughput to accelerate compaction and defragmentation.
- Prefer drives with strong read bandwidth to reduce recovery time after failures.
- Prefer drives with consistently low latency to ensure fast read and write operations.
-
Avoid distributed block storage systems such as Ceph RADOS Block Device (RBD), Network File System (NFS), and other network-attached backends, because they introduce unpredictable latency.
-
Keep etcd data on a dedicated drive or a dedicated logical volume.
- Do not place I/O-sensitive (such as logging) or other intensive filesystem activity on control-plane hosts, or at least do not let them share the same underlying storage with etcd.
-
Continuously benchmark with tools like
fioand use the results to track performance as the cluster grows. Refer to the disk benchmarking guide for more information.
Validating the hardware for etcd
Benchmarking with fio
To measure actual sequential IOPS and throughput, we suggest using the disk benchmarking tool fio. You may refer to the following instructions:
Do not run these tests against any nodes of the clusters.
Instead, run the tests against a dedicated VM that has the same set up as the control plane nodes.