Large server setups are quite typical for KVM on z. One common pitfall is that system and application defaults that work well for a small number of servers may not work very well for a large number of virtual servers or in combination with huge host resources. The following sections present a list of snags that could happen, along with respective suggestions on how to resolve them.
System-wide I/O Limits
In order to achieve the best possible performance, QEMU should be configured to use Linux native asynchronous I/O (AIO). With a large number of virtual servers running on the KVM host, the number of outstanding asynchronous I/O requests can exceed the system wide limit which is controlled by /proc/sys/fs/max-aio-nr via sysctl fs.aio-max-nr. If this happens, it might not be possible to start any more virtual servers.Therefore, use
$ sysctl fs.aio-nr fs.aio-max-nr fs.aio-nr = 0 fs.aio-max-nr = 65536to check on the current and maximum number of asynchronous I/O requests.
To prevent any issues, we recommend to increase the maximum to a higher number, e.g.
$ systctl fs.aio-max.nr=4194304
Paging Performance
The storage servers used by z Systems typically do not suffer from seek latencies, and are similar to SSDs in this respect. However, e.g. FCP LUNs are treated as regular SCSI disks and thus the I/O scheduler will utilize strategies avoiding seeks. This can adversely affect performance, e.g. when paging.Therefore we recommend to configure all disks as non-rotational. Use
$ cat /sys/block/<sdx>/queue/rotationalto check the current setting. Use the following command to mark the respective device as non-rotational:
$ echo 0 > /sys/block/<sdx>/queue/rotational
QEMU User Limits
QEMU instances started by libvirt typically run as dedicated user qemu and are subject to the host system's per-user limits. When deploying many guests, both the number of user processes and open file descriptors can be exceeded. Therefore, we recommend to increase QEMU’s limits for the number of processes and files.Use the following commands to determine the current number of QEMU processes and their open files:
# Number of QEMU processes $ ps -afem | grep qemu | grep -v grep | wc -l 463 # Number of open files $ for i in `ps -afe | grep qemu | grep -v grep | awk \ '{print($2)}'`; do ls -1 /proc/$i/fd; done | wc -l 13891We recommend to edit /etc/libvirt/qemu.conf and to use the following values to avoid running into resource limits:
max_processes = 10000 max_files = 100000Note that these numbers were chosen in order to be on the safe side for large deployments. If you have only a small number of virtual servers, the defaults might be just fine.
Libvirt Client Connections
libvirt clients like virsh or virt-managercommunicate with the libvirt daemon via remote procedure calls (RPCs). The number of concurrent RPC requests, both per client and global, is limited.The default limits should suffice for many situations where a single human administrator is using standard tools to manage virtual servers. However, in certain situations many libvirt requests may be processed in parallel, e.g. when a script starts a large number of virtual servers after the host has been booted.
For large deployments, we recommend to allow for up to 200 requests to be processed in parallel, and of these up to 10 per client connection. Edit /etc/libvirt/libvirtd.conf and use the following settings:
max_workers = 200 max_requests = 200 max_client_requests = 10
No comments:
Post a Comment