I use KVM in my production servers. But there are some critical questions with KVM usage. Backup process of running virtual machines may be top most question of KVM users. Although, there are many other solutions in the literature, I will discuss more defensive method in this article. Before I move on to details of backing-up VMs, I will first mention the building blocks I used in this method.
KVM
This is the hypervisor. It supports live snapshot which is crucial for backup process.
Snapshots
A Snapshot of a virtual machine holds the state-of virtual machine at the time of snapshot is taken. This state is mainly about hard drive but if live snapshot is the case, VM’s all registers, memory and kernel buffers are stored into the snapshot.
There are two types of snapshots in KVM. Internal snapshot is stored in virtual machines disk file, for instance if VM’s disk is named centos.qcow2 then snapshot will be saved within centos.qcow2. External shapshot is not saved into VM’s disk namely centos.qcow2. So snapshot is saved to the location specified with the snapshot command.
Both internal and external snapshots supports online (live) and offline (turned-off) backups.
Offine Backups:
When you will plan to take offline backups you do not need to take snapshots. But they may help the backing-off process effectiveness.
An offline backup can be taken with copying the qcow2 (or any other format) to backing-up destination. So let’s explain the process.
time | snapshot-name | note(s) |
0+ | current+ | install vm; update vm; install httpd |
1+ | backup1+ | install mysql |
2+ | backup2+ | install oracle-java |
3+ | backup3+ | mistakenly fdisk / partition |
Table shows that at time zero, we installed operating system to vm, updated packages, and then we installed httpd.
One hour later we took a backup. Since we’re taking offline backup VM is turned-off. That is no write taking place to VM’s backing store (qcow2 or any other format). So simple copy cmd will do the backup. The rules of backup are the same: backup should be stored to a different storage/server/location. But with every backup you’ll have huge backup file. To reduce size of backup files, we’ll use other tools: rsync and internal snapshots of VM. “rsync” will copy only differences of remote (backup) and actual file. But we can go further with internal snapshots. With internal snapshots, base will locked to writes and all writes after the snapshot will be written to snapshot file which is contained in backing store (qcow2). This will improve rsync’s performance and decrease size of new backup file. This is possible because without help of snapshots, VM can write any part of base image. So rsync’s delta will be bigger and calculation of delta will take more. By locking base with snapshots, base will never change and only the snapshot file will be different.
So rather than just copying backing store; taking an internal snapshot, than rsync to backup location is the more effective way to take a backup. But after a while too many snapshots will be created. We recommend to blockcommit both original VM and backups to reduce long snapshot chain.
Offline external snapshot is not different than internal snapshot since rsync, snapshot, and blockcommit tools are required to effectively take backups.
That is all for this article. We’ll discuss external online snapshots in our next article.