multi-clone.py: Multi-threaded cloning of a template to multiple VMs

February 3, 2014 in Development, Virtualization

In december, VMware released pyVmomi, a Python SDK for the vSphere API. In the past I have created a script to clone virtual machines using pySphere. This script has helped me (and others, if the mails I received were any indication), but i haven’t updated or improved it in a while.

So I decided this was a good moment to revisit the script and recreate it completely using pyVmomi. And not just recreate it, but improve a lot on it as well. The result is an all new multi-clone.py script which allows for the following list of capabilities:

Deploy a specified amount of virtual machines
Deploy in a specified folder
Deploy in a specified resource pool
Specify if the cloned virtual machines need to be powered on
Print out information of the main network interface (mac and ip, either IPv4 or IPv6)
Run a post-processing script with 3 parameters (virtual machine name, mac and ip)
Print logging to a log file or stdout
Do this in a threaded way

Threads

The previous version only allowed a sequential way of working, which meant the clone task had to complete, network information gathered and post-processing finished before the next deployment could start. Combined with pySphere, which could be rather slow in gathering information, made the old script slow and sluggish.

Using pyVmomi improved the speed of the script from the start, it’s a lot quicker in gathering information and executing commands. By adding the possibility to use threads, the speed is improved even more, altho this will depend on the IOPS of your datastore.

If the you specify the amount of threads, both the creation of the virtual machines and gathering the mac and ip information are threaded separately. To explain how this works, an example works best:

Imagine you want to create 4 virtual machines from a template and you specify the amount of threads to 2. At start, two clone tasks will start at about the same time. When a clone task is done, another is started. While this third clone task is started in a new thread (to replace the thread that finished), a first information gathering (mac,ip) and post-processing thread is started for the clone task that has finished. This all runs at the same time, so at this point two clone tasks are running, each in a thread, and one information gathering and post-processing thread is running. Once the second clone task is finished, two clone tasks threads are running and two information gathering and post-processing threads are running. If the two information gathering and post-processing threads aren’t finished when the two final clone tasks finish, two new information gathering and post-processing threads are put in the queue, but not started as only two threads can be running in that pool.

Deciding how many threads is the best for your case, will require a bit of experimenting and greatly depends on the IOPS of your datastore. The clone tasks takes the most time because it needs to copy the virtual disks to the newly created virtual machine.

Remember that vCenter, by default, will place any clone tasks in queue if more than 8 are started. So setting the amount of threads above 8 won’t really help, altho you are always able to do so.

Usage

Here is the output of the -h option, which provides an overview of the possible arguments and what they do:

usage: multi-clone.py [-h] [-6] -b BASENAME [-c COUNT] [-d] [-f FOLDER] -H
                      HOST [-i] [-m] [-l LOGFILE] [-n AMOUNT] [-o PORT]
                      [-p PASSWORD] [-P] [-r RESOURCE_POOL] [-s POST_SCRIPT]
                      -t TEMPLATE [-T THREADS] -u USERNAME [-v] [-w MAXWAIT]

Deploy a template into multiple VM's. You can get information returned with
the name of the virtual machine created and it's main mac and ip address.
Either in IPv4 or IPv6 format. You can specify which folder and/or resource
pool the clone should be placed in. Verbose and debug output can either be
send to stdout, or saved to a log file. A post-script can be specified for
post-processing. And it can all be done in a number of parallel threads you
specify.

optional arguments:
  -h, --help            show this help message and exit
  -6, --six             Get IPv6 address for VMs instead of IPv4
  -b BASENAME, --basename BASENAME
                        Basename of the newly deployed VMs
  -c COUNT, --count COUNT
                        Starting count, the name of the first VM deployed will
                        be <basename>-<count>, the second will be
                        <basename>-<count+1> (default = 1)
  -d, --debug           Enable debug output
  -f FOLDER, --folder FOLDER
                        The folder in which the new VMs should reside (default
                        = same folder as source virtual machine)
  -H HOST, --host HOST  The vCenter or ESXi host to connect to
  -i, --print-ips       Enable IP output
  -m, --print-macs      Enable MAC output
  -l LOGFILE, --log-file LOGFILE
                        File to log to (default = stdout)
  -n AMOUNT, --number AMOUNT
                        Amount of VMs to deploy (default = 1)
  -o PORT, --port PORT  Server port to connect to (default = 443)
  -p PASSWORD, --password PASSWORD
                        The password with which to connect to the host. If not
                        specified, the user is prompted at runtime for a
                        password
  -P, --disable-power-on
                        Disable power on of cloned VMs
  -r RESOURCE_POOL, --resource-pool RESOURCE_POOL
                        The resource pool in which the new VMs should reside,
                        (default = Resources, the root resource pool)
  -s POST_SCRIPT, --post-script POST_SCRIPT
                        Script to be called after each VM is created and
                        booted. Arguments passed: name mac-address ip-address
  -t TEMPLATE, --template TEMPLATE
                        Template to deploy
  -T THREADS, --threads THREADS
                        Amount of threads to use. Choose the amount of threads
                        with the speed of your datastore in mind, each thread
                        starts the creation of a virtual machine. (default =
                        1)
  -u USERNAME, --user USERNAME
                        The username with which to connect to the host
  -v, --verbose         Enable verbose output
  -w MAXWAIT, --wait-max MAXWAIT
                        Maximum amount of seconds to wait when gathering
                        information (default = 120)

usage: multi-clone.py [-h] [-6] -b BASENAME [-c COUNT] [-d] [-f FOLDER] -H

HOST [-i] [-m] [-l LOGFILE] [-n AMOUNT] [-o PORT]

[-p PASSWORD] [-P] [-r RESOURCE_POOL] [-s POST_SCRIPT]

-t TEMPLATE [-T THREADS] -u USERNAME [-v] [-w MAXWAIT]

Deploy a template into multiple VM's. You can get information returned with

the name of the virtual machine created and it's main mac and ip address.

Either in IPv4 or IPv6 format. You can specify which folder and/or resource

pool the clone should be placed in. Verbose and debug output can either be

send to stdout, or saved to a log file. A post-script can be specified for

post-processing. And it can all be done in a number of parallel threads you

specify.

optional arguments:

-h, --help show this help message and exit

-6, --six Get IPv6 address for VMs instead of IPv4

-b BASENAME, --basename BASENAME

Basename of the newly deployed VMs

-c COUNT, --count COUNT

Starting count, the name of the first VM deployed will

be <basename>-<count>, the second will be

<basename>-<count+1> (default = 1)

-d, --debug Enable debug output

-f FOLDER, --folder FOLDER

The folder in which the new VMs should reside (default

= same folder as source virtual machine)

-H HOST, --host HOST The vCenter or ESXi host to connect to

-i, --print-ips Enable IP output

-m, --print-macs Enable MAC output

-l LOGFILE, --log-file LOGFILE

File to log to (default = stdout)

-n AMOUNT, --number AMOUNT

Amount of VMs to deploy (default = 1)

-o PORT, --port PORT Server port to connect to (default = 443)

-p PASSWORD, --password PASSWORD

The password with which to connect to the host. If not

specified, the user is prompted at runtime for a

password

-P, --disable-power-on

Disable power on of cloned VMs

-r RESOURCE_POOL, --resource-pool RESOURCE_POOL

The resource pool in which the new VMs should reside,

(default = Resources, the root resource pool)

-s POST_SCRIPT, --post-script POST_SCRIPT

Script to be called after each VM is created and

booted. Arguments passed: name mac-address ip-address

-t TEMPLATE, --template TEMPLATE

Template to deploy

-T THREADS, --threads THREADS

Amount of threads to use. Choose the amount of threads

with the speed of your datastore in mind, each thread

starts the creation of a virtual machine. (default =

-u USERNAME, --user USERNAME

The username with which to connect to the host

-v, --verbose Enable verbose output

-w MAXWAIT, --wait-max MAXWAIT

Maximum amount of seconds to wait when gathering

information (default = 120)

Issues and feature requests

Feel free to use the Github issue tracker of the repository to post issues and feature requests.

Documentation

You can find all the documentation on GitHub.

Tags: python, pyvmomi, vmware, vsphere

RT @Cyber_Cox: There's a lot of talk about logs lately, but do you know why we call it a "log"? It's actually pretty interesting! The ter… about 3 years ago