Parallel task execution in Ansible
At work, I have a playbook that uses the Ansible ec2 module to provision a number of EC2 instances. The task in question looks something like this:
- name: Set up EC2 instances ec2: region: "{{ item.region }}" instance_type: "{{ item.type }}" … wait: yes with_items: instances register: ec2_instances
Later tasks use instance ids and other provisioning data, so each task must wait until it's completed; but provisioning instances can take a long time—up to several minutes for spot instances—so creating a 32-node cluster this way is painfully slow. The obvious solution is to create the instances in parallel.
Ansible will, of course, dispatch tasks to multiple hosts in parallel, but in this case all the tasks must run against localhost. Besides, although each iteration of a loop is executed separately, it's not possible to dispatch them in parallel. Multiple hosts can be made to execute the entire loop in parallel, but it's not possible to hand off one iteration to one host and another to a different host in parallel.
You can get close with “delegate_to: {{item}}
”, but each step
of the loop will be completed before the next is executed (with Ansible
2, it's possible that a custom strategy plugin could dispatch delegated
loop iterations in parallel, but the included
free execution strategy
doesn't work this way). The solution is to use
“fire-and-forget”
asynchronous tasks and wait for them to complete:
- name: Set up EC2 instances ec2: … wait: yes with_items: instances register: ec2_instances async: 7200 poll: 0 - name: Wait for instance creation to complete async_status: jid={{ item.ansible_job_id }} register: ec2_jobs until: ec2_jobs.finished retries: 300 with_items: ec2_instances.results
This will move on immediately from each iteration without waiting for the task to complete, and separately wait for the tasks to complete using async_status. The 7200 and 300 are arbitrary “longer than it could possibly take” choices. Note that we are polling the completion status one by one, so we'll start polling for the completion of iteration #2 only after #1 is complete, no matter how long either task takes. But in this case, since I have to wait for all of the tasks to complete anyway, it doesn't matter.