Parallel task execution in Ansible

By Abhijit Menon-Sen <ams@toroid.org>

2015-11-12

At work, I have a playbook that uses the Ansible ec2 module to provision a number of EC2 instances. The task in question looks something like this:

- name: Set up EC2 instances
  ec2:
    region: "{{ item.region }}"
    instance_type: "{{ item.type }}"
    …
    wait: yes
  with_items: instances
  register: ec2_instances

Later tasks use instance ids and other provisioning data, so each task must wait until it's completed; but provisioning instances can take a long time—up to several minutes for spot instances—so creating a 32-node cluster this way is painfully slow. The obvious solution is to create the instances in parallel.

Ansible will, of course, dispatch tasks to multiple hosts in parallel, but in this case all the tasks must run against localhost. Besides, although each iteration of a loop is executed separately, it's not possible to dispatch them in parallel. Multiple hosts can be made to execute the entire loop in parallel, but it's not possible to hand off one iteration to one host and another to a different host in parallel.

You can get close with “delegate_to: {{item}}”, but each step of the loop will be completed before the next is executed (with Ansible 2, it's possible that a custom strategy plugin could dispatch delegated loop iterations in parallel, but the included free execution strategy doesn't work this way). The solution is to use “fire-and-forget” asynchronous tasks and wait for them to complete:

- name: Set up EC2 instances
  ec2:
    …
    wait: yes
  with_items: instances
  register: ec2_instances
  async: 7200
  poll: 0

- name: Wait for instance creation to complete
  async_status: jid={{ item.ansible_job_id }}
  register: ec2_jobs
  until: ec2_jobs.finished
  retries: 300
  with_items: ec2_instances.results

This will move on immediately from each iteration without waiting for the task to complete, and separately wait for the tasks to complete using async_status. The 7200 and 300 are arbitrary “longer than it could possibly take” choices. Note that we are polling the completion status one by one, so we'll start polling for the completion of iteration #2 only after #1 is complete, no matter how long either task takes. But in this case, since I have to wait for all of the tasks to complete anyway, it doesn't matter.