Manage your OpenStack cloud with Ansible: Day two operations

Automate upgrades, backups, and scaling with Ansible playbooks.

7 great open source tools to power your marketing stack

Image by:

Opensource.com

Managing an application on OpenStack presents a host of challenges for the system administrator, and finding ways to reduce complexity and produce consistency are key ingredients to achieving success. By using Ansible, an agentless IT automation technology, a system administrator can create Ansible playbooks that provide consistency and reduce complexity.

OpenStack provides a rich API to manage resources that has led to the creation of dozens of Ansible modules that can easily fit into any automation workflow. Combined with the ability to automate tasks in OpenStack instances, an operator can work both inside and out to coordinate complex operations against an environment.

"Day one" operations refer to tasks that are executed during the initial configuration and deployment of an environment. The index of OpenStack modules for Ansible lists many of the common modules used to complete tasks during day one. Creating all manner of resources, such as networks, volumes, and instances are covered. "Day two" deals with what happens next:

How will upgrades happen?
How are backups maintained?
How does the environment scale up with demand?

Ansible can easily handle these use cases.

For example, consider a cluster of web servers that need to be upgraded, all sitting behind an OpenStack load balancer. With the ability to manage both the infrastructure and tasks within the VMs themselves, an operator can ensure the sequence of events executed always happens in a particular order. Here is a simple example of a playbook to perform a rolling upgrade:

- hosts: web
  gather_facts: true
  user: centos
  serial: 1  # ensures only one server will update/reboot at a time
  tasks:
  - name: check for pending updates
    yum:
      list: updates
    register: yum_update # check if there are updates before going any further
  - block: 
      - name: remove web server from pool
        os_member:
          state: absent
          name: '{{ ansible_hostname }}'
          pool: weblb_80_pool
        delegate_to: localhost
      - name: update packages
        package:
          name: '*'
          state: latest
        become: true
      - name: reboot server
        shell: sleep 5 && reboot &
        async: 1
        poll: 0
      - name: wait for server
        wait_for_connection:
          connect_timeout: 20
          sleep: 5
          delay: 5
          timeout: 600
        become: true
      - name: put server back in pool
        os_member:
          state: present
          name: '{{ ansible_hostname }}'
          pool: weblb_80_pool
          address: '{{ ansible_default_ipv4.address }}'
          protocol_port: 80
        delegate_to: localhost
    when:
    - yum_update.results | length > 0 # only execute the block if there are updates

This playbook first checks to see whether there are any updates to apply. If so, the playbook removes the node from the pool, applies the updates, and reboots the node. Once the node is back online, it gets added back into the pool. The Ansible playbook uses the serial keyword to ensure only one node is removed from the pool at a time.

If a database is running in the OpenStack cloud, occasionally a backup will have to be restored—either to refresh some test data or perhaps in the event of a data corruption incident. Orchestrating tasks between the database server and Cinder is easily accomplished with Ansible:

- hosts: db
  gather_facts: true
  user: centos
  tasks:
  - name: stop database
    systemd:
      name: mongod
      state: stopped
    become: true
  - name: unmount db volume
    mount:
      path: /var/lib/mongodb
      state: unmounted
    become: true
  - name: detach volume from server
    os_server_volume: 
      state: absent
      server: db0
      volume: dbvol
    delegate_to: localhost
  - name: restore cinder backup
    command: openstack volume backup restore dbvol_backup dbvol
    delegate_to: localhost
    register: vol_restore
    failed_when:
    - vol_restore.rc > 0
    - "'VolumeBackupsRestore' not in vol_restore.stderr"
  - name: wait for restore to finish
    command: openstack volume show -c status -f value dbvol
    register: restore_progress
    until: restore_progress.stdout is search("available")
    retries: 60
    delay: 5
    delegate_to: localhost
  - name: reattach volume to server
    os_server_volume: 
      state: present
      server: db0
      volume: dbvol
      device: /dev/vdb
    delegate_to: localhost
  - name: mount db volume
    mount:
      path: /var/lib/mongodb
      state: mounted
      src: LABEL=dbvol
      fstype: xfs
    become: true
  - name: start database
    systemd:
      name: mongod
      state: started
    become: true

Looking closely at the playbook, you may have noticed that the restore is done via the OpenStack command line and not a proper Ansible module. In some cases, a module for a task might not exist, but Ansible is flexible enough to allow calling arbitrary commands within a playbook until a module is developed. Feel like you could write the missing module? Consider creating it by contributing to the Ansible project.

These are just a couple of day-two operations a system administrator may need to orchestrate in their cloud. Roger Lopez and I will offer a hands-on lab at OpenStack Summit in Berlin with real-world scenarios and associated Ansible playbooks to automate them. We'll also upload our examples and materials to GitHub the week of the conference for the benefit of anyone who can't attend.

Roger Lopez and David Critch will present Simplifying Day Two Operations with Ansible (A Hands-on Lab) at the OpenStack Summit, November 13-15 in Berlin.