Managing an application on OpenStack presents a host of challenges for the system administrator, and finding ways to reduce complexity and produce consistency are key ingredients to achieving success. By using Ansible, an agentless IT automation technology, a system administrator can create Ansible playbooks that provide consistency and reduce complexity.
OpenStack provides a rich API to manage resources that has led to the creation of dozens of Ansible modules that can easily fit into any automation workflow. Combined with the ability to automate tasks in OpenStack instances, an operator can work both inside and out to coordinate complex operations against an environment.
"Day one" operations refer to tasks that are executed during the initial configuration and deployment of an environment. The index of OpenStack modules for Ansible lists many of the common modules used to complete tasks during day one. Creating all manner of resources, such as networks, volumes, and instances are covered. "Day two" deals with what happens next:
- How will upgrades happen?
- How are backups maintained?
- How does the environment scale up with demand?
Ansible can easily handle these use cases.
For example, consider a cluster of web servers that need to be upgraded, all sitting behind an OpenStack load balancer. With the ability to manage both the infrastructure and tasks within the VMs themselves, an operator can ensure the sequence of events executed always happens in a particular order. Here is a simple example of a playbook to perform a rolling upgrade:
- hosts: web
gather_facts: true
user: centos
serial: 1 # ensures only one server will update/reboot at a time
tasks:
- name: check for pending updates
yum:
list: updates
register: yum_update # check if there are updates before going any further
- block:
- name: remove web server from pool
os_member:
state: absent
name: '{{ ansible_hostname }}'
pool: weblb_80_pool
delegate_to: localhost
- name: update packages
package:
name: '*'
state: latest
become: true
- name: reboot server
shell: sleep 5 && reboot &
async: 1
poll: 0
- name: wait for server
wait_for_connection:
connect_timeout: 20
sleep: 5
delay: 5
timeout: 600
become: true
- name: put server back in pool
os_member:
state: present
name: '{{ ansible_hostname }}'
pool: weblb_80_pool
address: '{{ ansible_default_ipv4.address }}'
protocol_port: 80
delegate_to: localhost
when:
- yum_update.results | length > 0 # only execute the block if there are updates
This playbook first checks to see whether there are any updates to apply. If so, the playbook removes the node from the pool, applies the updates, and reboots the node. Once the node is back online, it gets added back into the pool. The Ansible playbook uses the serial keyword to ensure only one node is removed from the pool at a time.
If a database is running in the OpenStack cloud, occasionally a backup will have to be restored—either to refresh some test data or perhaps in the event of a data corruption incident. Orchestrating tasks between the database server and Cinder is easily accomplished with Ansible:
- hosts: db
gather_facts: true
user: centos
tasks:
- name: stop database
systemd:
name: mongod
state: stopped
become: true
- name: unmount db volume
mount:
path: /var/lib/mongodb
state: unmounted
become: true
- name: detach volume from server
os_server_volume:
state: absent
server: db0
volume: dbvol
delegate_to: localhost
- name: restore cinder backup
command: openstack volume backup restore dbvol_backup dbvol
delegate_to: localhost
register: vol_restore
failed_when:
- vol_restore.rc > 0
- "'VolumeBackupsRestore' not in vol_restore.stderr"
- name: wait for restore to finish
command: openstack volume show -c status -f value dbvol
register: restore_progress
until: restore_progress.stdout is search("available")
retries: 60
delay: 5
delegate_to: localhost
- name: reattach volume to server
os_server_volume:
state: present
server: db0
volume: dbvol
device: /dev/vdb
delegate_to: localhost
- name: mount db volume
mount:
path: /var/lib/mongodb
state: mounted
src: LABEL=dbvol
fstype: xfs
become: true
- name: start database
systemd:
name: mongod
state: started
become: true
Looking closely at the playbook, you may have noticed that the restore is done via the OpenStack command line and not a proper Ansible module. In some cases, a module for a task might not exist, but Ansible is flexible enough to allow calling arbitrary commands within a playbook until a module is developed. Feel like you could write the missing module? Consider creating it by contributing to the Ansible project.
These are just a couple of day-two operations a system administrator may need to orchestrate in their cloud. Roger Lopez and I will offer a hands-on lab at OpenStack Summit in Berlin with real-world scenarios and associated Ansible playbooks to automate them. We'll also upload our examples and materials to GitHub the week of the conference for the benefit of anyone who can't attend.
Roger Lopez and David Critch will present Simplifying Day Two Operations with Ansible (A Hands-on Lab) at the OpenStack Summit, November 13-15 in Berlin.
Comments are closed.