At the preface of Part I of the High Availability post series, I mentioned that in order to have a truly highly available LBaaS Agent + HAProxy in namespace implementation, it should have the ability to automatically recover in case that the HAProxy process died unexpectedly.

For that, we’ll need a watchdog to monitor and respawn the child processes. The watchdog in our case is the LBaaSv2 agent.

Overview

To achieve that, I implemented Neutron’s ProcessMonitor for the LBaaS HAProxy driver, by enhancing the HaproxyNSDriver class to utilize the external_process module.

Additionally, due to the fact that haproxy has a unique way of handling process reload for configuration updates (as opposed to ‘HUP’ in most cases), I added the option to allow custom reload callback methods. In a nutshell, upon configuration update (new listener, pool member, etc) the LBaaSv2 agent will update the haproxy configuration file and will spawn a new haproxy process with an added ‘-sf’ flag appended with the old haproxy process id. The process id(s) appended after the ‘sf’ flag will get a FINISH signal asking them to finish what they are doing and to exit.

Before the configuration update:

After the configuration update:

Configuration

It works by default starting Ocata.

You may tweak a couple of related options, found at neutron.conf:

  • check_child_processes_interval: which is ’60’ seconds by default. You may use 0 to disable process monitoring altogether.
  • check_child_processes_action: which is ‘respawn’ by default.

Demo

Setup

For this demo, I used a single devstack node running with the configuration found here.

The end result should look like this:

Now for the actual demo

  • Create a Loadbalancer

  • Create a listener so haproxy will get spawned

  • Kill the haproxy process

  • Inspect the LBaaSv2 agent log,  you’ll notice it detects that the haproxy process is missing and respawn it

Appendix

Patches involved: