OpenStack TripleO FFU Keystone Demo N to Q

This post will introduce a very rough demo of the new TripleO Fast-forward Upgrades (FFU) feature, warts and all, using an overcloud with only Keystone deployed. This should prove to be a useful starting point for anyone interested in this feature and could even be an approach used for future per-service FFU CI jobs.

Environment

I’m currently using the tripleo-quickstart project to deploy virtualised test environments. For this demo I’m using the following command line to create the demo environment:

$ bash quickstart.sh -w $WD -t all -R master-undercloud-newton-overcloud  \
   -c config/general_config/keystone-only.yml \
   -N config/nodes/1ctlr.yml $VIRTHOST

This is made possible by following unmerged changes to tripleo-quickstart:

https://review.openstack.org/#/q/topic:keystone_only_overcloud

Once deployed you should find the 10.0.3 Newton version of Keystone deployed on overcloud-controller-0:

$ ssh -F $WD/ssh.config.ansible overcloud-controller-0
[..]
$ rpm -qi openstack-keystone
Name        : openstack-keystone
Epoch       : 1
Version     : 10.0.3
Release     : 0.20170726120406.bd49c3e.el7.centos
Architecture: noarch
Install Date: Fri 10 Nov 2017 04:24:46 AM UTC
Group       : Unspecified
Size        : 175014
License     : ASL 2.0
Signature   : (none)
Source RPM  : openstack-keystone-10.0.3-0.20170726120406.bd49c3e.el7.centos.src.rpm
Build Date  : Wed 26 Jul 2017 12:07:53 PM UTC
Build Host  : n30.pufty.ci.centos.org
Relocations : (not relocatable)
URL         : http://keystone.openstack.org/
Summary     : OpenStack Identity Service
Description :
Keystone is a Python implementation of the OpenStack
(http://www.openstack.org) identity service API.

Before starting the upgrade I recommend that snapshots of the undercloud and overcloud-controller-0 libvirt domains are taken on the virthost:

$ ssh -F $WD/ssh.config.ansible virthost
$ for domain in $(virsh list | grep running | awk '{print $2 }'); do virsh snapshot-create-as ${domain} ${domain}_start ; done

UC - docker_registry.yaml

As with a normal container based deployment on >=Pike we will need a Docker registry file mapping each service to a container image. The following command will create this file, pointing to the offical RDO registry:

$ openstack overcloud container image prepare \
  --namespace trunk.registry.rdoproject.org/master \
  --tag tripleo-ci-testing \
  --output-env-file ~/docker_registry.yaml

Note that this will result in the container images being pulled from the remote RDO registry during the upgrade. We can pre-cache these images on the undercloud to speed the process up. However as we are only using a single host and minimal number of services in this demo I have chosen to skip this for now.

UC - tripleo-heat-templates

FFU itself is controlled by an Ansible playbook using tasks that are contained within the tripleo-heat-templates (THT) project. The following gerrit topic lists all of the current FFU changes up for review:

https://review.openstack.org/#/q/status:open+project:openstack/tripleo-heat-templates+branch:master+topic:bp/fast-forward-upgrades

For this demo we need to update the local copy of THT on the undercloud to include a subset of these changes:

$ cd /home/stack/tripleo-heat-templates
$ git fetch git://git.openstack.org/openstack/tripleo-heat-templates refs/changes/19/518719/2 && git checkout FETCH_HEAD

We also need the following noop-deploy-steps.yaml environment file that allows us to use openstack overcloud deploy to update the stack outputs of the overcloud without forcing an actual redeploy of any resources:

$ curl https://git.openstack.org/cgit/openstack/tripleo-heat-templates/plain/environments/noop-deploy-steps.yaml?h=refs/changes/97/520097/1 > environments/noop-deploy-steps.yaml

Finally, as we have deployed a custom set of services for the Controller role we now have to ensure that the Docker service is added to the role prior to our upgrade:

$ cat overcloud_services.yaml 
parameter_defaults:
  ControllerServices:
       - OS::TripleO::Services::Docker
       - OS::TripleO::Services::Kernel
       - OS::TripleO::Services::Keystone
       - OS::TripleO::Services::RabbitMQ
       - OS::TripleO::Services::MySQL
       - OS::TripleO::Services::HAproxy
       - OS::TripleO::Services::Keepalived
       - OS::TripleO::Services::Ntp
       - OS::TripleO::Services::Timezone
       - OS::TripleO::Services::TripleoPackages

OC - Ocata heat-agents

An older os-apply-config hiera hook and any legacy hiera data needs to be removed from the overcloud prior to our upgrade. The following ML post has more details on this workaround:

http://lists.openstack.org/pipermail/openstack-dev/2017-January/110922.html

For the time being this isn’t part of the upgrade playbook and so we need to run the following commands that will update the heat-agents on the host to their Ocata versions and remove the legacy data:

$ sudo rm -f /usr/libexec/os-apply-config/templates/etc/puppet/hiera.yaml /usr/libexec/os-refresh-config/configure.d/40-hiera-datafiles /etc/puppet/hieradata/*.yaml
$ sudo yum install -y \
https://trunk.rdoproject.org/centos7-ocata/current-tripleo/openstack-heat-agents-1.0.1-0.20170412210405.769d0de.el7.centos.noarch.rpm \
https://trunk.rdoproject.org/centos7-ocata/current-tripleo/python-heat-agent-1.0.1-0.20170412210405.769d0de.el7.centos.noarch.rpm \
https://trunk.rdoproject.org/centos7-ocata/current-tripleo/python-heat-agent-ansible-1.0.1-0.20170412210405.769d0de.el7.centos.noarch.rpm \
https://trunk.rdoproject.org/centos7-ocata/current-tripleo/python-heat-agent-apply-config-1.0.1-0.20170412210405.769d0de.el7.centos.noarch.rpm \
https://trunk.rdoproject.org/centos7-ocata/current-tripleo/python-heat-agent-docker-cmd-1.0.1-0.20170412210405.769d0de.el7.centos.noarch.rpm \
https://trunk.rdoproject.org/centos7-ocata/current-tripleo/python-heat-agent-hiera-1.0.1-0.20170412210405.769d0de.el7.centos.noarch.rpm \
https://trunk.rdoproject.org/centos7-ocata/current-tripleo/python-heat-agent-json-file-1.0.1-0.20170412210405.769d0de.el7.centos.noarch.rpm \
https://trunk.rdoproject.org/centos7-ocata/current-tripleo/python-heat-agent-puppet-1.0.1-0.20170412210405.769d0de.el7.centos.noarch.rpm

OC - Remove ceilometer

At present there is a packaging issue when upgrading the openstack-ceilometer packages directly from Newton to Queens. As these packages are installed by default in the Newton overcloud-full image used to deploy the environment but not used in our demo we can simply remove them for the time being:

$ sudo yum remove openstack-ceilometer* -y

UC - Update stack outputs

We can now use the openstack overcloud deploy command to update the overcloud stack and generate the new stack outputs, including the FFU playbook. To do this we simply add the previously created docker_registry.yaml, environments/docker.yaml and environments/noop-deploy-steps.yaml environment files to the original command used to deploy the environment.

$ . stackrc
$ openstack overcloud deploy \
  --templates /home/stack/tripleo-heat-templates \
[..]
  -e /home/stack/docker_registry.yaml \
  -e /home/stack/tripleo-heat-templates/environments/docker.yaml \
  -e /home/stack/tripleo-heat-templates/environments/noop-deploy-steps.yaml

The original command is logged under ~/overcloud_deploy.log on the undercloud, for example:

$ grep openstack\ overcloud\ deploy overcloud_deploy.log 
2017-11-16 14:36:11 | + openstack overcloud deploy --templates /home/stack/tripleo-heat-templates --libvirt-type qemu --control-flavor oooq_control --compute-flavor oooq_compute --ceph-storage-flavor oooq_ceph --block-storage-flavor oooq_blockstorage --swift-storage-flavor oooq_objectstorage --timeout 90 -e /home/stack/cloud-names.yaml -e /home/stack/tripleo-heat-templates/environments/network-isolation.yaml -e /home/stack/tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e /home/stack/network-environment.yaml -e /home/stack/tripleo-heat-templates/environments/low-memory-usage.yaml --validation-warnings-fatal -e /home/stack/overcloud_services.yaml --compute-scale 0 --ntp-server pool.ntp.org

UC - Download config

Now that the stack outputs have been updated we can download the overcloud config containing the FFU playbook onto the undercloud:

$ openstack overcloud config download

There is a known issue with the generated upgrade tasks at the moment where the ordering of conditionals causes Ansible to fail. To workaround this, simply edit the following Ansible tasks within the Controller/upgrade_tasks.yaml file to ensure the step conditional is always checked first:

- block:
  - name: Upgrade os-net-config
    yum: name=os-net-config state=latest
  - changed_when: os_net_config_upgrade.rc == 2
    command: os-net-config --no-activate -c /etc/os-net-config/config.json -v --detailed-exit-codes
    failed_when: os_net_config_upgrade.rc not in [0,2]
    name: take new os-net-config parameters into account now
    register: os_net_config_upgrade
  tags: step3
  when:
  - step|int == 3
  - not os_net_config_need_upgrade.stdout and os_net_config_has_config.rc == 0

UC - Run playbook

With the config present on the undercloud we can finally start the FFU upgrade using the following command line:

$ ansible-playbook -i /usr/bin/tripleo-ansible-inventory \
    /home/stack/tmp/fast_forward_upgrade_playbook.yaml

OC - Verification

Once the FFU upgrade is complete we can verify that Keystone is functional in the overcloud with a few simple commands:

$ ssh -F $WD/ssh.config.ansible undercloud
$ . overcloudrc
$ openstack endpoint list
+----------------------------------+-----------+--------------+--------------+---------+-----------+----------------------------+
| ID                               | Region    | Service Name | Service Type | Enabled | Interface | URL                        |
+----------------------------------+-----------+--------------+--------------+---------+-----------+----------------------------+
| 15fd404ff8c14971b4251b81624edab8 | regionOne | keystone     | identity     | True    | admin     | http://192.168.24.10:35357 |
| 2e513f5fdfc140ec916b081b47a2b8f7 | regionOne | keystone     | identity     | True    | internal  | http://172.16.2.12:5000    |
| 96980f0f9ac44c718c038ef54af814bc | regionOne | keystone     | identity     | True    | public    | http://10.0.0.8:5000       |
+----------------------------------+-----------+--------------+--------------+---------+-----------+----------------------------+
$ openstack service list
+----------------------------------+------------+----------+
| ID                               | Name       | Type     |
+----------------------------------+------------+----------+
| 3fc546421e9048f39b2b847b13fa8ea5 | keystone   | identity |
| 7f819190dc6f44d8b995021277b24d67 | ceilometer | metering |
+----------------------------------+------------+----------+

We can also log into the overcloud-controller-0 host and verify that the relevant containers are running:

$ ssh -F $WD/ssh.config.ansible overcloud-controller-0
$ sudo docker ps
CONTAINER ID        IMAGE                                                                COMMAND                  CREATED              STATUS                          PORTS               NAMES
4f40f0cf98aa        192.168.24.1:8787/master/centos-binary-keystone:tripleo-ci-testing   "/bin/bash -c '/usr/l"   About a minute ago   Up About a minute                                   keystone_cron
0b9d5cc17f5d        192.168.24.1:8787/master/centos-binary-keystone:tripleo-ci-testing   "kolla_start"            About a minute ago   Up About a minute (healthy)                         keystone
db967d899aaf        192.168.24.1:8787/master/centos-binary-mariadb:tripleo-ci-testing    "kolla_start"            About a minute ago   Up About a minute (unhealthy)                       mysql
1f0b9aa72ec7        192.168.24.1:8787/master/centos-binary-rabbitmq:tripleo-ci-testing   "kolla_start"            2 minutes ago        Restarting (1) 29 seconds ago                       rabbitmq
8e689f5bac22        192.168.24.1:8787/master/centos-binary-haproxy:tripleo-ci-testing    "kolla_start"            2 minutes ago        Up 2 minutes                                        haproxy

As I said at the start this is a very rough demo that we can hopefully clean up and iterate on quickly over the coming weeks. The current goal is to have another working demo available by M2 that covers all of the required services to upgrade the computes so we can also start verification of the data plane during the upgrade.

Contents

comments powered by Disqus