At Paypal, we spent more than one year to upgrade all our Openstack deployments from folsom/grizzly to havana. In this talk, we would like to share the lessons we learned the hard way from the previous upgrade, and how we plan to change the process starting from kilo upgrade. Our final target is to bring Paypal's openstack cloud up to date with upstream and to continuously integrate with upstream.
The previous upgrade process was labor intensive, painful and risky, here is why:
- Paypal has one of the largest Openstack cloud in enterprise serving production workloads. The upgrade should not interrupt any of the business applications running in the cloud.
- We had mixed environment with folsom/grizzly and mixed networking platforms with nova-network/neutron. There was no standard way to upgrade from nova-network to neutron without interrupting running VMs.
- We had a lot custom changes for Paypal’s specific use cases. We also back ported some bug fixes from upstream. This made our code base diverged from upstream. For each new openstack release, we have to manually merge, modify or drop our own changes one by one with the upstream code and run thoroughly tests again the merged code.
To meeting above requirement, we developed a special side project called ‘upgrade-test’ which has all the custom migration code and testing code. The upgrade plan was carefully designed and a few dry runs were executed before the real upgrade.
In order to catch up with upstream release, we decided to skip icehouse/juno and directly upgrade to kilo from Havana. To make sure the future upgrade will be smooth without code merging issues, we did the follow changes:
- Refactoring our custom changes into a different project and avoid the dilemma between backport and upgrade. For example, the nova project in Paypal’s github is a fork from upstream nova project. All Paypal specific changes go into another project called “nova_pypl”. The build script will combine the two projects and generates a single nova tarball.
- Without ad-hoc change to the upstream code. Any custom features should be implemented through standard customization paths: Paste middleware, API extension, scheduler filters, and class extension.
- Actively contribute to the community. Reporting and fixing bugs in the community and sync the changes back to our fork repository continuously.
Kilo upgrade is a milelstone for Paypal's openstack cloud, we will no longer play catching up games and always stay on latest openstack release.
To keep up to date with upstream code base, we can get more help from the community with bugs fixes and new features.
We can also actively contribute to community in the following ways:
1. Reporting bugs we find in a large enterprise deployment.
2. Fixing bugs in the upstream.
3. Actively get involved in blueprints and design sessions for new features.