Yahoo is huge! Yahoo has always had some legacy infrastructure in place, for better or worse, to solve the problems that OpenStack now solves. When we try to move to OpenStack, we can’t just get rid of existing infrastructure one fine day and tell people - “hey, I know that you have been using this or doing things this way for a long time.. From now on, you just can’t do that anymore. You need to update all your automation to work with this new thing”..
That will definitely piss off people.
Also, most of these legacy requirements come from our huge scale. So we need to make sure that when we move to OpenStack, we continue supporting those use-cases. To do so, we need to patch Neutron heavily as it currently doesn’t support them. Most of them are useful for the community as well, since anyone at this scale is going to have similar requirement.
We are working with community to upstream them. With this talk, we would like to share our experience, use-cases, hacks and also like to know if anyone else in the community has the same requirements.
We need following features to support our legacy use-cases
- Static IP allocation: Currently, when inventory is installed in racks, our site operations team picks IP address for the inventory and updates our inventory management database. Currently, we consider this database as source-of-truth. When this inventory is added to Ironic and booted, Neutron first looks at its original IP address, creates networks/subnets on-the-fly, if not already present and then assigns the same IP to the instance. Currently we have to do this because we can’t yet treat Neutron as the source-of-truth for bare-metal inventory. We are slowly trying to move towards Dynamic IP allocation where Neutron will be the source of truth, but there are certain blockers that we need to solve first.
- Single process Dnsmasq support for Neutron DHCP Agent: At Yahoo we don’t have tenant networks or overlapping IP spaces. So we don’t need dhcp-agent to spawn a Dnsmasq process for every network. We have patched the agent and the driver to spawn a single process responsible for all the networks. We don’t plan on using this patch forever, but only temporarily until we get support for ISC-DHCPD driver in.
- Support for ISC-DHCPD driver for Neutron: Dnsmasq does not scale well, especially for a scale at which Yahoo operates. Also, ISC-DHCP is mostly what is used at this scale. Currently Neutron DHCP agent doesn’t support any other DHCP server that Dnsmasq. In fact, the agent itself is so aligned with Dnsmasq, that it doesn’t work as-is for ISC-DHCP. So we have modified the agent and added a driver for ISC-DHCPD. This work is being upstreamed.
- Support for multiple gateways for a subnet in Neutron: At Yahoo, the network architecture is such a way that multiple gateways are configured for the subnets. These multiple gateways are typically spread across backplanes so that the production traffic can be load-balanced between backplanes. Currently, the subnets in Neutron only support one gateway. We are working with community to add multiple gateway support to Neutron subnet.
- Multi-IP support: For one of our use-cases, we need ability to allocate multiple IPs for an Ironic instance, but not necessarily create a port for it in Neutron. Currently we are evaluating different ways we can do this.
- IPv6 support on Opt-In basis: We need Neutron to be able to allocate IPv6 addresses, but not always. We have some legacy stuff that will simply not work with IPv6. So we need provide ability for users to “Opt-In” for either only IPv6, only IPv4, or both. This is also something that community doesn’t support as of now. We would like to know if others also have similar use-cases, so that we can work with them to upstream this.