Will Software Defined Networks Need Routing Protocols?Will Software Defined Networks Need Routing Protocols?
How does an SDN interoperate with the rest of the network?
May 2, 2014
How does an SDN interoperate with the rest of the network?
SDN Doesn't Operate Alone
Software Defined Networks don't operate alone. They will have to interoperate with other SDNs and with other parts of the network. This is particularly true in the early stages of SDN deployment, when most of an enterprise's network is based on distributed routing and switching technology.
Let's think about the likely place one might start implementing an SDN: the data center. How would it interface with the rest of the corporate network? What information needs to be shared?
Anatomy of an SDN
It is beneficial to start by examining how an SDN appears from the outside. An SDN effectively implements one or more forwarding domains. There may be multiple forwarding domains (overlays) running on top of a single, easily managed network (the underlay network). An underlay network could be based on a Layer 2 fabric, or it could be based on an easily managed Layer 3 topology.
The overlay SDN network simply provides the logical network connectivity that is desired, running on top of the underlay network. There may be multiple logical networks implemented by a single SDN control system. The overlay logical forwarding domain is a powerful abstraction. It decouples the forwarding domain from the technology of the physical implementation.
Inside the forwarding domain, the SDN controller makes the forwarding decisions. Each logical forwarding domain functions like a Layer 2-7 switch, with forwarding decisions based on Layer 2-7 information within each packet.
What Information Must Be Shared?
Each logical network forwarding domain looks like a big switch, and we need to consider how each of the domains interfaces with the rest of the network. One way to examine the problem is to answer the question "What information must be shared between the forwarding domain and the rest of the network?"
One answer is to share Layer 2 host reachability information, which is normally shared using the Spanning Tree Protocol (STP). But STP (and even RSTP) has its problems--ones that we would rather avoid. If adjacent parts of the network are running TRILL, FabricPath, or other L2 protocols, then our SDN will need to interoperate with them. This might be the case if the SDN domains are a small part of a much larger data center. The types of applications in use may also dictate whether L2 forwarding domains are required or if L3 domains are preferred.
Another answer is to share Layer 3 network reachability information. We then have a choice of using an IGP (RIPv2, OSPF, EIGRP) or BGP. [Note: I can't think of an instance off-hand when I'd want to select RIPv2. I include it here for completeness.]
When considering which protocol to use, we must look at how the SDN domains and the rest of the network are to be treated, relative to each other. Do the SDN domains see the rest of the network as a set of peer routers and switches (and vice-versa)? If that's the case, then an IGP might be the best selection. But if the network architecture requires clean separation of routing protocol functionality, then BGP would be a better choice.
Another way of thinking of it is the design choices that are made when scaling up IGPs for large networks. When considering OSPF, I would likely want to make each SDN forwarding domain a separate OSPF area. For EIGRP, I would want to limit the query propagation at the edge of each SDN domain.
Where Does the Routing Protocol Run?
In any of the above cases, the routing or switching protocol would run on the SDN controller. Each SDN forwarding domain may need to run its own routing protocol for interfacing with the external network. Or the SDN controller may run one or two instances and have virtual interfaces into each forwarding domain. [For fault tolerance, it would be preferable to run an instance of the protocol on two separate hardware platforms, as well as using more than one interface to the rest of the network.]
While some papers describe the routing protocol running on the controller (see Demystifying Routing Service in Software-Defined Networking), it doesn't have to run there. It could be implemented as an external application that talks to the SDN controller. In this case, routing updates would need to be forwarded from the switches to the controller and then to the external routing process. The application API would be used by the external routing process to update the controller's Routing Information Base (RIB). However, I don't expect to see external routing processes. I expect to see it more tightly coupled to the SDN controller than is possible with an API. It might get moved out of the controller at some point in the future.
The routing protocol could also be implemented in a hybrid SDN system where each switch in the SDN can be controlled by either the SDN controller or by traditional distributed routing protocols. A couple of core switches could be configured to run a routing protocol to exchange routing information with external systems. Internal to the SDN domains, the SDN controller would populate the Forwarding Information Base (FIB). Only the core switches would have routing information about the external destinations. All switches except the core switches would have a default FIB entry defined by the SDN controller. If no other FIB entry matches, forward the packet to one of the core switches. This is like the default route in traditional routing protocols.
I'm sure that some people will insist that there is no need for an SDN to run routing protocols. Simply configure static routing to the SDN's links and default routes within the SDN, and point traffic at the exit points. This configuration has its own problems, the primary one of which is that it is a static configuration, at least on the external side. The SDN controller could be smart enough to do load sharing over multiple egress links, but with static routes, the ingress paths would be selected by the routing metric's cost to the nearest link to the SDN. Of course, various types of static load balancing and policy routing could be attempted within the enterprise network in order to get some load sharing on multiple paths into the SDN.
SDN Sizing
What should an organization do when implementing SDN within multiple parts of the network, say in each of two data centers? Some network architects might argue that one SDN should be used because it would provide optimum routing between the data centers. Other architects might insist on limiting the size of the failure domain and keep the two SDN domains separate.
Network failures are often nasty (think about routing loops or STP forwarding loops or black hole routes). We don't have enough experience with SDN to fully understand and appreciate the failure modes we'll see. My preference, based on the side effects of other types of network failures, is to limit the size of the failure domain. Most organizations build two data centers in order to have a backup data center. If the same SDN instance is controlling both data centers and we find a new failure mode, it will likely affect both data centers. I would rather settle for slightly less optimal network utilization in favor of smaller failure domains.
There's another way of looking at the partitioning problem. We've learned that each protocol has its scaling traits. I don't expect SDN to be any different.
We shouldn't throw out our networking common sense just because of a new paradigm. By starting with smaller SDNs and coupling them together with standard routing protocols, we build some walls between failure domains.
We can experiment with how large an SDN should grow, and learn as we go. Until then, be conservative and limit the size of an SDN domain.