OSPF Out-Of-Band LSDB Resynchronization using Zebra on Linux

EECS 845 : Implementation of High Performance Networks

K P Muthuvelan, kpm@ittc.ukans.edu
Roopesh Rajamani, roopesh@ku.edu
Department of Electrical Engineering & Computer Science
The University of Kansas
Lawrence, KS 66045



Contents
                1.  Motivation

                2.  Brief Introduction to OSPF

                3.  LSDB Resynchronisation in OSPF

                4.  Implementation

                5.  Testing
 
                6.  Conclusion & Future Work
 
                7.  References
 
                8.  Presentation slides
 
                9.  Download Source Code


1. Motivation

In OSPF, the two neighboring routers consider themselves to be adjacent only if their LSDB is synchronized. All routing information is only exchanged
between adjacent routers. If two routers are adjacent they advertise their adjacency in their router-LSAs. In the presence of topology changes the
asynchronous flooding algorithm ensures that the routers are in sync with each other. However, if the LSDB are to be resynchronized without any changes in the topology then the neighbor’s state has to be put to the ExStart State. The problem with the above approach is that, both the routers do not consider themselves to be adjacent to each other. There are situations in which the routers may need to resynchronize their LSDB without loosing the adjacency. One such case is the restart of OSPF routers. Hence, we implemented changes to the OSPF protocol to provide Out-of-band resynchronization of the LSDB in OSPF routers.


2. Brief Introduction to OSPF
OSPF (Open Shortest Path First) is a link-state routing protocol used in IP networks. In OSPF the network is usually divide into various areas. The routers within a area have the same topological view of the area. All information regarding the area is stored in a link state database (LSDB). A router may be connected to more than one area. A separate copy of OSPF's basic routing algorithm runs in each area. Routers having interfaces to multiple areas run multiple copies of the algorithm.

Summary of the OSPF Algorithm

When a router starts, it first initializes the routing protocol data structures. The router then waits for indications from the lower- level protocols that its interfaces are functional. A router then uses the OSPF's Hello Protocol to acquire its neighbors. The router sends Hello packets to its neighbors, and in turn receives their Hello packets. On broadcast and point-to-point networks, the router dynamically detects its neighboring routers by sending its Hello packets to the multicast address AllSPFRouters. On broadcast and non-broadcast networks the Hello Protocol also elects a Designated router for the network. The router will attempt to form adjacencies with some of its newly acquired neighbors. Link-state databases( LSDB ) are synchronized between pairs of adjacent routers. Adjacencies control the distribution of routing information. Routing updates are sent and received only on adjacencies. A router periodically advertises its state, which is also called link state. Link state is also advertised when a router's state changes. A router's adjacencies are reflected in the contents of its LSAs. This relationship between adjacencies and link state allows the protocol to detect dead routers in a timely fashion. LSAs are flooded throughout the area. The flooding algorithm is reliable, ensuring that all routers in an area have exactly the same link-state database. This database consists of the collection of LSAs originated by each router belonging to the area. From this database each router calculates a shortest-path tree, with itself as root. This shortest-path tree in turn yields a routing table for the protocol.
Synchronisation of LSDB
In any link-state routing algorithm, it is very important for all routers' link-state databases to stay synchronized. OSPF simplifies this by requiring only adjacent routers to remain synchronized. The synchronization process begins as soon as the routers attempt to bring up the adjacency. Each router describes its LSDB by sending a sequence of Database Description packets to its neighbor. Each Database Description Packet describes a set of LSAs belonging to the router's database. When the neighbor sees an LSA that is more recent than its own database copy, it makes a note that this newer LSA should be requested. This sending and receiving of Database Description packets is called the "Database Exchange Process". During this process, the two routers form a master/slave relationship. Each Database Description Packet has a sequence number. Database Description Packets sent by the master (polls) are acknowledged by the slave through echoing of the sequence number. Both polls and their responses contain summaries of link state data. The master is the only one allowed to retransmit Database Description Packets. It does so only at fixed intervals, the length of which is the configured per-interface constant RxmtInterval. Each Database Description contains an indication that there are more packets to follow - the M (More) bit.
 The Database Exchange Process is over when a router has received and sent Database Description Packets with the M-bit off. During and after the Database Exchange Process, each router has a list of those LSAs for which the neighbor has more up-to-date instances. These LSAs are requested in Link State Request Packets. Link State Request packets that are not satisfied are retransmitted at fixed intervals of time RxmtInterval. When the Database Description Process has completed and all Link State Requests have been satisfied, the databases are deemed synchronized and the routers are marked fully adjacent. At this time the adjacency is fully functional and is advertised in the two routers' router-LSAs. The adjacency is used by the flooding procedure as soon as the Database Exchange Process begins.
Neighbor Date Structure
An OSPF router converses with its neighboring routers. Each separate conversation is described by a "neighbor data structure". Each conversation is bound to a particular OSPF router interface, and is identified either by the neighboring router's OSPF Router ID or by its Neighbor IP address. Thus if the OSPF router and another router have multiple attached networks in common, multiple conversations ensue, each described by a unique neighbor data structure. The neighbor data structure contains all information pertinent to the forming or formed adjacency between the two neighbors. (However, remember that not all neighbors become adjacent.) An adjacency can be viewed as a highly developed conversation between two routers.

The various information contained in a neighbor data structure are neighbor state, Inactivity timer, DD sequence Number, last received database Description packet, Neighbor ID, Neighbor Priority, Neighbor IP address, neighbor Options, Neighbor’s Designated Router, Link state retransmission list, database summary list, list state request list, etc.

Neighbor State Machine
The various states in which a neighbor can be are discussed below.
1. Down – the initial state of a neighbor conversation.
2. Attempt – indicates that an attempt should be made to contact the neighbor.
3. Init – hello packet has been received from the neighbor.
4. 2-Way – communication between two routers is bi-directional.
5. ExStart – first step to creating a n adjacency between the two neighboring routers.
6. Exchange – the router is sending data description packets to the neighbor.
7. Loading – Link state request packets are sent to the neighbor.
8. Full – the neighboring routers are fully adjacent.
 

3. LSDB Resynchronistion in OSPF
In OSPFv2 after two routers have established adjacency (the neighbor state has reached Full State), routers announce the adjacency states in their router-LSAs. Asynchronous flooding algorithm ensures routers LSDBs stay in sync in presence of topology changes. If the routers need to resynchronize their LSDB then they have to put the Neighbor State to ExStart. But this causes the adjacencies to be removed from the router-LSAs, which may not be acceptable in some cases.

Hence, we performed an Out-of-Band resynchronization of LSDB, which complements the regular resynchronization method in OSPFv2.  The various extensions to OSPF are discussed below.

Database Description Packet
The LSDB in two neighboring routers are synchronized by exchanging DBD packets. A new type of DBD packet is exchanged during the OOB resynchronization phase. OOB Database Description packets have an additional R-bit set in the header. The new type of header is as follows.
Neighbor Data Structure
In the neighbor data structure two additional fields are added. One field to indicate whether the neighbor has Out-of-Band resynchronization capability and another field indicates whether Out-of-Band resynchronization is in progress (OOBResync). If the field indicates that out-of-band resynchronization is in progress then the DBD packets are sent with the R-bit set.

The OOBResync field is cleared in the following conditions

The OOBResync field is set only if the neighbor states are ExStart, Exchange or loading. In any other state the OOBResync filed is cleared.
Processing of Database Description Packets
The DBD packets received from the neighbor is processed as follows

When DBD with R-bit set is received the following is done

  • If bits I, M, and MS are set and the state of the neighbor is Full and OOBResync flag is not set, the packet is accepted, the OOBResync flag is set and the neighbor state is put into ExStart state.

  •  
  • Else, if OOBResync flag is set and the state of the neighbor FSM is ExStart, Exchange, or Loading, the packet is processed just as in OSPFv2.

  •  
  • Else, if Neighbor State is Full and the receiving router was the Slave in the LSDB exchange process, it must be ready to identify duplicate DBDs with R-bit set from the master and retransmit the acknowledgement packet.

  •  
  • Else,  (the OOBResync flag is off, or the state is not Full, or the packet is not a duplicate), a SeqNumberMismatch Event is generated for the neighbor that causes the state to transition to the ExStart state.
  • When DBD with R-bit not set is received the following is done
  •  If OOBResync flag for the neighbor is set, OOBResync flag is cleared and a SeqNumberMismatch event is generated for the neighbor.

  •  
  • Else, process the DBD packet as in OSPFv2.
  • It is also necessary to limit the time an adjacency can spend time in ExStart, exchange and Loading states with OOBResync field set. If the adjacency does not proceed to Full State within a time period, then the requesting router may decide to stop trying to resynchronize the LSDB over this adjacency. It may then try to resynchronize the LSDB using the ordinary method in OSPFv2.
    Changes to the Neighbor State Machine
    The following extensions are made to the neighbor state machine to accommodate the OOB resynchronization capability.
  • When the Neighbor State changes from and to the Full State with the OOBResync filed set, then it should not cause origination of new version of router-LSA or network-LSA.

  •  
  • When the OOBResync field, checks for the Full state for purposes other than LSDB synchronization and flooding should treat the states ExStart, Exchange and Loading as Full State.
  • Initiating OOB LSDB Resynchronization
    In order to initiate OOB LSDB Resynchronization, the router must first check whether the neighbor has this capability. If the neighbor has the capability then the OOBResync field for the neighbor is set and the Neigh State is forced to the ExStart State.


    4.  Implementation
    The above extensions were provided to Zebra. Zebra is free software (distributed under the GNU General Public License) that manages TCP/IP based routing protocols. This project uses the 0.91 beta version of Zebra. There are 4 daemons in this software. One (zebra) can be viewed as a general router process and the other three are instances of three routing protocols, BGP, OSPF and RIP respectively. This project deals with the zebra and ospfd daemons only. Communication between zebra and ospfd takes place via sockets and communication between zebra and the kernel takes place via netlink sockets. The following are the changes that have been implemented as described above.

    5.  Testing
    The testing of the project was performed by running zebra daemon and the ospfd daemon in two testbeds, which acted as two routers, at ITTC. The two routers were connected to Ethernet. Each router advertised its ATM interface to the other router. The eth0 interface in each router was configured as ip_ospf_network_point-to-point in ospfd. Each router showed the network (point-to-point ATM links) advertised by the other. The out-of-band resynchronization was performed by invoking the resync function defined.
    Resync <IP address of the neighbor>
    The setting of the R-bit in the DBD packets were verified using the DEBUG command in ospfd. During the resynchronization the following properties were observed

    6. Conclusion & Future Work
    The Out-of- band resynchronization capability of the OSPF routers helps to provide them with hitless restart capability. Hitless restart is a capability by which the routing and forwarding functions of a router is separated. Even if router software is restarted the router continues to perform the forwarding function. When a router is to be restarted then the neighboring routers will remove the router from their router-LSAs, which will affect the forwarding function of the router. Hence, the OOB Resynchronization method can be used. By using this method the reloaded router can pick up the adjacencies after reload.

    In the project we assumed that both the routers have the resynchronization capability. We can use the Link Local Signaling (LLS).  A LLS block can be appended to a OSPF hello and DBD packets, which contains Extended Options- TLV fields. An EO-TLV field can be used to advertise that a router has OOB resynchronization capability.


    7.  References

    8.  Presentation Slides
     The Presentation slides in PPT format can be downloaded here.


    9.  Download Source Code
    The Source code can be downloaded here.


    Page Maintained by KP Muthuvelan, kpm@ittc.ukans.edu.