BGP module type declarations

type bgpBuffer

Contains a pointer to allocated memory and current processing state within the buffers. Used to store input data received from a particular peer. Part of bgpPeer and bgpProtoPeer structures. Management routines found in bgp_buffer*().

type bgpOutBuffer

The corresponding data structure for holding data to be sent to a peer. Part of the bgpPeer structure only. Management routines found in bgp_buffer*().

type bgp_conf

Structure to hold peer and group configuration information. Part of bgpPeer and bgpGroup structures. These structures are instantiated in the parser depending on the values specified in the configuration information.

type bgp_metrics

A per nexthop metrics structure containing one or more of a multi-exit-discriminator, a localpref or a route tag. In addition, there is a byte field which indicates which of these tags are present. This is associated with routes advertised by a BGP instance bgp_adv_entry and an outgoing route send list bgp_rto_entry. Metrics structures are stored in a Patricia tree.

type bgp_metrics_node

An internal node of the metrics Patricia tree described in the metrics manipulation macros.

type bgp_adv_queue

Queue of bgp_adv_entrys, a list of routes advertised to each peer. Part of each bgpPeer structure and is used for external peers only. Also part of the bgpPeerGroup structure and used for internal groups there. Most of the manipulations on this structure is done by the BGP_ADV* set of macros.

type bgp_adv_entry

An entry in a bgp_adv_queue. This contains a pointer to the route entry and the set of metrics associated with the route. Manipulated by of the BGP_ADV* set of macros.

type bgp_rti_entry

Structure containing incoming route advertisements from a peer. Part of a doubly-linked list of such advertisements. This linked list is maintained in the bgpPeer structure. The BGP_RTI* macros are used to manipulate this linked list. As far as I can tell, nothing is ever stored in a peer's incoming route list (all routes are immediately dumped into the GateD routing database after policy checking).

type bgp_rto_entry

Contains a list of routes to be sent in update messages to a peer (the bgp_adv_queue, in contrast, contains the Adj-Rib-Out to the peer.). Each entry is doubly linked into a chain of bgp_rto_entry structures. The queue head of each such list is stored in the bgp_asp_list structure. Now, each bgpPeer struct contains the head of a bgp_rt_queue struct. This queue is threaded through the bgp_asp_queue struct in the bgp_asp_list. That is how the bgp_rto_entry is associated with the bgpPeer struct. The BGP_RTO* macros are used to manipulate this linked list.

type bgpg_rto_entry

This structure is similar to the bgp_rto_entry, but gets used in bgpPeerGroups instead. Unlike the bgp_rto_entry, this contains a linked list of bgp_rtinfo_entrys, which are used to store previously advertised metrics and the peers in the group to which the update should be sent. The BGP-GRTO* macros are used to manipulate this linked list. The BRT_INFO* macros are used to allocate and free these entries.

type bgp_rtinfo_entry

This structure is part of the bgpg_rto_entry and contains the last metric and as path sent out for a given advertisement. In addition, it contains a bit vector used to determine which peers in non-external groups get to receive this announcement.

type bgp_rt_queue

This structure simply contains a pointer to an AS path struct. bgp_rt_queue structures form a doubly linked list that are used both in the bgpPeer structure (for storing routes that should be advertised to an external peer) and in the bgpPeerGroup structure (for storing routes to be advertised to internal peers as well). Both these lists actually thread through the bgp_rt_queue struct contained in the body of a bgp_asp_list struct, which contains a list of bgp(g)_rto_entrys. Conceptually, this structure also forms the head of a queue of AS paths (and routes that share that AS path). The head of the queue contains a pointer to an AS path hash table of the AS paths in the list, for easy retrieval.

type bgp_asp_list

This contains a single AS path (embedded in the bgpl_rt_queue) field and a pointer to a list of bgp(g)_rto_entries, each of which contains a route to be advertised. This forms the superstructure to the list of routes to be advertised to peers in the bgpPeer and bgpPeerGroup structures.

type bgp_ifap_list

Found only in bgpPeerGroup, this is the list of local interfaces on which peers in this group are running.

type bgpPeer

Basic state block for maintaining state info about a BGP peer. Such state includes the associated task, configuration info (metrics, timers and suchlike), protocol processing state, buffers and incoming routes and outgoing routes. In addition, this contains a back pointer to the associated peer group.

type bgpPeerGroup

Basic state block for maintaining state info about gated peer groups. This includes peer list, different counters/bit maps, outgoing route queues for internal groups and so on.

type bgpProtoPeer

Temporary state block for a peer from which a connection has been received, but no OPEN message has been received. Contains the task associated with the peer and some buffer space. Replaced by the bgpPeer after successful open.

BGP module macros

macro BGPCONN_*

Defines and macros related to staggering connect() attempts to peers. Every 256 second time cycle is split up into 64 time slots. Each slot is 4 seconds wide. During a slot, a number of peer connection requests may be initiated. gated attempts to spread out the cost of connection over a number of slots. The number of connects() to be attempted during any slot is stored in the bgp_connect_slots[] array. Used mostly in the bgp_*_connect_timer() functions.

macro BGP_ADV_*

Macros to initialize, test the presence of elements, get the next element, insert and delete elements from a bgp_adv_queue. Standard doubly-linked list operations.

macro BGP_*RT*_UNLINK

Removing an element from a list of bgp_[g]rt[oi]_entry lists. Doubly-linked list removal.

macro BGP_ASPL_*

A set of macros to initialize, test for null queues, insert and delete elements from both the bgp_[g]rt[oi] entry lists and the bgp_rt_queue structures in the bgp_aspl_list structure. Recall that this structure is doubly chained: through the as paths and through the rto lists. Much of it is standard list operations.

macro BGPB_*

Each non-external group maintains a bit mask of some number of words. Each peer within this group is assigned a position in this bitmask. This position is indicated by the bgp_group_bit field in the bgpPeer structure. This set of macros helps in the manipulation of these bitmasks (e.g. computing which word and which bit within the word the peer's bit mask is in, checking if a peer's mask is set, check if two bitmaps are equal by stepping through each word constituting the bitmask etc.).

macro BGP_RTI_*

List operations for the bgp_rti_entry queues.

macro BGP_GET_*

Macros to read different sized quantities (bytes, shorts etc.) from the TCP byte stream. Also macros to extract different fields from BGP messages.

macro BGP_PUT_*

The write analogs of the BGP_GET* macros.

BGP module function definitions

function bgp_event

Indicate a BGP FSM state transition caused by a "BGP event". This change of state, the event that caused it, and the previous state and event are stored in the bgpPeer structure. An event is one of BGPEVENT_* found (typical events include BGP message receipts, timer expirations and TCP connection status changes). State is one of BGPSTATE_*.

function bgp_send

Called when a open/notify/keepalive/update message is to be sent. We get a completely formed BGP packet here and all we need to do is to send it out of the socket. If the send fails for the reason that the write may have been blocked, we spool the unset part of the message in the outbuffer associated with the bgpPeer structure and schedule the remainder of the packet to be sent later. Otherwise, we retry three times before giving up. If the message was only partially sent, we repeatedly try to send the rest of the message.

function bgp_send_open

Called either when a BGP module initiates an open or when an open message is received from a peer. In the latter case, the caller specifies the remote peer's version. Simply figure out what to put into the different header fields, form a complete packet and hand off to bgp_send().

function bgp_send_keepalive

Send a keepalive message to a peer. As above, form a complete BGP message and hand off to bgp_send().

function bgp_send_notify*

Different flavors of a notification send. Formulate a complete notification message and call bgp_send().

function bgp_pp_notify

Send an error message to a proto-peer. To recall, a protoPeer struct is a state block for those peers which have not yet reached OpenConfirm state (?). This function is called when there is an error in processing an incoming message on such a peer. Before we send this error message, we need to send an open (in case the peer hasn't yet instantiated any state about us). This function makes a best guess about the parameters (e.g version) of such an open and then sends a notify message.

function bgp_pp_notify_*

Different flavors of the bgp_pp_notify() function for different sized notification data to be sent.

function bgp_recv

This routine reads a message (or part of it) directly from a socket into the peer's buffer. This is the lowest level function in the read chain, and is called from bgp_read_message() and other functions. It first tries to make space in the input buffer for the specified amount of data, and then reads directly from the socket. It complains if a hard error occurred while reading. Funnily enough, in all cases bgp_recv is called with maxread = 0, with the caller adjusting the buffers to make sure enough can be read.

function bgp_read_message

This function is the next higher level up in the read chain. That is, in different states, a BGP peer expects to read different kinds of messages. Functions like bgp_recv_open() and bgp_pp_recv() perform these tasks. They all call bgp_read_message(). This function calls bgp_recv() to read the entire message from the socket, checks that the length of data read is at least that expected for the type of message. Also, a caller may specify upto two types of messages it expects to read: if so, this function checks the type read to see if there is a match.

function bgp_get_open

This function parses an open message. It expects the entire message to be in the peer's input buffer. This function is called from two places: bgp_recv_open() and bgp_pp_recv(). It merely checks the version, holdtime, identifier and authentication fields for correctness. It also initiates version negotiation with the peer by sending a notification with the highest version it knows, if the received version number is not known to us.

function bgp_recv_open

This function gets called directly from the task module when a BGP module is in the OpenSent state or the OpenConfirm state. In the former case, it is waiting to receive an Open message from a peer (see bgp_peer_connected(), or bgp_pp_recv()). In the latter, it is waiting to receive a keep alive or a notification (respectively an ack or nack from the peer to the Open message that it sent). Code for these two should be separated out, for cleanliness.

If the peer is in OpenSent state, we read the open message, do the initial sanity checking, then do a version negotiation for the case when we support his version number but we also support a higher version. We also check if his AS number matches the one we were configured with. Finally, we do an authentication check before doing a state transition to OpenConfirm. If the peer is in a OpenConfirm state, simply check that either a notify or a keep alive was received. In either case, check to make sure that there is a version match, else we have to re-initiate version negotiation.

function bgp_pp_recv

This function gets called directly from a task module when a BGP module has gotten a TCP connection from a peer and is waiting for an Open message to arrive. This function first parses the Open message and does the initial sanity checks. It then extracts the bgpPeerGroup structure corresponding to the address, AS number and authentication information in the Open message.

In GateD, a peer may be explicitly configured within a group through a peer clause or implicitly through the allow clause by using a network address/mask pair. In the latter case, we may have a bgpPeerGroup structure for the peer, but no corresponding peer structure. If so, we check if the group is in one of the allowed states and whether the peer's version is correct, initiation version re-negotiation otherwise. Else, we establish a bgpPeer structure for the peer. Finally, we send an Open message to the peer and transition our state to OpenConfirm.

function bgp_make_*_name

Functions to generate character-string names for BGP peers and groups, using pre-defined generation rules.

function bgp_peer_alloc

GateD's BGP module keeps a free list of bgpPeer structures in bgp_free_list. This routine allocates an element from this free list; if none is available, it malloc's some appropriately sized memory.

function bgp_peer_free

Adds a bgpPeer struct to the free list.

function bgp_peer_free_all

Free()s all the bgpPeer structs from the free list.

function bgp_peer_add

Add a bgpPeer structure to the list of peers in a bgpPeerGroup. This list contains "normal" peers, followed by "unconfigured" peers (e.g. those which are "allowed" but not explicitly configured), followed by peers which have been deleted but are awaiting delete processing. This function adds the peer to this list according to which of these groups the peer belongs. This function is called: (i) when a peer structure has to be added after receiving an Open from a proto-peer (ii) from the parser when an explicit peer clause is encountered and (iii) from bgp_init() when BGP is restarted after re-configuration.

function bgp_peer_remove

Walk down the peer list in a bgpPeerGroup and remove the entry from the list. Caller frees up memory. This is called either when we are reconfiguring a running BGP or when we are closing down a peer connection.

function bgp_group_alloc

function bgp_group_free

function bgp_group_add

function bgp_group_remove

Similar to bgp_peer_alloc(), but for bgpPeerGroup structures.

function bgp_group_add

Add the specified bgpPeerGroup structure to the end of the global list of groups bgp_groups. This function is called from bgp_conf_group_add() to link a group declared in the configuration file.

function bgp_buffer_alloc

Allocate an input buffer and link it to a buffer descriptor block (that is, a bgpBuffer structure). Initialize the descriptor block.

function bgp_buffer_free

Free the memory associated with the data buffer. This does not free the descriptor block.

function bgp_buffer_copy

Copy only the buffer descriptor. Make sure that the "to" descriptor does not have an allocated buffer.

function bgp_outbuf_alloc

The bgpOutBuffer is allocated differently from the input buffer. The descriptor is contiguous with, and precedes the data area. In addition, BGP maintains a single element cache of freed output buffers. This functions checks if there is something in the cache, else calls the task memory allocator. It then initializes the other fields of the descriptor structure.

function bgp_outbuf_free

This fills the cache. If a buffer already exists in the cache, the function zero's the entire buffer and calls the memory de-allocator.

function bgp_outbuf_free_all

Frees the cache element.

function bgp_pp_create

Called from bgp_listen_accept() when we receive a TCP connection request on our well-known socket from a potential peer. This function allocates and initializes a bgpProtoPeer structure, allocates an input buffer for the proto-peer and links it into the list of global proto-peers (bgp_protopeers).

function bgp_pp_delete

Called when any error happens in reading data from a protopeer or when any error is encountered in processing Open messages from the protopeer. The function undoes any pending timers, frees the input buffer associated with a proto-peer and unlinks the bgpProtoPeer data structure from the global list of proto-peers.

function bgp_send_reset

The task structure associated with a peer contains pointers to send/recv/accept etc. functions that should be used for that peer. This function is called to reset the corresponding send routine. Called from bgp_write_ready().

function bgp_set_flash

A protocol's flash routine initiates an immediate transference of routing updates to a specified peer. Its new_policy routine is called whenever BGP reconfigures itself or in the initialization phase. This function sets these two routines.

function bgp_reset_flash

The complement of the previous function.

function bgp_(re)set_reinit

Similar routines to the above, but for the associated reinit functions.

function bgp_recv_*

Set send/receive buffers, and socket options on the associated task.

function bgp_recv_setup

(Re)initialize the socket associated with the peer's task. Set the read routine appropriately. Initialize the buffers and other socket options using the bgp_recv_*() functions above.

function bgp_close_socket

Close the socket associated with the peer's task.

function bgp_iflist_add

function bgp_iflist_add(): Each bgpPeer structure contains the peer's interface. For some groups (e.g. Internal and Test), we need to add this interface to the interface list kept in the group structure. We do this either by creating a new bgp_ifap_list structure, or if one already exists, just upping the reference count. Called when a peer reaches established state.

function bgp_iflist_free

The opposite of the add function. Find the peer's interface in the group list and decrement the reference count. If the count has reached zero, unlink the bgp_ifap_list structure and free the corresponding storage. Called when closing a peer.

function bgp_check_ifap_policy

Called when the system is initializing or if we have reconfigured gated. Simply check if the specified interface is up or down. If up, add the interface to the group list (policy permitting), otherwise delete the interface from the group's list (XXX).

function bgp_write_flush

bgp_send() is used by the protocol processing routines to send Open/Notify etc messages. If for some reason (e.g. the write would block), these messages are not completely sent synchronously, they are spooled in the task's output buffer for a later send. bgp_write_flush(), called by bgp_write_ready(), does the task of flushing out the spooled data. It's structure is identical to bgp_send(): it attempts to write the entire spooled data, and returns immediately on soft errors (e.g. EWOULDBLOCK), and fails on hard errors.

function bgp_force_write

Called when we want to send a notify just before closing a peer's connection. If the peer's outbuffer contains a full message, this function discards it. Else, it tries to bgp_write_flush() the message and discards the message if that fails as well. Can afford to be that cavalier since we will be closing the connection anyway.

function bgp_write_ready

This routine is called from the main task loop if something can be written to the socket. The routing first tries to flush out any existing stuff in buffers, then informs the routing table module that it can now write routing updates to the socket (this is so that if some routes are queued up on a peer, they can be flushed). If at the end of that, the output buffer is empty, we reset the write routine.

function bgp_write_message

This function is called from bgp_send() and is used to spool an unsent message (or part thereof) into a peer's output buffer. It allocates the buffer space, copies the unsent message and sets the task's write routine so that the write is completed later during execution of the task's main loop.

function bgp_set_write

While the previous routine sets the task write routine to bgp_write_ready() after queueing up data, this set's the task write routine without checking whether data is queued up or not. As far as I can see, in the couple of places it is used, it needn't be...

function bgp_*_connect_timer

To avoid initiating too many connection requests all at once, BGP maintains a 64 slot event wheel. Each slot is 4 seconds long. A connection request for a peer is pseudo-randomly (actually this is a function of the address of the peer structure) set to occur some slot N on the wheel. BGP tries to schedule the request into the least crowded slot in the interval (N, N+5). These functions deal with setting, resetting or deleting connection timers.

function bgp_peer_connected

This function is invoked after a connection request to a peer succeeds (see below for how connection requests are initiated). This function initializes the local address of the BGP connection, the input buffer and sets up the send and receive buffer sizes on the TCP connection. It then sets up a timer for sending an Open message.

function bgp_connect_complete

This function is called when a connect attempt (by bgp_connect_start()) is delayed (because the socket is non-blocking). In the task main loop, when we can write to the socket, we try to get the remote address. If this is available, then we know we have succeeded and we can proceed to call bgp_peer_connected(). Otherwise, we simply wait to re-connect to the peer.

function bgp_connect_start

This is called when a connect timer expires, or when a connect has failed and we need to retry the connection. This function first allocates a socket with the appropriate options (non-blocking). It then finds out what address to use to bind the socket (this address can be specified in the config file through the config clause or through the interface specified to talk to the particular peer in internal or test groups). If the bind succeeds, the function then attempts a connect, which may return immediately because of the non-blocking nature of the socket. If so, this function sets bgp_connect_complete() to be called at the completion of the connect.

function bgp_connect_timeout

This function is called when the connect timer expires. If a peer has not already connected to us in the interim, this function calls bgp_connect_start(), otherwise it simply resets the connection timer.

function bgp_*_traffic_timer

Traffic timer expiration monitors the state of a connection. at every expiration of the traffic timer, we monitor the state of the connection. The traffic timer is initially set to holdtime if the connection isn't established, otherwise it is set to the lesser of our holdtime and a third of the peer's. These functions set, delete and initialize the traffic timer.

function bgp_traffic_timeout

This function is called when a traffic timer has exceeded. After processing, it reschedules itself. The function checks to see if a peer has been silent for more than his hold time. If so, it closes the connection. It also checks to see if we have been silent for longer than a third of our hold time, and sends a keepalive if so.

function bgp_route_timer*

To dampen route fluctuations, BGP may be configured to hold down route advertisements to external peers for a period of time. This one-shot timer used to send peer updates after the specified hold-down period. These functions set and delete the timer.

function bgp_group_rt_timer*

To dampen route fluctuations, BGP may be configured to hold down route advertisements to non-external peers for a period of time. This one-shot timer used to send peer updates after the specified hold-down period. These functions set and delete the timer.

function bgp_listen_accept

Called when a connection request is made on the BGP well-known socket. The function performs the accept, creates a task structure for the remote end, then initializes the local address and the task structure routines, creates a protoPeer structure to serve until we get an open from the peer. Finally, it sets a timer on the protoPeer.

function bgp_listen_start

Set up a the well-known BGP listen socket (bound to the BGP listen task) to listen for incoming connections. This is called at the expiration of bgp_listen_timer.

function bgp_listen_init

Allocate the bgp_listen_task and set the bgp_listen_timer to go off after a small timeout.

function bgp_listen_stop

Close the BGP listen task.

function bgp_set_peer_if

A peer belonging to a gated external BGP group or internal (i.e. one peering with members on an SMDS type network) BGP group requires an interface pointer. Any other peer belonging to a non-test group can use an interface pointer (in the absence of other information about which interface routes are carried by the IGP). This function, called when peers are initialized, initializes the interface pointers. We simply find the interface with the same address as the peer's (or the next hop towards the peer) and then set the peer's interface to that.

function bgp_find_group

Called when transiting from a protoPeer to a bgpPeer structure. Given the addresses of the endpoints of a connection to a peer, the AS numbers of the endpoints, and some authentication information, find the group to which the peer should belong. Since peers can either be specified explicitly or through the "allow" clause, check both whether an explicit peer match is found on the specified address. Otherwise, return the group whose allow clause entertains the specified peer.

function bgp_find_peer

Given a remote address, find the peer with a matching address. Used in conjunction with the find_group() function.

function bgp_find_group_by_addr

Needed when an error message is received on a protopeer and we don't know anything about the peer's AS and authentication information. Logic is similar to bgp_find_group().

function bgp_pp_timeout

function bgp_pp_timeout(): Called when a proto-peer times out (that is, no open has been received for the timeout period since the peer connected). Delete the proto-peer structure.

function bgp_peer_create

Called either from the parser to create a configured group, or when an open is received on a proto-peer for a peer that matches an allow clause. Allocate a peer structure, copy the group config information and initialize the gateway entry.

function bgp_new_peer

Called when an open in received on a proto-peer for a peer that matches an allow clause. Unlike configured peers, we need to be able to fill in the appropriate fields of the bgpPeer structure from different places. Create a peer structure, initialize its address and policy information, steal the socket from the protoPeer, delete the protoPeer structure and we are well on our way.

function bgp_use_protopeer

Called when a proto-peer is unconfigured (i.e. through the allow clause). We are called when the peer turns out to be configured. We delete the peer's connect timer, steal the socket from the proto-peer, check if our local interface address is OK, do a state transition and set a traffic timer to monitor the status of the connection.

function bgp_peer_start

In gated, a BGP connection can be configured to be passive (i.e. the module does not initiate a connection, but waits instead for the peer to do so). For non-passive peers, if no start interval is specified, we attempt to connect right away *and* set a timer for connecting later on. Otherwise, we set a connect timer.

function bgp_peer_close

Called when a fatal error is observed in a peer connection (e.g. timeouts occur or open didn't succeed and so on). Causes the session to transition to an Idle state and releases all associated resources (timers, buffers etc.). In some cases, the peer might need to be restarted (e.g. if it is a configured peer, which hasn't been explicitly deleted).

function bgp_peer_established

Called when a keep-alive is received on a session in OpenConfirm state. Mainly send out our initial routes to the session's peer.

function bgp_ifachange

Process an interface change detected during a change in configuration or during startup. The comments preceding this function are descriptive enough.

function bgp_terminate

Called when gated receives a SIGTERM. Clean up all the tasks associated with the each peer in each group (be careful to scan the peer's repeatedly from the beginning since the peer's position in group list may change when the peer is closed). Stop listening on the socket.

function bgp_group_init

Called from bgp_init(). Simply create a task associated with the group.

function bgp_peer_init

During configuration, gated first creates a skeletal peer structure for each peer. Here, called from bgp_peer_init(), we create a task for the peer, set the peer interface, and initialize the peer into the idled state.

function bgp_peer_delete

This is called, when, after a reconfiguration, the peer is no longer configured so we want to completely remove resources allocated to him (as opposed to just close()-ing him). Set the peer's delete flag and then call bgp_peer_close(). This ensures that the peer's task and associated resources are removed.

function bgp_group_delete

The group analog of the above. Delete each peer in the group, delete the task associated with the group and any other resources allocated.

function bgp_cleanup

Called before the configuration file is re-read. Run through all groups and all peers marking them for deletion. Free up interface lists, policy lists as well.

function bgp_conf_*_alloc

Allocate group and peer structure when the corresponding statements are encountered in the parser.

function bgp_conf_group_add

Given a group structure do a number of sanity checks to see if the different kinds of groups have the right kinds of configured information. If the group doesn't already exist (e.g. because we are starting anew), then we add the group to the global list. Otherwise, we copy the configuration options from the new structure to the existing structure and delete the structure we are given.

function bgp_conf_peer_add

Similar to the above code in structure. The checks to be done if a deleted peer already exists are slightly different (e.g. we need to check if some data is already queued in the existing peer structure and so on).

function bgp_init

Called when gated is initializing after startup or after a reconfiguration. This is called after the configuration file has been parsed and the group and peer structures have already been created. We also have to be careful to see if in our current incarnation, we have been configured to turn BGP off. Clean up any deleted or unconfigured peers from a previous incarnation. Then start up tasks for each peer and each group that our current incarnation is configured with.