Nalanda

September 23, 2008

Fast Switched Backplane for a Gigabit Switched Router

Filed under: Networks — Tags: , , — Ashwin @ 9:04 am

Citation: N. McKeown, “Fast Switched Backplane for a Gigabit Switched Router,” Business Communications Review.

This paper examines the shifts in a range of technologies needed for faster routers, with a particular focus on using switched, rather than shared, backplanes.

It opens with a useful overview of the different functions in a router: datapath and control. As the author points out, datapath functions form the broad area to investigate for performance enhancement, as these are concerned with every packet that crosses the router. Datapath functions include the forwarding decision, the backplane, and the output link scheduler. Early routers maintained much of this functionality in software, run on a single CPU, with a shared bus. Since then, trends has been to push these functions into hardware, use multiple CPUs to enable parallelism, move function out of CPUs into hardware, and a move from a shared bus to a crossbar switch in the backplane, which allows multiple linecards to communicate simultaneously.

The use of a crossbar switch increases performance in two ways: point-to-point links for physical connections from linecards, allowing for very high speeds, and support for multiple simultaneous transactions, increasing the aggregate bandwidth.

The discussion of fixed v. variable length packets was interesting, and certainly helped clear up some of the questions I had while reading the last paper. Fixed length packets are preferable in the router, since scheduling becomes simple: assign fixed time slots for transferring each fixed length packet across the backplane. However, it still seems as though the output stage will have to maintain some kind of state to re-sequence and reassemble the fixed length packets; it was not clear from the readings how these decisions might affect performance.

In spite of all their advantages, crossbars are still susceptible to several types of blocking: head-of-line (HOL), input, and output blocking. HOL blocking occurs when using FIFO queueing, as the scheduler processes the queue by the packet at the head, which means that packets destined for other output queues will be kept waiting; this can be remedied through the use of VOQs, as detailed in our other reading for today. Input and output blocking are inevitable, as each input and output can process only one packet at a time, leading to unpredictable performance due to random packet delay. These are addressed using prioritization of packets, and speedup of the crossbar switch (although significant amounts of speedup are not practical).

Multicast becomes relatively easy to implement in a crossbar, as it just requires the selective closing of multiple crosspoints at the same time. The notion of fanout is introduced, which is the set of outputs to which an input packet needs to be transmitted. The iSLIP algorithm is presented for scheduling packets across the switch, and extended to the ESLIP algorithm to support both unicast and multicast traffic.

I would have liked to see more discussion of the iSLIP and ESLIP scheduling algorithms; however, I think these two papers worked well together, as I got a good sense of the construction of a router at theoretical and practical levels in different ways from each.

September 22, 2008

Scaling Internet Routers Using Optics

Filed under: Networks — Tags: , , — Ashwin @ 10:01 am

Citation: Keslassy, I., Chuang, S., Yu, K., Miller, D., Horowitz, M., Solgaard, O., and McKeown, N. 2003. Scaling internet routers using optics. In Proceedings of the 2003 Conference on Applications, Technologies, Architectures, and Protocols For Computer Communications (Karlsruhe, Germany, August 25 - 29, 2003). SIGCOMM ‘03. ACM, New York, NY, 189-200. DOI= http://doi.acm.org/10.1145/863955.863978

The authors are concerned with the problems faced by the current generation of Internet routers, especially in the face of expected increases in Internet traffic. Network operators’ central offices were already reportedly full at the time this paper was written, requiring that new router designs either reduce their power density, and/or consume less power. Router configurations have shifted from single-rack to multi-rack to reduce power density. However, both these configurations have their own problems: multi-rack systems suffer from unpredictable performance and poor scalability, while single-rack systems use sub-optimal centralized schedulers in practice, which will not scale with increases in ports or line rate.

These concerns limit the capacity of a rack to about 2.5 Tb/s, constrained by power consumption. In this paper, the authors present a novel architecture using optics, which they estimate will be able to scale to 100 Tb/s with practically zero power requirements, and guaranteed throughput.

They do this by extending the Load-Balanced switch architecture described by C-S. Chang et al., to address the problems inherent in this architecture: requirement for a rapidly configuring switch fabric, possibilities of mis-sequenced packets, vulnerability to pathological traffic patterns, and inability to deal with failed linecards.

The Load-Balanced switch architecture breaks a router down into three parts: input and output switching stages, separated by a middle stage of buffers maintaining Virtual Output Queues, which smooth traffic from inputs to outputs. The input stage acts as a load balancer spreading traffic over the VOQs, while the output stage serves each VOQ at a fixed rate. This allows the architecture to function entirely based on local knowledge, with no need for a centralized scheduler, or knowledge of state of all queues, resulting in greater reliability, and a simpler implementation. I had a little difficulty understanding the justifications for splitting input packets into smaller fixed-length packets; while I can understand why this is necessary, I wonder how much it would add to processing requirements. Could it be that this could become a bottleneck?

The problem of packet mis-sequencing is addressed using the Full Ordered Frames First (FOFF) scheme. This scheme processes input queues holding at least N packets (followed by any non-empty queues) in a round-robin fashion, reading at most N packets at a time, and transferring each packet to a different intermediate queue in the middle stage. This approach bounds the amount of mis-sequencing of packets, and corrects what mis-sequencing occurs with a re-sequencing buffer in the output stage. FOFF is resilient to pathological traffic patterns due to the manner in which it spreads flows across the intermediate linecards.

The problem of failed linecards is addressed by splitting input and output linecards into G groups, with M intermediate GxG switches, which each maintain a fixed configuration, rotating the matching of inputs to outputs in sequence. This seems, in essence, to be a hardwired load-balancing scheme; if I understand right, extra capacity needs to be added to the intermediate switches proportional to the maximum number of linecards that need to switched out at any given time.

There were segments of this paper that I had trouble grasping completely, particularly the discussion of hybrid electro-optical switches and optical switches, since I lack almost any familiarity with the EE side of EECS. Regardless, it was really interesting as a way to think through how some of the scheduling algorithms we’ve discussed would be built as hardware implementations.

September 18, 2008

Supporting Real-Time Applications in an Integrated Services Packet Network: Architecture and Mechanism

Filed under: Networks — Tags: , , — Ashwin @ 10:52 am

Citation: Clark, D. D., Shenker, S., and Zhang, L. 1992. Supporting real-time applications in an Integrated Services Packet Network: architecture and mechanism. In Conference Proceedings on Communications Architectures & Protocols (Baltimore, Maryland, United States, August 17 - 20, 1992). D. Oran, Ed. SIGCOMM ‘92. ACM, New York, NY, 14-26. DOI= http://doi.acm.org/10.1145/144179.144199

It was odd reading this paper alongside Shenker’s. Although it is from three years earlier, it seems much more modern, almost prescient in its analysis. Clark et al., propose a unified architecture that provides different kinds of service - predicted and guaranteed - to real-time applications, in addition to supporting traditional datagram applications. In particular, a class of real-time applications called play-back applications is considered, in which data is buffered at the receiver up to a playback point (which can be thought of as a delivery deadline), and then played on the client, to reduce jitter. Assuming that the majority of future real-time applications will be of this nature, the authors use the requirements of play-back applications to drive their architecture: sensitivity to delivery delay, information about estimated delay to set playback points, indifference to delivery so long as it is before the playback point, and limited tolerance to loss.

Analysing real-time applications along two axes - tolerance to brief interruptions and rigidity of delay requirements - the authors show that the dominant classes of such applications would be intolerant/rigid and tolerant/adaptive. Intolerant/rigid applications need guaranteed service with a priori limits on delay bounds, while tolerant/adaptive applications only need predicted service, as they will dynamically adjust their playback points to suit network conditions.The discussion of the packet scheduling algorithm for this architecture is motivated by these considerations, as it points to a need to separately consider the isolation of flows (to minimise effects of one flow on another), and sharing of bandwidth (to allow operation in a unified network).

A counter-intuitive result is that FIFO queueing works better than WFQ to satisfy the sharing requirement. Where WFQ penalizes sources with bursts of packets, FIFO passes them through as a unit, providing better control of delay bounds for real-time applications, but penalizing other flows as a result. Conversely, WFQ provides better isolation than FIFO, as it minimises the effects of bursty sources on other flows. The authors propose a new scheme, FIFO+, which allows sharing of jitter across multiple hops. A new header field maintaining the difference between the delay for a packet, and the average delay for its aggregate class, is maintained by switches, allowing insertion of a packet into the queue to minimize delay.

The unified algorithm that is presented is based on timestamp-based WFQ at the highest level, to serve flows that required guaranteed service, along with a pseudo-flow encapsulating predicted service flows and datagram flows, which is managed under a FIFO+ regime. This pseudo-flow maintains multiple classes of service, shifting jitter from higher classes of predicted service to lower, with datagram flows being at the lowest level, since they are most resistant to jitter. Admission control is used within each class to maintain delay bounds. This strikes a good balance, as guaranteed flows and the different classes of predicted service and datagram flows are isolated from one another, with jitter shared by progressively shifting it down to the lowest class.

The network fixes the clock rate for each guaranteed service flow, with no further checks on behaviour. Predicted service flows must declare the parameters for their service characterization (token bucket rate and size), and request a maximum delay and tolerable loss rate, to allow the network to assign the flow to an appropriate aggregate class, and also to make admission control decisions based on whether or not a source with these declared parameters can be carried. To maintain good shared behaviour, the network enforces conformance to the service characterization for predicted service flows.

I really enjoyed the discussion in this paper, especially the separation of concerns for sharing and isolation. Most of all, I liked the FIFO+ scheme, as it recalls XCP, but at the same time has no requirements of trust on endpoints; the assumption that switches trust one another seems far more reasonable.

Fundamental Design Issues for the Future Internet

Filed under: Networks — Tags: , , — Ashwin @ 9:16 am

Citation: S. Shenker, Fundamental design issues for the future Internet, IEEE J. Selected Areas Comm., 13 (1995)

I found this paper really useful for two reasons. First, Shenker provides a view of what the future Internet might look like from 1995, anticipating widespread use of real-time applications. Second, and more importantly perhaps, he gives us a set of abstractions by which to judge network architectures based on different the demands imposed by elastic and real-time applications.

The applications in common use on the Internet at the time were “elastic”, in the sense that they were tolerant of delays in transmission, with design decisions that were motivated by the packet-switched nature of the Internet. It is, however, much harder to meet the demands of real-time applications, such as voice and video streams, using a packet-switched network, as they are much less tolerant to delay and loss.

Shenker demonstrates that a unified network infrastructure would provide the highest utility. To allow different kinds of applications to coexist, he considers overprovisioning, and modifications of the basic Internet architecture to provide a variety of service models beyond just best-effort service, which may also depend on admission control. The general intuition at work here is that aligning services with application requirements results in better overall network utilization.

Two models for assigning service to applications are presented: an implicit model where the network attempts to guess an application’s requirements, and an explicit model, where an application can declare its requirements to the network. The implicit model can be introduced with no changes to current service models, but may not be able to service applications that it does not know about, or variations in an application. The explicit model, on the other hand, requires a new service model (and therefore changes to the Internet architecture), stable service offerings, and also raises interesting questions of incentives: how can applications be incentivized to seek the levels of service they actually require, rather than always seeking the highest available level? Shenker suggests that they only way to address this concern is through variable pricing for different classes of service, though raising the concern that such variable pricing might change the nature of the gift economy prevalent on the Internet.

The most interesting section of this paper, for me, was the discussion of utility curves for different classes of applications: elastic, hard real-time, delay-adaptive and rate-adaptive. The shifts between concave and convex shapes, and the inflection points at which these occur, provide a compelling framework for thinking about the design of different classes of service in a unified network.

September 15, 2008

Congestion Control for High Bandwidth-Delay Product Networks

Filed under: Networks — Tags: , , , — Ashwin @ 11:22 am

Citation: Katabi, D., Handley, M., and Rohrs, C. 2002. Congestion control for high bandwidth-delay product networks. SIGCOMM Comput. Commun. Rev. 32, 4 (Oct. 2002), 89-102. DOI= http://doi.acm.org/10.1145/964725.633035

Following a clean slate approach to network design, the authors propose the new eXplicit Transport Protocol, XCP. Where prior approaches attempt to build to TCP’s behaviour (e.g., signal congestion by dropping packets), XCP starts from scratch, addressing TCP’s shortcomings head-on. The principal issues with TCP that XCP attempts to remedy are its unstable and oscillatory behaviour as the delay-bandwidth product of a network increases, the inefficiency of its additive increase policy (a large number of RTTs may be required to achieve full utilization), and unfairness as flows with different RTTs contend at the same bottleneck.

XCP uses explicit notification of congestion, with an indication of the actual amount of congestion in the network, rather than just a single bit. The authors come to this approach from the conclusion that dropping a packet is a bad way to signal congestion: since the network’s objective is to get packets through, this should only be used as a last resort. Instead, a congestion header is maintained on every XCP packet, which contains the sender’s cwnd and estimated RTT, as well as a feedback field which is modified by routers as the packet flows through the network. XCP routers modify the feedback field based on their available bandwidth. XCP receivers copy these headers into their ACKs, allowing XCP senders to change their cwnd rapidly based on feedback, rather than just one packet per RTT. This approach allows efficient implementation in the routers, as no per-flow state needs to be maintained, just adjustments to feedback based on aggregate flows.

The other major innovation in XCP is the separation of utilization control from fairness control. The algorithms for both these estimate values over the average RTT, leading to responsive feedback. Utilization control determines by how much overall bandwidth should change, by computing feedback proportional to spare bandwidth, hence following a Multiplicative-Increase-Multiplicative-Decrease policy. The fairness control algorithm handles allocation of the spare bandwidth, which is done, like TCP, through an Additive-Increase-Multiplicative-Decrease policy: when feedback is positive, an equal increase is allocated to all flows, and when it is negative, the decrease is proportional to each flow’s throughput. In addition, XCP attempts to prevent stalling of convergence when efficiency is close to optimal through “bandwidth shuffling”, which forces AIMD on at least 10% of traffic; while this introduces some disturbance, it is a trade-off that helps more rapid convergence.

Most interestingly, the parameters required for XCP’s operation are tuned independent of network topology, and number of flows. With fixed parameters, it is shown that XCP has practically zero packet loss, fair allocation, good utilization, and reasonable queue size in comparison to several other protocols.

While XCP would certainly be of benefit on a controlled network, it is an open question as to how well it might function in the environment of the Internet where hosts (and even routers) cannot be trusted to conform to the protocol. From their discussion of implementing policing agents (which may themselves become bottlenecks, as they have to maintain per-flow state) at the edges of a network, it seems apparent that the authors are aware of this issue. I would have liked to see more discussion of this point, as it seems central to XCP’s viability. More generally, I wonder whether it is even possible to design a protocol that functions to XCP’s level of efficiency in the absence of trust.

Random Early Detection Gateways for Congestion Avoidance

Filed under: Networks — Tags: , , — Ashwin @ 9:58 am

Citation: Floyd, S. and Jacobson, V. 1993. Random early detection gateways for congestion avoidance. IEEE/ACM Trans. Netw. 1, 4 (Aug. 1993), 397-413. DOI= http://dx.doi.org/10.1109/90.251892

Floyd and Jacobson’s work on RED gateways is motivated by the observation that TCP is designed to detect congestion only after a packet is dropped at a gateway, which would typically result in large, mostly full, queues at gateways, contributing to increased average delay in the network. While RED’s congestion notification mechanisms (which are referred to as “marking”) are similar to those available in TCP (drop a packet) and explicit notification as proposed by Chiu and Jain (1989), RED detects congestion differently. Instead of reporting congestion when it occurs (i.e., queues approaching full), RED computes average queue size, and probabilistically “marks” packets when the the average queue size exceeds a a certain threshold value.

This approach should allow a RED gateway to maintain smaller queues on average, while also responding well to bursty connections, especially TCP’s Slow Start phase. Additionally, the probabilistic marking of packets ensures avoidance of global synchronization, where multiple TCP connections respond to congestion notification by concurrently reverting to Slow Start.

Two distinct algorithms are used to satisfy RED’s design goals, one for congestion detection, and another for deciding which connections to notify when congestion is detected. These function differently across three different ranges of operation, depending on the average queue size, computed as an exponential weighted moving average, with a low-pass filter. When average queue size is under a minimum threshold, all packets are passed through. When it is over a maximum threshold, all packets are marked. In between these two thresholds, packets are marked with a probability proportional to the share of bandwidth that a connection is consuming. I found the discussion of approaches to randomization in order to avoid clustering of marked packets (and hence global synchronization problems) particularly interesting.

RED makes two major assumptions: that FIFO scheduling will continue to be useful and prevalent, and that the host transport protocols are responsive to congestion notifications. From discussions in class, it would seem that the former assumption is no longer an issue with the current generation of routers. The latter assumption, that of cooperative transport protocols, is explicitly stated in the paper initially, but also refuted later, when the authors state that one of their design goals is to maintain proper function even in the presence of uncooperative transport protocols. In my reading, it seems to me an open question as to how well RED would function in the presence of uncooperative transport protocols.

An interesting characteristic of RED that I would have liked to see more discussion of is the fact that it does not attempt to be fair. Rather, it allows connections to take more than their fair share of bandwidth for brief periods, as long as average queue sizes are under the maximum threshold. While this is dependent on tuning the thresholds and weight, it seems a sensible approach - get as much through as possible - which is, after all, the function of the network.

September 11, 2008

Core-Stateless Fair Queueing: Achieving Approximately Fair Bandwidth Allocations in High Speed Networks

Filed under: Networks — Tags: , , — Ashwin @ 7:00 am

Citation: Stoica, I., Shenker, S., and Zhang, H. 1998. Core-stateless fair queueing: achieving approximately fair bandwidth allocations in high speed networks. SIGCOMM Comput. Commun. Rev. 28, 4 (Oct. 1998), 118-130. DOI= http://doi.acm.org/10.1145/285243.285273

This paper presents Core-Stateless Fair Queueing (CSFQ) an approach to implementing fairness in queueing, but without the expensive per-flow state that algorithms such as Fair Queueing require. Rather, the authors take advantage of network topology, implementing a distributed mechanism which can assure a reasonable approximation of fairness within an island of routers. To do this, they distinguish between edge routers and core routers in the island, having each perform different roles. Edge routers maintain per-flow state and insert per-flow information into packet headers, while core routers use this information in combination with aggregate knowledge of overall traffic flows to make probabilistic decisions about whether or not to drop packets. This approach recalls Chiu and Jain’s paper, as it seems to be a relaxation of the distributedness criterion that they introduced, which allows a separation of concerns between edge and core routers, and but also therefore requires some knowledge of network topology.

An initial prototype of an algorithm for estimation of the rates of individual flows, and the fair share of bandwidth on the output link, is presented: first as a bit-wise fluid model, and then generalised to handle packets by incorporating packet lengths. This algorithm is provided as a starting point; as the authors say, the overall intent of the paper is to present the concept of this new architecture. Using this algorithm, packets entering the island are labelled with their flow rates by the edge routers before they are forwarded on to the core routers. Particularly interesting was the discussion of how any estimation algorithm can be exploited, and how this one is vulnerable in the short run to flows attempting to gain more than their fair share of bandwidth, but is robust over longer time scales.

Inherent to this architecture is an assumption of trust within the island of routers. The authors suggest that islands would typically be restricted to the bounds of an ISP’s network, but given a degree of trust between ISPs, could extend across the bounds of cooperating ISPs’ networks. This does beg the question, though, of managing the complexity under such conditions; as we saw in our discussion of BGP, both the potential for misconfiguration as well as market conditions may well militate against such cooperation.

CSFQ is compared against a variety of other queueing algorithms - including FIFO, RED, FRED and DRR - in simulations to evaluate its efficacy. Of these, FRED and DRR attempt to provide a degree of fairness, while FIFO and RED do not. As the authors point out, CSFQ strikes a balance between these algorithms: edge routers have a complexity comparable to FRED, while core routers are more comparable to RED. The results indicate that CSFQ is closer in fairness to DRR, while its performs similarly to, and even outperforms, FRED in some cases.

The paper concludes with a discussion of why the authors believe fair bandwidth allocations are of overriding importance. They point out that end-to-end congestion control, such as that used in TCP, while successful, is based on the assumptions that flows are cooperative, and that the congestion control algorithms implemented by flows exhibit homogeneous behaviour. The authors’ assumption in writing this paper is that mechanisms are required to address the case where flows do not honour these assumptions, which they term the “unfriendly flow problem”. In their evaluation of approaches to identifying unfriendly flows, they conclude that the allocation approach, which they adopt, is the only currently (c. 1998) viable one. It would be interesting to revisit their analysis of fire-hose flows in today’s context of real-time VOIP and live video streams.

I enjoyed reading this paper for its analysis of the tradeoffs between engineered and ideal solutions, i.e., how some knowledge of network topology can help build a more realizable solution. I especially liked it as part of the sequence that we’ve already read, as it helps provides a unified picture of a particular trajectory of research; an insight into the process of how old research problems are addressed, and new ones uncovered.

September 10, 2008

Analysis and Simulation of a Fair Queueing Algorithm

Filed under: Networks — Tags: , , — Ashwin @ 9:34 pm

Citation: Demers, A., Keshav, S., and Shenker, S. 1989. Analysis and simulation of a fair queueing algorithm. In Symposium Proceedings on Communications Architectures & Protocols (Austin, Texas, United States, September 25 - 27, 1989). SIGCOMM ‘89. ACM, New York, NY, 1-12. DOI= http://doi.acm.org/10.1145/75246.75248

This paper presents and evaluates a technique for managing congestion control at gateways, rather than at sources. The paper’s focus is on queueing algorithms, since it is these which determine how packets from different sources interact with one another, and therefore contribute directly towards the “collective behaviour” of the collection of sources and gateways in a network.

The paper describes the fair queueing (FQ) algorithm, comparing it to the first-come-first-served (FCFS) queueing algorithm. FCFS pushes out packets on a first-come-first-served basis regardless of packet length, or the rate of packet ingress, resulting in greedy sources being favoured. FQ provides an elegant remedy to the problem of packet length: estimate the time at which a packet will finish transmission based on its arrival time and packet length, and dispatch packets with earlier estimated finish times. This allows for ordering of packets by time (like FCFS), while simultaneously favouring small packets over large. Additionally, FQ includes a mechanism to provide more promptness to sources using less than their fair share of bandwidth.

Through a series of simulations, the authors illustrate the differences in behaviour between FCFS and FQ gateways. These simulations illustrate two principal problems when using FCFS: unfair allocation of bandwidth (greedy sources are favoured), and a penalty applied to sources implementing smarter flow control mechanisms (i.e., that described by Jacobson and Karels). FQ, on the other hand, maintains fair allocation even under congested conditions, penalizing bandwidth hogs, and favours smart flow control sources over generic sources.

There may be one condition under which FQ will fail. What if a source is sending a number of small packets at a very high rate? For instance, a DoS SYN flood. From my reading of this paper, it seems as though FQ may fail to identify this kind of source as being badly behaved.

I found this paper really useful, especially as it addressed some of the questions that I raised with respect to other papers we’ve been reading: how well would these congestion avoidance algorithms function in the presence of badly-behaved sources? The FQ algorithm addresses this problem head-on, penalizing badly-behaved sources at the gateway, and favouring well-behaved ones. I was particularly intrigued by their reference to Nagle’s game theoretic work on how gateways might be redesigned to encourage good source behaviour.

September 8, 2008

Congestion Avoidance and Control

Filed under: Networks — Tags: , , — Ashwin @ 8:42 pm

Citation: Jacobson, V. and Karels, M.J. 1988. Congestion avoidance and control. In Symposium Proceedings on Communications Architectures and Protocols (Stanford, California, United States, August 16 - 18, 1988). V. Cerf, Ed. SIGCOMM ‘88. ACM, New York, NY, 314-329. DOI= http://doi.acm.org/10.1145/52324.52356

This is a fascinating paper for its examination of the practicalities of protocol implementation. As the authors point out, there can be flaws in seemingly ‘obvious’ implementation choices, even while these implementations conform to protocol specifications. The paper starts from the observation that even while the principle of ‘conservation of packets’ would imply a robust system, the Internet of the time was not robust under congestion. The authors proceed to examine the three distinct problems that could cause packet conservation to fail.

The first problem - that of connections not reaching equilibrium - is addressed using a ’slow start’ mechanism, whereby the minimum of the receiver’s advertised window and a locally maintained congestion window is sent. For each ack, the congestion window is increased by one packet, which effectively results in a fairly rapid start, as the congestion window increases exponentially. This approach prevents an initial flood of packets, which may in practice overwhelm intermediate low bandwidth links.

While this approach would make sense for a network with limited routes, I’m not clear as to how well this approach would function when packets are routed along different paths. I may just be missing something here, but if, for instance, there were two distinct paths, each with with distinct load characteristics, it seems possible that the congestion window would never reach a stable state.

The second problem that the authors address is that of senders injecting new packets before old packets exit. As they point out, given a correct protocol implementation, this can only be due to a failure in the sender’s retransmit timer, which was all too common due to incorrect anticipation of variances in round trip time (RTT) under load, and fixed values for constants used in estimation of RTT. The authors provide an alternate method for estimating RTT variation, which provides much more stable performance under load.

Finally, the authors address their third problem: equilibrium not reachable due to resource limits along the path. They do this by putting the algorithms from the last paper into practice, but using a signalling mechanism that derives from network behaviour under congestion - timeout of packets - rather than by setting a new bit in the packet headers. I find this approach particularly compelling since it provides gateways with the means to penalize badly behaved clients - drop their packets - rather than just suggesting a backoff using the message header flag that Chiu and Jain proposed.

It was interesting reading these two papers together, with references from this paper to earlier papers by Jain and Chiu; I get a sense of these ideas as being developed as the result of a complex back-and-forth between theoretical and applied computer science.

Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks

Filed under: Networks — Tags: , , — Ashwin @ 12:03 pm

Citation: Chiu, D. and Jain, R. 1989. Analysis of the increase and decrease algorithms for congestion avoidance in computer networks. Comput. Netw. ISDN Syst. 17, 1 (Jun. 1989), 1-14. DOI= http://dx.doi.org/10.1016/0169-7552(89)90019-6

In this paper, Chiu and Jain present a decentralised mechanism for congestion avoidance in networks, by signaling overload/underload status from network resources to clients using a binary scheme: overload=1, underload=0.The choice of a single-bit binary representation simultaneously simplifies the implementation in network resources, and allows easy embedding of this data into existing protocols.

An essential component of any decentralised mechanism of this sort is the algorithm chosen for clients to respond to overload/underload messages from network resources, to maintain each individual client’s load at levels that are optimal for the system as a whole. The authors analyse a variety of linear controls for this purpose, and later show how this analysis can be extended to non-linear controls. The criteria for evaluation are efficiency (how close a resource’s load is to its optimal load), fairness (how well resources are distributed across clients), distributedness (an assumption that no client has awareness of any data beyond the minimum communicated by resources), and convergence (how quickly the distributed system moves from initial state to the goal state). The system does not converge to a stable steady state, as the feedback is only binary; rather, it oscillates around an equilibrium, making responsiveness (the time to reach equilibrium) and smoothness (the size of oscillations) critical parameters for evaluating convergence.

A variety of linear algorithms are evaluated, choosing either additive or multiplicative algorithms for increase and decrease of client load. The authors show that an additive increase, and multiplicative decrease, of load by clients produces a system optimised for the chosen criteria. Intuitively speaking, an additive increase cautiously places more load on the system, while a multiplicative decrease rapidly pulls back when load is too high.

In a discussion of non-linear algorithms, it is shown how they offer far greater flexibility for directing the system towards fairness. However, they also make the choice of the right scaling factors more difficult, and make the system as a whole much more sensitive to these factors, thereby reducing the robustness of control.

Finally, the authors put forward a number of practical considerations, in a sense looking at human factors in the design of these algorithms. As they point out, the algorithm needs to be, as far as possible, independent of hardware and software scales or parameters, as in any real complex system, human administrators will have to configure these values. Following the discussion of non-linear algorithms, it seems apparent that linear controls are more realistic choices in practice.

While these algorithms would certainly play out well under the controlled environment of the ARPANET, I am curious to know how resilient they are when operating at the scale of the Internet. I also wonder how robust they are in the face of badly behaved clients; how greedy does a client need to be and/or what proportion of clients need to behave badly in order to destabilise networks at local and/or global levels?

This was a really interesting paper, both for the insights it offers into the process of protocol design, and also to understand the thinking of designer in integrating the practical considerations of complex deployed systems and human factors. It was also a fascinating examination of how a simple binary scheme at the level of individual messages can capture and manage a wide variety of system behaviours.

Older Posts »

Powered by WordPress