Nalanda

September 15, 2008

Congestion Control for High Bandwidth-Delay Product Networks

Filed under: Networks — Tags: , , , — Ashwin @ 11:22 am

Citation: Katabi, D., Handley, M., and Rohrs, C. 2002. Congestion control for high bandwidth-delay product networks. SIGCOMM Comput. Commun. Rev. 32, 4 (Oct. 2002), 89-102. DOI= http://doi.acm.org/10.1145/964725.633035

Following a clean slate approach to network design, the authors propose the new eXplicit Transport Protocol, XCP. Where prior approaches attempt to build to TCP’s behaviour (e.g., signal congestion by dropping packets), XCP starts from scratch, addressing TCP’s shortcomings head-on. The principal issues with TCP that XCP attempts to remedy are its unstable and oscillatory behaviour as the delay-bandwidth product of a network increases, the inefficiency of its additive increase policy (a large number of RTTs may be required to achieve full utilization), and unfairness as flows with different RTTs contend at the same bottleneck.

XCP uses explicit notification of congestion, with an indication of the actual amount of congestion in the network, rather than just a single bit. The authors come to this approach from the conclusion that dropping a packet is a bad way to signal congestion: since the network’s objective is to get packets through, this should only be used as a last resort. Instead, a congestion header is maintained on every XCP packet, which contains the sender’s cwnd and estimated RTT, as well as a feedback field which is modified by routers as the packet flows through the network. XCP routers modify the feedback field based on their available bandwidth. XCP receivers copy these headers into their ACKs, allowing XCP senders to change their cwnd rapidly based on feedback, rather than just one packet per RTT. This approach allows efficient implementation in the routers, as no per-flow state needs to be maintained, just adjustments to feedback based on aggregate flows.

The other major innovation in XCP is the separation of utilization control from fairness control. The algorithms for both these estimate values over the average RTT, leading to responsive feedback. Utilization control determines by how much overall bandwidth should change, by computing feedback proportional to spare bandwidth, hence following a Multiplicative-Increase-Multiplicative-Decrease policy. The fairness control algorithm handles allocation of the spare bandwidth, which is done, like TCP, through an Additive-Increase-Multiplicative-Decrease policy: when feedback is positive, an equal increase is allocated to all flows, and when it is negative, the decrease is proportional to each flow’s throughput. In addition, XCP attempts to prevent stalling of convergence when efficiency is close to optimal through “bandwidth shuffling”, which forces AIMD on at least 10% of traffic; while this introduces some disturbance, it is a trade-off that helps more rapid convergence.

Most interestingly, the parameters required for XCP’s operation are tuned independent of network topology, and number of flows. With fixed parameters, it is shown that XCP has practically zero packet loss, fair allocation, good utilization, and reasonable queue size in comparison to several other protocols.

While XCP would certainly be of benefit on a controlled network, it is an open question as to how well it might function in the environment of the Internet where hosts (and even routers) cannot be trusted to conform to the protocol. From their discussion of implementing policing agents (which may themselves become bottlenecks, as they have to maintain per-flow state) at the edges of a network, it seems apparent that the authors are aware of this issue. I would have liked to see more discussion of this point, as it seems central to XCP’s viability. More generally, I wonder whether it is even possible to design a protocol that functions to XCP’s level of efficiency in the absence of trust.

Random Early Detection Gateways for Congestion Avoidance

Filed under: Networks — Tags: , , — Ashwin @ 9:58 am

Citation: Floyd, S. and Jacobson, V. 1993. Random early detection gateways for congestion avoidance. IEEE/ACM Trans. Netw. 1, 4 (Aug. 1993), 397-413. DOI= http://dx.doi.org/10.1109/90.251892

Floyd and Jacobson’s work on RED gateways is motivated by the observation that TCP is designed to detect congestion only after a packet is dropped at a gateway, which would typically result in large, mostly full, queues at gateways, contributing to increased average delay in the network. While RED’s congestion notification mechanisms (which are referred to as “marking”) are similar to those available in TCP (drop a packet) and explicit notification as proposed by Chiu and Jain (1989), RED detects congestion differently. Instead of reporting congestion when it occurs (i.e., queues approaching full), RED computes average queue size, and probabilistically “marks” packets when the the average queue size exceeds a a certain threshold value.

This approach should allow a RED gateway to maintain smaller queues on average, while also responding well to bursty connections, especially TCP’s Slow Start phase. Additionally, the probabilistic marking of packets ensures avoidance of global synchronization, where multiple TCP connections respond to congestion notification by concurrently reverting to Slow Start.

Two distinct algorithms are used to satisfy RED’s design goals, one for congestion detection, and another for deciding which connections to notify when congestion is detected. These function differently across three different ranges of operation, depending on the average queue size, computed as an exponential weighted moving average, with a low-pass filter. When average queue size is under a minimum threshold, all packets are passed through. When it is over a maximum threshold, all packets are marked. In between these two thresholds, packets are marked with a probability proportional to the share of bandwidth that a connection is consuming. I found the discussion of approaches to randomization in order to avoid clustering of marked packets (and hence global synchronization problems) particularly interesting.

RED makes two major assumptions: that FIFO scheduling will continue to be useful and prevalent, and that the host transport protocols are responsive to congestion notifications. From discussions in class, it would seem that the former assumption is no longer an issue with the current generation of routers. The latter assumption, that of cooperative transport protocols, is explicitly stated in the paper initially, but also refuted later, when the authors state that one of their design goals is to maintain proper function even in the presence of uncooperative transport protocols. In my reading, it seems to me an open question as to how well RED would function in the presence of uncooperative transport protocols.

An interesting characteristic of RED that I would have liked to see more discussion of is the fact that it does not attempt to be fair. Rather, it allows connections to take more than their fair share of bandwidth for brief periods, as long as average queue sizes are under the maximum threshold. While this is dependent on tuning the thresholds and weight, it seems a sensible approach - get as much through as possible - which is, after all, the function of the network.

September 8, 2008

Congestion Avoidance and Control

Filed under: Networks — Tags: , , — Ashwin @ 8:42 pm

Citation: Jacobson, V. and Karels, M.J. 1988. Congestion avoidance and control. In Symposium Proceedings on Communications Architectures and Protocols (Stanford, California, United States, August 16 - 18, 1988). V. Cerf, Ed. SIGCOMM ‘88. ACM, New York, NY, 314-329. DOI= http://doi.acm.org/10.1145/52324.52356

This is a fascinating paper for its examination of the practicalities of protocol implementation. As the authors point out, there can be flaws in seemingly ‘obvious’ implementation choices, even while these implementations conform to protocol specifications. The paper starts from the observation that even while the principle of ‘conservation of packets’ would imply a robust system, the Internet of the time was not robust under congestion. The authors proceed to examine the three distinct problems that could cause packet conservation to fail.

The first problem - that of connections not reaching equilibrium - is addressed using a ’slow start’ mechanism, whereby the minimum of the receiver’s advertised window and a locally maintained congestion window is sent. For each ack, the congestion window is increased by one packet, which effectively results in a fairly rapid start, as the congestion window increases exponentially. This approach prevents an initial flood of packets, which may in practice overwhelm intermediate low bandwidth links.

While this approach would make sense for a network with limited routes, I’m not clear as to how well this approach would function when packets are routed along different paths. I may just be missing something here, but if, for instance, there were two distinct paths, each with with distinct load characteristics, it seems possible that the congestion window would never reach a stable state.

The second problem that the authors address is that of senders injecting new packets before old packets exit. As they point out, given a correct protocol implementation, this can only be due to a failure in the sender’s retransmit timer, which was all too common due to incorrect anticipation of variances in round trip time (RTT) under load, and fixed values for constants used in estimation of RTT. The authors provide an alternate method for estimating RTT variation, which provides much more stable performance under load.

Finally, the authors address their third problem: equilibrium not reachable due to resource limits along the path. They do this by putting the algorithms from the last paper into practice, but using a signalling mechanism that derives from network behaviour under congestion - timeout of packets - rather than by setting a new bit in the packet headers. I find this approach particularly compelling since it provides gateways with the means to penalize badly behaved clients - drop their packets - rather than just suggesting a backoff using the message header flag that Chiu and Jain proposed.

It was interesting reading these two papers together, with references from this paper to earlier papers by Jain and Chiu; I get a sense of these ideas as being developed as the result of a complex back-and-forth between theoretical and applied computer science.

Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks

Filed under: Networks — Tags: , , — Ashwin @ 12:03 pm

Citation: Chiu, D. and Jain, R. 1989. Analysis of the increase and decrease algorithms for congestion avoidance in computer networks. Comput. Netw. ISDN Syst. 17, 1 (Jun. 1989), 1-14. DOI= http://dx.doi.org/10.1016/0169-7552(89)90019-6

In this paper, Chiu and Jain present a decentralised mechanism for congestion avoidance in networks, by signaling overload/underload status from network resources to clients using a binary scheme: overload=1, underload=0.The choice of a single-bit binary representation simultaneously simplifies the implementation in network resources, and allows easy embedding of this data into existing protocols.

An essential component of any decentralised mechanism of this sort is the algorithm chosen for clients to respond to overload/underload messages from network resources, to maintain each individual client’s load at levels that are optimal for the system as a whole. The authors analyse a variety of linear controls for this purpose, and later show how this analysis can be extended to non-linear controls. The criteria for evaluation are efficiency (how close a resource’s load is to its optimal load), fairness (how well resources are distributed across clients), distributedness (an assumption that no client has awareness of any data beyond the minimum communicated by resources), and convergence (how quickly the distributed system moves from initial state to the goal state). The system does not converge to a stable steady state, as the feedback is only binary; rather, it oscillates around an equilibrium, making responsiveness (the time to reach equilibrium) and smoothness (the size of oscillations) critical parameters for evaluating convergence.

A variety of linear algorithms are evaluated, choosing either additive or multiplicative algorithms for increase and decrease of client load. The authors show that an additive increase, and multiplicative decrease, of load by clients produces a system optimised for the chosen criteria. Intuitively speaking, an additive increase cautiously places more load on the system, while a multiplicative decrease rapidly pulls back when load is too high.

In a discussion of non-linear algorithms, it is shown how they offer far greater flexibility for directing the system towards fairness. However, they also make the choice of the right scaling factors more difficult, and make the system as a whole much more sensitive to these factors, thereby reducing the robustness of control.

Finally, the authors put forward a number of practical considerations, in a sense looking at human factors in the design of these algorithms. As they point out, the algorithm needs to be, as far as possible, independent of hardware and software scales or parameters, as in any real complex system, human administrators will have to configure these values. Following the discussion of non-linear algorithms, it seems apparent that linear controls are more realistic choices in practice.

While these algorithms would certainly play out well under the controlled environment of the ARPANET, I am curious to know how resilient they are when operating at the scale of the Internet. I also wonder how robust they are in the face of badly behaved clients; how greedy does a client need to be and/or what proportion of clients need to behave badly in order to destabilise networks at local and/or global levels?

This was a really interesting paper, both for the insights it offers into the process of protocol design, and also to understand the thinking of designer in integrating the practical considerations of complex deployed systems and human factors. It was also a fascinating examination of how a simple binary scheme at the level of individual messages can capture and manage a wide variety of system behaviours.

Powered by WordPress