Nalanda

September 15, 2008

Congestion Control for High Bandwidth-Delay Product Networks

Filed under: Networks — Tags: , , , — Ashwin @ 11:22 am

Citation: Katabi, D., Handley, M., and Rohrs, C. 2002. Congestion control for high bandwidth-delay product networks. SIGCOMM Comput. Commun. Rev. 32, 4 (Oct. 2002), 89-102. DOI= http://doi.acm.org/10.1145/964725.633035

Following a clean slate approach to network design, the authors propose the new eXplicit Transport Protocol, XCP. Where prior approaches attempt to build to TCP’s behaviour (e.g., signal congestion by dropping packets), XCP starts from scratch, addressing TCP’s shortcomings head-on. The principal issues with TCP that XCP attempts to remedy are its unstable and oscillatory behaviour as the delay-bandwidth product of a network increases, the inefficiency of its additive increase policy (a large number of RTTs may be required to achieve full utilization), and unfairness as flows with different RTTs contend at the same bottleneck.

XCP uses explicit notification of congestion, with an indication of the actual amount of congestion in the network, rather than just a single bit. The authors come to this approach from the conclusion that dropping a packet is a bad way to signal congestion: since the network’s objective is to get packets through, this should only be used as a last resort. Instead, a congestion header is maintained on every XCP packet, which contains the sender’s cwnd and estimated RTT, as well as a feedback field which is modified by routers as the packet flows through the network. XCP routers modify the feedback field based on their available bandwidth. XCP receivers copy these headers into their ACKs, allowing XCP senders to change their cwnd rapidly based on feedback, rather than just one packet per RTT. This approach allows efficient implementation in the routers, as no per-flow state needs to be maintained, just adjustments to feedback based on aggregate flows.

The other major innovation in XCP is the separation of utilization control from fairness control. The algorithms for both these estimate values over the average RTT, leading to responsive feedback. Utilization control determines by how much overall bandwidth should change, by computing feedback proportional to spare bandwidth, hence following a Multiplicative-Increase-Multiplicative-Decrease policy. The fairness control algorithm handles allocation of the spare bandwidth, which is done, like TCP, through an Additive-Increase-Multiplicative-Decrease policy: when feedback is positive, an equal increase is allocated to all flows, and when it is negative, the decrease is proportional to each flow’s throughput. In addition, XCP attempts to prevent stalling of convergence when efficiency is close to optimal through “bandwidth shuffling”, which forces AIMD on at least 10% of traffic; while this introduces some disturbance, it is a trade-off that helps more rapid convergence.

Most interestingly, the parameters required for XCP’s operation are tuned independent of network topology, and number of flows. With fixed parameters, it is shown that XCP has practically zero packet loss, fair allocation, good utilization, and reasonable queue size in comparison to several other protocols.

While XCP would certainly be of benefit on a controlled network, it is an open question as to how well it might function in the environment of the Internet where hosts (and even routers) cannot be trusted to conform to the protocol. From their discussion of implementing policing agents (which may themselves become bottlenecks, as they have to maintain per-flow state) at the edges of a network, it seems apparent that the authors are aware of this issue. I would have liked to see more discussion of this point, as it seems central to XCP’s viability. More generally, I wonder whether it is even possible to design a protocol that functions to XCP’s level of efficiency in the absence of trust.

September 11, 2008

Core-Stateless Fair Queueing: Achieving Approximately Fair Bandwidth Allocations in High Speed Networks

Filed under: Networks — Tags: , , — Ashwin @ 7:00 am

Citation: Stoica, I., Shenker, S., and Zhang, H. 1998. Core-stateless fair queueing: achieving approximately fair bandwidth allocations in high speed networks. SIGCOMM Comput. Commun. Rev. 28, 4 (Oct. 1998), 118-130. DOI= http://doi.acm.org/10.1145/285243.285273

This paper presents Core-Stateless Fair Queueing (CSFQ) an approach to implementing fairness in queueing, but without the expensive per-flow state that algorithms such as Fair Queueing require. Rather, the authors take advantage of network topology, implementing a distributed mechanism which can assure a reasonable approximation of fairness within an island of routers. To do this, they distinguish between edge routers and core routers in the island, having each perform different roles. Edge routers maintain per-flow state and insert per-flow information into packet headers, while core routers use this information in combination with aggregate knowledge of overall traffic flows to make probabilistic decisions about whether or not to drop packets. This approach recalls Chiu and Jain’s paper, as it seems to be a relaxation of the distributedness criterion that they introduced, which allows a separation of concerns between edge and core routers, and but also therefore requires some knowledge of network topology.

An initial prototype of an algorithm for estimation of the rates of individual flows, and the fair share of bandwidth on the output link, is presented: first as a bit-wise fluid model, and then generalised to handle packets by incorporating packet lengths. This algorithm is provided as a starting point; as the authors say, the overall intent of the paper is to present the concept of this new architecture. Using this algorithm, packets entering the island are labelled with their flow rates by the edge routers before they are forwarded on to the core routers. Particularly interesting was the discussion of how any estimation algorithm can be exploited, and how this one is vulnerable in the short run to flows attempting to gain more than their fair share of bandwidth, but is robust over longer time scales.

Inherent to this architecture is an assumption of trust within the island of routers. The authors suggest that islands would typically be restricted to the bounds of an ISP’s network, but given a degree of trust between ISPs, could extend across the bounds of cooperating ISPs’ networks. This does beg the question, though, of managing the complexity under such conditions; as we saw in our discussion of BGP, both the potential for misconfiguration as well as market conditions may well militate against such cooperation.

CSFQ is compared against a variety of other queueing algorithms - including FIFO, RED, FRED and DRR - in simulations to evaluate its efficacy. Of these, FRED and DRR attempt to provide a degree of fairness, while FIFO and RED do not. As the authors point out, CSFQ strikes a balance between these algorithms: edge routers have a complexity comparable to FRED, while core routers are more comparable to RED. The results indicate that CSFQ is closer in fairness to DRR, while its performs similarly to, and even outperforms, FRED in some cases.

The paper concludes with a discussion of why the authors believe fair bandwidth allocations are of overriding importance. They point out that end-to-end congestion control, such as that used in TCP, while successful, is based on the assumptions that flows are cooperative, and that the congestion control algorithms implemented by flows exhibit homogeneous behaviour. The authors’ assumption in writing this paper is that mechanisms are required to address the case where flows do not honour these assumptions, which they term the “unfriendly flow problem”. In their evaluation of approaches to identifying unfriendly flows, they conclude that the allocation approach, which they adopt, is the only currently (c. 1998) viable one. It would be interesting to revisit their analysis of fire-hose flows in today’s context of real-time VOIP and live video streams.

I enjoyed reading this paper for its analysis of the tradeoffs between engineered and ideal solutions, i.e., how some knowledge of network topology can help build a more realizable solution. I especially liked it as part of the sequence that we’ve already read, as it helps provides a unified picture of a particular trajectory of research; an insight into the process of how old research problems are addressed, and new ones uncovered.

September 10, 2008

Analysis and Simulation of a Fair Queueing Algorithm

Filed under: Networks — Tags: , , — Ashwin @ 9:34 pm

Citation: Demers, A., Keshav, S., and Shenker, S. 1989. Analysis and simulation of a fair queueing algorithm. In Symposium Proceedings on Communications Architectures & Protocols (Austin, Texas, United States, September 25 - 27, 1989). SIGCOMM ‘89. ACM, New York, NY, 1-12. DOI= http://doi.acm.org/10.1145/75246.75248

This paper presents and evaluates a technique for managing congestion control at gateways, rather than at sources. The paper’s focus is on queueing algorithms, since it is these which determine how packets from different sources interact with one another, and therefore contribute directly towards the “collective behaviour” of the collection of sources and gateways in a network.

The paper describes the fair queueing (FQ) algorithm, comparing it to the first-come-first-served (FCFS) queueing algorithm. FCFS pushes out packets on a first-come-first-served basis regardless of packet length, or the rate of packet ingress, resulting in greedy sources being favoured. FQ provides an elegant remedy to the problem of packet length: estimate the time at which a packet will finish transmission based on its arrival time and packet length, and dispatch packets with earlier estimated finish times. This allows for ordering of packets by time (like FCFS), while simultaneously favouring small packets over large. Additionally, FQ includes a mechanism to provide more promptness to sources using less than their fair share of bandwidth.

Through a series of simulations, the authors illustrate the differences in behaviour between FCFS and FQ gateways. These simulations illustrate two principal problems when using FCFS: unfair allocation of bandwidth (greedy sources are favoured), and a penalty applied to sources implementing smarter flow control mechanisms (i.e., that described by Jacobson and Karels). FQ, on the other hand, maintains fair allocation even under congested conditions, penalizing bandwidth hogs, and favours smart flow control sources over generic sources.

There may be one condition under which FQ will fail. What if a source is sending a number of small packets at a very high rate? For instance, a DoS SYN flood. From my reading of this paper, it seems as though FQ may fail to identify this kind of source as being badly behaved.

I found this paper really useful, especially as it addressed some of the questions that I raised with respect to other papers we’ve been reading: how well would these congestion avoidance algorithms function in the presence of badly-behaved sources? The FQ algorithm addresses this problem head-on, penalizing badly-behaved sources at the gateway, and favouring well-behaved ones. I was particularly intrigued by their reference to Nagle’s game theoretic work on how gateways might be redesigned to encourage good source behaviour.

September 8, 2008

Congestion Avoidance and Control

Filed under: Networks — Tags: , , — Ashwin @ 8:42 pm

Citation: Jacobson, V. and Karels, M.J. 1988. Congestion avoidance and control. In Symposium Proceedings on Communications Architectures and Protocols (Stanford, California, United States, August 16 - 18, 1988). V. Cerf, Ed. SIGCOMM ‘88. ACM, New York, NY, 314-329. DOI= http://doi.acm.org/10.1145/52324.52356

This is a fascinating paper for its examination of the practicalities of protocol implementation. As the authors point out, there can be flaws in seemingly ‘obvious’ implementation choices, even while these implementations conform to protocol specifications. The paper starts from the observation that even while the principle of ‘conservation of packets’ would imply a robust system, the Internet of the time was not robust under congestion. The authors proceed to examine the three distinct problems that could cause packet conservation to fail.

The first problem - that of connections not reaching equilibrium - is addressed using a ’slow start’ mechanism, whereby the minimum of the receiver’s advertised window and a locally maintained congestion window is sent. For each ack, the congestion window is increased by one packet, which effectively results in a fairly rapid start, as the congestion window increases exponentially. This approach prevents an initial flood of packets, which may in practice overwhelm intermediate low bandwidth links.

While this approach would make sense for a network with limited routes, I’m not clear as to how well this approach would function when packets are routed along different paths. I may just be missing something here, but if, for instance, there were two distinct paths, each with with distinct load characteristics, it seems possible that the congestion window would never reach a stable state.

The second problem that the authors address is that of senders injecting new packets before old packets exit. As they point out, given a correct protocol implementation, this can only be due to a failure in the sender’s retransmit timer, which was all too common due to incorrect anticipation of variances in round trip time (RTT) under load, and fixed values for constants used in estimation of RTT. The authors provide an alternate method for estimating RTT variation, which provides much more stable performance under load.

Finally, the authors address their third problem: equilibrium not reachable due to resource limits along the path. They do this by putting the algorithms from the last paper into practice, but using a signalling mechanism that derives from network behaviour under congestion - timeout of packets - rather than by setting a new bit in the packet headers. I find this approach particularly compelling since it provides gateways with the means to penalize badly behaved clients - drop their packets - rather than just suggesting a backoff using the message header flag that Chiu and Jain proposed.

It was interesting reading these two papers together, with references from this paper to earlier papers by Jain and Chiu; I get a sense of these ideas as being developed as the result of a complex back-and-forth between theoretical and applied computer science.

Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks

Filed under: Networks — Tags: , , — Ashwin @ 12:03 pm

Citation: Chiu, D. and Jain, R. 1989. Analysis of the increase and decrease algorithms for congestion avoidance in computer networks. Comput. Netw. ISDN Syst. 17, 1 (Jun. 1989), 1-14. DOI= http://dx.doi.org/10.1016/0169-7552(89)90019-6

In this paper, Chiu and Jain present a decentralised mechanism for congestion avoidance in networks, by signaling overload/underload status from network resources to clients using a binary scheme: overload=1, underload=0.The choice of a single-bit binary representation simultaneously simplifies the implementation in network resources, and allows easy embedding of this data into existing protocols.

An essential component of any decentralised mechanism of this sort is the algorithm chosen for clients to respond to overload/underload messages from network resources, to maintain each individual client’s load at levels that are optimal for the system as a whole. The authors analyse a variety of linear controls for this purpose, and later show how this analysis can be extended to non-linear controls. The criteria for evaluation are efficiency (how close a resource’s load is to its optimal load), fairness (how well resources are distributed across clients), distributedness (an assumption that no client has awareness of any data beyond the minimum communicated by resources), and convergence (how quickly the distributed system moves from initial state to the goal state). The system does not converge to a stable steady state, as the feedback is only binary; rather, it oscillates around an equilibrium, making responsiveness (the time to reach equilibrium) and smoothness (the size of oscillations) critical parameters for evaluating convergence.

A variety of linear algorithms are evaluated, choosing either additive or multiplicative algorithms for increase and decrease of client load. The authors show that an additive increase, and multiplicative decrease, of load by clients produces a system optimised for the chosen criteria. Intuitively speaking, an additive increase cautiously places more load on the system, while a multiplicative decrease rapidly pulls back when load is too high.

In a discussion of non-linear algorithms, it is shown how they offer far greater flexibility for directing the system towards fairness. However, they also make the choice of the right scaling factors more difficult, and make the system as a whole much more sensitive to these factors, thereby reducing the robustness of control.

Finally, the authors put forward a number of practical considerations, in a sense looking at human factors in the design of these algorithms. As they point out, the algorithm needs to be, as far as possible, independent of hardware and software scales or parameters, as in any real complex system, human administrators will have to configure these values. Following the discussion of non-linear algorithms, it seems apparent that linear controls are more realistic choices in practice.

While these algorithms would certainly play out well under the controlled environment of the ARPANET, I am curious to know how resilient they are when operating at the scale of the Internet. I also wonder how robust they are in the face of badly behaved clients; how greedy does a client need to be and/or what proportion of clients need to behave badly in order to destabilise networks at local and/or global levels?

This was a really interesting paper, both for the insights it offers into the process of protocol design, and also to understand the thinking of designer in integrating the practical considerations of complex deployed systems and human factors. It was also a fascinating examination of how a simple binary scheme at the level of individual messages can capture and manage a wide variety of system behaviours.

September 3, 2008

On Inferring Autonomous System Relationships in the Internet

Filed under: Networks — Tags: , , , — Ashwin @ 9:22 pm

Citation: Gao, L. 2001. On inferring autonomous system relationships in the internet. IEEE/ACM Trans. Netw. 9, 6 (Dec. 2001), 733-745. DOI= http://dx.doi.org/10.1109/90.974527

The relationships between autonomous systems on the Internet can be near impossible to deduce accurately, as they are, in essence, a series of bilateral commercial agreements. In this paper, Gao presents a technique to infer autonomous system relationships by looking for patterns in the ASN paths in BGP routing tables.

To do this, Gao starts with some general assumptions of rational behaviour for autonomous systems. From these assumptions, she derives the Selective Export Rule, which defines these behaviours as mechanisms by which autonomous systems selectively export routes to their peers, customers and siblings. She then derives patterns that we might expect to see in BGP routing tables, based on the Selective Export Rule: Valley-Free, Downhill Path, Uphill Path, Maximal Downhill Path and Maximal Uphill Path.

Gao combines these with the idea that if we consider the graph of all autonomous systems, those with higher degree are more likely to be larger (and hence more likely to peer) than those with a smaller degree. The result is a set of algorithms, which when applied to publicly available BGP routing tables, and compared to internal AT&T routing tables, provide surprisingly high rates of accuracy in their ability to predict different kinds of relationships. While the accuracy rates vary greatly - from as high as 100% for customer relationships to to 25% for sibling relationships - there is still immense value to be had from these figures alone.

Additionally, as a side-effect of this research, heuristics were developed whereby misconfigured routers could be identified, which would no doubt be of great benefit to network administrators tasked with managing large BGP deployments.

I like this paper, a lot, both for its content, and its presentation. Gao is clear about the assumptions she makes, and the manner in which she builds on them. She is also clear about the limitations she sees to the data that was available for analysis. I found the actual meat of the paper, the heuristic mechanism for identifying different kinds of relationships, incredibly useful, and exciting.

Interdomain Internet Routing

Filed under: Networks — Tags: , , , — Ashwin @ 11:53 am

Citation: Lecture notes from Hari Balakrishnan and Nick Feamster.

In these notes, the authors first present an idealised view of internetworking, and then go on to demonstrate how commercial realities actually shape network topologies on the Internet, through the application of network policies expressed and enforced using the mechanisms made available by the Border Gateway Protocol.

The authors discuss the economics of transit v. peering arrangements, demonstrating how economic incentives drive network operators to prioritise the import of routes: first from customers, then from peers, and finally from transit providers. This discussion is compelling, but I would have liked to see more about how this is actually playing out. For instance, I have read that bandwidth scarcity is principally a problem at the edges of the Internet; the backbones, on the other hand, have excess capacity, and continue to upgrade in anticipation of demand. This would seem to imply variances in the patterns of transit/peering arrangements at least across the core and edges of the network, and probably in more complex ways across Tier 1, Tier 2 and Tier 3 networks operating in the same geography.

BGP is described in some detail, with a discussion of techniques for managing eBGP and iBGP deployments. It continues to surprise and fascinate me how much of the TCP/IP suite is based on an assumption of good behaviour across the different administrations comprising an internet, especially in a protocol such as BGP that is clearly intended for the political end of allowing different administrations to co-exist, rather than any technical purpose.

For instance, a topic that is not discussed in these notes is that of transitive trust; I wonder this came to be built into BGP? I can understand why this was so in ARPANET and NSFNET, but it does seem strange that this strategy was adopted for a protocol intended to “allow cooperation under competitive circumstances”, and that this persisted even after NSFNET was privatised, especially since BGP-4 was standardised in 1995, just after this transition.

September 1, 2008

The Design Philosophy of the DARPA Internet Protocols

Filed under: Networks — Tags: , , — Ashwin @ 6:33 pm

Citation: Clark, D. 1988. The design philosophy of the DARPA internet protocols. In Symposium Proceedings on Communications Architectures and Protocols (Stanford, California, United States, August 16 - 18, 1988). V. Cerf, Ed. SIGCOMM ‘88. ACM, New York, NY, 106-114. DOI= http://doi.acm.org/10.1145/52324.52336

The intent of this paper appears to be to communicate the design philosophy of the Internet protocols to at least three distinct audiences: implementors of the Internet protocols, architects creating new protocols and/or extending the Internet protocols, and the ISO committee standardising the OSI protocol stack. Clark starts with an outline of the goals of the Internet architecture, and then goes to analyse how a particular ordering of importance of these goals gave rise to a particular design for the Internet architecture.

The fundamental goal of the Internet architecture was to provide the means to interconnect disparate networks to provide services spanning these utilities. An additional priority was the issue of managing the interconnection of networks under separate administrations. Store and forward packet switching was chosen as the solution to this goal, since it was well understood, and already in use in the ARPANET networks initially interconnected.

Several second level goal are listed, with survivability, support for multiple types of service, and support for a variety of networks being the most important. The least important goal listed is accountability, which, as Clark notes, would likely be the most important goal for a commercial network.

Clark introduces the mechanism of “fate-sharing” as the means for survivability, whereby hosts on a network are responsible for maintaining state information for ongoing conversations. He also relates how TCP was split into TCP and IP in order to provide IP as a more fundamental building block from which applications could build general purpose services. As he notes later, the IP datagram also functions as the “minimum network service assumption”, which has made it easier for wide variety of networks to interconnect using the Internet protocols. These decisions seem to reflect an end-to-end approach to design, showing how function can be decomposed and placed to create a more flexible architecture.

In specific advice to protocol implementors, Clark relates how much more difficult it can be provide guidelines for building adequate performance, against the relatively easier task of correct protocol implementation.

Finally, Clark discusses the choice of flow control based on byte number, rather than packet number, in TCP. He argues that this choice provides efficiency both for connecting to networks with small packet sizes, and for unifying messages for retransmission when necessary.

Clark suggests early in the paper that packet switching was chosen in large part as it was the technique employed on the original networks being interconnected. What if a circuit switched mechanism was chosen instead? What would the Internet architecture look like under that circumstance? Is packet switching an inevitable choice? For instance, Donald Davies came to the idea of a packet switched, rather than circuit switched, network in the early 1960s to enable the bursty communications that commercial time sharing services require, while Paul Baran used the same principles to design a network for survivability. On the other hand, one could imagine that modern commercial networks, with their relatively stable and reliable topologies, might prefer the use of virtual circuits.

As with the last reading, this paper is invaluable for the insights it offers into the thinking of the architects of the Internet, providing a historical perspective with which we can understand protocol design decisions.

« Newer Posts

Powered by WordPress