Congestion Control for High Bandwidth-Delay Product Networks
Citation: Katabi, D., Handley, M., and Rohrs, C. 2002. Congestion control for high bandwidth-delay product networks. SIGCOMM Comput. Commun. Rev. 32, 4 (Oct. 2002), 89-102. DOI= http://doi.acm.org/10.1145/964725.633035
Following a clean slate approach to network design, the authors propose the new eXplicit Transport Protocol, XCP. Where prior approaches attempt to build to TCP’s behaviour (e.g., signal congestion by dropping packets), XCP starts from scratch, addressing TCP’s shortcomings head-on. The principal issues with TCP that XCP attempts to remedy are its unstable and oscillatory behaviour as the delay-bandwidth product of a network increases, the inefficiency of its additive increase policy (a large number of RTTs may be required to achieve full utilization), and unfairness as flows with different RTTs contend at the same bottleneck.
XCP uses explicit notification of congestion, with an indication of the actual amount of congestion in the network, rather than just a single bit. The authors come to this approach from the conclusion that dropping a packet is a bad way to signal congestion: since the network’s objective is to get packets through, this should only be used as a last resort. Instead, a congestion header is maintained on every XCP packet, which contains the sender’s cwnd and estimated RTT, as well as a feedback field which is modified by routers as the packet flows through the network. XCP routers modify the feedback field based on their available bandwidth. XCP receivers copy these headers into their ACKs, allowing XCP senders to change their cwnd rapidly based on feedback, rather than just one packet per RTT. This approach allows efficient implementation in the routers, as no per-flow state needs to be maintained, just adjustments to feedback based on aggregate flows.
The other major innovation in XCP is the separation of utilization control from fairness control. The algorithms for both these estimate values over the average RTT, leading to responsive feedback. Utilization control determines by how much overall bandwidth should change, by computing feedback proportional to spare bandwidth, hence following a Multiplicative-Increase-Multiplicative-Decrease policy. The fairness control algorithm handles allocation of the spare bandwidth, which is done, like TCP, through an Additive-Increase-Multiplicative-Decrease policy: when feedback is positive, an equal increase is allocated to all flows, and when it is negative, the decrease is proportional to each flow’s throughput. In addition, XCP attempts to prevent stalling of convergence when efficiency is close to optimal through “bandwidth shuffling”, which forces AIMD on at least 10% of traffic; while this introduces some disturbance, it is a trade-off that helps more rapid convergence.
Most interestingly, the parameters required for XCP’s operation are tuned independent of network topology, and number of flows. With fixed parameters, it is shown that XCP has practically zero packet loss, fair allocation, good utilization, and reasonable queue size in comparison to several other protocols.
While XCP would certainly be of benefit on a controlled network, it is an open question as to how well it might function in the environment of the Internet where hosts (and even routers) cannot be trusted to conform to the protocol. From their discussion of implementing policing agents (which may themselves become bottlenecks, as they have to maintain per-flow state) at the edges of a network, it seems apparent that the authors are aware of this issue. I would have liked to see more discussion of this point, as it seems central to XCP’s viability. More generally, I wonder whether it is even possible to design a protocol that functions to XCP’s level of efficiency in the absence of trust.