Nalanda

November 18, 2008

Data-oriented networking and DTNs

Filed under: Networks — Tags: , , — Ashwin @ 1:27 pm

Tolia, N., Kaminsky, M., Andersen, D. G., and Patil, S. 2006. An architecture for internet data transfer. In Proceedings of the 3rd Conference on Networked Systems Design & Implementation - Volume 3 (San Jose, CA, May 08 - 10, 2006). USENIX Association, Berkeley, CA, 19-19.

This paper presents one very simple, and powerful idea: that data transfer is a common activity across many classes of protocols, and so may be offered as a common service. Protocols may perform content negotiation as necessary, and delegate data transfer to the DOT service. This allows for reuse and innovation at the data transfer layer, with no need for protocols to reinvent the wheel, as it were.

The measurements of performance for the different DOT plugins was interesting, but not compelling; especially that for the USB device. All that this really shows us is that DOT achieves its goal of supporting multiple transports. The most interesting metric, for me, was that a single graduate student was able to add DOT support to Postfix in less than a week, which speaks volumes for the simplicity of the protocol. I especially like the attention to adoption, that DOT functions as an optional protocol extension, rather than a requirement.

Fall, K. 2003. A delay-tolerant network architecture for challenged internets. In Proceedings of the 2003 Conference on Applications, Technologies, Architectures, and Protocols For Computer Communications (Karlsruhe, Germany, August 25 - 29, 2003). SIGCOMM ‘03. ACM, New York, NY, 27-34. DOI=http://doi.acm.org/10.1145/863955.863960

The discussion of the assumptions that TCP/IP makes of the link layer (end-to-end paths, low RTT and drop rate) and of the characteristics of “challenged” networks, were alone worth the price of admission for this paper. To list a few of the latter: high latencies, predictable interruption, asymmetric data rates, non-existence of end-to-end routing paths, lack of continuous power or adequate memory.

I particularly liked the point that Fall makes in justifying this work, that there may exist conventional networks with Internet connectivity only available via a challenged network. The use of challenged networks for transit capabilities could provide significant benefits.

I thought the initial presentation was interesting, the use of an asynchronous transfer mechanism, but as the paper goes on, it seems like it’s trying to do too much. I can see that DTN could be useful to create guidelines for the design of protocol gateways. I got the most out of this paper in reading it as a thought piece on protocol design.

However, it sounds trying to get DTN functioning as a viable end-to-end overlay on the Internet sounds like it would be more trouble than it’s worth. As we’ve seen in our readings, it’s hard enough to get one set of protocols working, and harder yet to bridge disparate environments; I’m not convinced that the One True Protocol solution is workable.

November 13, 2008

Internet Measurement

Filed under: Networks — Tags: , , — Ashwin @ 2:12 pm

Paxson, V. 1997. End-to-end Internet packet dynamics. SIGCOMM Comput. Commun. Rev. 27, 4 (Oct. 1997), 139-152. DOI=http://doi.acm.org/10.1145/263109.263155

Paxson measures TCP transfers between 35 Internet sites to gather his data, with particular attention to dynamics of both sender and receiver, unlike many prior studies which only study sender behaviour. In doing so, he is able to study the end-to-end asymmetry in paths for individual transfers, and also deduce interesting characteristics of routers and links along these paths.

He also importantly draws our attention to several ways in which the effects of TCP may be disentangled from those of the network. One of the most interesting to me, for instance, was a reminder to consider that loss rate measurement for ACKs is very different from that for data packets; TCP adjusts data packet rates in response to loss, which illustrates how we could end up measuring an effect of TCP, rather than of the underlying network path.

Another interesting consideration was in the use of packet pairs for measurement. As the network evolves, packets may not traverse the same end-to-end link: they may be switched along different paths at a router. To deal with this problem, Paxson introduces the notion of “packet bunch modes”, where packets are sent in bunches of varying sizes to account for the different paths that they may traverse.

This was really deep and interesting paper that will take several more reads before I manage to absorb it completely.

R. Fonseca, G. Porter, R. H. Katz, S. Shenker, and I. Stoica. 2007. X-trace: A pervasive network tracing framework. In 4th USENIX Symposium on Networked Systems Design & Implementation, pp. 271-284. [Online]. Available: http://www.usenix.org/events/nsdi07/tech/fonseca.html

X-Trace provides the means to trace network requests across different services and layers of a network, by associating each request with a unique identifier. Protocols need to be modified, if required, to handle the extra X-Trace metadata.

Network elements need to support only two simple primitives - pushDown() and pushNext() - which are used to update request metadata to indicate it being pushed down a layer (e.g., HTTP to TCP), or across a layer (e.g., web browser to web server). Network elements also need to log reports based on this metadata to a persistent store.

Using the stored metadata, a tree may be reconstructed to show the sequence of requests, across, as well as up and down, network layers. I could imagine that this could potentially be extended to also capture logs within an application.

I like this approach for its simplicity, as it requires very little for instrumentation, and is easy to support in protocols that already allow for extension (e.g., HTTP headers). The challenge, of course, as the authors point out, would be to drive adoption to the point where it is sufficiently widespread that X-Trace reports could become useful across multiple administrative domains.

November 6, 2008

Middleboxes and Indirection

Filed under: Networks — Tags: , , — Ashwin @ 1:05 pm

Citations:

Walfish, M., Stribling, J., Krohn, M., Balakrishnan, H., Morris, R., and Shenker, S. 2004. Middleboxes no longer considered harmful. In Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation - Volume 6 (San Francisco, CA, December 06 - 08, 2004). Operating Systems Design and Implementation. USENIX Association, Berkeley, CA, 15-15

Stoica, I., Adkins, D., Zhuang, S., Shenker, S., and Surana, S. 2002. Internet indirection infrastructure. SIGCOMM Comput. Commun. Rev. 32, 4 (Oct. 2002), 73-86. DOI=http://doi.acm.org/10.1145/964725.63303

Both these papers build on our recent readings to explore new addressing and routing mechanisms that would allow easier deployment of existing network elements (such as middleboxes), and enable a wide variety of modern network services. The approaches outlined in these papers - Delegation Oriented Architecture (DOA) and Internet Indirection Infrastructure (i3) - have some features in common: a reliance on a new kind of host identifier to be embedded in packets, the use of DHTs as address lookup services, and implementation as an overlay network to provide the necessary function and at the same time allow them to coexist with the current Internet architecture.The principal difference between these approaches is perhaps in their use of identifiers: where DOA maps identifiers to individual hosts, i3 uses identifiers to map to potentially multiple hosts.

The authors of DOA make the point that although reviled, middleboxes provide essential network services; we should therefore ask how they might be better integrated into the Internet architecture without violating the fundamental tenets that every Internet entity has a unique network identifier and that network elements should not not process packets that are not addressed to them. DOA packets may specify a sequence of hosts to which the packet must be routed, allowing middleboxes to exist at locations other than the topological “middle”. With this abstraction in place, packets are source routed across the specified sequence of hosts before reaching the receiver. Users may adaptively select new services that they wish to process their packets, and outsource them accordingly.

i3 requires receivers to register “triggers” against common identifiers, indicating that packets to an identifiers should be forwarded to them. A mobile host may update its triggers to reflect its new network location, with no need for senders to be aware of this change. Multicast groups may be formed by multiple hosts registering triggers for the same identifier. The most interesting application of i3, to me, was for anycast, where the hosts in the anycast group maintain triggers with identical prefixes; senders form identifiers with a matching prefix, and the request is routed to the trigger with the longest matching prefix. Such services may provide load balancing by having senders generate random identifier suffixes, or location awareness by having receivers and senders encode their location into the identifier suffix. As with DOA, i3 provides a means by which packets may be source routed - an “identifier stack” - to enable service composition.

Each of these proposals is put forward to address fundamental mismatches between current network architecture, and the uses to which this architecture is being put. DOA deals with the problem of middleboxes, and how these might be better integrated into the network architecture, while i3 deals with the problem of network services which do not follow the point-to-point abstraction of Internet routing (multicast, anycast and mobility).

November 4, 2008

DNS Scalability

Filed under: Networks — Tags: , , — Ashwin @ 1:58 pm

Citations:

Mockapetris, P. and Dunlap, K. J. 1988. Development of the domain name system. SIGCOMM Comput. Commun. Rev. 18, 4 (Aug. 1988), 123-133. DOI=http://doi.acm.org/10.1145/52325.52338

Jung, J., Sit, E., Balakrishnan, H., and Morris, R. 2001. DNS performance and the effectiveness of caching. In Proceedings of the 1st ACM SIGCOMM Workshop on internet Measurement (San Francisco, California, USA, November 01 - 02, 2001). IMW ‘01. ACM, New York, NY, 153-167. DOI=http://doi.acm.org/10.1145/505202.505223

These two papers provide interesting views into the process of design of DNS, and of actual measurements of those design choices.

DNS was originally created to address problems with the HOSTS.TXT approach to maintaining host name to IP address mappings: the technical problem that the file was growing too large with increasing numbers of hosts, and the administrative problem that it was becoming difficult to maintain a single master file as the number of independent administrations on the network increased. To address these problems, a hierarchical system was created, with different servers accounting for different sections of the DNS tree. Administrations could maintain authoritative servers for their own domains, while higher level servers, leading up to the root servers, could provide pointers to tie the different administrations together.

It was interesting to see how many of the problems mentioned in the first paper were social, rather than technical: the difficulty of introducing new classes (which seems to me unrelated to DNS itself), the incentives for transitioning HOSTS.TXT-based to DNS, explicit requirements for redundancy in all DNS servers (which seems unachievable at Internet-scale), following sample settings word-for-word.

The first paper presents caching as one of the successes of DNS, in the context of the “unexpectedly bad performance” of the Internet of the time. Without caching, the DNS system may have been doomed to failure, making wider adoption more difficult. The second paper calls caching into question, specifically with reference to the predominant web traffic on the networks sampled.

As the second paper observes, most wide area traffic is web traffic, where a cluster of requests to a server are preceded by a single lookup. Since the distribution of names is Zipf-like, the usefulness of caching for A records is limited. The evaluation in this paper suggests that small groups of clients, of the order of 25 or so, provide the benefits of caching close to providing a common cache across all clients. However, since HTTP 1.1 is now fairly widely deployed, it seems likely that there the correspondence of TCP to DNS flows will be closer to 1:1, which begs the question whether caching of A records provides any utility today.

The authors also show that caching of NS records provides important performance benefits, as it reduces the number of referrals per lookup, and also reduces the load on the root and gTLD servers. This make intuitive sense; I would imagine the hierarchy of DNS servers to be relatively stable, which would lend itself well to caching with large TTL values.

October 28, 2008

Resilient Overlay Networks

Filed under: Networks — Tags: , , , — Ashwin @ 11:11 am

Citation: Andersen, D., Balakrishnan, H., Kaashoek, F., and Morris, R. 2001. Resilient overlay networks. SIGOPS Oper. Syst. Rev. 35, 5 (Dec. 2001), 131-145. DOI=http://doi.acm.org/10.1145/502059.502048

In order to operate at the scale of the Internet, the Internet’s current interdomain routing protocol, BGP-4, aggregates routes, but at the same time hides details of topology and traffic conditions, which leads to time intervals of the order of tens of minutes (or even hours) for BGP’s recovery mechanisms to converge on a consistent routing table. This paper presents the Resilient Overlay Network (RON) as a remedy to BGP’s problems.

A RON consists of a set of nodes residing in a variety of routing domains, which maintain a full mesh of routes between themselves using a link state routing protocol. Link metrics are collected using a combination of active periodic probing, and passive listening. This allows RON to route around failures in the order of seconds, a significant improvement over BGP. For a RON of 50 nodes, an approximate overhead of 30 kbps will be required with the implementation described in the paper; this does not seem excessive considering the capacity of current Internet links.

An interesting feature of RON is an explicit interaction with applications for the purposes of deciding metrics for route selection. By default, RON offers latency, loss and throughput metrics, though this set may be extended to include application-specific metrics such as jitter. This allows applications to choose trade-offs between different metrics for route selection, and in effect enables application-specific routing. In this context, the paper introduces the useful notion of a performance failure, where a link does not provide adequate performance to the application using it.

RON also puts in place rich routing policies, which may restrict the flow of certain kinds of traffic over certain routes: for instance, in the tests the authors conducted, policies were put in place to prevent commercial traffic from flowing over the non-commercial Internet2 backbone.

Two sets of tests were run, one of 12 academic networks (several of which are on the Internet2 backbone), and another which added 4 commercial networks. In the former case, RON was able to overcome all observed outages, and in the latter, 60%, with an average of 18 seconds to recover from a fault. In addition, thanks to its explicit attention to application-dependent metrics, RON is able to improve loss rate, latency and TCP throughput.

As the authors themselves note, individual RONs will not scale to the level of the Internet: RONs operate best with relatively small numbers of nodes. This smaller scale is also important administratively, from the perspective of attempting to deal with malicious nodes, as such problems are obviously easier to deal with at an administrative, rather than a technical level. In my reading, RON is an important and useful adjunct to BGP routing, both for its rapid recovery from fault, but especially for its ability to route according to application-specified metrics.

October 21, 2008

On the (in)validity of Internet topology models

Filed under: Networks — Tags: , , , , — Ashwin @ 12:04 pm

Citations:

Faloutsos, M., Faloutsos, P., and Faloutsos, C. 1999. On power-law relationships of the Internet topology. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols For Computer Communication (Cambridge, Massachusetts, United States, August 30 - September 03, 1999). SIGCOMM ‘99. ACM, New York, NY, 251-262. DOI=http://doi.acm.org/10.1145/316188.316229

Li, L., Alderson, D., Willinger, W., and Doyle, J. 2004. A first-principles approach to understanding the internet’s router-level topology. In Proceedings of the 2004 Conference on Applications, Technologies, Architectures, and Protocols For Computer Communications (Portland, Oregon, USA, August 30 - September 03, 2004). SIGCOMM ‘04. ACM, New York, NY, 3-14. DOI=http://doi.acm.org/10.1145/1015467.1015470

These are an interesting pair of papers, the first from what seems to be the beginning of attempts to extract “laws” of Internet topology, and the second more recent, offering constructive criticisms, and what seems to me a more realistic approach to modeling.

The Faloutsos’ paper studies the topology of the Internet circa 1997-1998, with three datasets of interdomain topology from BGP routing tables gathered at 6 month intervals, and one dataset of the topology of routers on the Internet. They analyse the data using a number of metrics: frequency of outdegree, node rank in order of decreasing outdegree, pairs of nodes and average number of nodes in a neighbourhood as a function of hop distance, eigenvalues of the graph, and order of eigenvalues. Using these metrics, they come to the conclusion that the Internet is subject to three power laws, in spite of a 45% growth over the period of their interdomain topology datasets.

First, the outdegree of a node is proportional to the rank of the node by outdegree raised to a constant rank exponent. They make the claim that this is indicative of the financial and functional considerations in adding a new edge to a node, providing a view into the dynamics of the ways in which domains interconnect. Second, the frequency of a given outdegree is proportional to the outdegree raised to the power of a constant outdegree exponent. From this, they propose the outdegree exponent as a useful metric for estimating the distribution of outdegree amongst Internet nodes. Finally, the eigenvalues of the graphs are proportional to the order raised to a constant eigen exponent. Finding that the eigen exponent is practically the same for the interdomain graphs, but significantly different for the router graph, they conclude that the eigen exponent can be used to distinguish amongst families of graphs.

In addition, they propose an approximation whereby the number of pairs of nodes within a hop distance is proportional to the hop distance raised to a hop-plot exponent. From this, they define the effective diameter of a graph as a function of the numbers of nodes and edges in the graph, and the hop-plot exponent. If accurate, this could have material implications for selection of TTL for broadcast, as the effective diameter provides a high probability estimate of the distance between any pair of nodes. It seems odd to me that they forged ahead with this particular analysis when their datasets lacked sufficient data to prove or disprove this hypothesis.

The Faloutsos’ metrics are interesting and valuable, but I find the failing of this paper in the assumption that certain observations about properties of the Internet amount to laws governing the growth of the Internet. As the second paper mentions, there have been arguments put forward that power law type distributions may arise for purely mathematical and statistical reasons. There is no more reason to believe that there are natural laws governing the global spread of the Internet than that there are natural laws governing the “free” market conditions under which interconnection takes place. The dynamics of differing regulatory regimes, the range of commercial agreements from peering to transit, varying physical geography, and overall economic development are just some of the factors that contribute to the manner in which the Internet spreads.

The second paper seems an explicit attempt to debunk some of the network models that have evolved from the Faloutsos’ work. As the authors point out, for a given distribution of node degree, there are a number of possible networks of distinctly different forms. This paper makes two broad contributions: an understanding of router-level constraints on network performance, and a definition of characteristics that form a heuristically optimal network.

The maximum number of packets that a router can process per unit time constrains the number of links it can handle, and the connection speed per link, which form what the paper calls a feasible region, and corresponding efficient frontier of possible bandwidth-degree combinations. Different kinds of demands may be placed on a router depending on where it is located in the network: core routers may handle fewer links, but larger numbers of flows, while edge routers must deal with the converse case, many links, and relatively fewer flows, aggregating traffic to the core. From this, a notion of a heuristically optimal network is put forward, with a loose mesh of high speed low degree routers at the core, and a hierarchical tree-like structure of slower high degree routers at the edge.

Performance comparisons are made for heuristically optimal, sub-optimal and Abilene-inspired networks against networks formed by two random generation mechanisms: preferential attachment and a general model of random graphs. The paper shows that performance in terms of overall network throughput as well as end-user bandwidth and router utilization is significantly higher for the heuristically optimal network, followed by the Abilene-inspired and sub-optimal networks, and finally the randomly generated networks.

A very useful concept introduced here is that of a likelihood-performance plane, plotting the performance of networks against the likelihood of a network being produced through a random generation process. As the paper demonstrates, heuristically optimal and Abilene-inspired networks have both a far lower likelihood, and a far greater performance than randomly generated networks of the same node degree distribution. As with the paper we read on modeling wireless networks, this paper draws our attention to the fact that models of the Internet topology must take into account specific engineering considerations.

September 23, 2008

Fast Switched Backplane for a Gigabit Switched Router

Filed under: Networks — Tags: , , — Ashwin @ 9:04 am

Citation: N. McKeown, “Fast Switched Backplane for a Gigabit Switched Router,” Business Communications Review.

This paper examines the shifts in a range of technologies needed for faster routers, with a particular focus on using switched, rather than shared, backplanes.

It opens with a useful overview of the different functions in a router: datapath and control. As the author points out, datapath functions form the broad area to investigate for performance enhancement, as these are concerned with every packet that crosses the router. Datapath functions include the forwarding decision, the backplane, and the output link scheduler. Early routers maintained much of this functionality in software, run on a single CPU, with a shared bus. Since then, trends has been to push these functions into hardware, use multiple CPUs to enable parallelism, move function out of CPUs into hardware, and a move from a shared bus to a crossbar switch in the backplane, which allows multiple linecards to communicate simultaneously.

The use of a crossbar switch increases performance in two ways: point-to-point links for physical connections from linecards, allowing for very high speeds, and support for multiple simultaneous transactions, increasing the aggregate bandwidth.

The discussion of fixed v. variable length packets was interesting, and certainly helped clear up some of the questions I had while reading the last paper. Fixed length packets are preferable in the router, since scheduling becomes simple: assign fixed time slots for transferring each fixed length packet across the backplane. However, it still seems as though the output stage will have to maintain some kind of state to re-sequence and reassemble the fixed length packets; it was not clear from the readings how these decisions might affect performance.

In spite of all their advantages, crossbars are still susceptible to several types of blocking: head-of-line (HOL), input, and output blocking. HOL blocking occurs when using FIFO queueing, as the scheduler processes the queue by the packet at the head, which means that packets destined for other output queues will be kept waiting; this can be remedied through the use of VOQs, as detailed in our other reading for today. Input and output blocking are inevitable, as each input and output can process only one packet at a time, leading to unpredictable performance due to random packet delay. These are addressed using prioritization of packets, and speedup of the crossbar switch (although significant amounts of speedup are not practical).

Multicast becomes relatively easy to implement in a crossbar, as it just requires the selective closing of multiple crosspoints at the same time. The notion of fanout is introduced, which is the set of outputs to which an input packet needs to be transmitted. The iSLIP algorithm is presented for scheduling packets across the switch, and extended to the ESLIP algorithm to support both unicast and multicast traffic.

I would have liked to see more discussion of the iSLIP and ESLIP scheduling algorithms; however, I think these two papers worked well together, as I got a good sense of the construction of a router at theoretical and practical levels in different ways from each.

September 22, 2008

Scaling Internet Routers Using Optics

Filed under: Networks — Tags: , , — Ashwin @ 10:01 am

Citation: Keslassy, I., Chuang, S., Yu, K., Miller, D., Horowitz, M., Solgaard, O., and McKeown, N. 2003. Scaling internet routers using optics. In Proceedings of the 2003 Conference on Applications, Technologies, Architectures, and Protocols For Computer Communications (Karlsruhe, Germany, August 25 - 29, 2003). SIGCOMM ‘03. ACM, New York, NY, 189-200. DOI= http://doi.acm.org/10.1145/863955.863978

The authors are concerned with the problems faced by the current generation of Internet routers, especially in the face of expected increases in Internet traffic. Network operators’ central offices were already reportedly full at the time this paper was written, requiring that new router designs either reduce their power density, and/or consume less power. Router configurations have shifted from single-rack to multi-rack to reduce power density. However, both these configurations have their own problems: multi-rack systems suffer from unpredictable performance and poor scalability, while single-rack systems use sub-optimal centralized schedulers in practice, which will not scale with increases in ports or line rate.

These concerns limit the capacity of a rack to about 2.5 Tb/s, constrained by power consumption. In this paper, the authors present a novel architecture using optics, which they estimate will be able to scale to 100 Tb/s with practically zero power requirements, and guaranteed throughput.

They do this by extending the Load-Balanced switch architecture described by C-S. Chang et al., to address the problems inherent in this architecture: requirement for a rapidly configuring switch fabric, possibilities of mis-sequenced packets, vulnerability to pathological traffic patterns, and inability to deal with failed linecards.

The Load-Balanced switch architecture breaks a router down into three parts: input and output switching stages, separated by a middle stage of buffers maintaining Virtual Output Queues, which smooth traffic from inputs to outputs. The input stage acts as a load balancer spreading traffic over the VOQs, while the output stage serves each VOQ at a fixed rate. This allows the architecture to function entirely based on local knowledge, with no need for a centralized scheduler, or knowledge of state of all queues, resulting in greater reliability, and a simpler implementation. I had a little difficulty understanding the justifications for splitting input packets into smaller fixed-length packets; while I can understand why this is necessary, I wonder how much it would add to processing requirements. Could it be that this could become a bottleneck?

The problem of packet mis-sequencing is addressed using the Full Ordered Frames First (FOFF) scheme. This scheme processes input queues holding at least N packets (followed by any non-empty queues) in a round-robin fashion, reading at most N packets at a time, and transferring each packet to a different intermediate queue in the middle stage. This approach bounds the amount of mis-sequencing of packets, and corrects what mis-sequencing occurs with a re-sequencing buffer in the output stage. FOFF is resilient to pathological traffic patterns due to the manner in which it spreads flows across the intermediate linecards.

The problem of failed linecards is addressed by splitting input and output linecards into G groups, with M intermediate GxG switches, which each maintain a fixed configuration, rotating the matching of inputs to outputs in sequence. This seems, in essence, to be a hardwired load-balancing scheme; if I understand right, extra capacity needs to be added to the intermediate switches proportional to the maximum number of linecards that need to switched out at any given time.

There were segments of this paper that I had trouble grasping completely, particularly the discussion of hybrid electro-optical switches and optical switches, since I lack almost any familiarity with the EE side of EECS. Regardless, it was really interesting as a way to think through how some of the scheduling algorithms we’ve discussed would be built as hardware implementations.

September 18, 2008

Supporting Real-Time Applications in an Integrated Services Packet Network: Architecture and Mechanism

Filed under: Networks — Tags: , , — Ashwin @ 10:52 am

Citation: Clark, D. D., Shenker, S., and Zhang, L. 1992. Supporting real-time applications in an Integrated Services Packet Network: architecture and mechanism. In Conference Proceedings on Communications Architectures & Protocols (Baltimore, Maryland, United States, August 17 - 20, 1992). D. Oran, Ed. SIGCOMM ‘92. ACM, New York, NY, 14-26. DOI= http://doi.acm.org/10.1145/144179.144199

It was odd reading this paper alongside Shenker’s. Although it is from three years earlier, it seems much more modern, almost prescient in its analysis. Clark et al., propose a unified architecture that provides different kinds of service - predicted and guaranteed - to real-time applications, in addition to supporting traditional datagram applications. In particular, a class of real-time applications called play-back applications is considered, in which data is buffered at the receiver up to a playback point (which can be thought of as a delivery deadline), and then played on the client, to reduce jitter. Assuming that the majority of future real-time applications will be of this nature, the authors use the requirements of play-back applications to drive their architecture: sensitivity to delivery delay, information about estimated delay to set playback points, indifference to delivery so long as it is before the playback point, and limited tolerance to loss.

Analysing real-time applications along two axes - tolerance to brief interruptions and rigidity of delay requirements - the authors show that the dominant classes of such applications would be intolerant/rigid and tolerant/adaptive. Intolerant/rigid applications need guaranteed service with a priori limits on delay bounds, while tolerant/adaptive applications only need predicted service, as they will dynamically adjust their playback points to suit network conditions.The discussion of the packet scheduling algorithm for this architecture is motivated by these considerations, as it points to a need to separately consider the isolation of flows (to minimise effects of one flow on another), and sharing of bandwidth (to allow operation in a unified network).

A counter-intuitive result is that FIFO queueing works better than WFQ to satisfy the sharing requirement. Where WFQ penalizes sources with bursts of packets, FIFO passes them through as a unit, providing better control of delay bounds for real-time applications, but penalizing other flows as a result. Conversely, WFQ provides better isolation than FIFO, as it minimises the effects of bursty sources on other flows. The authors propose a new scheme, FIFO+, which allows sharing of jitter across multiple hops. A new header field maintaining the difference between the delay for a packet, and the average delay for its aggregate class, is maintained by switches, allowing insertion of a packet into the queue to minimize delay.

The unified algorithm that is presented is based on timestamp-based WFQ at the highest level, to serve flows that required guaranteed service, along with a pseudo-flow encapsulating predicted service flows and datagram flows, which is managed under a FIFO+ regime. This pseudo-flow maintains multiple classes of service, shifting jitter from higher classes of predicted service to lower, with datagram flows being at the lowest level, since they are most resistant to jitter. Admission control is used within each class to maintain delay bounds. This strikes a good balance, as guaranteed flows and the different classes of predicted service and datagram flows are isolated from one another, with jitter shared by progressively shifting it down to the lowest class.

The network fixes the clock rate for each guaranteed service flow, with no further checks on behaviour. Predicted service flows must declare the parameters for their service characterization (token bucket rate and size), and request a maximum delay and tolerable loss rate, to allow the network to assign the flow to an appropriate aggregate class, and also to make admission control decisions based on whether or not a source with these declared parameters can be carried. To maintain good shared behaviour, the network enforces conformance to the service characterization for predicted service flows.

I really enjoyed the discussion in this paper, especially the separation of concerns for sharing and isolation. Most of all, I liked the FIFO+ scheme, as it recalls XCP, but at the same time has no requirements of trust on endpoints; the assumption that switches trust one another seems far more reasonable.

Fundamental Design Issues for the Future Internet

Filed under: Networks — Tags: , , — Ashwin @ 9:16 am

Citation: S. Shenker, Fundamental design issues for the future Internet, IEEE J. Selected Areas Comm., 13 (1995)

I found this paper really useful for two reasons. First, Shenker provides a view of what the future Internet might look like from 1995, anticipating widespread use of real-time applications. Second, and more importantly perhaps, he gives us a set of abstractions by which to judge network architectures based on different the demands imposed by elastic and real-time applications.

The applications in common use on the Internet at the time were “elastic”, in the sense that they were tolerant of delays in transmission, with design decisions that were motivated by the packet-switched nature of the Internet. It is, however, much harder to meet the demands of real-time applications, such as voice and video streams, using a packet-switched network, as they are much less tolerant to delay and loss.

Shenker demonstrates that a unified network infrastructure would provide the highest utility. To allow different kinds of applications to coexist, he considers overprovisioning, and modifications of the basic Internet architecture to provide a variety of service models beyond just best-effort service, which may also depend on admission control. The general intuition at work here is that aligning services with application requirements results in better overall network utilization.

Two models for assigning service to applications are presented: an implicit model where the network attempts to guess an application’s requirements, and an explicit model, where an application can declare its requirements to the network. The implicit model can be introduced with no changes to current service models, but may not be able to service applications that it does not know about, or variations in an application. The explicit model, on the other hand, requires a new service model (and therefore changes to the Internet architecture), stable service offerings, and also raises interesting questions of incentives: how can applications be incentivized to seek the levels of service they actually require, rather than always seeking the highest available level? Shenker suggests that they only way to address this concern is through variable pricing for different classes of service, though raising the concern that such variable pricing might change the nature of the gift economy prevalent on the Internet.

The most interesting section of this paper, for me, was the discussion of utility curves for different classes of applications: elastic, hard real-time, delay-adaptive and rate-adaptive. The shifts between concave and convex shapes, and the inflection points at which these occur, provide a compelling framework for thinking about the design of different classes of service in a unified network.

Older Posts »

Powered by WordPress