Deprecated: Assigning the return value of new by reference is deprecated in /home/jsanmath/public_html/blogs/nalanda/wp-settings.php on line 472

Deprecated: Assigning the return value of new by reference is deprecated in /home/jsanmath/public_html/blogs/nalanda/wp-settings.php on line 487

Deprecated: Assigning the return value of new by reference is deprecated in /home/jsanmath/public_html/blogs/nalanda/wp-settings.php on line 494

Deprecated: Assigning the return value of new by reference is deprecated in /home/jsanmath/public_html/blogs/nalanda/wp-settings.php on line 530

Deprecated: Assigning the return value of new by reference is deprecated in /home/jsanmath/public_html/blogs/nalanda/wp-includes/cache.php on line 103

Deprecated: Assigning the return value of new by reference is deprecated in /home/jsanmath/public_html/blogs/nalanda/wp-includes/query.php on line 21

Deprecated: Assigning the return value of new by reference is deprecated in /home/jsanmath/public_html/blogs/nalanda/wp-includes/theme.php on line 623
Nalanda

Nalanda

November 25, 2008

A Policy Aware Switching Layer for Data Centers

Filed under: Networks — Tags: , — Ashwin @ 12:36 pm

Joseph, A.J., Tavakoli, A, Stoica, I. 2008. A Policy Aware Switching Layer for Data Centers. UC Berkeley Technical Report No. UCB/EECS-2008-82.

The authors deal with the problem of configuring middleboxes in datacenters. Current architectures call for middleboxes to be placed on the physical network path, which leads to a number of sticky configuration problems. These include removal of physical connectivity paths which do not cross the middlebox, manipulation of link costs and separation into VLANs. All these approaches carry penalties with them: loss of fault tolerance, difficulty of predicting behaviour, fatesharing of flows with middleboxes, and the loss of ability to run clustering and virtual server mechanisms which require layer 2 connectivity.

The authors propose a new approach, PLayer, consisting of policy aware switched, pswitches, which allow middleboxes to be taken off the physical network path, and allows for the explicit specification of middlebox routing policy, rather than the implicit mechanisms currently in use. Though conceptually simple, this is a difficult problem in practice, since a principal, though unstated, design goal is to not require any changes of the middleboxes themselves. Even for simple middleboxes, Ethernet frames have to be encapsulated for delivery. More complex middleboxes that require layer 3 and layer 4 data need assurance that streams are always directed to the same middlebox instance; this is achieved using consistent hashing to choose instances.

An interesting problem that the authors deal with is the dissemination of policy updates. Each pswitch maintains a copy of all policy rules for the datacenter, to allow for continued correct function in the event of any failures elsewhere on the network. When policies are updated, these must be adopted concurrently by all pswitches. The authors propose a mechanism where policies are pushed out pswitches, but not immediately adopted; a separate small control packet is dispatched to signal the switch to a new policy. As the packet is small, there is a greater likelihood that it will reach all switches synchronously.

Even with this mechanism in place, there are several scenarios under which flows will be processed by different policies, which call for very specific approaches to policy configuration in order to enable reliable and consistent dissemination. This did make me wonder about how such a mechanism might be deployed to a real datacenter. Could it be that, depending on topology, different portions of a network could be taken offline and updated independent of one another? For smaller, controlled functionality (e.g., load balancers and firewalls dedicated to a web server farm), it may be that this could provide a more reliable, albeit also more manual, update mechanism.

Improving MapReduce Performance in Heterogeneous Environments

Filed under: Networks — Tags: , , — Ashwin @ 12:11 pm

Zaharia, M., Konwinski, A., Joseph, A. D., Katz, R. and Stoica, I. 2008. Improving MapReduce Performance in Heterogeneous Environments.

The authors examine the performance of the MapReduce implementation in Hadoop, and find several flaws in Hadoop’s scheduler which can cause severe degradation in performance.The principal problems that are analysed are those of the identification of laggard tasks for speculative execution, and the identification of nodes to which these tasks should be assigned. These problems arise in Hadoop due to assumptions of homogeneity in tasks, and in the network itself.

To remedy these problems, the authors propose a new scheduling algorithm, LATE, which is sensitive both to the variance in tasks and also to variance in node performance. LATE chooses tasks for speculative execution based on estimated time to completion, rather than the simple score metric that Hadoop uses. Only nodes with performance above a specified threshold are chosen for the execution of speculative tasks. In addition, a cap is maintained on the total number of speculative tasks that may be run at a time. Testing on various configurations of EC2, and also on a testbed with virtual machines, demonstrates that LATE provides significant performance benefits.

The one question I have is with regards to the choice of virtual machines as testbeds. While I undertstand that these could be useful to simulate a heterogeneous environment, it also seems like they are a worst case scenario. MapReduce is itself a virtualization scheme that makes certain assumptions about the locality of data; layering this on top of another virtualization scheme seems like overkill. It would be interesting to know how much of an advantage LATE delivers in a carefully planned data center, even one with a degree of heterogeneity.

November 18, 2008

Data-oriented networking and DTNs

Filed under: Networks — Tags: , , — Ashwin @ 1:27 pm

Tolia, N., Kaminsky, M., Andersen, D. G., and Patil, S. 2006. An architecture for internet data transfer. In Proceedings of the 3rd Conference on Networked Systems Design & Implementation - Volume 3 (San Jose, CA, May 08 - 10, 2006). USENIX Association, Berkeley, CA, 19-19.

This paper presents one very simple, and powerful idea: that data transfer is a common activity across many classes of protocols, and so may be offered as a common service. Protocols may perform content negotiation as necessary, and delegate data transfer to the DOT service. This allows for reuse and innovation at the data transfer layer, with no need for protocols to reinvent the wheel, as it were.

The measurements of performance for the different DOT plugins was interesting, but not compelling; especially that for the USB device. All that this really shows us is that DOT achieves its goal of supporting multiple transports. The most interesting metric, for me, was that a single graduate student was able to add DOT support to Postfix in less than a week, which speaks volumes for the simplicity of the protocol. I especially like the attention to adoption, that DOT functions as an optional protocol extension, rather than a requirement.

Fall, K. 2003. A delay-tolerant network architecture for challenged internets. In Proceedings of the 2003 Conference on Applications, Technologies, Architectures, and Protocols For Computer Communications (Karlsruhe, Germany, August 25 - 29, 2003). SIGCOMM ‘03. ACM, New York, NY, 27-34. DOI=http://doi.acm.org/10.1145/863955.863960

The discussion of the assumptions that TCP/IP makes of the link layer (end-to-end paths, low RTT and drop rate) and of the characteristics of “challenged” networks, were alone worth the price of admission for this paper. To list a few of the latter: high latencies, predictable interruption, asymmetric data rates, non-existence of end-to-end routing paths, lack of continuous power or adequate memory.

I particularly liked the point that Fall makes in justifying this work, that there may exist conventional networks with Internet connectivity only available via a challenged network. The use of challenged networks for transit capabilities could provide significant benefits.

I thought the initial presentation was interesting, the use of an asynchronous transfer mechanism, but as the paper goes on, it seems like it’s trying to do too much. I can see that DTN could be useful to create guidelines for the design of protocol gateways. I got the most out of this paper in reading it as a thought piece on protocol design.

However, it sounds trying to get DTN functioning as a viable end-to-end overlay on the Internet sounds like it would be more trouble than it’s worth. As we’ve seen in our readings, it’s hard enough to get one set of protocols working, and harder yet to bridge disparate environments; I’m not convinced that the One True Protocol solution is workable.

November 13, 2008

Internet Measurement

Filed under: Networks — Tags: , , — Ashwin @ 2:12 pm

Paxson, V. 1997. End-to-end Internet packet dynamics. SIGCOMM Comput. Commun. Rev. 27, 4 (Oct. 1997), 139-152. DOI=http://doi.acm.org/10.1145/263109.263155

Paxson measures TCP transfers between 35 Internet sites to gather his data, with particular attention to dynamics of both sender and receiver, unlike many prior studies which only study sender behaviour. In doing so, he is able to study the end-to-end asymmetry in paths for individual transfers, and also deduce interesting characteristics of routers and links along these paths.

He also importantly draws our attention to several ways in which the effects of TCP may be disentangled from those of the network. One of the most interesting to me, for instance, was a reminder to consider that loss rate measurement for ACKs is very different from that for data packets; TCP adjusts data packet rates in response to loss, which illustrates how we could end up measuring an effect of TCP, rather than of the underlying network path.

Another interesting consideration was in the use of packet pairs for measurement. As the network evolves, packets may not traverse the same end-to-end link: they may be switched along different paths at a router. To deal with this problem, Paxson introduces the notion of “packet bunch modes”, where packets are sent in bunches of varying sizes to account for the different paths that they may traverse.

This was really deep and interesting paper that will take several more reads before I manage to absorb it completely.

R. Fonseca, G. Porter, R. H. Katz, S. Shenker, and I. Stoica. 2007. X-trace: A pervasive network tracing framework. In 4th USENIX Symposium on Networked Systems Design & Implementation, pp. 271-284. [Online]. Available: http://www.usenix.org/events/nsdi07/tech/fonseca.html

X-Trace provides the means to trace network requests across different services and layers of a network, by associating each request with a unique identifier. Protocols need to be modified, if required, to handle the extra X-Trace metadata.

Network elements need to support only two simple primitives - pushDown() and pushNext() - which are used to update request metadata to indicate it being pushed down a layer (e.g., HTTP to TCP), or across a layer (e.g., web browser to web server). Network elements also need to log reports based on this metadata to a persistent store.

Using the stored metadata, a tree may be reconstructed to show the sequence of requests, across, as well as up and down, network layers. I could imagine that this could potentially be extended to also capture logs within an application.

I like this approach for its simplicity, as it requires very little for instrumentation, and is easy to support in protocols that already allow for extension (e.g., HTTP headers). The challenge, of course, as the authors point out, would be to drive adoption to the point where it is sufficiently widespread that X-Trace reports could become useful across multiple administrative domains.

November 6, 2008

Middleboxes and Indirection

Filed under: Networks — Tags: , , — Ashwin @ 1:05 pm

Citations:

Walfish, M., Stribling, J., Krohn, M., Balakrishnan, H., Morris, R., and Shenker, S. 2004. Middleboxes no longer considered harmful. In Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation - Volume 6 (San Francisco, CA, December 06 - 08, 2004). Operating Systems Design and Implementation. USENIX Association, Berkeley, CA, 15-15

Stoica, I., Adkins, D., Zhuang, S., Shenker, S., and Surana, S. 2002. Internet indirection infrastructure. SIGCOMM Comput. Commun. Rev. 32, 4 (Oct. 2002), 73-86. DOI=http://doi.acm.org/10.1145/964725.63303

Both these papers build on our recent readings to explore new addressing and routing mechanisms that would allow easier deployment of existing network elements (such as middleboxes), and enable a wide variety of modern network services. The approaches outlined in these papers - Delegation Oriented Architecture (DOA) and Internet Indirection Infrastructure (i3) - have some features in common: a reliance on a new kind of host identifier to be embedded in packets, the use of DHTs as address lookup services, and implementation as an overlay network to provide the necessary function and at the same time allow them to coexist with the current Internet architecture.The principal difference between these approaches is perhaps in their use of identifiers: where DOA maps identifiers to individual hosts, i3 uses identifiers to map to potentially multiple hosts.

The authors of DOA make the point that although reviled, middleboxes provide essential network services; we should therefore ask how they might be better integrated into the Internet architecture without violating the fundamental tenets that every Internet entity has a unique network identifier and that network elements should not not process packets that are not addressed to them. DOA packets may specify a sequence of hosts to which the packet must be routed, allowing middleboxes to exist at locations other than the topological “middle”. With this abstraction in place, packets are source routed across the specified sequence of hosts before reaching the receiver. Users may adaptively select new services that they wish to process their packets, and outsource them accordingly.

i3 requires receivers to register “triggers” against common identifiers, indicating that packets to an identifiers should be forwarded to them. A mobile host may update its triggers to reflect its new network location, with no need for senders to be aware of this change. Multicast groups may be formed by multiple hosts registering triggers for the same identifier. The most interesting application of i3, to me, was for anycast, where the hosts in the anycast group maintain triggers with identical prefixes; senders form identifiers with a matching prefix, and the request is routed to the trigger with the longest matching prefix. Such services may provide load balancing by having senders generate random identifier suffixes, or location awareness by having receivers and senders encode their location into the identifier suffix. As with DOA, i3 provides a means by which packets may be source routed - an “identifier stack” - to enable service composition.

Each of these proposals is put forward to address fundamental mismatches between current network architecture, and the uses to which this architecture is being put. DOA deals with the problem of middleboxes, and how these might be better integrated into the network architecture, while i3 deals with the problem of network services which do not follow the point-to-point abstraction of Internet routing (multicast, anycast and mobility).

November 4, 2008

DNS Scalability

Filed under: Networks — Tags: , , — Ashwin @ 1:58 pm

Citations:

Mockapetris, P. and Dunlap, K. J. 1988. Development of the domain name system. SIGCOMM Comput. Commun. Rev. 18, 4 (Aug. 1988), 123-133. DOI=http://doi.acm.org/10.1145/52325.52338

Jung, J., Sit, E., Balakrishnan, H., and Morris, R. 2001. DNS performance and the effectiveness of caching. In Proceedings of the 1st ACM SIGCOMM Workshop on internet Measurement (San Francisco, California, USA, November 01 - 02, 2001). IMW ‘01. ACM, New York, NY, 153-167. DOI=http://doi.acm.org/10.1145/505202.505223

These two papers provide interesting views into the process of design of DNS, and of actual measurements of those design choices.

DNS was originally created to address problems with the HOSTS.TXT approach to maintaining host name to IP address mappings: the technical problem that the file was growing too large with increasing numbers of hosts, and the administrative problem that it was becoming difficult to maintain a single master file as the number of independent administrations on the network increased. To address these problems, a hierarchical system was created, with different servers accounting for different sections of the DNS tree. Administrations could maintain authoritative servers for their own domains, while higher level servers, leading up to the root servers, could provide pointers to tie the different administrations together.

It was interesting to see how many of the problems mentioned in the first paper were social, rather than technical: the difficulty of introducing new classes (which seems to me unrelated to DNS itself), the incentives for transitioning HOSTS.TXT-based to DNS, explicit requirements for redundancy in all DNS servers (which seems unachievable at Internet-scale), following sample settings word-for-word.

The first paper presents caching as one of the successes of DNS, in the context of the “unexpectedly bad performance” of the Internet of the time. Without caching, the DNS system may have been doomed to failure, making wider adoption more difficult. The second paper calls caching into question, specifically with reference to the predominant web traffic on the networks sampled.

As the second paper observes, most wide area traffic is web traffic, where a cluster of requests to a server are preceded by a single lookup. Since the distribution of names is Zipf-like, the usefulness of caching for A records is limited. The evaluation in this paper suggests that small groups of clients, of the order of 25 or so, provide the benefits of caching close to providing a common cache across all clients. However, since HTTP 1.1 is now fairly widely deployed, it seems likely that there the correspondence of TCP to DNS flows will be closer to 1:1, which begs the question whether caching of A records provides any utility today.

The authors also show that caching of NS records provides important performance benefits, as it reduces the number of referrals per lookup, and also reduces the load on the root and gTLD servers. This make intuitive sense; I would imagine the hierarchy of DNS servers to be relatively stable, which would lend itself well to caching with large TTL values.

October 30, 2008

P2P systems and the Chord protocol

Filed under: Networks — Tags: , , , — Ashwin @ 1:19 pm

Citations:

Balakrishnan, H., Kaashoek, M. F., Karger, D., Morris, R., and Stoica, I. 2003. Looking up data in P2P systems. Commun. ACM 46, 2 (Feb. 2003), 43-48. DOI=http://doi.acm.org/10.1145/606272.606299

Stoica, I., Morris, R., Karger, D., Kaashoek, M. F., and Balakrishnan, H. 2001. Chord: A scalable peer-to-peer lookup service for internet applications. In Proceedings of the 2001 Conference on Applications, Technologies, Architectures, and Protocols For Computer Communications (San Diego, California, United States). SIGCOMM ‘01. ACM, New York, NY, 149-160. DOI=http://doi.acm.org/10.1145/383059.383071

These two papers deal with the problem of storing and locating data in P2P systems, the first surveying a number of different systems, and the second describing the Chord system in detail.

The principal question that the first paper addresses is that of lookup: how is data to be located in a P2P system efficiently, lacking any centralised index or hierarchy? The authors take us through a number of different approaches to lookup, mainly structured and symmetric lookup.

In a structured lookup system, data may be stored in a central server, or a hierarchy may be imposed to provide scalability. These systems can provide guarantees for location of data, but are vulnerable to the failure of the central database, or of higher-level nodes. In addition, a disproportionately high percentage of load on the system may have to be borne by higher level nodes.

In a symmetric lookup system, no node is more important than any other, and each node participates in handling a fraction of the total load on the system. These vary from systems where lookup requests are broadcast to all neighbours (introducing significant overhead, and scaling problems), to others where “superpeers” are used to introduce some hierarchy to the system to scale up more efficiently. However, more significant superpeers become points of failure, and such systems provide no guarantees for lookup.

Modern P2P algorithms, including CAN, Chord, Kademlian, Pastry, Tapestry and Viceroy, all implement the Distributed Hash Table (DHT) abstraction, with implementations driven by considerations of scaling well to large numbers of nodes, low latency lookup, efficient handling of dynamic node arrival and departure and an even distribution of keys amongst nodes.

A DHT identifies both keys and nodes using a common hash function. Keys are then stored on nodes “closer” to them; “closeness” is evaluated using a function that compares the key and node identifiers. When a node receives a lookup request, it needs sufficient information to forward the request on to a node which is closer to the requested key. Routing table to maintain this information may be one dimensional (implemented a skip list in Chord, and as a tree in Pastry, Tapestry and Kademlia), or multi-dimensional (as in CAN). The multidimensional approach is interesting: I could imagine this being used for a data retrieval system, where queries may be run to locate items based on arbitrary combinations of dimensions.

The discussion of the unidirectionality and symmetry properties of the different algorithms was interesting, but also a little confusing: why, for instance, do symmetric protocols not require stabilization routines? I particularly liked the way in which malicious activity can be dealt with: since all these algorithms can easily perform cross-checks to verify correct routing and so identify malicious nodes, the worst that can happen is that a malicious node denies the existence of available data.

The second paper describes an implementation of Chord, which the authors suggest has advantages of simplicity, provable correctness, and provable performance over other P2P lookup protocols. The Chord implementation uses SHA-1 hashing to generate identifiers for keys and nodes, proceeding on the assumption that given a sufficient number of nodes, there will be a roughly equal distribution of keys over nodes. Nodes form their routing (”finger”) tables by maintaining information about successor nodes: each successor node is at increasingly exponential distances from the source node. This creates the interesting property that a node “knows” more of its closer neighbours than its distant neighbours. Keys are assigned to the first nodes with an identifier which is equal to, or follows the key. By maintaining a relatively small amount of information (about O(log N) successors), nodes can efficiently route lookup requests to the node holding the key.

When a node joins or leaves Chord, two properties must be preserved: each node’s successor must be correctly maintained, and keys must be distributed properly amongst nodes. Chord cannot transfer keys itself; it only notifies application software that this transfer is necessary, so that the application layer can take care of also transferring associated data. Chord periodically runs a stabilization protocol to update the finger tables for all nodes.

Since keys are uniquely associated with nodes, applications are also responsible for maintaining replicas of data, by associating them with different keys. I’m not sure if this is a reasonable expectation, since this means that the application layer would have to be aware of the key distribution across nodes, to ensure that replicas are actually stored on different nodes.

In simulations, it was found that mean path length increases logarithmically with the number of nodes, as expected. Under simultaneous node failure conditions, the lookup failure rate is proportional to the fraction of failed nodes, again as expected, since these would map to the fraction of lost keys. Lookup failures during stabilization depend on the number of failures, the average path length and number of nodes. In experimental testing on the RON testbed, lookup latency was found to grow slowly with the total number of nodes, confirming simulation results, and demonstrating Chord’s scalability.

October 28, 2008

Active Network Vision and Reality: Lessons from a Capsule-based System

Filed under: Networks — Tags: , , — Ashwin @ 12:57 pm

Citation: Wetherall, D. 1999. Active network vision and reality: lessons from a capsule-based system. In Proceedings of the Seventeenth ACM Symposium on Operating Systems Principles (Charleston, South Carolina, United States, December 12 - 15, 1999). SOSP ‘99. ACM, New York, NY, 64-79. DOI=http://doi.acm.org/10.1145/319151.319156

This paper presents an interesting idea: rather than have routing algorithms statically built into routers, allow applications to bundle their own algorithms along with their data packets, and have routers execute this application-determined code to make routing decisions. The combination of code and data is termed a capsule. Evaluation of the system is presented based on the ANTS toolkit.

Capsules declare a type in their headers, which is used to determine the code to be executed at the router. If this code is not already in the router’s cache, it is obtained from the neighbour which dispatched the capsule. This is an interesting choice of code distribution mechanism; as long as trust is not violated amongst the active routers, more explicit security checks on the code are needed only at the edge of the network. Indeed, security is a recurring theme in the paper, as it is a primary goal that untrusted users be capable of deploying code to the network without compromising the capacity of individual nodes, or of the network as a whole.

The idea presented here is certainly compelling as a thought experiment, but I found it difficult to place in context without examples of real usage. The performance numbers presented, for instance, were confusing, as they relate to no real application; I would imagine that performance would vary wildly depending on the capsule code.

Resilient Overlay Networks

Filed under: Networks — Tags: , , , — Ashwin @ 11:11 am

Citation: Andersen, D., Balakrishnan, H., Kaashoek, F., and Morris, R. 2001. Resilient overlay networks. SIGOPS Oper. Syst. Rev. 35, 5 (Dec. 2001), 131-145. DOI=http://doi.acm.org/10.1145/502059.502048

In order to operate at the scale of the Internet, the Internet’s current interdomain routing protocol, BGP-4, aggregates routes, but at the same time hides details of topology and traffic conditions, which leads to time intervals of the order of tens of minutes (or even hours) for BGP’s recovery mechanisms to converge on a consistent routing table. This paper presents the Resilient Overlay Network (RON) as a remedy to BGP’s problems.

A RON consists of a set of nodes residing in a variety of routing domains, which maintain a full mesh of routes between themselves using a link state routing protocol. Link metrics are collected using a combination of active periodic probing, and passive listening. This allows RON to route around failures in the order of seconds, a significant improvement over BGP. For a RON of 50 nodes, an approximate overhead of 30 kbps will be required with the implementation described in the paper; this does not seem excessive considering the capacity of current Internet links.

An interesting feature of RON is an explicit interaction with applications for the purposes of deciding metrics for route selection. By default, RON offers latency, loss and throughput metrics, though this set may be extended to include application-specific metrics such as jitter. This allows applications to choose trade-offs between different metrics for route selection, and in effect enables application-specific routing. In this context, the paper introduces the useful notion of a performance failure, where a link does not provide adequate performance to the application using it.

RON also puts in place rich routing policies, which may restrict the flow of certain kinds of traffic over certain routes: for instance, in the tests the authors conducted, policies were put in place to prevent commercial traffic from flowing over the non-commercial Internet2 backbone.

Two sets of tests were run, one of 12 academic networks (several of which are on the Internet2 backbone), and another which added 4 commercial networks. In the former case, RON was able to overcome all observed outages, and in the latter, 60%, with an average of 18 seconds to recover from a fault. In addition, thanks to its explicit attention to application-dependent metrics, RON is able to improve loss rate, latency and TCP throughput.

As the authors themselves note, individual RONs will not scale to the level of the Internet: RONs operate best with relatively small numbers of nodes. This smaller scale is also important administratively, from the perspective of attempting to deal with malicious nodes, as such problems are obviously easier to deal with at an administrative, rather than a technical level. In my reading, RON is an important and useful adjunct to BGP routing, both for its rapid recovery from fault, but especially for its ability to route according to application-specified metrics.

October 21, 2008

On the (in)validity of Internet topology models

Filed under: Networks — Tags: , , , , — Ashwin @ 12:04 pm

Citations:

Faloutsos, M., Faloutsos, P., and Faloutsos, C. 1999. On power-law relationships of the Internet topology. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols For Computer Communication (Cambridge, Massachusetts, United States, August 30 - September 03, 1999). SIGCOMM ‘99. ACM, New York, NY, 251-262. DOI=http://doi.acm.org/10.1145/316188.316229

Li, L., Alderson, D., Willinger, W., and Doyle, J. 2004. A first-principles approach to understanding the internet’s router-level topology. In Proceedings of the 2004 Conference on Applications, Technologies, Architectures, and Protocols For Computer Communications (Portland, Oregon, USA, August 30 - September 03, 2004). SIGCOMM ‘04. ACM, New York, NY, 3-14. DOI=http://doi.acm.org/10.1145/1015467.1015470

These are an interesting pair of papers, the first from what seems to be the beginning of attempts to extract “laws” of Internet topology, and the second more recent, offering constructive criticisms, and what seems to me a more realistic approach to modeling.

The Faloutsos’ paper studies the topology of the Internet circa 1997-1998, with three datasets of interdomain topology from BGP routing tables gathered at 6 month intervals, and one dataset of the topology of routers on the Internet. They analyse the data using a number of metrics: frequency of outdegree, node rank in order of decreasing outdegree, pairs of nodes and average number of nodes in a neighbourhood as a function of hop distance, eigenvalues of the graph, and order of eigenvalues. Using these metrics, they come to the conclusion that the Internet is subject to three power laws, in spite of a 45% growth over the period of their interdomain topology datasets.

First, the outdegree of a node is proportional to the rank of the node by outdegree raised to a constant rank exponent. They make the claim that this is indicative of the financial and functional considerations in adding a new edge to a node, providing a view into the dynamics of the ways in which domains interconnect. Second, the frequency of a given outdegree is proportional to the outdegree raised to the power of a constant outdegree exponent. From this, they propose the outdegree exponent as a useful metric for estimating the distribution of outdegree amongst Internet nodes. Finally, the eigenvalues of the graphs are proportional to the order raised to a constant eigen exponent. Finding that the eigen exponent is practically the same for the interdomain graphs, but significantly different for the router graph, they conclude that the eigen exponent can be used to distinguish amongst families of graphs.

In addition, they propose an approximation whereby the number of pairs of nodes within a hop distance is proportional to the hop distance raised to a hop-plot exponent. From this, they define the effective diameter of a graph as a function of the numbers of nodes and edges in the graph, and the hop-plot exponent. If accurate, this could have material implications for selection of TTL for broadcast, as the effective diameter provides a high probability estimate of the distance between any pair of nodes. It seems odd to me that they forged ahead with this particular analysis when their datasets lacked sufficient data to prove or disprove this hypothesis.

The Faloutsos’ metrics are interesting and valuable, but I find the failing of this paper in the assumption that certain observations about properties of the Internet amount to laws governing the growth of the Internet. As the second paper mentions, there have been arguments put forward that power law type distributions may arise for purely mathematical and statistical reasons. There is no more reason to believe that there are natural laws governing the global spread of the Internet than that there are natural laws governing the “free” market conditions under which interconnection takes place. The dynamics of differing regulatory regimes, the range of commercial agreements from peering to transit, varying physical geography, and overall economic development are just some of the factors that contribute to the manner in which the Internet spreads.

The second paper seems an explicit attempt to debunk some of the network models that have evolved from the Faloutsos’ work. As the authors point out, for a given distribution of node degree, there are a number of possible networks of distinctly different forms. This paper makes two broad contributions: an understanding of router-level constraints on network performance, and a definition of characteristics that form a heuristically optimal network.

The maximum number of packets that a router can process per unit time constrains the number of links it can handle, and the connection speed per link, which form what the paper calls a feasible region, and corresponding efficient frontier of possible bandwidth-degree combinations. Different kinds of demands may be placed on a router depending on where it is located in the network: core routers may handle fewer links, but larger numbers of flows, while edge routers must deal with the converse case, many links, and relatively fewer flows, aggregating traffic to the core. From this, a notion of a heuristically optimal network is put forward, with a loose mesh of high speed low degree routers at the core, and a hierarchical tree-like structure of slower high degree routers at the edge.

Performance comparisons are made for heuristically optimal, sub-optimal and Abilene-inspired networks against networks formed by two random generation mechanisms: preferential attachment and a general model of random graphs. The paper shows that performance in terms of overall network throughput as well as end-user bandwidth and router utilization is significantly higher for the heuristically optimal network, followed by the Abilene-inspired and sub-optimal networks, and finally the randomly generated networks.

A very useful concept introduced here is that of a likelihood-performance plane, plotting the performance of networks against the likelihood of a network being produced through a random generation process. As the paper demonstrates, heuristically optimal and Abilene-inspired networks have both a far lower likelihood, and a far greater performance than randomly generated networks of the same node degree distribution. As with the paper we read on modeling wireless networks, this paper draws our attention to the fact that models of the Internet topology must take into account specific engineering considerations.

Older Posts »

Powered by WordPress