On Inferring Autonomous System Relationships in the Internet
Citation: Gao, L. 2001. On inferring autonomous system relationships in the internet. IEEE/ACM Trans. Netw. 9, 6 (Dec. 2001), 733-745. DOI= http://dx.doi.org/10.1109/90.974527
The relationships between autonomous systems on the Internet can be near impossible to deduce accurately, as they are, in essence, a series of bilateral commercial agreements. In this paper, Gao presents a technique to infer autonomous system relationships by looking for patterns in the ASN paths in BGP routing tables.
To do this, Gao starts with some general assumptions of rational behaviour for autonomous systems. From these assumptions, she derives the Selective Export Rule, which defines these behaviours as mechanisms by which autonomous systems selectively export routes to their peers, customers and siblings. She then derives patterns that we might expect to see in BGP routing tables, based on the Selective Export Rule: Valley-Free, Downhill Path, Uphill Path, Maximal Downhill Path and Maximal Uphill Path.
Gao combines these with the idea that if we consider the graph of all autonomous systems, those with higher degree are more likely to be larger (and hence more likely to peer) than those with a smaller degree. The result is a set of algorithms, which when applied to publicly available BGP routing tables, and compared to internal AT&T routing tables, provide surprisingly high rates of accuracy in their ability to predict different kinds of relationships. While the accuracy rates vary greatly - from as high as 100% for customer relationships to to 25% for sibling relationships - there is still immense value to be had from these figures alone.
Additionally, as a side-effect of this research, heuristics were developed whereby misconfigured routers could be identified, which would no doubt be of great benefit to network administrators tasked with managing large BGP deployments.
I like this paper, a lot, both for its content, and its presentation. Gao is clear about the assumptions she makes, and the manner in which she builds on them. She is also clear about the limitations she sees to the data that was available for analysis. I found the actual meat of the paper, the heuristic mechanism for identifying different kinds of relationships, incredibly useful, and exciting.