Thursday, July 2, 2009

Review: On Inferring AS relationships in the Internet

Contractual commercial agreements between administrative domains play a crucial role in shaping the structure of the Internet and the end-to-end performance characteristics. Since interdomain routing is controlled by Border Gateway Protocol (BGP) which is policy-based, connectivity does not imply reachability. A global picture of Autonomous System (AS) relationships is an important aspect of the Internet Architecture.

The paper, On inferring autonomous system relationships in the internet by L. Gao [1] proposes an augmented AS graph representation to capture AS relationships. They classify the relationship between a pair of interconnected ASs as customer-provider, peering and sibling relationship. Since there was no publicly available information about inter-AS relationships, they made use of heuristic algorithms to infer the augmented AS graph from BGP routing tables. Their algorithm is based on the fact that each AS follow the selective export rule which indicates that there should be a certain pattern that can be use to infer export policies from routing table entries. Their final algorithm first coarsely classifies AS pairs into having provider-customer or sibling relationships in phase 1, identify all the AS pairs that cannot peer with each other and finally in phase 3, identify the peering relationships from the rest by using the heuristic that two peering ASs' degree do not differ significantly. They created two algorithms for phase 1, the basic algorithm which assumes that all BGP speaking routers are configured correctly and the refined algorithm that uses counting to reduce the possibility of incorrect inference due to misconfigured routers. They performed an experimental study using routing tables retrieved from Route Views server in Oregon and among the connected AS pairs, 90.5% of the AS pairs have customer-provider relationships, less than 1.5% of the AS pairs have sibling relationships and less than 8% of the AS pairs have peering relationships. Then they were able to verify their results with AT&T and the results show that 100% of their inferred customers and inferred peers were confirmed. Meanwhile only 20.5% of their inferred siblings were confirmed and out of all their inference results, 98.9% were confirmed. They also verified their inferred sibling relationships with the information from WHOIS lookup service and more than half of the inferred sibling relationships were confirmed

However in a more recent study in [2], its author's claim in the Abstract that the approach presented here are local in nature and do not capture the global hierarchical structure. In their work they define a way to infer relationships from collected data adding consideration for the global hierarchy constrains. They define the Acyclic Type of Relationship AToR problem that captures the global hierarchy and present an efficient algorithm that allows determining if there is a hierarchical assignment without invalid paths. They said that their experiments and simulation results shows that their algorithm is able to classify the type of relationship between ASes better than previous algorithms but due to lack of access rights to the whole paper I'm not able to confirm this.

The paper, being the pioneer in this problem of inferring AS relationships in the Internet did a splendid job of explaining and providing the necessary details in introducing and attacking the problem. The Internet was just starting to be popular in the time of the study but has been in use for many years prior to it. The unavailability of public data might have discouraged researchers back then but it's great that the authors of this paper were able to find a way that is like reverse engineering.

The problem of inferring or deducing the AS relationships in the Internet is indeed an interesting topic that could prove to be useful if the results of algorithms like this reach a certain level of reliability. The fault in their algorithm was in their assumption that the degree of two ASes that has a peering relationship do not differ by more than a value R. They made this assumption without providing a clear way of how to set it. The value (60) that they have chosen in the paper may not provide the best result today because of the massive increase in the users and usage of the Internet. However, they do suggest making use of other information that may not be available in the time of their writing with the inference techniques to improve accuracy.

References:

[1] L. Gao, "On inferring autonomous system relationships in the internet," IEEE/ACM Transactions on Networking,vol.9, no.6, December 2001.

[2] R. Cohen and D. Raz, "Acyclic type of relationships between autonomous systems," INFOCOM 2007. 26th IEEE International Conference on Computer Communications, pp. 1334-1342, May 2007.

About Interdomain Internet Routing

In the lecture, Interdomain Internet Routing by H. Balakrishnan and N. Feamster it is explained how routing between different administrative domains work in the Internet. It is stated that the Internet service is provided by a large number of commercial enterprises called autonomous systems (ASes) in competition of each other but are being required to cooperate for global connectivity. These ASes implements some sets of policies in deciding how to route its packets to the rest of the Internet and how to export the routes of itself, its customers and learned routes to other ASes. In the first part of the lecture the Inter-AS business relationships, transit and peering were discussed. Transit is a provider-customer relationship wherein the provider charges its customers for Internet access. Meanwhile peering on the other hand is a mutual agreement between two providers to allow access to subset of each other's routing tables wherein a financial settlement may not be involve as long as the traffic ration is not highly symmetric. Although peering is beneficial to ASses, its main drawback would be it not being able to give revenue, transit relationships does. Internet Service Providers (ISPs) filter the routes to export and most of them end up providing selective transit. While in importing, routes are imported in the following order: customer, peer and provider.

In the second part of the lecture, the most important features of the Border Gateway Protocol (BGP) were discussed. The BGP was designed to meet three important needs scalability, policy and cooperation under competitive circumstances. This is so that the Internet routing infrastructure remained scalable as the number of connected networks increased, the ASses are capable of implementing various forms of routing policy and to gracefully handle the transfer the "backbone" Internet infrastructure from single to multiple administrative entities. BGP runs on TCP over port 179. The two messages sent by BGP routers are UPDATE messages which announce a change or removal of route and the KEEPALIVE messages to ensure that it is still functioning properly. BGP disseminates routes within between routers on the different ASses using eBGP sessions and iBGP sessions for routers on the same AS. BGP allows the policy expression of ASses by allowing network operators to configure routers to manipulate route attributes when disseminating routes. The most important attributes are LOCAL PREF, ASPATH and MED arranged in priority order. In the end, the authors concluded that although BGP is a simple protocol, its operation in practice is extremely complex because of its configuration flexibility.

This lecture is a great introduction on how routing in the Internet happens between its domains. The design of the internet prioritized the needs of the military, on the other hand interdomain routing prioritize the need of the many administrative systems to be truly independent of each other but still cooperate. BGP gives the ASses the ability to implement their own policies based on their best interest. As consequence of this, the consumers that rely on these providers may not be getting the best possible service.

Wednesday, June 24, 2009

Review: The Design Philosophy of the DARPA Internet Protocols

The paper, The design philosophy of the DARPA internet protocols by David D. Clark [1], presents some of the early reasoning and motivations that contributed to the how the Internet was designed. The top level goal for the DARPA Internet Architecture was to develop an effective technique for multiplexed utilization of existing interconnected networks. They established a list of detailed goals ranked by their importance to DARPA project to achieve the fundamental goal.

Because this was a military project, the first goal on the list was survivability, which means the Internet should continue to supply communication service even though networks and gateways are failing. In other words, the connection should never be lost unless there is no physical path. The architecture chose the "fate sharing" approach wherein the end host is made responsible of maintaining the state information about an ongoing connection, and this lead to the use of "datagram" networks which made the internet, "connectionless". The second and third goal was to support a variety of types of services and utilize a wide variety of network technologies. Because of this, TCP and IP became two layers and the datagram became the basic building block in which multiple types of services could be constructed from. The remaining goals were: The internet architecture must permit distributed management of its resources, must be cost effective, permit host attachment with a low level of effort and the resource used in the Internet architecture must be accountable. Because these goals were lower in importance, they were less effectively met or not completely engineered but nonetheless contributed to the success of the Internet. And it was mentioned by the author that if the order of the goal list were different, a different Internet architecture would have been created. And with this, he suggests that there may be a better design of the Internet if the designer's priorities match with those of the actual users.

The author did a great job of giving the details on how and why the Internet was designed. The designers of the Internet architecture were successful in meeting their ordered list of goals which suited the need of the DARPA project. However I do agree with the author that the priorities of the list of goals have changed since the migration from military to commercial setting. Some of the goals that were lower in importance when the Internet was created that have less been effectively met might led to the Internet's deficiencies today.

It was mentioned in the paper that an alternative to interconnecting existing networks was to design a unified system which incorporated a variety of different transmission media or a multi-media network. At the time of the design of the Internet, the support for multimedia traffic was not a priority because it was not one of their needs at that time and also multimedia traffic was a minimum. However, today's Internet is heavily burdened by large amount of multimedia traffic. Is the Internet still sufficient in providing good performance? Although the Internet is flexible enough to support the various services that have been created over the years, some of these services might be wearing this flexibility.

References:

[1] D. D. Clark, "The design philosophy of the DARPA Internet protocols," ACM SIGCOMM Computer Communication Review, vol. 18, issue 4, August 1988.

Review: End to End Arguments in System Design

The paper, End-to-end arguments in system design by J.H. Saltzer, D.P. Reed and D.D. Clark [1] presents a design principle called end-to-end argument that discourages the placement of functions in the lower level of the system unless they provide performance enhancements. It suggest instead that these functions be move upward in the layered system, closer to the application or the End where the requirements for these functions are more defined. The only justification of creating a lower level function is to provide performance optimization but only to certain degree. However, the end-to-end argument is not absolute but rather a guideline and one must use some care to identify the Ends to take the right approach.

As mentioned in [2], the end-to-end principle is one of the key architectural guidelines of the Internet. Because of the bias of the principle in the movement of function "up" from the lower level or the network and "out" to the edge node or applications, the Internet network have remained rather simple and general and because of this, it is able to support many different applications like e-mail, World Wide Web (WWW), web browsers, multi-player games etc. The application of the end to end arguments makes the network transparent, packets go in and they come out and that is all that happens in the network.

The biggest advantage of use of the end to end arguments as a design principle is the simplicity that it produces at the lower levels of a system. Building simple rather than complex lower layers increases the potential uses of these layers which may be unknown or unpredictable at design time. Because of this design, the Internet was able to grow. By making the host responsible for reliability, the network's role was made simple and it became possible to join the different networks that existed at that time.

However the principle's biggest advantage might well be its source of fault. Aside from the additional work that is left out for the higher layer in implementing functions, placing the functions in the slower upper layers has potential performance cost.


References:

[1] J. H. Saltzer, D.P. Reed and D. D. Clark, "End-to-end arguments in system design," ACM Transactions in Computer Systems, vol. 2, no. 4, pp. 277-288, November 1984.

[2] M. S. Blumenthal and D. D. Clark, "Rethinking the design of the Internet: The end to end arguments vs. the brave new world," ACM Transactions on Internet Technology, vol. 1, no. 1, pp. 70-109, August 2001.