J. Abley | |
ISC | |
March 1, 2003 |
Hierarchical Anycast for Global Service Distribution
isc-tn-2003-1
This document describes an approach which allows a particular service on the Internet to be distributed, such that the service can be implemented by geographically and topologically dispersed components. By distributing components in this manner a stable service may be provided to a wide audience even in the event of serious problems which cause individual components of the distributed service infrastructure to fail or otherwise become unavailable. In addition:
Service for the F root nameserver (f.root-servers.net) is currently provided using this technique.
Copyright (C) Internet Software Consortium, Inc. (2003). All Rights Reserved.
Sets of host and network components which are capable of autonomously providing the service to be disteributed are deployed in physically and topologically diverse points of the Internet. Each set of components is referred to as a "node" in this document.
The services being distributed are associated with particular IP addresses. Supernet routes which cover those IP addresses ("service supernets" in this document) are injected into the global routing system by each node. Request datagrams from clients of the distributed services are routed to exactly one node, where they are processed. This general approach is in common use, and is generally known as "anycast".
This document refines the general anycast technique described above by imposing additional structure on the routing policy associated with different nodes. This allows very widespread service distribution (very many nodes) and to allow nodes to be deployed within regions with marginal network infrastructure without sacrificing global stability of the service.
Two classes of node are described in this document:
Global Nodes provide a baseline degree of proximity to the entire Internet. Multiple global nodes are deployed to ensure that the general availability of the service does not rely on the availability or reachability of a single global node.
Local Nodes provide contained regions of optimisation. Clients within the catchment area of a local node may have their queries serviced by a Local Node, rather than one of the Global Nodes.
The desired algorithm for node selection by remote networks is for a path to a Local Node to be chosen in preference to a path to a Global Node. A low-latency, uncongested path is preferred to a high-latency, congested path.
The natural routing policy employed by network operators (in combination with the route export policy from Local Nodes and Global Nodes) approximates this behaviour.
Each deployed Global Node should be designed to handle a full, global load of client requests. The catchment area of each Global Node is large, and the stability of Global Nodes is correspondingly important. Global Nodes should be well-connected and internally resistent to failure.
Each deployed Local Node should be designed and scaled to handle a full load of client requests from its local catchment area. The performance, internal robustness and connectivity requirements of Local Nodes are less than those of Global Nodes; however, it is important for clients within the catchment of a Local Node that the availability of the service that node is stable (i.e. it does not oscillate, causing undue instability in the local routing system).
Availability of a service is signalled to the routing system by a node using BGP; a supernet which covers the service address is advertised to other ASes. Care should be taken to ensure that:
Each node advertises a consistent set of service supernets, each with a consistent origin AS. The AS_PATH attribute of advertisements made to external networks by all nodes should be the same.
Global Nodes should be provisioned with a rich set of transit providers and local peers, in order to make them highly reachable to the entire Internet.
The desired routing policy for a Local Node is one which allows the service supernets to propagate within a local catchment area, but which does not allow them to propagate globally. This policy is important since Local Nodes will typically not have the network connectivity or server capability required for a global load of queries.
Local Nodes advertise service supernets to external networks as peers, and not for transit. An AS which receives a route from a peer should advertise it to their customers, but not to any other peer or transit provider. If all adjacent ASes to the Local Node accept the service supernet advertisements as they would a route from a peer, the desired route policy of the Local Node is achieved.
This approach distributes the enforcement of the desired propagation policy amongst a set of peer networks, and an operational problem in any one of those networks might lead to the service supernets being leaked beyond their intended propagation radius. The probability of such failures increases with the number of peers of a Local Node. This is not a desirable scaling characteristic.
To reduce the possibility of the service subnet being leaked, Local Nodes advertise service supernets to peers with the well-known "no-export" BGP community attribute. The default behaviour of the peer's routers should be to suppress advertisements of the prefixes to any other AS; this should reduce the chance that service supernets are accidentally leaked, since the "no-export" community is handled automatically by most vendors.
In some cases, the default policy imposed by the "no-export" community attribute may limit the catchment area of a Local Node to an extent which is draconian and undesirable. Peers may choose to strip the "no-export" community and apply other appropriate local policy; however, it is highly desirable that this activity is coordinated with the node operator, so that the expected propagation of service supernets is well-known.
In the event of a failure of a Global Node, traffic which had been handled by that Global Node will shift to another Global Node, if one is available. Clients whose requests are serviced by other nodes will see zero impact. Clients of the failed Global Node may experience short-term service disruption as the routing system reconverges.
In the event of a failure of a Local Node, traffic which had been handled by that local node will shift to a Global Node, if one is available. Clients whose requests are serviced by other nodes will see zero impact. Clients of the failed Local Node may experience short-term service disruption as the routing system reconverges.
It is possible that the catchment areas of two or more Local Nodes may overlap. Requests from a client located in the intersection of these catchment areas will be serviced by exactly one Local Node; in the event of a failure of that node, the client's request traffic may well shift to one of the other Local Nodes available to the client. For clients that are not within overlapping catchment areas, request traffic will always shift to a Global Node in the event that the Local Node fails, and never to another Local Node.
Since a node will typically be constructed from a variety of different host and network components, there is a possibility that a partial node failure will occur; for example, hosts may fail due to some software problem, while routers and switches remain operational. The internal failure modes of the node should be considered carefully in order that advertisement of service supernets ceases as soon as a node is no longer able to respond to queries. A node which continues to advertise service supernets without the internal capability to service queries will deny service to clients within its catchment area.
If a peer of a Local Node leaks the service supernet to peers and transit providers, the catchment area of the Local Node may be extended beyond the desired radius. This has the potential to deny service to clients within the enlarged radius, as connectivity between that new service region and the Local Node may not be sufficient to handle the increased traffic load. Such leaks are not uncommon in the Internet, due mainly to operator error but also in some cases due to software faults in routers.
Certain precautions are taken to make it less likely that such leaks will occur (see Section 4.3).
The potential damage caused by such a leak can also be naturally mitigated by a rich deployment of Global Nodes and other Local Nodes; if clients within the enlarged propagation radius of the leaked service supernet are better served by another node, the Local Node leak will have no residual effect on the service as seen by those clients.
A leak of a service supernet originating from a particular Local Node may be eliminated by withdrawing the announcement of the service supernet from that node. It may be appropriate to institute manual or automated procedures to perform such an exercise in the event that a leak is detected (see Section 6.2).
Although each node is intended to be reached by a particular community of clients, there is also a requirement to be able to reach individual nodes in a predictable fashion for the purposes of systems administration, and so that service performance can be monitored. For this reason each node has a set of unique, unicast management addresses associated with it.
Where interior connectivity connects all nodes (e.g. through a private frame-relay management network) that internal network can be used to provide management access to each node. Where interior connectivity does not exist, each node may obtain transit from local ISPs such that appropriate reachability to management addresses is provided to each node through the Internet.
To detect routing policy failures due to leaked service supernets (see Section 5.4) the global routing system should be measured in appropriate places. It is recommended that the route propagation policy for the service supernet is chosen such that origin node for a particular advertisement are identifiable using attributes found in the route such as AS_PATH or community strings in order to facilitate the construction of a robust anomoly detection algorithm.
There may be additional benefits in monitoring traffic at each node for a distributed service, in addition to the benefits of monitoring traffic directed at a non-distributed service (see Section 7.1).
Many Internet services attract malicious traffic. The distribution of denial-of-service attack traffic along with non-malicious traffic provides the following opportunities:
The possibility that foreign Internet routers might advertise a service supernet, constructing a rogue service node with which to divert client queries, exists for all services, distributed and non-distributed.
Distribution of services using anycast will tend to decrease the possible catchment of such a rogue node, since the rogue node must compete in the routing system with more paths to the service address than would be the case for a non-distributed service. Careful monitoring of the routing system will aid detection of rogue nodes (see Section 6.2).
There is an increased risk, however, that a rogue node will be more difficult to identify for a third party operator when a service is distributed. Legitimate nodes may be added and removed by the service operator quite regularly, and distinguishing between legitimate and illegitimate changes is more difficult than simply noticing a new path, or a new advertised prefix length.
It is recommended that the community of operators interested in the availability of particular services be kept informed of legitimate routing policy changes, in order that rogue nodes can be more easily identified.
Much of the material in this technical note is based on work done by Stephen Stuart and Paul Vixie in providing anycast service for the F root server.