$theTitle=wp_title(" - ", false); if($theTitle != "") { ?> } else { ?> } ?>
by Andrew Johnstone
In: Linux
18 Jan 2010I previously described how to configure HA Proxy and ucarp to load balance traffic between servers and share common IP addresses in the event of failure, however this still leaves holes in the availability of the application. The scenario only accounts for availability within a single data center and does not address how traffic and application services are managed between two or more geographically distinct data centers.
I’ve attached a more simplified diagram of our current architecture to help highlight single points of failure (Bytemarks load balancing was simplified in the diagram).
It becomes much more complicated to handle fail-over between multiple data centers. As an example if data center 1 fails entirely, we need to ensure that VIPs are routed to the correct data center OR DNS is changed. This becomes a two fold problem, by the time your DNS is propagated there is an unknown amount of time before becoming available again, in addition if you do not own the IP you cannot port to a different data-center.
There are a number of considerations that you can take into account, each will have varying implications on cost.
There are several things you can do at the DNS level to reduce the effect of any outage.
All of the above still leaves margin for outages and DNS can not be used to ensure high availability alone, despite helping to some degree. As mentioned above it is possible to move the network addresses from one data center to another data center. However re-routing IP addresses becomes fairly tricky if you are working with small CIDR blocks and are specific to the IP Network Allocation used. In addition it involves coordination with the hosting provider and upstream transit providers.
There are two types of IP Network Allocation.
- PA – Provider Agregable
- PI – Provider Independent
PA – Provider Agregable
- Ripe assigns a large block of networks to a LIR (Local Internet Registry)
- LIR assigns smaller networks to customers from the larger block
- PA addresses cannot be transferred between providers
- Request form to be filled by each end-customer justifying quantity
PI – Provider Independent
- Not associated with any provider –cannot be agregated
- Used by some dual-homed networks
- RIPE performs much stricter checking of application than for PA
- Applicant must justify “why not PA?”
- Smallest allocation /24 (255 IP addrs)
- LIR (Local Internet Registry) submits form on customer’s behalf
In order to have high availability and re-route traffic you will need the following.
Whilst it is possible to use Provider Agregable addresses and advertise the fragment to the other providers. “Wherever the fragment is propagated in the Internet the incoming traffic will follow the path of the more specific fragment, rather than the path defined by the aggregate announcement” and therefore will require Provider Independent addresses.
In order to acquire PI addresses you must register them through either an LIR (Local Internet Registry) or by becoming an LIR through a Regional Internet Registry (RIR).
|
It is relatively cheap to acquire AS numbers and PI addresses, which can be acquired through Secura Hosting, which is an LIR.
Once you have your address block, AS number and multiple upstream connections, you announce your address block to each provider, and receive their routing table via an eBGP session. You can then configure quagga a software routing suite to do this.
For more information on BGP see:
BGP Blackmagic: Load Balancing in “The Cloud”
su apt-get install quagga quagga-doc touch /etc/quagga/{bgpd.conf,ospfd.conf,zebra.conf} sed -i 's/bgpd=no/bgpd=yes/' /etc/quagga/daemons chown quagga.quaggavty /etc/quagga/*.conf echo 'password YourPassHere' > /etc/quagga/bgpd.conf echo 'password YourPassHere' > /etc/quagga/ospfd.conf echo 'password YourPassHere' > /etc/quagga/zebra.conf sed -i 's/bgpd=no/bgpd=yes/' /etc/quagga/daemons # By default Linux block traffic that go out from an interface and come back from another. sed -i 's/.net.ipv4.conf.all.rp_filter=[0,1]/net.ipv4.conf.all.rp_filter=0/;s/.net.ipv4.conf.lo.rp_filter=[0,1]/net.ipv4.conf.lo.rp_filter=0/;s/.net.ipv4.conf.default.rp_filter=[0,1]/net.ipv4.conf.default.rp_filter=0/' /etc/sysctl.conf sysctl -p /etc/init.d/quagga start
andrew-home:~# telnet localhost bgpd Trying ::1... Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. Hello, this is Quagga (version 0.99.15). Copyright 1996-2005 Kunihiro Ishiguro, et al. User Access Verification Password: andrew-home> andrew-home> enable andrew-home# conf t andrew-home(config)# hostname R1 R1(config)# router bgp 65000 R1(config-router)# network 10.1.2.0/24 R1(config-router)# ^Z R1# wr Configuration saved to /usr/local/quagga/bgpd.conf R1# show ip bgp BGP table version is 0, local router ID is 0.0.0.0 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale, R Removed Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path *> 10.1.2.0/24 0.0.0.0 0 32768 i Total number of prefixes 1 R1# exit Connection closed by foreign host. # A few addition commands show ip bgp summary
I wont go too much into configuring this, however there there a few additional resources that can help.
All transit providers for the site accept a prefix advertisement from the multi-homed site, and advertise this prefix globally in the inter-domain routing table. When connectivity between the local site and an individual transit provider is lost, normal operation of the routing protocol will ensure that the routing advertisement corresponding to this particular path will be withdrawn from the routing system, and those remote domain domains who had selected this path as the best available will select another candidate path as the best path. Upon restoration of the path, the path is re-advertised in the inter-domain routing system. Remote domains will undertake a further selection of the best path based on this re-advertised reachability information. Neither the local or the remote host need to have multiple addresses, nor undertake any form of address selection.
Multi-Homing
This also does have its problems for Provider Independent addresses, which causes concern for this approach to multi-homing via Provider Independent addresses. However Provider Agregable addresses do not typically suffer from the first two points below.
The above points do not really make this approach feasible for small CIDR blocks, as it will be difficult to justify a large enough range of routes as well as ensuring infrastructure for BGP.
We originally hosted a large number of servers with EC2, however have moved most of our services away from EC2 due to excessive costs with large and extra large instances. At present we have moved a large number to bytemark. The following does put a dependency on a single provider…
ELB does not immediately release any IP address when traffic decreases. Instead, it removes the IP address from the associated DNS name. However, the IP address continues to be associated with the load balancer for some period of time. During this time, ELB monitors various metrics to check if this IP address is still receiving traffic, and release it when it is appropriate to do so.
# WARNING: changing the keypair associated to an existing key will and running # ./load.balancing.sh remove will delete all associated instances. Probably best to leave as is. KEYPAIR=loadbalancing-test; ELB_NAME='test'; NODES=2; AMI=ami-b8446fcc; function remove_elb { #elb-delete-lb $ELB_NAME; ec2-describe-instances | grep $KEYPAIR | grep -v terminated | awk '{print $2}' | while read instanceId; do ec2-terminate-instances $instanceId; done; ec2-delete-keypair $KEYPAIR && rm ~/.ssh/$KEYPAIR; } function create_elb { ec2-add-keypair $KEYPAIR > ~/.ssh/$KEYPAIR; chown 0600 ~/.ssh/$KEYPAIR; ssh-keygen -p -f ~/.ssh/$KEYPAIR; elb-create-lb $ELB_NAME --availability-zones $EC2_AVAILABILITY_ZONE --listener "protocol=http,lb-port=80,instance-port=80" # Create instances to attach to ELB for i in `seq 1 $NODES`; do ec2-run-instances $AMI --user-data-file install-lamp -k $KEYPAIR done; # Authorize CIDR Block 0.0.0.0/0 ec2-authorize default -p 22 ec2-authorize default -p 80 addresses_allocated=0; ec2-describe-instances | grep $KEYPAIR | grep -v terminated | awk '{print $2}' | while read instanceId; do echo elb-register-instances-with-lb $ELB_NAME --instances $instanceId; elb-register-instances-with-lb $ELB_NAME --instances $instanceId; # # Allocate addresses for each node # # You may need to contact EC2 to increase the amount of IP addresses you can allocate, however these will be hidden, so not important # while [ "`ec2-describe-addresses | grep -v 'i-' | awk '{print $2}' | wc -l`" -lt "$NODES" ]; do # ec2-allocate-address; # fi; # # # Allocate non associated addresses # ec2-describe-addresses | grep -v 'i-' | awk '{print $2}' | while read ip_address; do # echo $ip_address; # ec2-associate-address $ip_address -i $instanceId # addresses_allocated=$(($addresses_allocated+1)); # done; done; elb-configure-healthcheck $ELB_NAME --headers --target "TCP:80" --interval 5 --timeout 2 --unhealthy-threshold 2 --healthy-threshold 2 elb-enable-zones-for-lb $ELB_NAME --availability-zones $EC2_AVAILABILITY_ZONE } function test_elb { SERVER="`ec2-describe-instances | grep $KEYPAIR | grep -v terminated | head -n1 | awk '{print $4}'`" instanceId="`ec2-describe-instances | grep $KEYPAIR | grep -v terminated | head -n1 | awk '{print $2}'`" echo "Shutting down $SERVER"; ssh -i ~/.ssh/$KEYPAIR root@$SERVER "/etc/init.d/apache2 stop" sleep 11; elb-describe-instance-health $ELB_NAME ssh -i ~/.ssh/$KEYPAIR root@$SERVER '/etc/init.d/apache2 start' sleep 2; elb-describe-instance-health $ELB_NAME } if ( [ $# -gt 0 ] && ( [ $1 = "create" ] || [ $1 = "remove" ] || [ $1 = "test" ] || [ $1 = "health" ] ) ); then command="$1;" else echo "Usage `basename $0` create|remove"; echo "Usage `basename $0` test"; exit; fi; case $1 in "remove") remove_elb; ;; "create") create_elb; ;; "test") test_elb; ;; "health") elb-describe-instance-health $ELB_NAME; ;; esac;
Change the ip addresses 192.0.32.10:80 to the relevant servers, presently this points to example.com.
install-lamp
#!/bin/bash export DEBIAN_FRONTEND=noninteractive; apt-get update && apt-get upgrade -y && apt-get install -y apache2 haproxy subversion; tasksel install lamp-server; echo "Please remember to set the MySQL root password!"; IP_ADDRESS="`ifconfig | grep 'inet addr:'| grep -v '127.0.0.1' | cut -d: -f2 | awk '{ print $1}'`"; echo $IP_ADDRESS >/var/www/index.html; cat > /etc/haproxy/haproxy.cfg < < EOF global log /dev/log local1 maxconn 4096 #chroot /usr/share/haproxy user haproxy group haproxy daemon #debug #quiet defaults log global mode http option httplog option dontlognull retries 3 option redispatch maxconn 2000 contimeout 5000 clitimeout 50000 srvtimeout 50000 option httpclose listen gkweb $IP_ADDRESS:80 mode http balance leastconn stats enable stats realm LB Statistics stats auth stats:password stats scope . stats uri /stats?stats #persist server web1 192.0.32.10:80 check inter 2000 fall 3 server web2 192.0.32.10:80 check inter 2000 fall 3 server web3 192.0.32.10:80 check inter 2000 fall 3 server web4 192.0.32.10:80 check inter 2000 fall 3 EOF sed -i -e 's/^ENABLED.*$/ENABLED=1/' /etc/default/haproxy; /etc/init.d/apache2 stop; /etc/init.d/haproxy restart;
References:
Multihoming to one ISP
Multipath Load Sharing with two ISPs
Small site multihoming
EBGP load balancing with EBGP session between loopback interfaces
Failover and Global Server Load Balancing for Better Network Availability
The “Elastic” in “Elastic Load Balancing”: ELB Elasticity and How to Test it
EC2 Instance Belonging to Multiple ELBs, (Creating different tiers of service, Splitting traffic across multiple destination ports)
I have been a developer for roughly 10 years and have worked with an extensive range of technologies. Whilst working for relatively small companies, I have worked with all aspects of the development life cycle, which has given me a broad and in-depth experience.
1 Response to High Availability Across Multiple Data Centers, Multihoming and EC2
shahzaib
November 19th, 2014 at 10:41 am
Nice post . I have a question regarding BGP. As you said, we’ll need to advertise our route from multiple Geo locations to multiple ISps in order to re-route incoming traffic towards other DC if primary goes down. What happens, if the primary DC doesn’t go down but our servers crash due to any failure inside the DC i.e PDU connected to our servers switch fails and cause down the whole website ? As long as i know, bgp comes into play if the WHOLE DC goes down.
Awaiting your reply !1