EC2: New instances and firewalls
We hold much of our server configuration within the office, which is restricted down by iptables. As such when spawning new instances on EC2 we need to allow access to our internal network via iptables, to allow nodes to connect to the office and configure themselves.
The following script can be run within a crontab to automatically add the nodes to your firewall.
Alternatively you could add a wrapper, whilst creating these, although this is not as nice as using elasticfox etc.
cron.ec2.firewall.sh
#!/bin/bash
IGNORE_REGION='us-west-1'; # For some reason this failed to connect/timeout
PORTS='22 80 3690 4949 8140';
iptables-save > /etc/iptables-config;
ec2-describe-regions | awk '{print $2}' | egrep -v "$IGNORE_REGION" | while read REGION; do
echo "$REGION";
ec2-describe-instances --region $REGION --connection-timeout 3 --request-timeout 3 |
grep INSTANCE |
while read DATA; do
EC2_HOST="`echo $DATA | awk '{print $4}'`";
EC2_PUBLIC_IP="`echo $DATA | awk '{print $15}'`";
for PORT in $PORTS; do
MATCH_RULES="\-\-dport $PORT"
if ! cat /etc/iptables-config | grep "$EC2_HOST" | egrep "$MATCH_RULES" > /dev/null; then
echo -e "tiptables -A INPUT -s $EC2_PUBLIC_IP/32 -p tcp -m tcp --dport $PORT -m comment --comment "EC2 - $EC2_HOST" -j ACCEPT"
iptables -A INPUT -s $EC2_PUBLIC_IP/32 -p tcp -m tcp --dport $PORT -m comment --comment "EC2 - $EC2_HOST" -j ACCEPT
fi;
done;
done;
done;
echo "Saving config: /etc/iptables-config"
iptables-save > /etc/iptables-config
High Availability Across Multiple Data Centers, Multihoming and EC2
I previously described how to configure HA Proxy and ucarp to load balance traffic between servers and share common IP addresses in the event of failure, however this still leaves holes in the availability of the application. The scenario only accounts for availability within a single data center and does not address how traffic and application services are managed between two or more geographically distinct data centers.
I’ve attached a more simplified diagram of our current architecture to help highlight single points of failure (Bytemarks load balancing was simplified in the diagram).

It becomes much more complicated to handle fail-over between multiple data centers. As an example if data center 1 fails entirely, we need to ensure that VIPs are routed to the correct data center OR DNS is changed. This becomes a two fold problem, by the time your DNS is propagated there is an unknown amount of time before becoming available again, in addition if you do not own the IP you cannot port to a different data-center.
There are a number of considerations that you can take into account, each will have varying implications on cost.
DNS
There are several things you can do at the DNS level to reduce the effect of any outage.
- Have a very low time to live, or TTL in your DNS
- If you have complete replica of your environment, split the traffic between the data centers using a round robin DNS (Will half the outage to those cached on the active data center)
- Browsers will cache the DNS for 30 minutes
- ISPs and Corporates will cache your DNS, regardless of TTL
- Maintain a minimal setup (1 or 2 nodes) to balance traffic between the data-centers (using round robin as described above) and utilize HA Proxy weighted to a data center. In the event of failure more nodes can be setup automatically to recover.
Multihoming via routing
All of the above still leaves margin for outages and DNS can not be used to ensure high availability alone, despite helping to some degree. As mentioned above it is possible to move the network addresses from one data center to another data center. However re-routing IP addresses becomes fairly tricky if you are working with small
There are two types of IP Network Allocation.
- PA – Provider Agregable
- PI – Provider Independent
PA – Provider Agregable
- Ripe assigns a large block of networks to a LIR (Local Internet Registry)
- LIR assigns smaller networks to customers from the larger block
- PA addresses cannot be transferred between providers
- Request form to be filled by each end-customer justifying quantity
PI – Provider Independent
- Not associated with any provider –cannot be agregated
- Used by some dual-homed networks
- RIPE performs much stricter checking of application than for PA
- Applicant must justify “why not PA?”
- Smallest allocation /24 (255 IP addrs)
- LIR (Local Internet Registry) submits form on customer’s behalf
In order to have high availability and re-route traffic you will need the following.
- Your own address block (or a block of ‘provider independent’ space)
- Your own Autonomous System number, and
- Multiple upstream network connections.
Whilst it is possible to use Provider Agregable addresses and advertise the fragment to the other providers. “Wherever the fragment is propagated in the Internet the incoming traffic will follow the path of the more specific fragment, rather than the path defined by the aggregate announcement” and therefore will require Provider Independent addresses.
In order to acquire PI addresses you must register them through either an LIR (Local Internet Registry) or by becoming an LIR through a Regional Internet Registry (RIR).
![]() |
|
It is relatively cheap to acquire AS numbers and PI addresses, which can be acquired through Secura Hosting, which is an LIR.
Once you have your address block, AS number and multiple upstream connections, you announce your address block to each provider, and receive their routing table via an eBGP session. You can then configure quagga a software routing suite to do this.
Setting up Quagga
su
apt-get install quagga quagga-doc
touch /etc/quagga/{bgpd.conf,ospfd.conf,zebra.conf}
sed -i 's/bgpd=no/bgpd=yes/' /etc/quagga/daemons
chown quagga.quaggavty /etc/quagga/*.conf
echo 'password YourPassHere' > /etc/quagga/bgpd.conf
echo 'password YourPassHere' > /etc/quagga/ospfd.conf
echo 'password YourPassHere' > /etc/quagga/zebra.conf
sed -i 's/bgpd=no/bgpd=yes/' /etc/quagga/daemons
# By default Linux block traffic that go out from an interface and come back from another.
sed -i 's/.net.ipv4.conf.all.rp_filter=[0,1]/net.ipv4.conf.all.rp_filter=0/;s/.net.ipv4.conf.lo.rp_filter=[0,1]/net.ipv4.conf.lo.rp_filter=0/;s/.net.ipv4.conf.default.rp_filter=[0,1]/net.ipv4.conf.default.rp_filter=0/' /etc/sysctl.conf
sysctl -p
/etc/init.d/quagga start
Configuring Quagga for BGP
andrew-home:~# telnet localhost bgpd
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Hello, this is Quagga (version 0.99.15).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
User Access Verification
Password:
andrew-home>
andrew-home> enable
andrew-home# conf t
andrew-home(config)# hostname R1
R1(config)# router bgp 65000
R1(config-router)# network 10.1.2.0/24
R1(config-router)# ^Z
R1# wr
Configuration saved to /usr/local/quagga/bgpd.conf
R1# show ip bgp
BGP table version is 0, local router ID is 0.0.0.0
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale, R Removed
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 10.1.2.0/24 0.0.0.0 0 32768 i
Total number of prefixes 1
R1# exit
Connection closed by foreign host.
# A few addition commands
show ip bgp summary
I wont go too much into configuring this, however there there a few additional resources that can help.
All transit providers for the site accept a prefix advertisement from the multi-homed site, and advertise this prefix globally in the inter-domain routing table. When connectivity between the local site and an individual transit provider is lost, normal operation of the routing protocol will ensure that the routing advertisement corresponding to this particular path will be withdrawn from the routing system, and those remote domain domains who had selected this path as the best available will select another candidate path as the best path. Upon restoration of the path, the path is re-advertised in the inter-domain routing system. Remote domains will undertake a further selection of the best path based on this re-advertised reachability information. Neither the local or the remote host need to have multiple addresses, nor undertake any form of address selection.
Multi-Homing
Problems
This also does have its problems for Provider Independent addresses, which causes concern for this approach to multi-homing via Provider Independent addresses. However Provider Agregable addresses do not typically suffer from the first two points below.
- RIPE warns: “assignment of address space does NOT imply that this address space will be ROUTABLE ON ANY PART OF THE INTERNET”, see PA vs. PI Address Space
- Possible auto summarization on global routing tables. Routing protocols summarize multiple routes into single one to cut down size of the routing tables… So assuming your class is A.B.C.D/26, you may have one or more ISPs having summary route to A.B.0.0/16 pointing to completely different network then your IP is physically in. Understanding Route Aggregation in BGP
- Small subnets such as /24 are heavily penalized by BGP dampening after few flaps.
- “When starting out with BGP, opening sessions with other providers, if you have got tweaking to do and it does start flapping, some providers will simply disconnect you, whilst others will dampen for days. Basically closing any throughput, watching and waiting to see if you stabilize. Others will require require contracts to be drawn up to prove you have competency and proven stability.”, Andrew Gearing
- Many ISPs filter the IP address routing announcements made by their customers to prevent malicious activities such as prefix hijacking. As a result, it can take time for ISPs to reconfigure these filters which are often manually maintained. This can slow down the process of moving IP addresses from the primary datacenter to the backup datacenter.
- Smaller routes won’t be allowed on the global backbone routing BGP tables, because they don’t want a huge table of small allocations – for performance and stability the backbone prefers a small number of relatively large address space chunks.
The above points do not really make this approach feasible for small
High Availability through EC2
We originally hosted a large number of servers with EC2, however have moved most of our services away from EC2 due to excessive costs with large and extra large instances. At present we have moved a large number to bytemark. The following does put a dependency on a single provider…
Implementation
- EC2 Elastic Load balancing
- Two small EC2 compute nodes in different data centers/region soley for proxying content
- HA Proxy
Benefits:
- Using Elastic Load Balancing, you can distribute incoming traffic across your Amazon EC2 instances in a single Availability Zone or multiple Availability Zones.
- Elastic Load Balancing can detect the health of Amazon EC2 instances. When it detects unhealthy load-balanced Amazon EC2 instances, it no longer routes traffic to those Amazon EC2 instances instead spreading the load across the remaining healthy Amazon EC2 instances.
- Trivial to monitor and spawn instances on failure and re-assign IP addresses
- No need to configure BGP and acquire Provider Independent addresses
- Cost, bandwidth and the elastic load balancer are relatively cheap. ” transferring 100 GB of data over a 30 day period, the monthly charge would amount to $18 (or $0.025 per hour x 24 hours per day x 30 days x 1 Elastic Load Balancer) for the Elastic Load Balancer hours and $0.80 (or $0.008 per GB x 100 GB) for the data transferred through the Elastic Load Balancer, for a total monthly charge of $18.80.”. Small EC2 instances are fairly cheap, and solely forwarding requests.
- EC2 Auto Scaling can be used to spawn more nodes and configure automatically.
Disadvantages:
- “The ELB system strives to scale ahead of demand but wild spikes in traffic demand (such as 0 load to full load swings during initial load testing) can temporarily run past it’s provisioned capacity which will result in 503 Service Unavailable until the system can get ahead of demand. Under extreme overload, timeouts can occur. Customer traffic flows are typically gradual swings that occur at rates measured in minutes or hours, not fractions of a second.”
- This is a problem for load testing, as the ELB will drop connections, however ELB as described will increase capacity gradually, to keep up with demand. Will need to run some load tests to test this, and check whether the service can adequately handle our traffic needs.
- Elastic Load Balancer: An Elasticity Gotcha (Caching DNS and ELB releasing IPs) and Elastic Load Balancer is elastic – how the elasticity is achieved
-
ELB does not immediately release any IP address when traffic decreases. Instead, it removes the IP address from the associated DNS name. However, the IP address continues to be associated with the load balancer for some period of time. During this time, ELB monitors various metrics to check if this IP address is still receiving traffic, and release it when it is appropriate to do so.
-
- Cannot load balance via ELB between regions.
- Health check via ELB minimum is 5 seconds, (Easy to monitor availability between nodes and remove from ELB ).
- Additional latency, however initial tests seem neglible.
# WARNING: changing the keypair associated to an existing key will and running
# ./load.balancing.sh remove will delete all associated instances. Probably best to leave as is.
KEYPAIR=loadbalancing-test;
ELB_NAME='test';
NODES=2;
AMI=ami-b8446fcc;
function remove_elb
{
#elb-delete-lb $ELB_NAME;
ec2-describe-instances | grep $KEYPAIR | grep -v terminated | awk '{print $2}' | while read instanceId; do ec2-terminate-instances $instanceId; done;
ec2-delete-keypair $KEYPAIR && rm ~/.ssh/$KEYPAIR;
}
function create_elb
{
ec2-add-keypair $KEYPAIR > ~/.ssh/$KEYPAIR;
chown 0600 ~/.ssh/$KEYPAIR;
ssh-keygen -p -f ~/.ssh/$KEYPAIR;
elb-create-lb $ELB_NAME --availability-zones $EC2_AVAILABILITY_ZONE --listener "protocol=http,lb-port=80,instance-port=80"
# Create instances to attach to ELB
for i in `seq 1 $NODES`; do
ec2-run-instances $AMI --user-data-file install-lamp -k $KEYPAIR
done;
# Authorize CIDR Block 0.0.0.0/0
ec2-authorize default -p 22
ec2-authorize default -p 80
addresses_allocated=0;
ec2-describe-instances | grep $KEYPAIR | grep -v terminated | awk '{print $2}' | while read instanceId; do
echo elb-register-instances-with-lb $ELB_NAME --instances $instanceId;
elb-register-instances-with-lb $ELB_NAME --instances $instanceId;
# # Allocate addresses for each node
# # You may need to contact EC2 to increase the amount of IP addresses you can allocate, however these will be hidden, so not important
# while [ "`ec2-describe-addresses | grep -v 'i-' | awk '{print $2}' | wc -l`" -lt "$NODES" ]; do
# ec2-allocate-address;
# fi;
#
# # Allocate non associated addresses
# ec2-describe-addresses | grep -v 'i-' | awk '{print $2}' | while read ip_address; do
# echo $ip_address;
# ec2-associate-address $ip_address -i $instanceId
# addresses_allocated=$(($addresses_allocated+1));
# done;
done;
elb-configure-healthcheck $ELB_NAME --headers --target "TCP:80" --interval 5 --timeout 2 --unhealthy-threshold 2 --healthy-threshold 2
elb-enable-zones-for-lb $ELB_NAME --availability-zones $EC2_AVAILABILITY_ZONE
}
function test_elb
{
SERVER="`ec2-describe-instances | grep $KEYPAIR | grep -v terminated | head -n1 | awk '{print $4}'`"
instanceId="`ec2-describe-instances | grep $KEYPAIR | grep -v terminated | head -n1 | awk '{print $2}'`"
echo "Shutting down $SERVER";
ssh -i ~/.ssh/$KEYPAIR root@$SERVER "/etc/init.d/apache2 stop"
sleep 11;
elb-describe-instance-health $ELB_NAME
ssh -i ~/.ssh/$KEYPAIR root@$SERVER '/etc/init.d/apache2 start'
sleep 2;
elb-describe-instance-health $ELB_NAME
}
if ( [ $# -gt 0 ] && ( [ $1 = "create" ] || [ $1 = "remove" ] || [ $1 = "test" ] || [ $1 = "health" ] ) ); then
command="$1;"
else
echo "Usage `basename $0` create|remove";
echo "Usage `basename $0` test";
exit;
fi;
case $1 in
"remove")
remove_elb;
;;
"create")
create_elb;
;;
"test")
test_elb;
;;
"health")
elb-describe-instance-health $ELB_NAME;
;;
esac;
Node configuration:
Change the ip addresses 192.0.32.10:80 to the relevant servers, presently this points to example.com.
install-lamp
#!/bin/bash
export DEBIAN_FRONTEND=noninteractive;
apt-get update && apt-get upgrade -y && apt-get install -y apache2 haproxy subversion;
tasksel install lamp-server;
echo "Please remember to set the MySQL root password!";
IP_ADDRESS="`ifconfig | grep 'inet addr:'| grep -v '127.0.0.1' | cut -d: -f2 | awk '{ print $1}'`";
echo $IP_ADDRESS >/var/www/index.html;
cat > /etc/haproxy/haproxy.cfg < < EOF
global
log /dev/log local1
maxconn 4096
#chroot /usr/share/haproxy
user haproxy
group haproxy
daemon
#debug
#quiet
defaults
log global
mode http
option httplog
option dontlognull
retries 3
option redispatch
maxconn 2000
contimeout 5000
clitimeout 50000
srvtimeout 50000
option httpclose
listen gkweb $IP_ADDRESS:80
mode http
balance leastconn
stats enable
stats realm LB Statistics
stats auth stats:password
stats scope .
stats uri /stats?stats
#persist
server web1 192.0.32.10:80 check inter 2000 fall 3
server web2 192.0.32.10:80 check inter 2000 fall 3
server web3 192.0.32.10:80 check inter 2000 fall 3
server web4 192.0.32.10:80 check inter 2000 fall 3
EOF
sed -i -e 's/^ENABLED.*$/ENABLED=1/' /etc/default/haproxy;
/etc/init.d/apache2 stop;
/etc/init.d/haproxy restart;
Commercial:
References:
Multihoming to one ISP
Multipath Load Sharing with two ISPs
Small site multihoming
EBGP load balancing with EBGP session between loopback interfaces
Failover and Global Server Load Balancing for Better Network Availability
The “Elastic” in “Elastic Load Balancing”: ELB Elasticity and How to Test it
EC2 Instance Belonging to Multiple ELBs, (Creating different tiers of service, Splitting traffic across multiple destination ports)
EC2 Tools Installation: AMI, API, Elastic Load Balancing (ELB), Auto Scaling and Cloud Watch
Updated bash script 23/01/2010: The script was assumed to run from EC2 itself, however I have since modified this so its applicable to local environments and made a little more robust.
There are quite a number of new tools from EC2, each requiring some form of setup on the server. As a result I have created a bash script to install them automatically.
- AMI Tools
- API Tools
- Elastic Load Balancing Tools
- Cloud Watch Tools
The only prerequisite is to install the certificate and private key within ~/.ec2/pk.pem and ~/.ec2/cert.pem and Java if on Fedora.
I have tested this on Debian and Fedora, which uses either yum or apt to execute the install of some dependencies, although it does assume that you are running under root.
You can download it from here ec2.sh
#!/bin/bash
DEBUG=1;
if [ "$(id -u)" != "0" ]; then
echo "This script must be run as root" 1>&2
exit 1
fi
GREP="`which grep`";
if which apt-get >/dev/null; then
PACKAGE_MANAGEMENT="`which apt-get` "
else
PACKAGE_MANAGEMENT="`which yum`"
fi
if which dpkg >/dev/null; then
PACKAGE_TEST="dpkg --get-selections | $GREP -q "
else
PACKAGE_TEST="rpm -qa | $GREP -q "
fi
CURL_OPTS=" --silent --retry 1 --retry-delay 1 --retry-max-time 1 "
function log
{
if [ "$DEBUG" -eq 1 ]; then
echo $1;
fi;
}
function bail
{
echo -e $1;
exit 1;
}
function check_env
{
if [ -f ~/.bashrc ]; then
. ~/.bashrc;
fi
# Tools already exist
if [ -z `which ec2-describe-instances` ] || [ -z `which ec2-upload-bundle` ] || [ -z `which ec2-describe-instances` ] || [ -z `which elb-create-lb` ] || [ -z `which mon-get-stats` ] || [ -z `which as-create-auto-scaling-group` ]; then
log "Amazon EC2 toolkit missing!"
install_ec2;
fi
# EC2_HOME set
if [ -z "$EC2_HOME" ]; then
log "Amazon EC2 is not set-up correctly! EC2_HOME not set"
if ! grep EC2_HOME ~/.bashrc; then
echo "export EC2_HOME=/usr/local/ec2-api-tools/" >> ~/.bashrc
fi;
export EC2_HOME=/usr/local/ec2-api-tools/
source ~/.bashrc
fi
# Java
if [ -z "$JAVA_HOME" ]; then
if grep -i yum "$PACKAGE_MANAGEMENT" > /dev/null; then
bail "nPlease install java manually (do not use yum install java, it is incompatible)nsee JRE http://java.sun.com/javase/downloads/index.jspnDownload, run the bin file, place in /opt/ and update ~/.bashrc. Once complete run 'source ~/.bashrc;'";
fi;
$PACKAGE_MANAGEMENT install -y sun-java6-jdk
JAVA_PATH=/usr/lib/jvm/java-6-sun/jre/;
echo "export JAVA_HOME=$JAVA_PATH" >> ~/.bashrc
export JAVA_HOME=$JAVA_PATH
source ~/.bashrc
fi
# Keys
EC2_HOME_DIR='.ec2';
EC2_PRIVATE_KEY_FILE="$HOME/$EC2_HOME_DIR/pk.pem";
EC2_CERT_FILE="$HOME/$EC2_HOME_DIR/cert.pem";
if [ ! -d "$HOME/$EC2_HOME_DIR" ]; then
mkdir -pv "$HOME/$EC2_HOME_DIR";
fi
install_ec2_env EC2_PRIVATE_KEY "$EC2_PRIVATE_KEY_FILE";
install_ec2_env EC2_CERT "$EC2_CERT_FILE";
install_ec2_keys_files "$EC2_PRIVATE_KEY_FILE" "Private key";
install_ec2_keys_files "$EC2_CERT_FILE" "Certificate";
install_ec2_env AWS_AUTO_SCALING_HOME "/usr/local/ec2-as-tools/"
install_ec2_env AWS_ELB_HOME "/usr/local/ec2-elb-tools/"
install_ec2_env AWS_CLOUDWATCH_HOME "/usr/local/ec2-cw-tools/"
get_region
get_availability_zone
}
function install_ec2_env
{
# Variable Variable for $1
EC2_VARIABLE=${!1};
EC2_VARIABLE_NAME=$1;
EC2_FILE=$2;
#log "VARIABLE: $EC2_VARIABLE_NAME=$EC2_VARIABLE";
# Variable Variable
if [ -z "$EC2_VARIABLE" ]; then
log "Amazon $EC2_VARIABLE_NAME is not set-up correctly!";
if ! grep -q "$EC2_VARIABLE_NAME" ~/.bashrc > /dev/null; then
echo "export $EC2_VARIABLE_NAME=$EC2_FILE" >> ~/.bashrc;
fi;
export $EC2_VARIABLE_NAME=$EC2_FILE;
source ~/.bashrc
else
if ! grep -q "$EC2_VARIABLE_NAME" ~/.bashrc > /dev/null; then
echo "export $EC2_VARIABLE_NAME=$EC2_FILE" >> ~/.bashrc;
else
log "Amazon $EC2_VARIABLE_NAME var installed";
fi;
fi
}
function install_ec2_keys_files
{
EC2_FILE=$1;
EC2_DESCRIPTION=$2;
EC2_CONTENTS='';
if [ ! -f "$EC2_FILE" ]; then
bail "Amazon $EC2_FILE does not exist, please copy your $EC2_DESCRIPTION to $EC2_FILE and re-run this script";
else
log "Amazon $EC2_FILE file installed";
fi
}
function install_ec2
{
for PACKAGE in curl wget tar bzip2 unzip zip symlinks unzip ruby; do
if ! which "$PACKAGE" >/dev/null; then
$PACKAGE_MANAGEMENT install -y $PACKAGE;
fi
done;
# AMI Tools
if [ -z "`which ec2-upload-bundle`" ]; then
curl -o /tmp/ec2-ami-tools.zip $CURL_OPTS --max-time 30 http://s3.amazonaws.com/ec2-downloads/ec2-ami-tools.zip
rm -rf /usr/local/ec2-ami-tools;
cd /usr/local && unzip /tmp/ec2-ami-tools.zip
ln -svf `find . -type d -name ec2-ami-tools*` ec2-ami-tools
chmod -R go-rwsx ec2* && rm -rvf /tmp/ec2*
fi
# API Tools
if [ -z "`which ec2-describe-instances`" ]; then
log "Amazon EC2 API toolkit is not installed!"
curl -o /tmp/ec2-api-tools.zip $CURL_OPTS --max-time 30 http://s3.amazonaws.com/ec2-downloads/ec2-api-tools.zip
rm -rf /usr/local/ec2-api-tools;
cd /usr/local && unzip /tmp/ec2-api-tools.zip
ln -svf `find . -type d -name ec2-api-tools*` ec2-api-tools
chmod -R go-rwsx ec2* && rm -rvf /tmp/ec2*
fi
# ELB Tools
if [ -z "`which elb-create-lb`" ]; then
curl -o /tmp/ec2-elb-tools.zip $CURL_OPTS --max-time 30 http://ec2-downloads.s3.amazonaws.com/ElasticLoadBalancing-2009-05-15.zip
rm -rf /usr/local/ec2-elb-tools;
cd /usr/local && unzip /tmp/ec2-elb-tools.zip
mv ElasticLoadBalancing-1.0.3.4 ec2-elb-tools-1.0.3.4;
ln -svf `find . -type d -name ec2-elb-tools*` ec2-elb-tools
chmod -R go-rwsx ec2* && rm -rvf /tmp/ec2*
fi
# Cloud Watch Tools
if [ -z "`which mon-get-stats`" ]; then
curl -o /tmp/ec2-cw-tools.zip $CURL_OPTS --max-time 30 http://ec2-downloads.s3.amazonaws.com/CloudWatch-2009-05-15.zip
rm -rf /usr/local/ec2-cw-tools;
mv -v CloudWatch-1.0.2.3 ec2-cw-tools-1.0.2.3
cd /usr/local && unzip /tmp/ec2-cw-tools.zip
ln -svf `find . -type d -name ec2-cw-tools*` ec2-cw-tools
chmod -R go-rwsx ec2* && rm -rvf /tmp/ec2*
fi
if [ -z "`which as-create-auto-scaling-group`" ]; then
curl -o /tmp/ec2-as-tools.zip $CURL_OPTS --max-time 30 http://ec2-downloads.s3.amazonaws.com/AutoScaling-2009-05-15.zip
rm -rf /usr/local/ec2-as-tools;
mv -v AutoScaling-1.0.9.0 ec2-as-tools-1.0.9.0
cd /usr/local && unzip /tmp/ec2-as-tools.zip
ln -svf `find . -type d -name ec2-as-tools*` ec2-as-tools
chmod -R go-rwsx ec2* && rm -rvf /tmp/ec2*
fi
ln -sf /usr/local/ec2-api-tools/bin/* /usr/bin/;
ln -sf /usr/local/ec2-ami-tools/bin/* /usr/bin/;
ln -sf /usr/local/ec2-elb-tools/bin/* /usr/bin/;
ln -sf /usr/local/ec2-cw-tools/bin/* /usr/bin/;
ln -sf /usr/local/ec2-as-tools/bin/* /usr/bin/;
rm -f /usr/bin/ec2-*.cmd;
}
function get_availability_zone
{
# Not reliable between availability zones using meta-data
# export EC2_AVAILABILITY_ZONE="`curl $CURL_OPTS --max-time 2 http://169.254.169.254/2009-04-04/meta-data/placement/availability-zone`"
get_instance_id;
if [ ! -z "$EC2_INSTANCE_ID" ]; then
EC2_AVAILABILITY_ZONE="`ec2-describe-instances | grep -q $EC2_INSTANCE_ID | awk '{print $11}'`"
if [ -z "$EC2_AVAILABILITY_ZONE" ] && [ ! "$EC2_AVAILABILITY_ZONE"="" ]; then
export EC2_AVAILABILITY_ZONE=$EC2_AVAILABILITY_ZONE;
install_ec2_env EC2_AVAILABILITY_ZONE $EC2_AVAILABILITY_ZONE;
fi;
fi;
}
function get_region
{
get_instance_id;
if [ ! -z "$EC2_INSTANCE_ID" ]; then
EC2_REGION="`ec2-describe-instances | grep $EC2_INSTANCE_ID | awk '{print $11}'`"
if [ -z "$EC2_REGION" ]; then
export EC2_REGION=$EC2_REGION;
install_ec2_env EC2_REGION $EC2_REGION;
install_ec2_env EC2_URL "https://ec2.$EC2_REGION.amazonaws.com" | sed 's/a.amazonaws.com/.amazonaws.com/'
fi;
fi;
}
function get_instance_id
{
instanceId="`curl $CURL_OPTS --max-time 2 http://169.254.169.254/1.0/meta-data/instance-id`"
if [ ! -z "$instanceId" ]; then
export EC2_INSTANCE_ID="$instanceId";
fi;
}
check_env
Lock Files in PHP & Bash
I just read “How to use locks in PHP cron jobs to avoid cron overlaps” and I thought I would elaborate on this and provide some more examples. In order for a lock to work correctly it must handle, Atomicity / Race Conditions, and Signaling.
I use the following bash script to create locks for crontabs and ensure single execution of scripts.
“The clever bit is to get a lock file test and creation (if needed) to be atomic, that is done without interruption. The set -C stops a redirection from over writing a file. The : > touches a file. In combination, the effect is, when the lock file exists, the redirection fails and exits with an error. If it does not exist, the redirection creates the lock file and exits without an error.The final part is to make sure that the lock file is cleaned up. To makes sure it is removed even if the script is terminated with a ctrl-c, a trap is used. Simply, when the script exits, the trap is run and the lock file is deleted.”, The Lab Book Pages
In addition it also checks the process list and tests whether the pid within the lock file is active.
#!/bin/bash
LOCK_FILE=/tmp/my.lock
CRON_CMD="php /var/www/..../fork.php -t17"
function check_lock {
(set -C; : > $LOCK_FILE) 2> /dev/null
if [ $? != "0" ]; then
RUNNING_PID=$(cat $LOCK_FILE 2> /dev/null || echo "0");
if [ "$RUNNING_PID" -gt 0 ]; then
if [ `ps -p $RUNNING_PID -o comm= | wc -l` -eq 0 ]; then
echo "`date +'%Y-%m-%d %H:%M:%S'` WARN [Cron wrapper] Lock File exists but no process running $RUNNING_PID, continuing";
else
echo "`date +'%Y-%m-%d %H:%M:%S'` INFO [Cron wrapper] Lock File exists and process running $RUNNING_PID - exiting";
exit 1;
fi
else
echo "`date +'%Y-%m-%d %H:%M:%S'` CRIT [Cron wrapper] Lock File exists with no PID, wtf?";
exit 1;
fi
fi
trap "rm $LOCK_FILE;" EXIT
}
check_lock;
echo "`date +'%Y-%m-%d %H:%M:%S'` INFO [Cron wrapper] Starting process";
$CRON_CMD &
CURRENT_PID=$!;
echo "$CURRENT_PID" > $LOCK_FILE;
trap "rm -f $LOCK_FILE 2> /dev/null ; kill -9 $CURRENT_PID 2> /dev/null;" EXIT;
echo "`date +'%Y-%m-%d %H:%M:%S'` INFO [Cron wrapper] Started ($CURRENT_PID)";
wait;
# remove the trap kill so it won't try to kill process which took place of the php one in mean time (paranoid)
trap "rm -f $LOCK_FILE 2> /dev/null" EXIT;
rm -f $LOCK_FILE 2> /dev/null;
echo "`date +'%Y-%m-%d %H:%M:%S'` INFO [Cron wrapper] Finished process";
With the implementation described in the post at abhinavsingh.com, it will fail if you put it as a background process as an example see below.
andrew@andrew-home:~/tmp.lock$ php x.php
==16169== Lock acquired, processing the job...
^C
andrew@andrew-home:~/tmp.lock$ php x.php
==16169== Previous job died abruptly...
==16170== Lock acquired, processing the job...
^C
andrew@andrew-home:~/tmp.lock$ php x.php
==16170== Previous job died abruptly...
==16187== Lock acquired, processing the job...
^Z
[1]+ Stopped php x.php
andrew@andrew-home:~/tmp.lock$ ps aux | grep php
andrew 16187 0.5 0.5 50148 10912 pts/2 T 09:53 0:00 php x.php
andrew 16192 0.0 0.0 3108 764 pts/2 R+ 09:53 0:00 grep --color=auto php
andrew@andrew-home:~/tmp.lock$ php x.php
==16187== Already in progress...
You can use pcntl_signal to trap interruptions to the application and handle cleanup of the process. Here is a slightly modified implementation to handle cleanup. Just to highlight the register_shutdown_function will not help to cleanup on any signal/interruption.
<?php
class lockHelper {
protected static $_pid;
protected static $_lockDir = '/tmp/';
protected static $_signals = array(
// SIGKILL,
SIGINT,
SIGPIPE,
SIGTSTP,
SIGTERM,
SIGHUP,
SIGQUIT,
);
protected static $_signalHandlerSet = FALSE;
const LOCK_SUFFIX = '.lock';
protected static function isRunning() {
$pids = explode(PHP_EOL, `ps -e | awk '{print $1}'`);
return in_array(self::$_pid, $pids);
}
public static function lock() {
self::setHandler();
$lock_file = self::$_lockDir . $_SERVER['argv'][0] . self::LOCK_SUFFIX;
if(file_exists($lock_file)) {
self::$_pid = file_get_contents($lock_file);
if(self::isrunning()) {
error_log("==".self::$_pid."== Already in progress...");
return FALSE;
}
else {
error_log("==".self::$_pid."== Previous job died abruptly...");
}
}
self::$_pid = getmypid();
file_put_contents($lock_file, self::$_pid);
error_log("==".self::$_pid."== Lock acquired, processing the job...");
return self::$_pid;
}
public static function unlock() {
$lock_file = self::$_lockDir . $_SERVER['argv'][0] . self::LOCK_SUFFIX;
if(file_exists($lock_file)) {
error_log("==".self::$_pid."== Releasing lock...");
unlink($lock_file);
}
return TRUE;
}
protected static function setHandler() {
if (!self::$_signalHandlerSet) {
declare(ticks = 1);
foreach(self::$_signals AS $signal) {
if (!pcntl_signal($signal, array('lockHelper',"signal"))) {
error_log("==".self::$_pid."== Failed assigning signal - '{$signal}'");
}
}
}
return TRUE;
}
protected static function signal($signo) {
if (in_array($signo, self::$_signals)) {
if(!self::isrunning()) {
self::unlock();
}
}
return FALSE;
}
}
As an example:
andrew@andrew-home:~/tmp.lock$ php t.php
==16268== Lock acquired, processing the job...
^Z==16268== Releasing lock...
Whilst the implementation above simply uses files, it could be implemented with shared memory (SHM/APC), distributed caching (memcached), or a database. If over a network, factors such as packet loss, latency etc can cause race conditions and should be taken into account. Depending on the application it maybe better to implement as a daemon. If your looking to distribute tasks amongst servers, take a look at Gearman
Google Maps: Large KML and Tiles
Last year I wrote an application to highlight media outlets and their reach (coverage of media outlets), selecting regions within the UK and highlighting aspects of a map. This had many issues where by hitting performance problems of rendering within browsers and also limitations of converting KML to tiles via google. A list of these limitations are:
- Timeouts from google on large KML files.
- Responsiveness of servers to deliver KML files to google.
- Max KML size (Even when gzipped)
- 500 Errors from google
- Transparency within IE
- ….
Some of these limits have since been increased by google and are documented.
Maximum fetched file size (raw KML, raw GeoRSS, or compressed KMZ) 3MB Maximum uncompressed KML file size 10MB Maximum number of Network Links 10 Maximum number of total document-wide features 1,000
In order to alleviate these issues I ended up with the following
- Caching KML files to avoid latency on a expensive database lookups/response.
- Chunking the response into 250 records and writing to individual static KML files. (Files would become very large and google would time out retrieving data sets).
- Proxying googles tiles after they had been converted from KML to images and caching them locally on our servers and then applying the overlays from our servers once merged
So depending on the depth (zoom) of the map and the area selected as well the volume of data, it would either use tiles or googles KML directly (Increased functionality).
In order to have greater control over the spatial data within our database we split this into areas, regions, and sub_regions, which held lookups to postcodes, towns and spatial data itself (There are a lot of discrepancies over outlines of maps).
Left hand menu:
<ul style="display: block;" class="ulTree jsTree">
<li id="East"><a href="#" onclick="loadTilesFromGeoXML('|1|'); return false;">East</a>
<ul style="display: none;" class="ulTree jsTree">
<li><a href="#" onclick="loadTilesFromGeoXML('|1|6'); return false;">Bedfordshire</a></li>
<li><a href="#" onclick="loadTilesFromGeoXML('|1|18'); return false;">Cambridgeshire</a></li>
...
</ul>
</li>
</ul>
Javascript to locate tiles
function loadTilesFromGeoXML(entity_id) {
// Matches database record ids that are mapped to spatial data within MySQL
mapTownsId = entity_id.toString().split('|')[0];
mapRegionsId = entity_id.toString().split('|')[1];
mapSubRegionsId = entity_id.toString().split('|')[2];
locationUrl ='map_towns_id='+mapTownsId+'&map_regions_id='+mapRegionsId+'&map_sub_regions_id='+mapSubRegionsId;
var cc = map.fromLatLngToDivPixel(map.getCenter());
map.setZoom(1);
// Request URL to cached titles links
geoXMLUrl = '/ajax/mapping/get/overlays/region?'+locationUrl;
geoXMLUrl+='&format=JSON&method=getLinks&x='+cc.x+'&y='+cc.y+'&zoom='+map.getZoom();
// tileUrlTemplate: 'http://domain.com/maps/proxy/regions/?url=http%3A%2F%2Fdomain.com/ajax/mapping/get/cache/?filename=.1.6.0&x={X}&y={Y}&zoom={Z}',
$.getJSON(geoXMLUrl, function(data) {
$.each(data, function(i,link) {
kmlLinks+=encodeURIComponent(link)+',';
});
// Builds the location for tiles to be mapped
tileUrlTemplate = '/maps/proxy/regions/?url='+kmlLinks+'&x={X}&y={Y}&zoom={Z}';
var tileLayerOverlay = new GTileLayerOverlay(
new GTileLayer(null, null, null, {
tileUrlTemplate: tileUrlTemplate,
isPng:true,
opacity:1.0
})
);
if (debug) GLog.writeUrl('/maps/proxy/regions/?url='+kmlLinks+'&x={X}&y={Y}&zoom={Z}');
map.addOverlay(tileLayerOverlay);
});
}
Response whilst retrieving links (if cached)
The code behind this simply caches the KML files, if it does not exist, otherwise attempts to create it and also outputs a json request with the files matching the sequence and globs for any files with a similar pattern, all files are suffixed with their page number.
["/ajax/mapping/get/cache/?filename=.1..0&x=250&y=225&zoom=5","/ajax/mapping/get/cache/?filename=.1..1&x=250&y=225&zoom=5"]
Proxying googles tiles and merging the layer ids
$kmlUrls = urlencode($_GET['url']);
$cachePath = dirname(__FILE__).'/cache.maps/tiles/';
$cachedFiles = array_filter(explode(',',rawurldecode($kmlUrls)));
$hash = sha1(rawurldecode($kmlUrls).".w{$_GET['w']}.h{$_GET['h']}.x{$_GET['x']}.y{$_GET['y']}.{$_GET['zoom']}");
$cachePath.="{$_GET['x']}.{$_GET['y']}/{$_GET['zoom']}/";
if (!is_dir($cachePath)) {
@mkdir($cachePath, 0777, true);
}
// Returns image if cached already and aggregated.
if (file_exists($path = $cachePath.$hash)) {
header('Content-Type: image/png');
$fp = fopen($path, 'rb');
fpassthru($fp);
}
// Extract layer id's from KML files that are to be merged.
$layerIds = array();
foreach( $cachedFiles AS $kmlFile) {
$kmlFile="http://{$_SERVER['HTTP_HOST']}{$kmlFile}";
$url = "http://maps.google.com/maps/gx?q={$kmlFile}&callback=_xdc_._1fsue7g2w";
@$c = file_get_contents($url);
if (!$c)
throw new Exception("Failed to request {$url} - {$c}");
preg_match_all('/layer_id:"kml:(.*)"/i', $c, $matches);
if (count($matches)>0 && isset($matches[1][0])) {
$layerIds[] = "kml:{$matches[1][0]}";
}
}
// Cache locally.
if (count($layerIds)>0) {
header('Content-Type: image/png');
// Aggregate layers into a single image
$link = "http://mlt0.google.com/mapslt?lyrs=" . implode(',',$layerIds);
$link.="&x={$_GET['x']}&y={$_GET['y']}&z={$_GET['zoom']}&w={$_GET['w']}&h={$_GET['h']}&source=maps_api";
echo $c = file_get_contents($link);
@file_put_contents($path, $c);
} else {
// Output 1x1 png
header('Content-Type: image/png');
echo base64_decode('iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAC0lEQVQIHWNgAAIAAAUAAY27m/MAAAAASUVORK5CYII=');
}
}
Paging GeoXML loading
function loadGeoXMLPaged(geoXMLUrl) {
var cc = map.fromLatLngToDivPixel(map.getCenter());
geoXMLUrl+='&format=JSON&method=getLinks&x='+cc.x+'&y='+cc.y+'&zoom='+map.getZoom();
if (debug) GLog.writeUrl(geoXMLUrl);
$.getJSON(geoXMLUrl, function(data) {
geoXmlPager = data;
loadGeoXmlPage();
});
}
var timeoutPID = null;
function loadGeoXmlPage(){
if (data = geoXmlPager.pop()){
if (debug)
GLog.writeUrl(BASE_URL+data);
geoXmlStack.push(new GGeoXml(BASE_URL+data));
map.addOverlay(geoXmlStack[geoXmlStack.length - 1]);
GEvent.addListener(geoXmlStack[geoXmlStack.length - 1],"load",function() {
timeoutPID = setTimeout("loadGeoXmlPage()", 500);
});
}else{
clearTimeout(timeoutPID);
map.setZoom(map.getBoundsZoomLevel(bounds));
map.setCenter(bounds.getCenter());
try {
geoXmlStack[geoXmlStack.length - 1].gotoDefaultViewport(map);
} catch(e) {}
}
}
All the code above has been modified slightly to make it applicable to others, however don’t accept raw input as its simply an example.
Australian Timezones and Daylight Savings Time – Redhat and php date broken?
I recently came across a peculiar issue that meant dates and times were causing issues with a product we had developed within Australia. The issue being that within “Red Hat Enterprise Linux Server release 5 (Tikanga)” the date within PHP was being read as EST instead of AEST/AEDT, however running “date” from the terminal or running “SELECT NOW()” from MySQL displayed the correct time.
[user@server ~]$ date
Wed Oct 14 22:24:20 EST 2009
[user@server ~]$ php -r'var_dump(date("r"));'
string(51) "Wed, 14 Oct 2009 21:25:07 +1000 Australia/Melbourne"
[user@server ~]$ php -r'var_dump(date("r e"));var_dump(getenv("TZ"));var_dump(ini_get("date.timezone"));var_dump(date_default_timezone_get());';
string(51) "Wed, 14 Oct 2009 21:25:07 +1000 Australia/Melbourne"
bool(false)
string(0) ""
string(19) "Australia/Melbourne"
[user@server ~]$ mysql -uuser -ppassword -e 'SELECT NOW();'
+---------------------+
| NOW() |
+---------------------+
| 2009-10-14 22:26:12 |
+---------------------+
As you can see php incorrectly gets the time, being an hour off. Running the above on debian worked perfectly fine and comparing the zoneinfo matched my local machine.
[user@server ~]$ md5sum /etc/localtime && md5sum /usr/share/zoneinfo/Australia/Sydney && md5sum /usr/share/zoneinfo/Australia/Melbourne
85285c5495cd5b8834ab62446d9110a9 /etc/localtime
85285c5495cd5b8834ab62446d9110a9 /usr/share/zoneinfo/Australia/Sydney
8a7f0f78d5a146db4bf865ca91cc1c42 /usr/share/zoneinfo/Australia/Melbourne
After a fair amount of digging I ended up coming across the following ticket @478566. Amazingly the ticket is marked as “CLOSED WONTFIX”.
There were a few interesting points from some of the conversations I read.
” Alphabetic time zone abbreviations should not be used as unique identifiers for UTC offsets as they are ambiguous in practice. For example, “EST” denotes 5 hours behind UTC in English-speaking North America, but it denotes 10 or 11 hours ahead of UTC in Australia; and French-speaking North Americans prefer “HNE” to “EST”. twinsun”
Due to different locations in Australia having various interpretations of summer time with start/end dates and clock shifts. As well as the operating system not having zoneinfo data for DEST, AEDT etc (unless you create these yourself) it means you cannot rely on the correct time from php on redhat.
So far I have resorted to the following
[user@server ~]$ php -r 'date_default_timezone_set("Etc/GMT-11"); var_dump(date("r"));';
string(31) "Wed, 14 Oct 2009 22:24:29 +1100"
Migrating Websites & Services Checklist
I have been migrating a large number of websites and consolidating servers to reduce costs.
As a result it is important to ensure that services are migrated smoothly, planned effectively,
after which I had a think about aspects to consider prior to migrating services.
Planning
- Make a preliminary checklist of services actively in use by each active domain, I.e. FTP, HTTP, SMTP, IMAP, POP3, MySQL etc.
- What maintenance periods do you have available, if at all?
- What volume of traffic and when are your quietest periods?
- Do you have dedicated infrastructure, sharded, split by service/role?
- Can parts of the infrastructure be migrated as an individual component
- List core functionality from the domain for testing purposes
- Ideally this should be wrapped in unit tests as both functional
- Examples are email, upload (permissions), adding/editing/removing users
- Ideally this should be wrapped in unit tests as both functional
- How many servers are you migrating?
- Large quantities should be automated.
- How critical is the site/service
- Does it stop 80 staff working?
Specific
- Services
- Ensure services are initially installed on new server(s).
- List all configuration files for a particular service (tree).
- Ensure configuration between each service are identical or compromises are made.
- List data directories for each service I.e. /var/lib/mysql
- Can data be transferred automically.
- Can services be replicated and brought into sync
- Can data be back filled?
- I.e Are large log tables required to make the site functional, what is the minimal effort required to bring the site functional?
- SSL
- Ensure valid certificate exists for any CDN, sub-domain, domain.
- Email
- Are there any special firewall, configuration requirements?
- DNS
- Lower the TTL for a domain your preparing to transfer (if possible)
- Cannot rely on low TTLs, these are cached amongst large corporates, ISPs etc.
- Ensure the domain is bound to a unique VIP on new servers, if DNS resolution fails, you can put a header(’Location 10.10.10.10′); in the old site to ensure the domain will resolve correctly.
- Test this prior to transfer for both HTTP & HTTPS if applicable
- Lower the TTL for a domain your preparing to transfer (if possible)
- Permissions
- Do you upload content to the servers, does your code write to the filesystem?
- Is this writtable?
- Under which user/group is this written?
- Do you upload content to the servers, does your code write to the filesystem?
- Cache
- Does your site make use of distributed or local cache?
- Could there be collisions between different sites, I.e. Do you prefix cache key names based on site?
- Does your site make use of distributed or local cache?
- Networking
- Can specific services be migrated prematurely?
- Repoint via iptables, and keep an eye on bytes passing through the interface till redundant
- Can specific services be migrated prematurely?
- Security
- Were there any firewall restrictions that need to be replicated, either hardware, iptables etc.
- Chrooted, users copied, ssh keys copied.
- Optimizations
- Were there any special optimizations, I.e. DnsMasq?, sysctl changes?
- Load balancing
- Ensure each domain has its own VIP – HTTP_HOST fails in HTTP 1.0 clients
- Ensure wild cards are not specified within virtual hosts – see above
- Ensure sites with load balancing and over SSL use TCP requests correctly, in addition see first point.
- ifdown each VIP in the webserver pool, does it failover with the correct site on all nodes?
- Monitoring
- If previously had monitoring on servers (should do), has this been replicated to new servers?
- Database (Will vary depending on setup)
- Is the database replicated?
- Take LVM snapshots of the raw data on slave and rsync to new servers.
- Ensure to change configuration such as server id’s, permissions on master, firewall, start service and start replication. Will be ready to start replicating with correct binlog positions etc.
- Take LVM snapshots of the raw data on slave and rsync to new servers.
- Is the database replicated?
- Other general changes
- Are there customizations to /etc/hosts get sites working?
Let me know if there is anything you think I have missed.
Load balancing with ucarp & haproxy
Recently we had an issue with one of our hosting providers load balancing (LVS), which resulted in some very small outages. As a result we decided to setup our own load balancing that we had full control over, and could manage ourselves. In addition to choosing a better suited weighting algorithm.
Each webserver is setup using ucarp an implementation of Common Address Redundancy Protocol (CARP) allowing failover of a single Virtual IP (VIP) for high availability. We bound multiple VIPs for each host as we noticed some HTTP 1.0 clients incorrectly sending the host address to the server.
There are many ways you can then proxy the webservers and load balance, however we decided to use haproxy. This can also be acheived by pound, apache mod_proxy, mod_backhand etc.
In order to setup ucarp & haproxy:
apt-get install -y haproxy ucarp
Modify /etc/network/interfaces giving each interface a unique ucarp-vid and adjust ucarp-advskew for weighting on each server (increment by one for each server) and set ucarp-master to yes if it is to be the master. Modify the configuration below appropriately.
# The primary network interface
auto eth0
iface eth0 inet static
address 10.10.10.2 # IP address of server
netmask 255.255.255.255
broadcast 10.10.10.10
gateway 10.10.10.1
ucarp-vid 3
ucarp-vip 10.110.10.20 # VIP to listen to
ucarp-password password
ucarp-advskew 10
ucarp-advbase 1
ucarp-facility local1
ucarp-master yes
iface eth0:ucarp inet static
address 10.10.10.20# VIP to listen to
netmask 255.255.255.255
To bring the interface up, simply run the following:
ifdown eth0; ifup etho0
ifdown eth0:ucarp; ifup eth0:ucarp
In order to configure haproxy:
sed -i -e 's/^ENABLED.*$/ENABLED=1/' /etc/default/haproxy
Reconfigure apache to listen only on local interfaces (/etc/apache2/ports.conf):
So replace “Listen 80″ with
Listen 10.10.10.20:80
Listen 10.10.10.2:80
edit /etc/haproxy/haproxy.cfg
listen web 10.10.10.20:80
mode http
balance leastconn
stats enable
stats realm Statistics
stats auth stats:password
stats scope .
stats uri /stats?stats
#persist
server web1 10.10.10.2:80 check inter 2000 fall 3
server web2 10.10.10.3:80 check inter 2000 fall 3
server web3 10.10.10.4:80 check inter 2000 fall 3
server web4 10.10.10.5:80 check inter 2000 fall 3
server web5 10.10.10.6:80 check inter 2000 fall 3
Then restart haproxy with /etc/init.d/haproxy restart

After changing your DNS to point to 10.10.10.20 you will be able to see the traffic balanced between the servers by going to the URL http://10.10.10.20/stats?stats with the credentials assigned above and see the bytes balanced between the servers listed.
Some other alternatives are:
Soap, XmlRpc and Rest with the Zend Framework
The Project
I was recently working on a project to expose our trading systems via XmlRpc, Rest and SOAP. It was quite an interesting project, which took two of us three weeks to develop (Amongst other things).
This involved creating a testbed, that would automatically generate the payload and response for each protocol. The parameters are introspected for each class method capturing each parameters data type, allowing for user input via standard html forms. This is probably best described with a picture or two.
Most of the documentation was generated via reflection and comments within the docblocks, parameters, notes were also generated making it quick and simple to update. In addition to parsing the start and end line of each method for any applicable error codes/faults that may be returned.
Zend Framework
Using the Zend Framework for the first time in a commercial product was not exactly hassle free, and still has quite a few issues with its webservices implementation. Currently there seems to be quite a bit of confusion regarding its Rest implementation and whether it is to be merged, would be great if someone clarify this.
The main issue I found with the Zend Frameworks implementation of XmlRpc and Rest is that it assumes that the payload it receives is valid. During my development, I tended to mix the payloads from SOAP, XmlRpc and Rest, yet it would assume that simple_xml can parse the input.
For example $this->_sxml is assumed to be a valid object, if not you will either get invalid method call or an undefined index, which doesn’t render well for an xmlrpc server.
/**
* Constructor
*
* @param string $data XML Result
* @return void
*/
public function __construct($data)
{
$this->_sxml = simplexml_load_string($data);
}
/**
* toString overload
*
* Be sure to only call this when the result is a single value!
*
* @return string
*/
public function __toString()
{
if (!$this->getStatus()) {
$message = $this->_sxml->xpath('//message');
return (string) $message[0];
} else {
$result = $this->_sxml->xpath('//response');
if (sizeof($result) > 1) {
return (string) "An error occured.";
} else {
return (string) $result[0];
}
}
}
One of the main issues with Rest was that it needed ksort when using the Rest client as the arguments were not necessarily passed in order. This can be “rest.php?method=x&arg1=1&arg0=0″ and it would interpret each arg in the order it received them. This should be sorted in the next release of the ZF.
As the webservices we are exposing needs to have quite good performance with the number of transactions it will be handling and the amount of reflection that Zend Server Reflection (Only noticed after I started profiling) performs and I wanted to optimize any overhead, which got me looking at Zend_XmlRpc_Server_Cache. First thing I did was profile Zend_XmlRpc_Server_Cache, which added a considerable amount of overhead. Looking at its implementation, it uses serialize, which is a relatively slow process and should be avoided, unless there is a large overhead in initializing objects. So most likely Zend_XmlRpc_Server_Cache will not add any benefit. And var_dump’ing out the reflection in XmlRpc spews out a shocking amount of information on some fairly large classes.
if (!Zend_XmlRpc_Server_Cache::get($cacheFile, $server)) {
}
Generating WSDL
I tried a number of WSDL generators including the implementation in incubator for ZF, which I found to be the best, yet I still had to write a large chunk of the WSDL by hand and adapt it.
The best way to debug is to run the soap client with verbose mode on, and it will typically tell you the issue straight away.
- Zend_Soap_AutoDiscover: Duplicates an operation in WSDL for methods with parameters that are optional. (ZF-2642)
- Zend_Soap_AutoDiscover: If missing the @return in your docblock the message response in the WSDL is not generated. (ZF-2643)
- AutoDiscover duplicates response if using set class multiple times. (ZF-2641 )
- One of my colleagues typically writes their docblocks with “@return int, comment.”, which the comma caused return types to be dropped with AutoDiscover, more of an issue with Zend Server Reflection
Other odd issues
Raw input bug
Some other obscurities I found was capturing the raw request data. In our local development environment reading the raw request input, and then once again within the Zend Frameworks appears to work fine. However in our pre-production environment it fails to read the second request to read the raw request. (PHP 5.2.2)
if (!isset($HTTP_RAW_POST_DATA)){
$HTTP_RAW_POST_DATA = file_get_contents('php://input');
}
It does seem a little odd that the XmlRpc does not check whether $HTTP_RAW_POST_DATA isset before attempting to re-read raw input.
Internal error: Wrong return type
Whilst running PHPUnit I noticed a very weird quirk in our local dev environment, which essentially did the following… You would expect this to output the contents of an array right? Well between the method call to x and return the result back to method y returns NULL. This is very obscure and i’ve never seen anything like it especially considering it is explicitly set. I had a number of colleagues check this, which had us all scratching our heads. Has anyone else seen anything similar to this?
class test {
public function x() {
$ret = array();
for(...) {
$ret[] = $row;
}
return $ret;
}
public function y() {
$response = $this->x();
var_dump($response);
}
}
$t = new test();
$t->y();
Conclusion
Overall the project went pretty well, I’m confident it is now stable especially with the number of tests we ran against it. It is adaptable to other projects that we may need to expose via an API, in total there is about 6000 lines of code alone just testing the 3 different protocols it supports. I would have rather avoided the Rest implementation with ZF as it still needs a lot of work, however XmlRpc is a lot more stable and I would quite happily use again. As there is a lot of overhead with reflection it is not the fastest implementation and was contrasted to some of the heavier web pages we have for some simple functionality. It would be ideal to replace the reflection with something lighter such as an array with the corresponding methods, parameters and types, however I would have to look into that if performance did become a major issue.
PS. Just to note I used PHP’s in built soap server.
Zend Studio for Eclipse: Neon
I’ve been using Zend Studio for Eclipse (beta) for several weeks in a rewrite of a framework and numerous sites at work and overall I really like the IDE. Its got some great features and being based on the eclipse project makes it really extensible and customizable. With debugging, profiling, code completion, code formatting and more enabled can help with productivity.
A complete list of features can be found at Zend.
Zend Studio for Eclipse consumes quite a lot of memory and the recommended amount of RAM for eclipse based applications is 2GB, however you can control the amount of memory that eclipse will use by editing the zendStudio.ini file.
Whilst I do like the IDE I have found a number of issues with Zend Studio:
Bugs
- 1. There have been a number of issues revolving around the SVN implementation in Zend Studio for Eclipse which causes the application to hang (SVN support is provided to eclipse by a 3rd party plugin (a company called Polarion)):
- a) When committing files it locks entire directories and often hangs making Zend Studio for Eclipse unusable, if you have files that are not saved and attempt to do so it queues it as a users pending tasks and because the commit has stalled you cannot save the file.
To resolve this I have to kill the process for Zend Studio, shell into the server and cleanup the the projects src, and sometimes have to re-checkout the directories in a project
- a) When committing files it locks entire directories and often hangs making Zend Studio for Eclipse unusable, if you have files that are not saved and attempt to do so it queues it as a users pending tasks and because the commit has stalled you cannot save the file.
- 2. When developing via a samba share, it prompts with an incorrect error and does not attempt to re-authenticate when the samba share needs to re-connect and/or does recognise that it is talking via a remote device.
- 3. Auto format adds extra braces to statements, causing syntax errors, strips all comments out of files!
- 4. Importing an auto format does not seem to work correctly.
- 5. When working with multiple open files it can overwrite the contents with another, I believe this is the case with files of a similar name (I’ve only had this occur once, however a colleague experiences this quite frequently).
- 6. Modified file names are prefixed with “>” and when searching for files by pressing a character will not go to that file in PHP Explorer
- 7. Templates do not always get replaced e.g. If the system is slow or you type fnc real quick it doesn’t replace with the template for a function.
- 8.
- 9. introducing a syntax error and then removing the syntax error doesn’t clear until you save the document.
Resolved
- 1. There was a bug in automatically updating eclipse, which never seemed to work however in the latest release (beta 2) this has now been resolved.



