Chapter 4. Hacking Elastic Beanstalk

The purpose of this chapter is to explore where Elastic Beanstalk ends, and where we can begin to adjust the system ourselves. We’ll start by delving into the way the Elastic Beanstalk instances integrate with the Elastic Beanstalk Service. If we understand this, we can slowly start to customize the image that runs our application (and the instances that are launched from it); we’ll change the logging, replace the OpenJDK with the Sun JDK, and replace Apache with Nginx. An interesting way to change an infrastructure is to take things out, which is exactly what we’ll do in the end: we’ll make the Elastic Load Balancer bypass Apache or Nginx altogether.

So, we understand how Elastic Beanstalk works and have sort of mastered the fundamentals. It is time to go a little bit further. Why not get our hands dirty and change those fundamentals? Chapter 2 introduced the concepts underlying the AWS services Elastic Beanstalk uses. There is not time to get into the details of working with every one of those things. If you want more details on how to create an AMI, for example, we suggest you read Programming Amazon EC2.

Building on top of Elastic Beanstalk, we can do all sorts of interesting things. Perhaps you want to use Nginx instead of Apache. Or you are contemplating just ignoring Apache for the Tomcat traffic. You might be using some features on the Sun JDK that you are used to, and they are not implemented (yet) in the OpenJDK.

Well, the good news is, we can do all these things. And, even better, the hostmanager (the part of the image that AWS added) is available under the Amazon Software License. You have no control over the actual hostmanager to Beanstalk communication, so doing these things comes at a cost. You will have to keep an eye on changes to this service, and maintain your custom images yourself.

The Instance

Elastic Beanstalk comes with default AMIs, for 32-bit and 64-bit instances. These images launch into instances that basically contain two things:

  1. Everything related to running Tomcat (6 or 7)

  2. The hostmanager, which is used to communicate with the Beanstalk environment

The hostmanager is a Ruby application. It takes care of starting and stopping necessary applications, handling deploys, and other Beanstalk-related tasks, like restarting the application servers. The Tomcat environment is a standard install, on top of the Open JDK.

If you know your way around Linux (CentOS/RH in particular), you can inspect the instances. If you launch a separate instance, you can make your changes and create a custom AMI. The difficulty is that you can’t easily deploy your WAR and test if your changes work.

We launched an environment with one instance, and made changes there so we could test this immediately. Sometimes when we broke the instance Beanstalk automatically replaced it. So you have to make a bit of haste. Once we were happy with the changes, we replayed the changes to the separate instance and created our AMI. Changing the environment configuration will show you if you are successful or not.

Note

It is a bit difficult to work with Elastic Beanstalk Instances like this. Another way to prevent accidental termination by an eager Elastic Beanstalk is to use Termination Protection. You can enable Termination Protection in the Console.

If you use Termination Protection, Elastic Beanstalk will still replace your instance if it does not show up healthy. But it can’t terminate it, so you don’t lose changes while working on it.

Logging

The logging (Tomcat logs) is verbose in the default images. If you want to change this, you’ll have to create a custom image. If you want to change the logging, you have to edit /opt/tomcat7/conf/logging.properties. We changed this file to this:

# ElasticBeanstalk Tomcat Logging
handlers = 1monitor.java.util.logging.FileHandler, 2tail.java.util.logging.FileHandler

# catalina.log for logrotate
1monitor.java.util.logging.FileHandler.level = WARNING
1monitor.java.util.logging.FileHandler.count = 1
1monitor.java.util.logging.FileHandler.pattern = ${catalina.base}/logs/monitor_
catalina.log
1monitor.java.util.logging.FileHandler.append = true
1monitor.java.util.logging.FileHandler.formatter=java.util.logging.XMLFormatter

2tail.java.util.logging.FileHandler.level = WARNING
2tail.java.util.logging.FileHandler.count = 1
2tail.java.util.logging.FileHandler.pattern = ${catalina.base}/logs/tail_catalina.log
2tail.java.util.logging.FileHandler.append = true
2tail.java.util.logging.FileHandler.formatter=java.util.logging.SimpleFormatter

Now, create the image and change the environment configuration.

Note

Since we are hanging around in /opt/tomcat7/conf anyway, you can see many other files. If you are familiar with Tomcat, you can find your way easily.

One thing we noticed is that there are only two AMIs, one 32-bit and one 64-bit. If you want to use other instance types for your particular app, you will definitely want to have a look at server.xml, to change the number of threads, for example.

Sun JDK

There are things we can’t see from outside the instances. We can’t see what the memory usage is, for example. By logging in, we can see the instances themselves are fine. The memory appears to be OK as well. But we have a very limited test suite, so it doesn’t tell us very much.

There is an interesting tool we often use in other Tomcat environments, and that is VisualVM. In theory, it would be pretty straightforward to enable this type of profiling information in Tomcat, but it is not that easy.

First, it is available in the Sun JDK, and experimental in the OpenJDK that powers Beanstalk. But we still wanted to give this a try. Launching a separate medium instance from the Beanstalk AMI gives us something to work on. And after some time we were able to replace the OpenJDK with the Sun JDK by doing the following:

$ rpm -e --nodeps java-1.6.0-openjdk.i686

$ wget -O jdk-6u25-linux-i586-rpm.bin \
    http://download.oracle.com/otn-pub/java/jdk/6u25-b06/jdk-6u25-linux-i586-rpm.bin

$ sh jdk-6u25-linux-i586-rpm.bin

$ sed -i 's/\/usr\/lib\/jvm\(\/jre\)*/\/usr\/java\/jdk1.6.0_25/g' \
    /etc/java/java.conf \
    /etc/profile.d/aws-apitools-common.sh

Note

We posted this solution in the AWS forum, and one of our colleagues (dhavala, going by the name of Kris) came up with an alternative way of installing the Sun JDK:

Re: installing Sun JDK
Posted by: Kris
Posted on: May 30, 2011 3:45 AM
 in response to: truthtrap
 Reply
When you remove OpenJDK using rpm -e --nodeps, you end up removing 
some symbolic links that are not created upon installing Sun JDK bin.

Here are the commands for Tomcat, 64bit (similar to truthtrap's)

cd ~
wget -O jdk-6u25-linux-x64-rpm.bin http://download.oracle.com/otn-pub/
java/jdk/6u25-b06/jdk-6u25-linux-x64-rpm.bin
sudo chmod +x jdk-6u25-linux-i586-rpm.bin
sudo ./jdk-6u25-linux-i586-rpm.bin
sudo alternatives --install /usr/bin/java java /usr/java/default/bin/
java 20000
sudo update-alternatives --config java
sudo ln -s /usr/java/default/jre /usr/lib/jvm/jre
sudo ln -s /usr/share/java /usr/lib/jvm-exports/jre

(Optional) While you are at it, install PSI-Probe in the Dev 
environments to monitor your JVMs.

Just copy probe.war to /usr/share/tomcat6/webapps, start Tomcat using:
/etc/init.d/tomcat6 start

We created the image, and it works perfectly. Now, supposedly, adding the following JVM Command-Line Options should make Tomcat ready for VisualVM style scrutiny, but we did not get this to work. This is not a “we leave this to the reader”; we have a book to finish. But if you know how to make this work, let us know:

-Dcom.sun.management.jmxremote=true \
-Dcom.sun.management.jmxremote.port=8086 \
-Dcom.sun.management.jmxremote.ssl=false \
-Dcom.sun.management.jmxremote.authenticate=false

Nginx

Apache is quite a beast. It does everything, basically, but at a cost. For heavy lifting (like our heystaq API calls), this is fine, but for other calls, the overhead of Apache is not always necessary. So, why not replace Apache with Nginx?

Replacing Apache with Nginx is a little bit more difficult than replacing the JDK. To do this we not only have to change the OS installation, but we also have to change the hostmanager.

We compiled most of the changes you have to make into this script:

#!/bin/sh

# install Nginx
yum -y install nginx
sed -i 's/ 1;/ 4;/g' /etc/nginx/nginx.conf
echo 'MAKE SURE TO REMOVE THE server ENTRY FROM /etc/nginx/nginx.conf'

# add Nginx to the infamous Beanstalk hostmanager
cd /opt/elasticbeanstalk/srv/hostmanager/lib/elasticbeanstalk/hostmanager
cp utils/apacheutil.rb utils/nginxutil.rb
sed -i 's/Apache/Nginx/g' utils/nginxutil.rb
sed -i 's/apache/nginx/g' utils/nginxutil.rb
sed -i 's/httpd/nginx/g' utils/nginxutil.rb
cp init-tomcat.rb init-tomcat.rb.orig
sed -i 's/Apache/Nginx/g' init-tomcat.rb
sed -i 's/apache/nginx/g' init-tomcat.rb

# create the right proxies (Beanstalk and hostmanager)
echo 'proxy_redirect            off;
proxy_set_header          Host            $host;
proxy_set_header          X-Real-IP       $remote_addr;
proxy_set_header          X-Forwarded-For $proxy_add_x_forwarded_for;
client_max_body_size      10m;
client_body_buffer_size   128k;
client_header_buffer_size 64k;
proxy_connect_timeout     90;
proxy_send_timeout        90;
proxy_read_timeout        90;
proxy_buffer_size         16k;
proxy_buffers             32              16k;
proxy_busy_buffers_size   64k;' > /etc/nginx/conf.d/proxy.conf

echo 'server {
 listen 80;
 server_name _;
 access_log /var/log/httpd/elasticbeanstalk-access_log;
 error_log /var/log/httpd/elasticbeanstalk-error_log;

 #set the default location
 location / {
  proxy_pass         http://127.0.0.1:8080/;
 }

 # make sure the hostmanager works
 location /_hostmanager/ {
  proxy_pass         http://127.0.0.1:8999/;
 }
}' > /etc/nginx/conf.d/beanstalk.conf

Make sure to remove the server entry in /etc/nginx/nginx.conf; otherwise it won’t work.

The Infrastructure

Not everything happens on the instances, of course. We can also hack the infrastructure. In the previous section we replaced Apache with Nginx, for example. We can easily ignore Apache altogether, and tell the load balancer to connect its port 80 to the Tomcat ports on the instances.

There is one thing we need to do for that, and that is make sure the Tomcat instances accept incoming connections on that port. Until a few weeks ago, we had to open up the security group to the world, but Amazon released a feature that allows us to open it up to a special security group that each Elastic Load Balancer has. You can find this security group in the Console, and it has a form like amazon-elb/amazon-elb-sg, where amazon-elb is the owner alias.

And now, you can change the ELB from the command line to connect port 80 to 8080 (for easy testing you can make two distinct connections, 80 to 80 and 8080 to 8080):

# first remove the old listener
$ elb-delete-lb-listeners awseb-staging -lb-ports 80

# and then bypass the apache to point directly to 8080
elb-create-lb-listeners awseb-staging --listener "lb-port=80,instance-port=8080,
protocol=http"

Note

In general it is best to hide as much of your instances as possible, but this particular feature just wasn’t there yet. We expect the Elastic Beanstalk product team to implement these features in updates.

We could have removed Apache altogether, but that would have broken the hostmanager. This app runs on port 8999, but Beanstalk talks to /_hostmanager. Apache proxies this traffic as well. We chose to ignore Apache, and leave it be.

Conclusion

This chapter shows that, with a few recipes, you can get inside Elastic Beanstalk and customize—or hack—the provided images to your needs. It’s not so straightforward, but it’s possible. The next thing would be to create your own AMIs for Beanstalk, but that goes beyond being a user to being a contributor to Beanstalk.

We hope that by reading this book you learned how to use Elastic Beanstalk in standard and more advanced ways. If you understand what is happening underneath the surface of your application, the potential of what you can do with Elastic Beanstalk and AWS is big!

Get Elastic Beanstalk now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.