Saturday, August 14, 2010

Just FYI

This is to inform possible readers of my blog that my article "Do we really need Servlet Containers always?" was published in Also got linked my post "How JBoss Netty helped me build a highly scalable HTTP Communication Server over GPRS" on the Netty community site.

I will be writing more posts on this blog shortly.

Saturday, July 17, 2010

Do we really need Servlet Containers always - concluding part

Building an HTTP server engine using JBoss Netty and using Apache Web Server for load-balancing with sticky-sessions

JBoss Netty has been my first choice for building a scalable HTTP server engine sans Servlet API and Containers, ever since I first discovered it around 1.5 years back. Its USP is its simplicity combined with extreme performance. It is primarily an NIO framework which provides ready-to-use HTTP Protocol decoder and encoder along with a basic API for HTTP request and response, with which we can very simply and quickly build a scalable HTTP server. The performance test results and testimonials are proof of its excellence, for various protocols, including HTTP, vis-à-vis other NIO frameworks as well as HTTP Servlet containers like Jetty and Tomcat.

The documentation is very comprehensive and intuitive with a number of examples.

Here, I shall show a simple example of creating HTTP server with JBoss Netty which creates and provides a session identity to the user. When the user connects by providing the session identity as a URL parameter and without the name, it welcomes the user with its name. We will use this simple server as an example to do reverse proxy and then load balancing with sticky-sessions with Apache Web Server’s mod_proxy and mod_proxy_balancer modules.

First, we need to create a handler, much like a Servlet that would receive HTTP requests. HTTP requests can be chunked and would have to be combined at the server side before processing. This chunking is not typical of high-bandwidth Internet requests, but is very much possible with GPRS medium. So first, we have an abstract template handler for receiving HTTP requests.

package apachetest;

import org.jboss.netty.buffer.ChannelBuffer;
import org.jboss.netty.buffer.ChannelBuffers;
import org.jboss.netty.handler.codec.http.HttpChunk;
import org.jboss.netty.handler.codec.http.HttpRequest;

public abstract class AbstractHttpHandler
extends SimpleChannelUpstreamHandler {

protected HttpRequest currentRequest;

public static ChannelGroup allChannels = new DefaultChannelGroup

public void messageReceived(ChannelHandlerContext ctx,
MessageEvent e)
throws Exception {
Object message = e.getMessage();
Channel userChannel = e.getChannel();
boolean processMessage = false;
if (message instanceof HttpChunk) {
HttpChunk httpChunk = (HttpChunk) message;
if (currentRequest==null)
throw new
IllegalStateException("No chunk start");
ChannelBuffer channelBuffer = currentRequest
if (channelBuffer==null)
throw new
IllegalStateException("No chunk start");
ChannelBuffer compositeBuffer =
processMessage = httpChunk.isLast();
} else if (message instanceof HttpRequest){
currentRequest = (HttpRequest) message;
processMessage = !currentRequest.isChunked();
if (processMessage) {
handleRequest(currentRequest, userChannel);

protected abstract void handleRequest(HttpRequest request,
Channel userChannel) throws Exception;

public void channelOpen(ChannelHandlerContext ctx,
ChannelStateEvent e) throws Exception {
public void exceptionCaught(ChannelHandlerContext ctx,
ExceptionEvent e) throws Exception {

The above AbstractHttpHandler extends JBoss Netty’s SimpleChannelUpstreamHandler and extends its messageReceived method to assemble HTTP chunks, if required to a complete HTTP Request by simply appending the content body. Once a complete HTTP Request is available, it calls the abstract handleRequest method, whose implementation is left to base classes, with the HttpRequest and the channel from which the request was obtained. The ChannelPipelineCoverage is kept as one so that for every channel connection, a new handler will be assigned. This helps us to ensure that we are dealing with a single channel in every handler and so we can safely assemble their chunked requests.

When a new channel (socket) is opened, it is added to an allChannels group. This group enables to close all client and server sockets gracefully during shutdown with a single method call. We will see this in more detail shortly in another code snippet.

Now, we will create an implementation of this AbstractHttpHandler which assigns unique sessions to users and later identifies the users from their sessions.

package apachetest;
import java.util.List;
import java.util.Map;

import org.jboss.netty.buffer.ChannelBuffers;
import org.jboss.netty.handler.codec.http.DefaultHttpResponse;
import org.jboss.netty.handler.codec.http.HttpHeaders;
import org.jboss.netty.handler.codec.http.HttpRequest;
import org.jboss.netty.handler.codec.http.HttpResponse;
import org.jboss.netty.handler.codec.http.HttpResponseStatus;
import org.jboss.netty.handler.codec.http.HttpVersion;
import org.jboss.netty.handler.codec.http.QueryStringDecoder;

public class SimpleHttpHandler extends AbstractHttpHandler {

private String serverInstanceId;

private String sessionInstanceSeparator;

private static ThreadLocal sessionIdBufLocal
= new ThreadLocal(){

public StringBuffer get() {
StringBuffer strBuf=super.get();
return strBuf;

protected StringBuffer initialValue() {
return new StringBuffer(50);


public SimpleHttpHandler(String serverInstanceId,
String sessionInstanceSeparator) {
this.serverInstanceId = serverInstanceId;
this.sessionInstanceSeparator = sessionInstanceSeparator;

protected void handleRequest(HttpRequest request,
Channel userChannel) throws Exception {
String requestUri = request.getUri();
QueryStringDecoder queryStringDecoder = new
Map> params =
String sessionId = null;
int indexOf = requestUri.indexOf("sessionTest");
if (indexOf==-1) {
sendResponse(request, userChannel,
"Valid URI supported is /sessionTest");

List sessionList = params.get("jsessionid");
if (sessionList!=null && !sessionList.isEmpty()) {
String sessionInList = sessionList.get(0);
if (sessionInList!=null &&
!"".equals(sessionInList.trim())) {
sessionId = sessionInList;
List userList = params.get("user");
String user = null;
if (userList!=null && !userList.isEmpty()) {
user = userList.get(0);
if (sessionId==null) {
if (user==null || "".equals(user)) {
sendResponse(request, userChannel,
"No valid session, please login with user");
UserSessionDetails userSessionDetails =
if (userSessionDetails==null) {
userSessionDetails = new UserSessionDetails();
userSessionDetails.userChannel = userChannel;
userSessionDetails.userId = user;
.putIfAbsent(user, userSessionDetails);
userSessionDetails =
synchronized(userSessionDetails) {
if (userSessionDetails.sessionId!=null) {
sessionId = userSessionDetails.sessionId =
userSessionDetails.sessionCreationTime =
userSessionDetails.lastAccessTime =

StringBuffer sessionIdToReturn =
if (serverInstanceId!=null &&
!"".equals(serverInstanceId)) {
"Session Id is "+sessionIdToReturn);

} else {
//strip off the identity of the server
int serverIdIndex =
if (serverIdIndex>=0)
sessionId = sessionId.substring(0,
UserSessionDetails userSessionDetails =
if (userSessionDetails==null) {
sendResponse(request, userChannel,
"No valid session, please login with user");
synchronized(userSessionDetails) {
userSessionDetails.userChannel = userChannel;
userSessionDetails.lastAccessTime =
UserSessionDetails userDetails =
if (userDetails==null ||
userDetails!=userSessionDetails) {
"Welcome user "+userSessionDetails.userId
+" to server "+serverInstanceId);

private void sendResponse(HttpRequest request,
                Channel userChannel,
HttpResponseStatus responseStatus, String responseBody)
throws Exception {
boolean close =
.getHeader(HttpHeaders.Names.CONNECTION)) ||
.equals(HttpVersion.HTTP_1_0) &&
HttpResponse response = new DefaultHttpResponse
(HttpVersion.HTTP_1_1, responseStatus);
response.setHeader(HttpHeaders.Names.PRAGMA, "no-cache");
byte[] array = responseBody.getBytes("UTF8");
if (responseBody!=null)
ChannelFuture future = userChannel.write(response);
if (close) {

The SimpleHttpHandler is created with a server instance id and a session instance separator. Each server is provided a unique identity in a cluster of servers. Sessions created for users would be given a session id with the following format - . Thus, if the server instance name is ‘netty1’ and session instance separator is ‘.’ and the unique id created is ‘2C855D9FB0710111’, then the session id given to the user would be the concatenated value ‘2C855D9FB0710111.netty1’. This would help Apache Web Server to maintain sticky sessions, as it would be able to parse the session id and find that it has been created by netty1 instance and all the next requests with this session id must be passed to this instance.

In our handleRequest implementation, we simply generate a new session id for every new user using our own IdGenerator. Notice how we can use the request.getUri() to check that the requests have been made to the expected URI which is ‘/sessionTest’. The QueryStringDecoder from the Netty API helps in extracting request parameters as name-value(s) map. If the request provides the username, we simply create a new session id and store the user session details in our local memory map in SessionIdHolder. We send the response to the user – ‘Session ID is <…..>’. In the next request, the user can copy this session id from the response and send it as a request parameter. Then the server extracts the user session details mapped to the session id, after stripping off the server instance id, and finds out the corresponding user name. It sends the response ‘Welcome user <….> to server <…>’. If it does not find the session in its map, it returns the response ‘No valid session, please login with user’.

The sessions can be shared across servers using replicated caching or a shared database. However, to keep our example simple, and to illustrate load balancing with session stickiness, we have not replicated the sessions.

The sendResponse method sends a response to the user channel from which the request was received. It sets the desired response status – 200 for OK, 403 for UNAUTHORIZED. The HttpResponseStatus from Netty API provides all the HTTP response constants. We also set cache-control and pragma headers to ensure that responses are not cached by the client browser. We also set the content-length header. If the request had a CLOSE header or was using HTTP1.0 protocol with no KEEP_ALIVE header, we close the client socket channel after providing the response. If not, the same socket channel will be used for multiple HTTP request-response cycles between the client and server.

Now that we have the handlers ready, we write the main class to start the HTTP server.

package apachetest;

import java.util.concurrent.Executors;

import org.jboss.netty.bootstrap.ServerBootstrap;
import org.jboss.netty.handler.codec.http.HttpRequestDecoder;
import org.jboss.netty.handler.codec.http.HttpResponseEncoder;

public class SessionTestServer {

public static void main(String[] args) {
int port = 9080;
String[] arguments = new String[] {"","."};
if (args.length>0 && !"".equals(args[0])
                    && args[0]!=null) {
arguments[0] = args[0];
if (args.length>1 && !"".equals(args[1])
                    && args[1]!=null) {
arguments[1] = args[1];
if (args.length>2 && !"".equals(args[2])
                    && args[2]!=null) {
try {
int portGiven = Integer.parseInt(args[2]);
if (portGiven<=1024 || portGiven>9999);
port = portGiven;
} catch (Exception e) { }

String serverInstanceId = arguments[0];

String sessionInstanceSeparator = arguments[1];
NioServerSocketChannelFactory factory = new
ServerBootstrap bootstrap = new ServerBootstrap(factory);
Channel serverChannel =
bootstrap.bind(new InetSocketAddress(port));
if (serverChannel==null) {
System.out.println("Unable to open
                            server at port "+port);
System.out.println("Started server at port "+port
+" Press any key to stop server");
try {;
} catch (IOException e) {
} finally {
System.out.println("Shutting down server");
ChannelGroupFuture future =

private static class HttpServerPipelineFactory implements
ChannelPipelineFactory {

private String serverInstanceId;
private String sessionInstanceSeparator;

public HttpServerPipelineFactory(String serverInstanceId,
                String sessionInstanceSeparator) {
this.serverInstanceId = serverInstanceId;
this.sessionInstanceSeparator =
public ChannelPipeline getPipeline() throws Exception {
ChannelPipeline pipeline = Channels.pipeline();
                             new HttpRequestDecoder());
                             new HttpResponseEncoder());
pipeline.addLast("handler", new
return pipeline;

The program accepts the server instance id, the session separator and the server port number to bind for accepting client requests in the command-line arguments. These arguments are optional, the default value for session separator is ‘.’ and the server port number is 9080.

Like any other server-side implementation using Netty, we create a NioSocketChannelFactory and bootstrap the server with our ChannelPipelineFactory implementation and bind it to the acceptor port. We also add the server channel to the allChannels group and wait for user key stroke to stop the server. While stopping the server, we need to simply close the allChannels group and release external resources of our factory, which would stop underlying thread pools.

Now let us look at the ChannelPipelineFactory implementation. For every new client channel that connects to the server, a new pipeline (which is based on interceptor-chain design pattern) is created. The ready-made Netty HttpRequestDecoder and HttpResponseDecoder, and finally our very own SimpleHttpHandler are added to the pipeline in that order.

This is all that is needed to create a simple HTTP server using Netty API.

Now let us start the server using the command line

java apachetest.SessionTestServer netty1 . 9080

This starts a server listening for HTTP requests at port 9080 and with server instance id ‘netty1’ and session id separator as ‘.’ We get the command-line print-

Started server at port 9080 Press any key to stop server

We leave the server running and open a new browser window. We give the URL address as http://localhost:9080/sessionTest?user=Archanaa and press enter.

We get back a response on the browser window – ‘Session id is ….’

Now, we copy the session id and change the URL address at the browser to http://localhost:9080/sessionTest?jsessionid=0140C3892B4F5348.netty1

- where the ‘jsessionid’ request parameter value would be the session id which was assigned to us, in the last response.

We now get the response – ‘Welcome user Archanaa to server netty1’

At the console where netty1 process has been started, press any key (like Enter). The server process will stop with the message – ‘Shutting down server’

Now, let us use Apache Web Server for load-balancing two such HTTP servers with session stickiness.

After installing the Apache Web Server, which would listen at port 80 on your machine, go to conf/ directory and make the following changes in httpd.conf configuration file.

  • Uncomment the modules mod_proxy, mod_proxy_balancer, mod_proxy_http and mod_rewrite.
  • Add the following VirtualHost configuration to the end of the httpd.conf file.

DocumentRoot "C:/apps/Apache Software Foundation/Apache2.2/htdocs"
ProxyRequests Off
ErrorLog logs/
CustomLog logs/ common

Order deny,allow
Allow from all

ProxyPreserveHost On
ProxyPass /sessionTest/ balancer://mycluster/ stickysession=jsessionid
ProxyPass /sessionTest balancer://mycluster stickysession=jsessionid

Order deny,allow
Allow from all
BalancerMember http://localhost:9080/sessionTest route=netty1
BalancerMember http://localhost:9081/sessionTest route=netty2

SetHandler balancer-manager
Order deny,allow
Allow from all

We intend to start two servers with names netty1 and netty2 at ports 9080 and 9081, respectively. We have specified these URLs and their server instances in the BalancerMember configuration under Proxy tag. In the ProxyPass configuration, we have specified the stickysession as ‘jsessionid’, so that Apache would look for it in the HTTP request parameters or the cookies. We will pass the sessionid in the HTTP request parameter, as we had done earlier. We will choose the session id separator as ‘.’ Because that is the default separator recognized by Apache Web Server.

Now let us start the two server instances like so-

java apachetest.SessionTestServer netty1 . 9080
java apachetest.SessionTestServer netty2 . 9081

After that, let us start the Apache httpd

With this, the Apache Web Server will now act as a reverse-proxy as well as load balancer with sticky-sessions for the two server instances.

To test that, let us connect with the browser URL http://localhost/sessionTest?user=Archanaa
We are provided with a user session id –

Similarly, in a different browser window or tab, let us connect with a different user http://localhost/sessionTest?user=Ankur and we would get a new session id.

Now let us change the request to give the jsessionid parameter in the request instead of the user and see the results.

No matter how many times we connect with the same session id and in whatever order, we always get the response from the correct server showing that session stickiness is working properly.

Now let us stop one particular server instance, say netty1 and try to connect its user. Now if we fire a request from our browser with netty1’s session, it would be failed over to the next instance, that is netty2. Since we have not replicated our session, we get the response –

Meanwhile, the other user is able to connect as before successfully with netty2’s session.

Thus we see that our simple experiment has worked well.

There is no limit to how we can use JBoss Netty to build our own HTTP server. In the above example, we have kept the handler synchronous. But there is no restriction for saving the HttpRequest along with a unique asynchronous completion token in a map and doing asynchronous request processing in a different thread. Once the request processing is done, the request can be retrieved from the map and we can send the response as we have done in this simple example above. To add robustness, we can keep a time-to-live for requests and on timeout, remove the corresponding request and send a REQUEST_TIMEOUT response to the client.

With that I conclude this series and I hope I have been able to showcase a fine alternative to using Servlet API and Containers for building simple HTTP servers.

Sunday, July 11, 2010

Do we really need Servlet Containers always? Part 4

Other Features

Session Management, Replication and load-balancing with sticky sessions

The HttpSession provided by Servlet specification is, for most cases, an ideal context for storing user state and data in memory. However, this context can only be obtained from the reference of the request object.

Suppose you want to push frequent updates to the user from the server-side, this context would not be available without access to the request object. If such data is stored in the database, for every user request, the database would have to be accessed to check for and retrieve updates, leading to severe loss of performance and scalability. The ideal place to store this data is in memory. In the absence of access to the user’s HTTP request, you would look for some other means to store this data – possibly a distributable and replicable cache such as JBossCache or Terracotta. Frameworks such as Hibernate can be configured to use such caches for second level caching. This somewhat limits the usefulness of sessions to store user related data in memory. Of course, they still have their usefulness with respect to transparent access to cookies.

But the point of this discussion is anyway not restricted to the usual browser-based thin HTML GUI web applications, but rather with respect to other special cases that we have enumerated earlier in the premise. In fact, web-services exposed for client applications using either SOAP or REST APIs are designed to be as idempotent and stateless as possible. So, in such applications, one would not need HttpSessions anyway.

If you need session identifiers just for tracking that the connected device or client has been authenticated before, this can also be achieved simply by using custom logic for creating and storing sessions in memory and even replicating them across the cluster. It is also fairly trivial for the client application to use URL-rewriting or URL parameter or embed the session identifier in the application protocol itself to use this session identifier. Or, one may use JBossCache and Terracotta for this purpose as well. For a simple use like this, a custom solution may offer more freedom and developer control than using HttpSession.

This brings us to the discussion of load-balancing with a web-server reverse proxy using sticky sessions. Most application servers and containers provide plug-ins for major web-servers like Apache HTTPD. With these plug-ins the web-servers can do a reverse-proxy for application servers or containers and in case of fronting an entire cluster, they can identify the correct backend server to stick the user to. However, by using Apache HTTPD’s mod_proxy and mod_proxy_balancer, one may easily achieve these objectives for any custom HTTP server or engine. I have successfully done so myself and would soon be posting an article for the same.

JSP Support

For the scenarios under discussion, JSP support is clearly not required.

Declarative security (by adhering to JAAS Security Standards)

Servlet containers provide adherence to Java Authentication and Authorization Service (JAAS) Security standards. This feature is undeniably extremely useful for web-site based applications. However, as a user of JAAS myself, I have found it very difficult to understand and extend, especially when I did not have the option of using ready-made Servlet container JAAS plug-ins. For example, I once had to authenticate and authorize users from an Oracle Internet Directory credential repository, for which I had to make a custom JAAS module. Configuring custom JAAS modules is not standardized across containers and can be quite non-trivial.

JAAS is not the only option. For example, one may use Spring security (formerly known as Acegi). Especially for the case when you do not have to extend the authentication and authorization for container resources like EJB or use it in JSP scripting with Servlet request API like isUserInRole(..) or getUserPrincipal(..), it is better to implement a simple and custom security module rather than overly complicating the whole thing just to adhere to standards. Besides, as discussed in the case of Communications, Multi-threading and Request-Response lifecycle, you cannot be absolutely aware of how the Servlet Container of your choice actually operates beneath the layers and for all you know, it could be sub-optimal for your scenario.

Deployment contract

By definition, a container is one that would run the whole show- managing the life-cycle of your application and resources. It is the container that starts up and passes control to your application, not the other way round. While this helps in having a standard deployment and packaging contract for your application, it can be sometimes an unnecessary pain, especially for the scenarios enlisted in the premise. Containers may have elaborate steps to deploy and update applications and may take time to startup and shutdown due to the vast array of services that they have to gracefully start and stop. It is rather a welcome feature that some containers like Jetty or Tomcat can run as embedded engines, that is they can be started up by the application by using appropriate API.

There is one other aspect that becomes trivialized by the deployment contract of Application Servers and Containers. Typically Application Servers can deploy any number of web and other JavaEE applications within a single instance and all these applications can share the server resources. This is not necessarily a good thing because, after all, each instance is a single JVM or process and if it crashes, it can bring down all the applications hosted in it. Also, the applications need not always share the same application server resources and the sharing would in fact be detrimental, considering that, unless properly configured, all applications may be sharing the same thread-pool or JDBC connection pool and competing with each other for resources. These points are often overlooked and application developers forget that they have to plan how to deploy their components in different application server instances.

Standard HTTP request-response API

The Servlet Specification provides a standard API for HTTP server side applications, albeit with certain constraints and loss of freedom. If however, you choose to use a Server or Engine, they too would provide facilities to access HTTP headers, URL parameters, request body and so on. The lock-in to specific API has not been so much of a concern to users of Apache Commons HTTP Client, Spring, Hibernate, as long as the API suits the purpose to a T.

Saturday, July 10, 2010

Do we really need Servlet Containers always? Part 3

Request-Response Lifecycle and Multi-threading

Whenever a Servlet’s service()/doGet()/doPost() method is invoked, the container provides it an HTTP request and response object. What happens, when the Servlet method returns? As per the Servlet specification up to 2.5, the request and response objects can be recycled by the container and hence –

Each request/response object is valid only within the scope of a Servlet’s service method, or within the scope of a filter’s doFilter method. Containers commonly recycle request/response objects in order to avoid the performance overhead of request object creation. The developer must be aware that maintaining references to request/response objects outside the scope described above may lead to non-deterministic behavior.

The pooling of objects used to be a best practice to mitigate the performance overhead of early JVMs. But it is rendered less useful with modern JVMs and this approach now presents a serious constraint. The constraint is that you cannot save the reference to the request or response object and pass it on to a different thread context. As soon as the Servlet method returns, for all you know, the request and response objects would be recycled and reused for a completely different connection. The non-deterministic behavior, if you try to asynchronously handle the request and response, can be for example that the response you give may be delivered to a completely different client or else that the request you are trying to read may be suddenly over-written by some other client’s request. So you are then forced to do all the processing in the context of the container thread.

Is that a problem? Well, most of us have developed applications with a database backend. This can be a real bottleneck. Suppose, there is a database query which, even with maximum optimization, can only give results in 2 seconds. For those two seconds, the Servlet Container thread is blocked and cannot be used for handling another incoming request. Similarly other threads may also be similarly blocked waiting for their results and some other threads may be blocked waiting for JDBC Connections to be available to the pool. This, therefore, brings down the throughput (number of transactions per unit time that can be handled). So, what’s the solution? Increasing the thread pool size of the container to a 100? But can you also similarly increase the maximum number of connections you can have in the JDBC connection pool? So, there is no use increasing the thread pool of the Servlet Container when you have limited backend resources.

Moreover, you cannot use asynchronous techniques like passing pending requests to a JMS or simple Java queue, since the Servlet specification constraint implies that the moment the method call to the Servlet returns, it has lost the response object and therefore it cannot render the response to the user’s request in a different thread. You could possibly use a round-about technique of handing an Asynchronous completion token for the initial request and then use Ajax polling with the token to check if the corresponding response is ready. But this is rather like arm-twisting the application just to fit the Servlet Container constraints and would also result in lots of unnecessary pings from the client-side which would anyway bring down the actual throughput.

So, even if you have divided the application into presentation, business logic and data access layer, they become tightly coupled with respect to their dynamic execution and the slowness of one layer adversely impacts the throughput and scalability of another layer.

If you look at the CPU utilization of any particular server, it is bound to be under-utilized. If you increase the number of threads, the throughput does not increase but due to heavy context-switching, the CPU utilization would shoot up. What is the reason? Whenever there is any thread waiting for I/O to occur or receiving data from the database over a network call, it would be swapped for another thread which can do CPU based work. In the above case, nearly all threads that are working for the application would be blocked for more time than actually doing CPU related processing. This is not the best way of using the CPU resources.

To manage a high load with such architecture, you would have to replicate the application across several machines and do horizontal scaling. That would be good news for commercial application server and hardware vendors – you would have to multiply your hardware and licensing costs. It is bad news for application providers, unless they are already used to raising a large bill of material.

But for this constraint of the Servlet specification, these applications could have been designed in an asynchronous way. In contrast, look at the alternative of keeping the layers asynchronously decoupled as follows-

With the ability to store references to request and response references in memory, now one is free to loosely decouple the layers and queue up requests for processing and response data to be rendered.

The benefits can be profound. I have recently used the above approach of keeping the layers loosely coupled and found that I could attain a huge throughput from one machine itself whereas in the earlier approach I would have required multiple horizontally clustered machines and application servers. The reason, I believe, is that here we are decoupling the layers such that each layer has its own thread or thread pool concentrating on the specific responsibilities of that layer; rather than the earlier approach of making threads span through all three layers. The front layer concentrates on I/O and very little CPU processing; the business processing layer would typically have more of CPU-bound activities; whereas the Data Access/EAI layer would again have more of I/O related activities. This division enables the CPU to more optimally switch its attention to different layers. Moreover, one layer does not affect the lifecycle and processing of another layer. Thus the front layer is free to accept more and more HTTP requests and work on them.

To design such an application, though, one must be careful to appropriately time-out and clean up request and response objects, not to mention a more sophisticated programming and design with multiple threads to keep the layers loosely coupled.

In fact, possibly realizing this, Servlet 3.0 specification has sought to remedy this by introducing Asynchronous request processing. With Servlet 3.0, The HttpServletRequest has a new method startAsync() which returns an AsyncContext object that caches the request/response object pair. This AsyncContext object can be stored in a memory-repository and the servlet method may return immediately so that more requests can be handled. Additionally, the HttpServlet should be annotated with the asyncSupported attribute set to true so that the Servlet container does not commit the response when the Servlet method returns.

@WebServlet(name=”frontServlet”, urlPatterns={“/*.do”}, asyncSupported=true)
public class FrontServlet extends HttpServlet {
public void doGet(HttpServletRequest request, HttpServletResponse response) {
AsyncContext aCtx = request.startAsync(request, response);
Queue bizLayerQueue = BusinessLayer.getRequestQueue();
public class FrontResponseThread extends Thread {
Queue bizLayerResponseQueue = BusinessLayer.getResponseQueue();
public void run() {
while (true) {
AsyncContext asyncContext = bizLayerResponseQueue.take();
ServletRequest request = aCtx.getRequest();
//get parameters and render the response

Servlet 3.0 would help standardize the asynchronous request processing API for the JavaEE developer community. But it would take some time for Application Servers to be JavaEE 6 compliant with stable versions of their product. The good news is that most of the application servers already support Asynchronous Request Processing with non-standard APIs and at the risk of sacrificing portability, one may use these APIs till then.

  • Tomcat 6 provides a CometProcessor interface which can be implemented by Servlets.
  • Jetty 6 provides Continuations
  • WebLogic 9.2 provides AbstractAsynchServlet and FutureResponseServlet
  • Websphere version 7 provides Asynchronous Request Dispatcher API

But the fact remains that most of the frameworks built around this Servlet model and based on the Core J2EE Patterns, be they presentation frameworks like Struts, JSF, Spring MVC or bridge frameworks like Seam, is synchronous and they have not yet advocated any samples using Asynchronous Request Processing paradigm. And most of the JavaEE developer community is heavily influenced by reference architectures and blueprints like the Java Pet Store application and does not follow out-of-the-box approaches.

In my opinion, this has been the result of the mentality that Servlet Containers are the be-all and end-all solution for HTTP server side applications and restricting oneself within the confines of the Servlet framework.

Do we really need Servlet Containers always? Part 2

Communications and Multithreading Support

Application servers/Servlet containers do all the hard work of opening up server sockets to accept client connections and decoding and encoding HTTP requests and responses. But the internal multi-threading and communication strategy used and thereby the performance and scalability of servers may widely vary in this respect. What’s more, you are left to the mercy of the container and if one doesn’t know the internal details or read the fine print, one may choose a sub-optimal container for specific scenarios.
  • For example, earlier versions of some containers used a thread pool and allocated threads-per-client socket connection and recycled used threads once connections closed. Of course, this means that there is a cap to the number of clients that can be supported with this approach since after a few hundred threads, there can be severe context switching. This model has been abandoned for good reason in newer versions of the same containers. Most application servers and containers now provide the thread-per-request model where each HTTP request may be read and processed by a thread from a thread-pool and after processing of the request, the thread is recycled back. But this goes to show that not all application servers / containers are made equal.
  • The mechanism used by the server to read the request as part of the socket communication may also vary. Servers can adopt Native IO or Java NIO mechanisms to implement either the reactor pattern or the proactor pattern. The proactor pattern is more sophisticated, scalable and efficient than its reactor counterpart. True proactor implementations require extensive OS support and Native IO. Using Java NIO’s non-blocking read, a near-Proactor implementation can be achieved.
With respect to HTTP, a reactor implementation of a container/server is one where the following happens for reading requests-

In this case, whenever a client socket channel is opened at the server, a read-interest is registered at an event de-multiplexer. The event de-multiplexer thus similarly monitors read-interests for all the client socket channels. Whenever a read event occurs in any channel, it temporarily de-registers the read interest on the corresponding channel and sends a read event notification with the reference of that channel to a thread pool.

This thread-pool invokes the Servlet’s methods and it is in the context of these threads that the Servlet reads an HTTP request from the socket. This read is typically blocking but has a maximum read timeout. After the read and request processing is done, the read-interest is once again registered with the event de-multiplexer for the channel. The thread gets recycled for handling more such requests.
So in effect, it becomes the thread-pool’s responsibility to read from the socket.

Whereas, a proactor implementation of a container for HTTP may be as follows-

In this case, the event de-multiplexer part has more work to do. It maintains a buffer for each socket channel. Whenever a read-event occurs, it does a non-blocking read of the data from the OS and fills up whatever data it can from the socket channel into the buffer. As long as the complete request is not assembled, it continues the process. Once it identifies that a complete HTTP request has been assembled in memory, it sends the collected data from the buffer to be processed to the thread-pool. Meanwhile, it continues to read further data from the socket channel and filling up the buffer again. For reading up buffers, it can also delegate to an internal pool of worker threads.

As in the reactor approach, the thread-pool invokes the Servlet’s threads but the read is actually being done from an in-memory buffer rather than the client socket channel.

Let us compare the two approaches.

In the reactor variation, since the application thread pool is reading from the actual socket, one saves on the memory consumed. There is no buffering up of data except for the actual request data that will be used by the application. When one is concerned about the possible memory consumption due to unknown request sizes, this would be an optimal approach.

On the down-side, consider that the clients are sending data to the server over a fragile medium with low bandwidth like GPRS. I happen to have some experience developing server-side communicators for GPRS. GPRS actually uses unused time division multiple access (TDMA) slots of a GSM system to transfer data, for example, when there are gaps and pauses in speech. This means that the transfer of data can happen very slowly depending on the voice traffic and moreover, it would arrive in bursts rather than a complete packet. The time gap between bursts of data is arbitrary, depending on the availability of time slots. So, if the reading is done by the Servlet in the context of the thread-pool, a precious container thread gets blocked till more data can arrive on the socket. This can bring down the scalability of the application. So, for low bandwidth transportation mediums, such a server could prove sub-optimal.

In comparison, the proactor variation would have much better throughput and scalability for low-bandwidth transport mediums since the event de-multiplexer part would take care to buffer up data in memory and the thread-pool would read and process completed requests from in-memory buffers. On the down-side, this may be unnecessary when you are assured of excellent bandwidth on your LAN or WAN and would be sub-optimal if you have memory constraints. Of course, this would require very careful implementation of the container or server with stringent memory handling and timeouts and users of such containers may have to read the documentation very carefully to configure optimally. If not, it could even lead to memory leaks.

Most Application Servers/Servlet Containers actually use the reactor pattern, because most web-site based applications can assume good bandwidth from clients and keeping in view the hazards of inefficient memory management. Personally, I was hit by this behavior when I used Weblogic8.1 Application Server and its Servlet Container for handling HTTP POST data from GPRS based clients.

In general, for designing server-side applications which only use HTTP as a transport medium, I would recommend using a proactor pattern-based approach. To handle memory constraints, a proper timeout mechanism should be implemented to detect idling of clients and cleanup buffers.

Sunday, July 4, 2010

Do we really need Servlet Containers always? - Part 1

The Premise

Since the inception of the Servlet and JSP specifications, using Servlet Containers and the Servlet API has become the de-facto method for Java(EE) community to host dynamic web-site content or web-based services or generally for any HTTP server-side processing. These containers may be provided by full-blown JavaEE application servers like Weblogic, WebSphere, JBossAS, etc or by lighter-weight containers like Tomcat, Jetty and so on.
But is that really the best method always? This is a question I have lately asked in some forums or discussed with colleagues. In response, people are invariably taken aback - a large percentage of the community is not aware that there can be another alternative and even if such an alternative exists, they question the need for it. There is so much FUD (fear, uncertainty, doubt) associated with letting go of these Servlet containers. The root cause I believe is due to the following reasons-
  • People believe that Application Servers and their Servlet containers provide some great sophisticated and almost magical services that they can never do without for any serious production based application.
  • A very large community has generally built web-site based applications or has some web-site component for exposing their business to human consumers. If they are using and investing in Application Servers and/or Containers anyway, why not just use the Servlet container.

These are in fact, quite often valid reasons. But one must pause to think - is there no flip-side at all? Have we not lost out on anything at all due to this approach? And do the above reasons hold true for all HTTP server applications?

Not long ago, there was a time when people thought that EJB = J2EE and that every J2EE application must have an EJB container. In fact, I was one of them. Then one fine day in 2006 I came across a book - "Expert One-on-One J2EE development without EJB" (2004 Edition) by Rod Johnson and Juergen Hoeller, the creators of Spring Framework. That book completely changed the way I thought of JavaEE applications and I consider it a turning point in my professional career. There was an entire chapter - "EJB, Five Years On", which effectively dispelled all the myths I had ever entertained about J2EE and EJB. One by one, the chapter showed me that whatever services Stateless Session Beans (often the only EJBs used) provided- whether Declarative Transaction Management, Remoting, Clustering, Thread Management, Instance Pooling, Security and so on- there were alternatives and in fact, EJB was not the best answer to these concerns for all scenarios. 6 years after the publication of this path-breaking (in my eyes) book, the impact of Spring is for all to see. It revolutionized JavaEE development, the JavaEE6 specification is greatly inspired by it, a large part of the developer community has not looked at EJB since the arrival of Spring and it is understood that Spring shops are unlikely to move away from their favorite framework so that the community may now be split into Spring shops and JavaEE shops. (Read Adam Bien's weblog JAVA EE 6 IS NICE, BUT IT IS TOO LATE...).

I think the reason Spring framework and the book I mentioned above were so effective was that the creators of Spring tried to understand and discuss the philosophy behind the EJB2.1 and J2EE specification and then see whether there were any alternatives or better approaches. Not only that, there was a respect for the capability of the developer and architect community urging that middleware developers are not semi-competent to be unable to make decisions for their application requirements and they should not be deprived of control or ability to use alternatives.

In the same lines I would like to discuss about the need to use Servlet Containers for all HTTP server side processing. Let us keep our minds open and ask some real questions as to when and why we need Servlet containers and when can we do without them. An important pre-requisite is to have confidence and respect for our own competence as developers and architects to control the programming language and platform we have painstakingly learnt and developed our expertise in over several years.

An excellent book “Head First Servlets and JSP” explains the value of containers for server-side HTTP applications. It first challenges a Java developer to think of her responsibilities “if you had Java but no Servlets or Containers”. Then it expounds “What does the Container give you?” and lists the following –

  • Communications support (i.e. listening on server socket and HTTP protocol encoding and decoding as well as HTTPs support)
  • Lifecycle management (of the Servlet and the entire web application)
  • Multithreading support
  • Declarative security (by adhering to JAAS Security Standards)
  • JSP support

One may add the following as well to the above list -

  • standardized deployment contract in the form of war files and declarative deployment descriptor in the form of web.xml.
  • A standard API to represent HTTP requests and responses to extract data from HTTP requests (headers, URL parameters, request/response body.
  • Session Management
  • Pluggability with major web servers (like Apache HTTPD) for reverse proxy, load-balancing with session-stickiness and failover.

One must be thankful to the containers and the Servlet specification for the standardization and the above fundamental features offered which help developers concentrate on the business problem at hand. Without this support, the huge amount of ground-work to be done would be daunting to the best of us. Some of these features are just indispensable for web applications - no questions asked. I say this also because there are a whole lot of GUI MVC frameworks like Struts, JSF, Spring MVC which have been built around the Servlet specifications. To use these invaluable MVC frameworks for churning out your web-sites, you would of course require a Servlet container.

Then what exactly are we attempting to discuss about? First, as architects or designers, one must find the key quality attributes required for the business application and decide which of the Servlet Container facilities we absolutely need, which ones we can do without, which attributes are we prepared to trade-off for more important ones and whether there are alternatives. As a young JavaEE developer/designer/architect, I was also among those who thought that Servlet Containers are THE solution for developing Web/HTTP Server-side Applications. By not considering the special quality attributes and trying to twist Servlet Containers and other JavaEE platform offerings, I had blundered and what was worse, for a long time I didn’t even know that I could have done better by adopting a different paradigm. I have since become wiser.

Before proceeding, let me make it clear that if you need to make a web site for thin browsers with HTML for human users using JavaEE platform, I am of the opinion that Servlet Containers would most likely be required at least to a certain extent. Though of course, now with the emergence of RIA technologies like Flex, JavaFX, etc, there is a whole new paradigm for web-site development.

However, consider the following interesting scenarios where there is scope for exploring alternatives to Servlet Containers-

  • Say you need HTTP as a transport protocol. You have your own application level protocol in the form of some XML format or some binary protocol (custom or Google protocol buffers and so on). This application level protocol or format is just using HTTP as a means of transport instead of raw TCP/IP sockets. Typically this is done because HTTP is seen as firewall friendly.
  • Or else you are using the RESTful facilities provided by HTTP - GET, POST, PUT and DELETE to provide a simple web based service. Once again HTTP is providing a convenient protocol especially in the form of name-value URL parameters to specify your query string and formulate your request. But essentially it is being used as a protocol and a format, not for displaying an HTML GUI page.
  • Even for GUI applications, there has been a re-emergence of thick RIA clients enabled by Flex and JavaFX and Silverlight which take care of the GUI rendering entirely at the client side and seek business data from HTTP based server-side applications. Once again, we have the same premise - they don't need the server side to do any rendering for them, just to provide HTTP-based service and provide the data in some mutually agreed application level format embedded in HTTP request/response format.
  • One can also include the AJAX/Comet requests which are frequently sent by clients to servers and have to be responded to in minimal time with maximum throughput and also scalability for handling numerous such requests. Often the client side script is capable enough of rendering or just un-hiding certain portions of the page after applying the correct values.
  • Sometimes the medium between the client and server can be fragile like wireless (GPRS) and the transport may behave in strange ways - like sending chunks of data in packets and with very low bandwidth.

In my next posts, I will attempt to go under the hood and read between the lines to evaluate the Servlet Container features, for such scenarios.

Saturday, July 3, 2010

How JBoss Netty helped me build a highly scalable HTTP communication server for GPRS

It has been nearly 1 year and 3 months since I discovered JBoss Netty. We were building an end-to-end bus transportation solution and the server-side required a highly scalable
communication layer which could support concurrent transportation devices to the tune of 10000s and with a throughput of above 500 requests/sec posting transportation related data to the server over GPRS using HTTP.

In the first version of this solution wherein I was the architect of the communication server, I had already burnt my fingers using a commercial application server's servlet container and naively building upon a synchronous architecture which could not scale upto more than 500 devices at nearly 25 requests/sec! Of course, a lot more was wrong with it than just using a servlet container - the entire architecture was synchronous and there were other humungous mistakes, but that's another story.

This time we were making a radically different architecture under the guidance of a seasoned architect. As I had built the communication server for the first generation of the solution and had already learnt a lot from our experiences of making the solution live, I was determined to get it right this time. In keeping with the overall architecture of the solution and for supporting this high scalability and throughput, I decided to use an asynchronous HTTP engine based on NIO.
Why NIO? Well my fascination for NIO started in early 2006 when I came across the following articles published in O'Reilly

However, my project manager-cum-architect at the time (early 2006) was perturbed with the idea of trying out a 'new' and 'untested' architectural approach and as I had not much architecture experience at the time to counter the arguments, I gave in and used a regular Core-J2EE-patterns inspired 3-tier architecture utilizing a commercial application server's servlet container.

When the first generation of the complete solution was built, deployed and made live, I became more and more convinced that my first inclination to use NIO was justified.

Here were the reasons-

  • I was building a communication server which used HTTP as a means of transporting the actual application data. I was not putting up a web site for human users with an HTML GUI. As such I did not need lots of the facilities of the servlet containers and was in fact hampered by a few restrictions as I will elaborate below.
  • Pre-Servlet3.0 specifications only supports synchronous architecture and a thread-per-connection or thread-per-request model. Servlet3.0 specification has only recently been released and the majority of application servers are yet to comply with this specification. For more information, read the article Asynchronous processing support in Servlet 3.0. Our solution was built and deployed much before that. Of course, application servers also have non-servlet-standard APIs for asynchronous HTTP in their servlet engines (read article Asynchronous HTTP and Comet architectures). But I had no awareness of it at the time.
  • The transportation medium was GPRS for which bandwidth is quite low. Moreover, in developing countries, the quality of GPRS is not as consistent as Europe or United States. Furthermore, our devices were mounted on moving vehicles which could move at very high speeds changing cells and GPRS connection could be quite fragile. Under the hood, GPRS is a packet oriented service offered on 2G networks and as such manages to send only a few packets so long as communication channels are not used up by voice communications. That means the entire HTTP request would arrive in packets or chunks or bursts over GPRS. Often we would also get HTTP requests that were incomplete and would eventually time out. For example, if the GPRS device was trying to send 512 bytes our server would get only 78 bytes and then the request would time out. In the thread-per-request model, this would mean that one precious thread of the container would be blocked just trying to read the complete request and would eventually timeout. Depending on the read timeout value set in the application server, this could mean as much as 30 secs which is a huge amount of time. What we required was an engine which could do a non-blocking read and assemble the entire HTTP POST request in memory and only then call a servlet or a handler to do the rest of the processing. In other words, we needed the proactor pattern instead of the reactor pattern. And while lots of application servers use NIO or native IO, they internally use the reactor pattern!

The case for NIO was thus clear. But I still was having a hangover of the Core J2EE patterns and the numerous MVC frameworks built around Servlets for presentation tier. So I would probably have used NIO frameworks in the typical synchronous style. Then along came our seasoned architect who reviewed our solution and proposed a radically different - distributed and asynchronous architecture. I realized that if I used a synchronous approach with the reliability that data sent by the clients was getting successfully processed by the backend, I would have to block my thread till the backend processing of the data sent by the vehicular devices was done.

The other possibility would have been to use reliable JMS, but with the store-and-forward technique of reliable JMS, we would have probably sacrificed the desired high scalability and throughput for the reliability. Besides that was rather like arm-twisting the architecture to comply with servlet container constraints.

What I needed was an HTTP engine which would not impose the following restriction of pre-Servlet-3.0-

Each request object is valid only within the scope of a servlet’s service method, or within the scope of a filter’s doFilter method. Containers commonly recycle request objects in order to avoid the performance overhead of request object creation. The developer must be aware that maintaining references to request objects outside the scope described above may lead to non-deterministic behavior.

An engine that would allow me to do this-

That way, there would be no scalability bottleneck in accepting more requests from larger number of clients and reliability would have been sufficiently achieved as well.

Moreover I needed this in late 2008-early 2009 when the final Servlet 3.0 specification was not yet released and there was no reference architecture either.
Relying on my NIO experience, I proceeded to evaluate the following three options of NIO frameworks-

I had had experience building NIO TCP/IP servers with Apache MINA but the MINA releases were taking a long time and HTTP protocol was not directly supported. Yes, there had been an asyncweb project from safehaus built on MINA0.8, but the project had been migrated to Apache and was not developed further to be compatible with latest versions of MINA. It was a big risk using Apache MINA+Asyncweb.

Grizzly, I found, had too many options - plain NIO framework, Embeddable HTTP Web Server, Embeddable Comet WebServer, Embeddable Servlet container. I was simply baffled. It seemed a great learning curve and I had very little time. Moreover, this post from Grizzly's creator alerted me that Grizzly would by default, use the reactor pattern rather than proactor, which I was not comfortable with for my scenario.

In contrast, JBoss Netty's HTTP support was short and sweet - with no-frills. All I had to do was to use the ready-made HTTP Protocol encoder and decoder and attach my own HTTP request handler. After all, HTTP for my scenario was just a protocol over TCP/IP sockets carrying the application data in its body.

Within no time, I had my prototype ready and I was confident that I was on the right track. The performance test reports were extremely heartening. But could Netty fit in my scenario?

Turns out that it far exceeded expectations. That simple HTTP communication server I built using Netty could sustain the concurrent load of 10000 clients with the desired throughput of 500 requests/sec on a single desktop PC with 1 GB RAM and 1 CPU in our lab. And I am not kidding! It was celebration time for all of us. The memory utilization was well within the 812MB heap size we had granted to the JVM and the average CPU utilization was 78%. In other words, we were utilizing the full resources that the machine offered.

My reasoning for building a scalable HTTP communication for GPRS devices with NIO, asynchronous HTTP was thus vindicated by using JBoss Netty Framework. Thanks to Trustin Lee for making this splendid framework.