Saturday, July 10, 2010

Do we really need Servlet Containers always? Part 3

Request-Response Lifecycle and Multi-threading

Whenever a Servlet’s service()/doGet()/doPost() method is invoked, the container provides it an HTTP request and response object. What happens, when the Servlet method returns? As per the Servlet specification up to 2.5, the request and response objects can be recycled by the container and hence –

Each request/response object is valid only within the scope of a Servlet’s service method, or within the scope of a filter’s doFilter method. Containers commonly recycle request/response objects in order to avoid the performance overhead of request object creation. The developer must be aware that maintaining references to request/response objects outside the scope described above may lead to non-deterministic behavior.

The pooling of objects used to be a best practice to mitigate the performance overhead of early JVMs. But it is rendered less useful with modern JVMs and this approach now presents a serious constraint. The constraint is that you cannot save the reference to the request or response object and pass it on to a different thread context. As soon as the Servlet method returns, for all you know, the request and response objects would be recycled and reused for a completely different connection. The non-deterministic behavior, if you try to asynchronously handle the request and response, can be for example that the response you give may be delivered to a completely different client or else that the request you are trying to read may be suddenly over-written by some other client’s request. So you are then forced to do all the processing in the context of the container thread.

Is that a problem? Well, most of us have developed applications with a database backend. This can be a real bottleneck. Suppose, there is a database query which, even with maximum optimization, can only give results in 2 seconds. For those two seconds, the Servlet Container thread is blocked and cannot be used for handling another incoming request. Similarly other threads may also be similarly blocked waiting for their results and some other threads may be blocked waiting for JDBC Connections to be available to the pool. This, therefore, brings down the throughput (number of transactions per unit time that can be handled). So, what’s the solution? Increasing the thread pool size of the container to a 100? But can you also similarly increase the maximum number of connections you can have in the JDBC connection pool? So, there is no use increasing the thread pool of the Servlet Container when you have limited backend resources.

Moreover, you cannot use asynchronous techniques like passing pending requests to a JMS or simple Java queue, since the Servlet specification constraint implies that the moment the method call to the Servlet returns, it has lost the response object and therefore it cannot render the response to the user’s request in a different thread. You could possibly use a round-about technique of handing an Asynchronous completion token for the initial request and then use Ajax polling with the token to check if the corresponding response is ready. But this is rather like arm-twisting the application just to fit the Servlet Container constraints and would also result in lots of unnecessary pings from the client-side which would anyway bring down the actual throughput.

So, even if you have divided the application into presentation, business logic and data access layer, they become tightly coupled with respect to their dynamic execution and the slowness of one layer adversely impacts the throughput and scalability of another layer.

If you look at the CPU utilization of any particular server, it is bound to be under-utilized. If you increase the number of threads, the throughput does not increase but due to heavy context-switching, the CPU utilization would shoot up. What is the reason? Whenever there is any thread waiting for I/O to occur or receiving data from the database over a network call, it would be swapped for another thread which can do CPU based work. In the above case, nearly all threads that are working for the application would be blocked for more time than actually doing CPU related processing. This is not the best way of using the CPU resources.

To manage a high load with such architecture, you would have to replicate the application across several machines and do horizontal scaling. That would be good news for commercial application server and hardware vendors – you would have to multiply your hardware and licensing costs. It is bad news for application providers, unless they are already used to raising a large bill of material.

But for this constraint of the Servlet specification, these applications could have been designed in an asynchronous way. In contrast, look at the alternative of keeping the layers asynchronously decoupled as follows-

With the ability to store references to request and response references in memory, now one is free to loosely decouple the layers and queue up requests for processing and response data to be rendered.

The benefits can be profound. I have recently used the above approach of keeping the layers loosely coupled and found that I could attain a huge throughput from one machine itself whereas in the earlier approach I would have required multiple horizontally clustered machines and application servers. The reason, I believe, is that here we are decoupling the layers such that each layer has its own thread or thread pool concentrating on the specific responsibilities of that layer; rather than the earlier approach of making threads span through all three layers. The front layer concentrates on I/O and very little CPU processing; the business processing layer would typically have more of CPU-bound activities; whereas the Data Access/EAI layer would again have more of I/O related activities. This division enables the CPU to more optimally switch its attention to different layers. Moreover, one layer does not affect the lifecycle and processing of another layer. Thus the front layer is free to accept more and more HTTP requests and work on them.

To design such an application, though, one must be careful to appropriately time-out and clean up request and response objects, not to mention a more sophisticated programming and design with multiple threads to keep the layers loosely coupled.

In fact, possibly realizing this, Servlet 3.0 specification has sought to remedy this by introducing Asynchronous request processing. With Servlet 3.0, The HttpServletRequest has a new method startAsync() which returns an AsyncContext object that caches the request/response object pair. This AsyncContext object can be stored in a memory-repository and the servlet method may return immediately so that more requests can be handled. Additionally, the HttpServlet should be annotated with the asyncSupported attribute set to true so that the Servlet container does not commit the response when the Servlet method returns.

@WebServlet(name=”frontServlet”, urlPatterns={“/*.do”}, asyncSupported=true)
public class FrontServlet extends HttpServlet {
public void doGet(HttpServletRequest request, HttpServletResponse response) {
AsyncContext aCtx = request.startAsync(request, response);
Queue bizLayerQueue = BusinessLayer.getRequestQueue();
public class FrontResponseThread extends Thread {
Queue bizLayerResponseQueue = BusinessLayer.getResponseQueue();
public void run() {
while (true) {
AsyncContext asyncContext = bizLayerResponseQueue.take();
ServletRequest request = aCtx.getRequest();
//get parameters and render the response

Servlet 3.0 would help standardize the asynchronous request processing API for the JavaEE developer community. But it would take some time for Application Servers to be JavaEE 6 compliant with stable versions of their product. The good news is that most of the application servers already support Asynchronous Request Processing with non-standard APIs and at the risk of sacrificing portability, one may use these APIs till then.

  • Tomcat 6 provides a CometProcessor interface which can be implemented by Servlets.
  • Jetty 6 provides Continuations
  • WebLogic 9.2 provides AbstractAsynchServlet and FutureResponseServlet
  • Websphere version 7 provides Asynchronous Request Dispatcher API

But the fact remains that most of the frameworks built around this Servlet model and based on the Core J2EE Patterns, be they presentation frameworks like Struts, JSF, Spring MVC or bridge frameworks like Seam, is synchronous and they have not yet advocated any samples using Asynchronous Request Processing paradigm. And most of the JavaEE developer community is heavily influenced by reference architectures and blueprints like the Java Pet Store application and does not follow out-of-the-box approaches.

In my opinion, this has been the result of the mentality that Servlet Containers are the be-all and end-all solution for HTTP server side applications and restricting oneself within the confines of the Servlet framework.

1 comment:

  1. Wow. Great post!
    Please continue this topic - go deeper with presentation frameworks you've mentioned like JSF. It's interesting if modern web servers, servlet containers implement proactor pattern - compilation would be great!