App Server Architecture

10 PM February 11, 2004

I’m going to build a web application—a replacement for my Movable Type blog. I have a rough set of requirements and some vague plans for using Twisted. The next step is to choose an architecture.

The reliability and availability requirements for my blog are very light. There is no need for fail-over or even warm-backup. Likewise, a single machine ought to be able to handle the load from one obscure blog, so a rack of load-balanced quad-SPARCs is not required.

The software side is more interesting. Three basic approaches present themselves:

1. 100% Dynamic

For sites with highly dynamic content, every page served must be dynamically generated, each time it is to be served. Most Servlet/JSP/J2EE applications work in this manner, as do Wikis.

While this approach offers the ultimate in flexibility and ensures that each page is served from the very latest data, it is costly in terms of computing resources. To serve a high-volume application using this approach requires distributing the application across multiple machines.

A common variation on this architecture is to place the application server ‘behind’ a web server. In this configuration, the web server handles requests for static content such as images and stylesheets, leaving the application server to serve HTML.

2. Regenerated Static Pages

Another approach is have an essentially static site, served by a web server, with pages regenerated as required. Movable Type is a good example of this approach.

The web server can handle the the majority of requests directly from the filesystem (big red arrow). When the web server receives request to update content (smaller, purple arrow), it invokes the application code. The application code updates its private data store, then regenerates the pages that have been modified.

This approach generally results in lighter use of server resources than the 100% dynamic approach, though the page regeneration process can be costly.

To make the page regeneration process as efficient as possible, the application regenerates only those pages affected by a change. If the application cannot determine precisely which pages are affected, it must update every page. The mechanism to determine exactly which pages are affected by a change is potentially complex to implement.

3. Cache

A third approach is to serve the application dynamically, but cache the most frequently requested content. The cache can be invalidated either by the application – when the application determines that the content for that page has changed – or on a timer.

The aim is that most requests (big red arrow) can be served from cache. Requests for resources that are not, or cannot be cached are passed through to the application server.

Compared with a 100% dynamic configuration, this approach is more complicated, but uses less resource, meaning that more useres can be supprorted on a single machine. Compared with Regenerated Static pages, caching is more flexible in terms of the kinds of application it can support.


For my blog replacement, I decided to use the 100% dynamic approach, because it gives me the most flexibility, and load is not a big issue.

Further reading: Roy Fielding’s Architectural Styles and the Design of Network-based Software Architectures.

By alang | # | Comments (2)
(Posted to Software Development and javablogs)

Comments

At 04:33, 12 Feb 2004 Ian Bicking wrote:

Static publishing ("Regenerated Static Pages") can also be made simpler by using something like server-side includes. Then page dependencies can be represented directly on-disk, and a much smaller number of pages have to be regenerated. With a well-defined set of reusable content, identifying needed updates shouldn't be too hard.

Personally I worry less about load than reliability, though maybe neither is really an issue for you at this point. There's also several flavors of server under the dynamic approach, like Twisted (async), Webware (threaded) and SkunkWeb (multi-process), which isn't as big an architectural choice, but are worth some thought. (Threaded and multi-process aren't that different from each other, but async is)

(#)
At 17:09, 13 Feb 2004 Charles Miller wrote:

One thing to note is that if you pick a good URL scheme, you can dance blithely between the three options.

(#)

Add Comment




(Not displayed)






(Leave blank line between paragraphs. URLs converted to links. HTML stripped. Indented source code will be formatted with <pre> tags.)




© 2003-2006 Alan Green