Archive for the 'Terracotta' Category

06th Mar 2008

InfoQ writes about my use of Terracotta Server as a message bus

Check out this article on InfoQ about using Terracotta Server as a message bus!

Posted by Posted by Mark Turansky under Filed under Architecture, Terracotta Comments No Comments »

21st Feb 2008

Some wheels need reinventing

Reinventing a square wheel is a common anti-pattern. The idea is a) we don’t need to reinvent the wheel because b) we’re likely to recreate it poorly compared to what is already available. But if we never reinvent any wheels, then we never progress beyond what we have. The real question, then, is when does it make sense to recreate a wheel? Some wheels need to be recreated.

I recently reinvented a wheel. A big one. The wheel is “Enterprise Messaging,” which much be complex because it has “enterprise” right in the name! I’d be a fool to reinvent that wheel, right? Maybe. Maybe not. We fit our “enterprise messaging system” into 92kb:

enterprise_messaging_in_92kb.jpg

Some won’t consider 92kb to be “enterprisey” enough, but that’s ok with me. I know we were able to put 1.3 million real-world messages through our bus over a weekend. That’s enterprisey.

Jonas Bonér wrote an article about building a POJO datagrid using Terracotta Server, and I replied on his blog saying we did something similar by using Terracotta Server as a message bus. Another reader asked why I did this instead of using JMS.

I think there are several benefits to this reinvented wheel:

TINY!

92kb contains the entire server framework. We have another jar containing common interfaces we share with client applications that weighs in at 18kb.

It works!

A single “consumer” in our framework is bootstrapped into an isolated classloader, which enables our framework to load applications (the various apps we need to integrate) into their own isolated classloaders. One consumer can process a message for any integrated application.

This is utility computing without expensive VMWare license fees.

We’re consolidating servers instead of giving each application dedicated hardware. The servers were mostly idle, anyway, which is why enterprises are looking to utility computing and virtualization to make more efficient use those spare CPU cycles. In our framework, hardware becomes generic processing power without the need for virtualizing servers. Scaling out the server farm benefits all applications equally, whereas the prior deployments required separate capital expenditures for each new server.

Pure POJO

Our framework runs inside an IDE without any server infrastructure at all. No ActiveMQ, no MySQL, and no Terracotta Server. Developers can stand up their own message bus in their IDE, post messages to it, and debug their message code right in the framework itself.

We introduce Terracotta Server as a configuration detail in a test environment. Configuration Management handles this part, leaving developers to focus on the business logic.

So, I might not be writing my own servlet container anytime soon (not when Tomcat and Jetty are open source and high quality), but I think it made a lot of sense to reinvent the “enterprise messaging” wheel. Terracotta Server allows me, in effect, to treat my network as one big JVM. My simple POJO model went wide as a configuration detail. That makes my bus (and TC) remarkably transparent.

Posted by Posted by Mark Turansky under Filed under Architecture, Engineering, Technology, Terracotta Comments 9 Comments »

13th Feb 2008

Code complete doesn’t mean you’re done

Joe Coder runs through his feature in the UI. It works. The doodad renders beautifully on the screen, and when he clicks the button, all the right things happen on the server. He checks his code in, writes a quick comment in Jira, changes the issue status to “Completed & Checked-in”, and goes to his next task. Lo and behold, his To Do list is empty! Joe’s done coding!

Or is he?

Configuration Management cuts a branch off the trunk code. Joe’s code goes through QA.

Some QA departments try to break developers’ code, others just test the “go path” to insure their test scripts work. In Joe’s case, the QA department has a test script that is a step-by-step use case for how the feature should work. QA signs off on his feature. The doodad worked exactly as specified, which is to say, the select/combo box was correctly filled with test data from the database.

Joe’s code goes to production … and takes down an entire server.

How can this be? It went through QA! Testers verified that his code did what it was supposed to do!

Most software organizations use the term robust wrongly. I generally hear it used in a context that implies the software has more features. Robustness has nothing to do with features or capabilities! Software products are robust because they are healthy, strong, and don’t fall over when the table that populates a select/combo box has a million rows in it.

Joe Coder never tested his feature with a large result set. Michael Nygard — in his excellent book Release It! from the Pragmatic Press — calls this problem an unbounded result set. It’s a common problem and an easy trap to fall into.

Writing code for a small result set enables rapid feedback for a developer. This is important, because the developer’s first task is to write his program correctly. It seems, though, that this first step is oftentimes the only step in the process! Without testing the program against a large result set, the new feature is a performance problem waiting to happen. The most common consequences are memory errors (have you ever tried filling a combo/select box with a million rows from the database?) or unscalable algorithms (see Schlemiel the Painter’s Algorithm).

In our case, testing with big numbers revealed concurrency issues that we did not and could not find when developing with simple, smaller tests. Our multi-threaded, multi-node messaging system would routinely deadlock whenever we slammed it with lots and lots of messages. It didn’t do this when we posted simple messages to the endpoint during development. Similarly, we noticed that we held on to objects for too long during big batch runs. They all got garbage collected when the batch completed, so it wasn’t exactly a memory leak, but there was no need to hold on to the references. After we fixed that, we noticed that our servers would stay within a tight and predictable memory band, as opposed to GC’ing all at once at the end of a batch. Terracotta server expertly pages large heaps to disk, so we were never at risk of running out of memory. Still, it’s nice to see our JVMs running lean and mean.

We’re still stress testing our system today. This past weekend, we pumped over one million real-world messages through our message bus. Our concurrency issues are gone, memory usage is predictable, and we stopped our test only to let our QA department have at our servers. There are zero technical reasons why our test couldn’t have processed another million messages. Terracotta Server held up strong throughout.

But we’re still not done testing yet. We still have to see what happens if we pull the plug (literally) on our message bus. We still have to throw poorly written messages/jobs at our bus to see how they’re handled. We need to validate that we’ve got enough business visibility into our messaging system so that operations folks have enough info at runtime to do their jobs. We need canaries in the coal mine to inform us when things are going wrong, and for that we need better metrics and reporting built into our system that shows performance over time.

We’re code complete, but we’re not done yet.

Posted by Posted by Mark Turansky under Filed under Engineering, Terracotta Comments 2 Comments »