Archive for the 'Architecture' Category
25th Jul 2008
HOW TO: Use mini-batching to improve grid performance
We achieved a 3.5X increase in throughput by implementing “mini-batching” in our grid-enabled jobs.
We have a parent BatchService that creates child Services where each individual Service is a unit of work. A Service implementation might perform some calculation for a single employee of a large employer group. When the individual Services are very fast and the cost of bussing them around the network is greater than the cost of processing the Service, then adding more consumers makes the BatchService run slower! It is slower because these fine grained units of work require more queue locks, more network traffic, and more handling calls when the child Service is returned back to the parent BatchService for accumulation.
The secret, then, is to give each consumer enough work to make the overhead of bussing negligible. That is, give each consumer a “mini-batch” of Services to run instead of sending just one Service to a consumer.
Here’s a graph of some of our benchmarks:
Some of the data surprised us. For example, we expected 3 big batches to run fairly slowly across 11 consumers because there would be 8 consumers sitting idle, but we were not expecting 11 batches to run more slowly than 43 batches. We thought dividing the work equally across consumers in the exact number of batches would be the lowest point on the graph. We were wrong. We expected the U-shape, but we thought the trough would be at a different batch size.
Our test system can only support up to 11 consumers, so we haven’t yet tested batch sizes with more than 11, but the graph implies that we’ll have a deeper trough when we add consumers and tweak the batch size. There should be, in theory, a point where we can’t process jobs any faster due without killing the database. I’ve warned our DBAs that we’re looking to hit that point.
If you’re doing any kind of grid computing (by way of Terracotta’s Master-Worker project, GridGain, or rolling your own), check out the effects mini-batching can have on your throughput. You might be surprised by your benchmarking metrics!
We achieved a 3.5X increase in throughput by implementing “mini-batching” in our grid-enabled jobs.
We have a parent BatchService that creates child Services where each individual Service is a unit of work. A Service implementation might perform some calculation for a single employee of a large employer group. When the individual Services are very fast and the cost of bussing them around the network is greater than the cost of processing the Service, then adding more consumers makes the BatchService run slower! It is slower because these fine grained units of work require more queue locks, more network traffic, and more handling calls when the child Service is returned back to the parent BatchService for accumulation.
The secret, then, is to give each consumer enough work to make the overhead of bussing negligible. That is, give each consumer a “mini-batch” of Services to run instead of sending just one Service to a consumer.
Here’s a graph of some of our benchmarks:
Some of the data surprised us. For example, we expected 3 big batches to run fairly slowly across 11 consumers because there would be 8 consumers sitting idle, but we were not expecting 11 batches to run more slowly than 43 batches. We thought dividing the work equally across consumers in the exact number of batches would be the lowest point on the graph. We were wrong. We expected the U-shape, but we thought the trough would be at a different batch size.
Our test system can only support up to 11 consumers, so we haven’t yet tested batch sizes with more than 11, but the graph implies that we’ll have a deeper trough when we add consumers and tweak the batch size. There should be, in theory, a point where we can’t process jobs any faster due without killing the database. I’ve warned our DBAs that we’re looking to hit that point.
If you’re doing any kind of grid computing (by way of Terracotta’s Master-Worker project, GridGain, or rolling your own), check out the effects mini-batching can have on your throughput. You might be surprised by your benchmarking metrics!
Posted by Mark Turansky under
Architecture, Code Hints, Engineering
2 Comments »
06th May 2008
More proof that you can’t keep a good idea down?
In this blog article, Michael Nygard discusses a talk he attended where a technical architect discussed an SOA framework at FIDUCIA IT AG, a company in the financial services industry. Nygard describes an architecture that echoes many of the features I implicitly spoke of in my first blog article about my big integration project / message bus.
You may be asking yourself right now, why does he keep talking about this particular project? Briefly: it’s been a very fun project, it’s ongoing, it consumes most of my daily brain cycles, we’re still growing it (it’s a brand new infrastructure for us), and it encompasses a whole lot of ideas that I thought were good and that are now being validated by other projects I read about online.
So, what other unsung features did we build in that I’ll now sing about?
Asynchronous Messaging
You’ll notice the Spooler component in the original broad sketch of our architecture. The high-level description I gave the Spooler touched on callbacks. Asynchronous messaging was left unsaid, but it is implied by having a mechanism for callbacks.
The description also labeled my Spooler an endpoint, which may be a web service endpoint. We use web services only because the Enterprise Service Bus (ESB) orchestrating work on our bus is .NET-based while our project is all Java. That said, we post Plain Ol’ XML (POX) over HTTP, which is deserialized quickly to a Java POJO. Our entire messaging system works on POJOs, not XML.
The outside world may use SOAP (or XML-RPC or flat files or whatever) when communicating with my company, but internally our ESB talks POX with the bus. Mediation and transformation (from SOAP –> POX) is part of the functionality of an ESB. Consumers, internally to our bus, would directly access queues instead of using web services.
Pure POJOs, but distributed
It’s extremely productive and useful to work with a pure POJO model, and it’s even more productive and useful when the state of those POJOs is automagically kept in sync across the cluster regardless of what node is working on it. This is where Terracotta Server shines.
We pass POJOs around through all the queues. Consumers — which can exist anywhere on the network — process the Service/Job/Message (all interchangeable terms, as far as I am concerned — they are all units of work). Our messages are stateful, meaning they enter our bus empty except for parameters in instance variables, get routed around to various and sundry consumers across the network, and get posted back (the callback) full of data to the ESB.
Why do we need distributed POJOs? Well, we found it to be highly useful. For example, we offer a REST API to abort a pending message (such as http://ourendpoint/message/abort/abcdefg-the-guid-wxyz). The easiest way we found to tell the entire bus to disregard this message was to flip the bit on the message itself. The endpoint is running under Terracotta Server, all of the queues live in TC, and our consumers are likewise plugged in. If you stick all your messages in a Map (or series of maps if you’re worried about hashing, locking, and high volumes) where the GUID is the key and the value is the message, then the endpoint or any consumer can quickly obtain the reference to the message itself and alter its state. We can also write programs that hook into TC temporarily to inspect or modify the state of the system. Persistent memory is cool like that. It exists outside the runtime duration of the ephemeral program.
The endpoint likewise has REST APIs for returning the state of the bus, queues sizes, current activity, and other metrics. All of this data is collected from the POJOs themselves, because the endpoint has access to the very object instances that are running all over the network. It just so happens this architecture works wonderfully inside a single JVM, too, without TC, for easier development and debugging.
Load balancing and routers
Straight from Michael Nygard’s article:
Third, they’ve build a multi-layered middle tier. Incoming requests first hit a pair of “Central Process Servers” which inspect the request. Requests are dispatched to individual “portals” based on their customer ID.
In other words, they have endpoints behind load balancers (we use Pound) and “dispatched” is another word for “routed.” We have content based routers (a common and useful Enterprise integration Pattern for messaging systems) that route messages/services/jobs of specific types to certain queues. Our consumers are not homogenous. We’ve configured different applications (the integration aspects of our project) to listen on different queues. This saved us from having to port applications off the servers where they were previously deployed. These apps are several years old. Porting would have taken time and money. Allowing messages to flow to them where they already exist was a big win for us.
More to come
I’ve got the outline for my white paper complete, where I bulleted the features above as well as those in my previous blog article. There are other features I haven’t covered yet. Overall, I think it will be an interesting paper to read.
Still, I’m a little jealous, though, that FIDUCIA IT AG has scaled out to 1,000 nodes in their system. I can’t say how many nodes we’re up to, but I can say I’m looking forward to the massive scalability that our new architecture will give us.
In this blog article, Michael Nygard discusses a talk he attended where a technical architect discussed an SOA framework at FIDUCIA IT AG, a company in the financial services industry. Nygard describes an architecture that echoes many of the features I implicitly spoke of in my first blog article about my big integration project / message bus.
You may be asking yourself right now, why does he keep talking about this particular project? Briefly: it’s been a very fun project, it’s ongoing, it consumes most of my daily brain cycles, we’re still growing it (it’s a brand new infrastructure for us), and it encompasses a whole lot of ideas that I thought were good and that are now being validated by other projects I read about online.
So, what other unsung features did we build in that I’ll now sing about?
Asynchronous Messaging
You’ll notice the Spooler component in the original broad sketch of our architecture. The high-level description I gave the Spooler touched on callbacks. Asynchronous messaging was left unsaid, but it is implied by having a mechanism for callbacks.
The description also labeled my Spooler an endpoint, which may be a web service endpoint. We use web services only because the Enterprise Service Bus (ESB) orchestrating work on our bus is .NET-based while our project is all Java. That said, we post Plain Ol’ XML (POX) over HTTP, which is deserialized quickly to a Java POJO. Our entire messaging system works on POJOs, not XML.
The outside world may use SOAP (or XML-RPC or flat files or whatever) when communicating with my company, but internally our ESB talks POX with the bus. Mediation and transformation (from SOAP –> POX) is part of the functionality of an ESB. Consumers, internally to our bus, would directly access queues instead of using web services.
Pure POJOs, but distributed
It’s extremely productive and useful to work with a pure POJO model, and it’s even more productive and useful when the state of those POJOs is automagically kept in sync across the cluster regardless of what node is working on it. This is where Terracotta Server shines.
We pass POJOs around through all the queues. Consumers — which can exist anywhere on the network — process the Service/Job/Message (all interchangeable terms, as far as I am concerned — they are all units of work). Our messages are stateful, meaning they enter our bus empty except for parameters in instance variables, get routed around to various and sundry consumers across the network, and get posted back (the callback) full of data to the ESB.
Why do we need distributed POJOs? Well, we found it to be highly useful. For example, we offer a REST API to abort a pending message (such as http://ourendpoint/message/abort/abcdefg-the-guid-wxyz). The easiest way we found to tell the entire bus to disregard this message was to flip the bit on the message itself. The endpoint is running under Terracotta Server, all of the queues live in TC, and our consumers are likewise plugged in. If you stick all your messages in a Map (or series of maps if you’re worried about hashing, locking, and high volumes) where the GUID is the key and the value is the message, then the endpoint or any consumer can quickly obtain the reference to the message itself and alter its state. We can also write programs that hook into TC temporarily to inspect or modify the state of the system. Persistent memory is cool like that. It exists outside the runtime duration of the ephemeral program.
The endpoint likewise has REST APIs for returning the state of the bus, queues sizes, current activity, and other metrics. All of this data is collected from the POJOs themselves, because the endpoint has access to the very object instances that are running all over the network. It just so happens this architecture works wonderfully inside a single JVM, too, without TC, for easier development and debugging.
Load balancing and routers
Straight from Michael Nygard’s article:
Third, they’ve build a multi-layered middle tier. Incoming requests first hit a pair of “Central Process Servers” which inspect the request. Requests are dispatched to individual “portals” based on their customer ID.
In other words, they have endpoints behind load balancers (we use Pound) and “dispatched” is another word for “routed.” We have content based routers (a common and useful Enterprise integration Pattern for messaging systems) that route messages/services/jobs of specific types to certain queues. Our consumers are not homogenous. We’ve configured different applications (the integration aspects of our project) to listen on different queues. This saved us from having to port applications off the servers where they were previously deployed. These apps are several years old. Porting would have taken time and money. Allowing messages to flow to them where they already exist was a big win for us.
More to come
I’ve got the outline for my white paper complete, where I bulleted the features above as well as those in my previous blog article. There are other features I haven’t covered yet. Overall, I think it will be an interesting paper to read.
Still, I’m a little jealous, though, that FIDUCIA IT AG has scaled out to 1,000 nodes in their system. I can’t say how many nodes we’re up to, but I can say I’m looking forward to the massive scalability that our new architecture will give us.
Posted by Mark Turansky under
Architecture, Engineering, Technology, Terracotta
1 Comment »
01st May 2008
You can’t keep a good idea down
Our message bus project was more than just replacing JMS with a POJO messaging system. It’s a whole piece of infrastructure designed to make it easy for different folks to do their jobs.
How did we do this and why do the next couple of paragraphs sound like I’m bragging? Because many of the features we implemented were recently announced in a new open source project (more on that later). Bear with me as I go through some of the features we implemented, knowing that I’ll tie them to the features of a recently announced (and exciting) open source middleware product.
Configuration Management and Network Deployments …
We deploy applications to our bus over the network by the way of a simple little bootstrap loader. You’ll note the Java class I used in my blog article uses a URLClassLoader. My example used a file URL (”file://”) but there’s nothing stopping those URLs from beginning with “http://…”
This lets our Config Mgt. team deploy applications to a single place on the network. As nodes on the network come up, they’ll download the code they need to run.
… via Bootstrapping
While we’re on the subject of bootstrapping, there’s nothing stopping a smart developer from bootstrapping different applications into different classloaders. Again using the Java class from my blog article, you’ll notice the code finds the “main” class relative to the classloader. Who says you need just one classloader? Who says you can only run “main” methods? Stick all your classloaders in a Map and use the appropriate classloader to resolve an interface like, say, Service (as in Service Oriented Architecture) with an “execute” method. Suddenly, you can have applications that are invoked by a Service. We used this very technique to integrate legacy, stand-alone applications into an SOA.
Take bootstrapping and isolated classloading one step further and you’ll soon realize you can load multiple versions of the same application side-by-side in the same container. One could be your release branch version, the other could be your trunk code. Same infrastructure and container. We did that, too.
Lastly, what happens if you dump a classloader containing an application and replace it with a new one containing a new version of the application? Well, you just hot swapped your app. You updated the application without restarting your container.
Focus on Developer Productivity
Developers? We got them covered, too. We went for a simple pure Java POJO model. No app servers, databases, or anything else required. A developer can run the entire message bus — the entire infrastructure — in a single JVM, which means inside your IDE. Unit tests are a snap because all Services are POJO classes. Did I mention it takes one whole second for someone to start a local instance of our message bus in a single JVM? I’m a developer first, architect second. I like to think about how my decisions affect other developers. If I make it easy, they’ll love me. If I don’t make it easy, well, … I don’t think I’ve done my job well.
Utility Computing w/o Virtualization
It’s a lot easier to get efficient use of hardware once you can load all your applications into a single container/framework. If you bake in queuing and routing (a message bus), then you can implement the Competing Consumers pattern for parallelism in processing. Also, if all message consumers are running all applications (thanks, classloading!), then your consumers can listen on all queues to process all available work. This is utility computing without a VM. Our project lets us use all available CPU cycles as long as there is work in the queues.
There’s also the Master/Worker pattern to process batch jobs across a grid of consumers. Grid computing is one aspect of our project, but a minor one. I’m more interested in the gains we achieve through utility computing and the integration of several legacy applications to form our SOA.
The Open Source Version
Here are some features from the open source project, tell me if they sound familiar:
- You can install, uninstall, start, and stop different modules of your application dynamically without restarting the container.
- Your application can have more than one version of a particular module running at the same time.
- OSGi provides very good infrastructure for developing service-oriented applications
http://www.javaworld.com/javaworld/jw-03-2008/jw-03-osgi1.html
SpringSource recently announced a new “application platform” with some of the following features and benefits:
- Real time application and server updates
- Better resource utilization
- Side by side resource versioning
- Faster iterative development
- Small server footprint
- More manageable applications
http://www.springsource.com/web/guest/products/suite/applicationplatform
http://www.theserverside.com/news/thread.tss?thread_id=49243
Now, if only the SpringSource Application Platform could add queuing and routing to their project, we might consider porting to it. In the meantime, I’m happy to see other projects validating the ideas we pitched here at our company.
I’m excited, too, to announce that I’ve received the blessing of Management and PR to write a white paper about our project. It will cover all the above features as well as a slew of others, such as service orchestration and monitoring, asynchronous callbacks, a few other key Enterprise Integration Patterns, and it will explain how we used Terracotta Server to tie it all together. Stay tuned! I’m going to write blog articles to coincide with the sections of the paper.
Our message bus project was more than just replacing JMS with a POJO messaging system. It’s a whole piece of infrastructure designed to make it easy for different folks to do their jobs.
How did we do this and why do the next couple of paragraphs sound like I’m bragging? Because many of the features we implemented were recently announced in a new open source project (more on that later). Bear with me as I go through some of the features we implemented, knowing that I’ll tie them to the features of a recently announced (and exciting) open source middleware product.
Configuration Management and Network Deployments …
We deploy applications to our bus over the network by the way of a simple little bootstrap loader. You’ll note the Java class I used in my blog article uses a URLClassLoader. My example used a file URL (”file://”) but there’s nothing stopping those URLs from beginning with “http://…”
This lets our Config Mgt. team deploy applications to a single place on the network. As nodes on the network come up, they’ll download the code they need to run.
… via Bootstrapping
While we’re on the subject of bootstrapping, there’s nothing stopping a smart developer from bootstrapping different applications into different classloaders. Again using the Java class from my blog article, you’ll notice the code finds the “main” class relative to the classloader. Who says you need just one classloader? Who says you can only run “main” methods? Stick all your classloaders in a Map and use the appropriate classloader to resolve an interface like, say, Service (as in Service Oriented Architecture) with an “execute” method. Suddenly, you can have applications that are invoked by a Service. We used this very technique to integrate legacy, stand-alone applications into an SOA.
Take bootstrapping and isolated classloading one step further and you’ll soon realize you can load multiple versions of the same application side-by-side in the same container. One could be your release branch version, the other could be your trunk code. Same infrastructure and container. We did that, too.
Lastly, what happens if you dump a classloader containing an application and replace it with a new one containing a new version of the application? Well, you just hot swapped your app. You updated the application without restarting your container.
Focus on Developer Productivity
Developers? We got them covered, too. We went for a simple pure Java POJO model. No app servers, databases, or anything else required. A developer can run the entire message bus — the entire infrastructure — in a single JVM, which means inside your IDE. Unit tests are a snap because all Services are POJO classes. Did I mention it takes one whole second for someone to start a local instance of our message bus in a single JVM? I’m a developer first, architect second. I like to think about how my decisions affect other developers. If I make it easy, they’ll love me. If I don’t make it easy, well, … I don’t think I’ve done my job well.
Utility Computing w/o Virtualization
It’s a lot easier to get efficient use of hardware once you can load all your applications into a single container/framework. If you bake in queuing and routing (a message bus), then you can implement the Competing Consumers pattern for parallelism in processing. Also, if all message consumers are running all applications (thanks, classloading!), then your consumers can listen on all queues to process all available work. This is utility computing without a VM. Our project lets us use all available CPU cycles as long as there is work in the queues.
There’s also the Master/Worker pattern to process batch jobs across a grid of consumers. Grid computing is one aspect of our project, but a minor one. I’m more interested in the gains we achieve through utility computing and the integration of several legacy applications to form our SOA.
The Open Source Version
Here are some features from the open source project, tell me if they sound familiar:
- You can install, uninstall, start, and stop different modules of your application dynamically without restarting the container.
- Your application can have more than one version of a particular module running at the same time.
- OSGi provides very good infrastructure for developing service-oriented applications
http://www.javaworld.com/javaworld/jw-03-2008/jw-03-osgi1.html
SpringSource recently announced a new “application platform” with some of the following features and benefits:
- Real time application and server updates
- Better resource utilization
- Side by side resource versioning
- Faster iterative development
- Small server footprint
- More manageable applications
http://www.springsource.com/web/guest/products/suite/applicationplatform
http://www.theserverside.com/news/thread.tss?thread_id=49243
Now, if only the SpringSource Application Platform could add queuing and routing to their project, we might consider porting to it. In the meantime, I’m happy to see other projects validating the ideas we pitched here at our company.
I’m excited, too, to announce that I’ve received the blessing of Management and PR to write a white paper about our project. It will cover all the above features as well as a slew of others, such as service orchestration and monitoring, asynchronous callbacks, a few other key Enterprise Integration Patterns, and it will explain how we used Terracotta Server to tie it all together. Stay tuned! I’m going to write blog articles to coincide with the sections of the paper.
Posted by Mark Turansky under
Architecture, Technology, Terracotta
6 Comments »



