Archive for the 'Engineering' Category
03rd Jun 2008
HOW TO: Use JDBC Batching for 7-8X throughput gains
Using the batched statement capability of your JDBC driver can give you 7-8X throughput gains. Not only is batching significantly faster, it’ll save database CPU cycles and be easier on the network, too.
The graph below shows elapsed time (in milliseconds) by batch size. For each data point, 1K rows were inserted into a simple table in MySQL. The benchmarking code I used can be found here.

Why is batching so much faster?
First, depending on how much PreparedStatement caching your driver is doing, your database may be spending a lot of time parsing and compiling statements. After the statement is parsed and compiled, bind variables are applied. In our example, the data base will parse and compile the statement once as opposed to 1,000 times. This reduces the work your database performs and saves CPU.
Second, all bind variables are passed to the database in a single network call instead of 1,000 separate out-of-process, across-the-network calls. This helps reduce network traffic.
Third, depending on the internal architecture of your code, single statements may return the connection to a pool after every use. Multiply that by 1,000 and run a profiler and you’ll see yourself calling take/put methods a lot. Many pools also verify the connection on check-in and check-out. “select 1 from dual” is a common check for a pool to use. Your 1,000 uses of a connection may also be incurring the cost of 2,000 “select 1 from dual” statements!
When should you use batching?
Batching is particularly useful in importing scenarios where you need to get lots of data into your application quickly, but it can be used even when executing a few similar statements. Check out the example source code provided to see if batching is right for you. Fiddle with the numbers to see the gains for batching just 10 similar statements. It may not be 8X big, but trumpeting 25% gains to management is still a win for you and your team.
Use JDBC Batching!
JDBC batching can give you dramatic throughput gains while simultaneously being less abusive to your hardware. Overall, if you have the opportunity to use batch inserts and updates, you should seize that opportunity. Look at your application’s internal architecture to see if batching is right for you.
Using the batched statement capability of your JDBC driver can give you 7-8X throughput gains. Not only is batching significantly faster, it’ll save database CPU cycles and be easier on the network, too.
The graph below shows elapsed time (in milliseconds) by batch size. For each data point, 1K rows were inserted into a simple table in MySQL. The benchmarking code I used can be found here.

Why is batching so much faster?
First, depending on how much PreparedStatement caching your driver is doing, your database may be spending a lot of time parsing and compiling statements. After the statement is parsed and compiled, bind variables are applied. In our example, the data base will parse and compile the statement once as opposed to 1,000 times. This reduces the work your database performs and saves CPU.
Second, all bind variables are passed to the database in a single network call instead of 1,000 separate out-of-process, across-the-network calls. This helps reduce network traffic.
Third, depending on the internal architecture of your code, single statements may return the connection to a pool after every use. Multiply that by 1,000 and run a profiler and you’ll see yourself calling take/put methods a lot. Many pools also verify the connection on check-in and check-out. “select 1 from dual” is a common check for a pool to use. Your 1,000 uses of a connection may also be incurring the cost of 2,000 “select 1 from dual” statements!
When should you use batching?
Batching is particularly useful in importing scenarios where you need to get lots of data into your application quickly, but it can be used even when executing a few similar statements. Check out the example source code provided to see if batching is right for you. Fiddle with the numbers to see the gains for batching just 10 similar statements. It may not be 8X big, but trumpeting 25% gains to management is still a win for you and your team.
Use JDBC Batching!
JDBC batching can give you dramatic throughput gains while simultaneously being less abusive to your hardware. Overall, if you have the opportunity to use batch inserts and updates, you should seize that opportunity. Look at your application’s internal architecture to see if batching is right for you.
Posted by Mark Turansky under
Code Hints, Engineering, HOW TO
4 Comments »
06th May 2008
More proof that you can’t keep a good idea down?
In this blog article, Michael Nygard discusses a talk he attended where a technical architect discussed an SOA framework at FIDUCIA IT AG, a company in the financial services industry. Nygard describes an architecture that echoes many of the features I implicitly spoke of in my first blog article about my big integration project / message bus.
You may be asking yourself right now, why does he keep talking about this particular project? Briefly: it’s been a very fun project, it’s ongoing, it consumes most of my daily brain cycles, we’re still growing it (it’s a brand new infrastructure for us), and it encompasses a whole lot of ideas that I thought were good and that are now being validated by other projects I read about online.
So, what other unsung features did we build in that I’ll now sing about?
Asynchronous Messaging
You’ll notice the Spooler component in the original broad sketch of our architecture. The high-level description I gave the Spooler touched on callbacks. Asynchronous messaging was left unsaid, but it is implied by having a mechanism for callbacks.
The description also labeled my Spooler an endpoint, which may be a web service endpoint. We use web services only because the Enterprise Service Bus (ESB) orchestrating work on our bus is .NET-based while our project is all Java. That said, we post Plain Ol’ XML (POX) over HTTP, which is deserialized quickly to a Java POJO. Our entire messaging system works on POJOs, not XML.
The outside world may use SOAP (or XML-RPC or flat files or whatever) when communicating with my company, but internally our ESB talks POX with the bus. Mediation and transformation (from SOAP –> POX) is part of the functionality of an ESB. Consumers, internally to our bus, would directly access queues instead of using web services.
Pure POJOs, but distributed
It’s extremely productive and useful to work with a pure POJO model, and it’s even more productive and useful when the state of those POJOs is automagically kept in sync across the cluster regardless of what node is working on it. This is where Terracotta Server shines.
We pass POJOs around through all the queues. Consumers — which can exist anywhere on the network — process the Service/Job/Message (all interchangeable terms, as far as I am concerned — they are all units of work). Our messages are stateful, meaning they enter our bus empty except for parameters in instance variables, get routed around to various and sundry consumers across the network, and get posted back (the callback) full of data to the ESB.
Why do we need distributed POJOs? Well, we found it to be highly useful. For example, we offer a REST API to abort a pending message (such as http://ourendpoint/message/abort/abcdefg-the-guid-wxyz). The easiest way we found to tell the entire bus to disregard this message was to flip the bit on the message itself. The endpoint is running under Terracotta Server, all of the queues live in TC, and our consumers are likewise plugged in. If you stick all your messages in a Map (or series of maps if you’re worried about hashing, locking, and high volumes) where the GUID is the key and the value is the message, then the endpoint or any consumer can quickly obtain the reference to the message itself and alter its state. We can also write programs that hook into TC temporarily to inspect or modify the state of the system. Persistent memory is cool like that. It exists outside the runtime duration of the ephemeral program.
The endpoint likewise has REST APIs for returning the state of the bus, queues sizes, current activity, and other metrics. All of this data is collected from the POJOs themselves, because the endpoint has access to the very object instances that are running all over the network. It just so happens this architecture works wonderfully inside a single JVM, too, without TC, for easier development and debugging.
Load balancing and routers
Straight from Michael Nygard’s article:
Third, they’ve build a multi-layered middle tier. Incoming requests first hit a pair of “Central Process Servers” which inspect the request. Requests are dispatched to individual “portals” based on their customer ID.
In other words, they have endpoints behind load balancers (we use Pound) and “dispatched” is another word for “routed.” We have content based routers (a common and useful Enterprise integration Pattern for messaging systems) that route messages/services/jobs of specific types to certain queues. Our consumers are not homogenous. We’ve configured different applications (the integration aspects of our project) to listen on different queues. This saved us from having to port applications off the servers where they were previously deployed. These apps are several years old. Porting would have taken time and money. Allowing messages to flow to them where they already exist was a big win for us.
More to come
I’ve got the outline for my white paper complete, where I bulleted the features above as well as those in my previous blog article. There are other features I haven’t covered yet. Overall, I think it will be an interesting paper to read.
Still, I’m a little jealous, though, that FIDUCIA IT AG has scaled out to 1,000 nodes in their system. I can’t say how many nodes we’re up to, but I can say I’m looking forward to the massive scalability that our new architecture will give us.
In this blog article, Michael Nygard discusses a talk he attended where a technical architect discussed an SOA framework at FIDUCIA IT AG, a company in the financial services industry. Nygard describes an architecture that echoes many of the features I implicitly spoke of in my first blog article about my big integration project / message bus.
You may be asking yourself right now, why does he keep talking about this particular project? Briefly: it’s been a very fun project, it’s ongoing, it consumes most of my daily brain cycles, we’re still growing it (it’s a brand new infrastructure for us), and it encompasses a whole lot of ideas that I thought were good and that are now being validated by other projects I read about online.
So, what other unsung features did we build in that I’ll now sing about?
Asynchronous Messaging
You’ll notice the Spooler component in the original broad sketch of our architecture. The high-level description I gave the Spooler touched on callbacks. Asynchronous messaging was left unsaid, but it is implied by having a mechanism for callbacks.
The description also labeled my Spooler an endpoint, which may be a web service endpoint. We use web services only because the Enterprise Service Bus (ESB) orchestrating work on our bus is .NET-based while our project is all Java. That said, we post Plain Ol’ XML (POX) over HTTP, which is deserialized quickly to a Java POJO. Our entire messaging system works on POJOs, not XML.
The outside world may use SOAP (or XML-RPC or flat files or whatever) when communicating with my company, but internally our ESB talks POX with the bus. Mediation and transformation (from SOAP –> POX) is part of the functionality of an ESB. Consumers, internally to our bus, would directly access queues instead of using web services.
Pure POJOs, but distributed
It’s extremely productive and useful to work with a pure POJO model, and it’s even more productive and useful when the state of those POJOs is automagically kept in sync across the cluster regardless of what node is working on it. This is where Terracotta Server shines.
We pass POJOs around through all the queues. Consumers — which can exist anywhere on the network — process the Service/Job/Message (all interchangeable terms, as far as I am concerned — they are all units of work). Our messages are stateful, meaning they enter our bus empty except for parameters in instance variables, get routed around to various and sundry consumers across the network, and get posted back (the callback) full of data to the ESB.
Why do we need distributed POJOs? Well, we found it to be highly useful. For example, we offer a REST API to abort a pending message (such as http://ourendpoint/message/abort/abcdefg-the-guid-wxyz). The easiest way we found to tell the entire bus to disregard this message was to flip the bit on the message itself. The endpoint is running under Terracotta Server, all of the queues live in TC, and our consumers are likewise plugged in. If you stick all your messages in a Map (or series of maps if you’re worried about hashing, locking, and high volumes) where the GUID is the key and the value is the message, then the endpoint or any consumer can quickly obtain the reference to the message itself and alter its state. We can also write programs that hook into TC temporarily to inspect or modify the state of the system. Persistent memory is cool like that. It exists outside the runtime duration of the ephemeral program.
The endpoint likewise has REST APIs for returning the state of the bus, queues sizes, current activity, and other metrics. All of this data is collected from the POJOs themselves, because the endpoint has access to the very object instances that are running all over the network. It just so happens this architecture works wonderfully inside a single JVM, too, without TC, for easier development and debugging.
Load balancing and routers
Straight from Michael Nygard’s article:
Third, they’ve build a multi-layered middle tier. Incoming requests first hit a pair of “Central Process Servers” which inspect the request. Requests are dispatched to individual “portals” based on their customer ID.
In other words, they have endpoints behind load balancers (we use Pound) and “dispatched” is another word for “routed.” We have content based routers (a common and useful Enterprise integration Pattern for messaging systems) that route messages/services/jobs of specific types to certain queues. Our consumers are not homogenous. We’ve configured different applications (the integration aspects of our project) to listen on different queues. This saved us from having to port applications off the servers where they were previously deployed. These apps are several years old. Porting would have taken time and money. Allowing messages to flow to them where they already exist was a big win for us.
More to come
I’ve got the outline for my white paper complete, where I bulleted the features above as well as those in my previous blog article. There are other features I haven’t covered yet. Overall, I think it will be an interesting paper to read.
Still, I’m a little jealous, though, that FIDUCIA IT AG has scaled out to 1,000 nodes in their system. I can’t say how many nodes we’re up to, but I can say I’m looking forward to the massive scalability that our new architecture will give us.
Posted by Mark Turansky under
Architecture, Engineering, Technology, Terracotta
1 Comment »
21st Apr 2008
When you absolutely, positively have to write software that does not fail
I’ve been fascinated about the software they run on the space shuttle ever since I read this article years ago: They Write the Right Stuff
Today, I ran across this article about Self-Modifying Code written by someone that used to work at Lockheed on the shuttle. He describes using it for fault tolerance down near the hardware.
I imagine the computers running the Federal Reserve have similarly robust features baked in. Interesting stuff.
I’ve been fascinated about the software they run on the space shuttle ever since I read this article years ago: They Write the Right Stuff
Today, I ran across this article about Self-Modifying Code written by someone that used to work at Lockheed on the shuttle. He describes using it for fault tolerance down near the hardware.
I imagine the computers running the Federal Reserve have similarly robust features baked in. Interesting stuff.
Posted by Mark Turansky under
Engineering
2 Comments »


