Archive for July, 2008

HOW TO: Use mini-batching to improve grid performance

| 4 Comments

We achieved a 3.5X increase in throughput by implementing “mini-batching” in our grid-enabled jobs.

We have a parent BatchService that creates child Services where each individual Service is a unit of work.  A Service implementation might perform some calculation for a single employee of a large employer group.  When the individual Services are very fast and the cost of bussing them around the network is greater than the cost of processing the Service, then adding more consumers makes the BatchService run slower!  It is slower because these fine grained units of work require more queue locks, more network traffic, and more handling calls when the child Service is returned back to the parent BatchService for accumulation.

The secret, then, is to give each consumer enough work to make the overhead of bussing negligible.  That is, give each consumer a “mini-batch” of Services to run instead of sending just one Service to a consumer.

Here’s a graph of some of our benchmarks:

throughputbybatchsize.png

Some of the data surprised us.  For example, we expected 3 big batches to run fairly slowly across 11 consumers because there would be 8 consumers sitting idle, but we were not expecting 11 batches to run more slowly than 43 batches.  We thought dividing the work equally across consumers in the exact number of batches would be the lowest point on the graph.  We were wrong.  We expected the U-shape, but we thought the trough would be at a different batch size.

Our test system can only support up to 11 consumers, so we haven’t yet tested batch sizes with more than 11, but the graph implies that we’ll have a deeper trough when we add consumers and tweak the batch size.  There should be, in theory, a point where we can’t process jobs any faster due without killing the database.  I’ve warned our DBAs that we’re looking to hit that point.

If you’re doing any kind of grid computing (by way of Terracotta’s Master-Worker project, GridGain, or rolling your own), check out the effects mini-batching can have on your throughput.  You might be surprised by your benchmarking metrics!

I’m published, and I struck a nerve.

The JavaLobby (now java.dzone.com) asked to republish my article on human “resources.”  I was happy to oblige!

http://java.dzone.com/articles/were-not-resources

I think the theme of the article touched on a strong undercurrent in the developer community.  My blog post received more than 6k hits over the weekend, has the highest number of comments of all my articles, was republished on JavaLobby, Reddit, and others, and each of the publishers has received a bunch of comments on their repost.

There’s clearly something to the idea that we’re more than just “resources.”  But this is not a new theme or idea.

Forrester Research published a similar article not long ago:  http://blogs.forrester.com/appdev/2008/04/what-is-more-im.html  Similarly, there are several links in the comments of my blog article echoing the same sentiment.

The times they are a-changin’.

This is such an easy concept to grok and an easier one to change.  I suspect that more organizations will begin to rename their “Human Resources Department” to “Human Talent Department.”   It’s definitely more PC and it’s a sign that organizations value the talent their employees provide more than they value the warm body in a cold seat.  That is, unless you’re a government contractor, in which case you really do just want warm bodies.

We’re not “resources”

Resources. It’s a dehumanizing term that is also flat-out wrong for nearly every profession I can think of.

Project planning requires estimates and scheduling. I’ve got no problem with that except when it treats people as interchangeable cogs. In a manufacturing process, skilled workers might be interchangeable. There are only so many ways to stamp out a piece of machinery or otherwise work the assembly line. The process can be perfected to the exact number of steps involved in making a thing. Read The Toyota Way to get a better feeling for how world class manufacturers achieve this.

THESE AREN’T RESOURCES

But there are many, many professions that do not and can not achieve worker utility, where swapping out one “resource” for another is feasible or sensible.

Does George Steinbrenner schedule a “short stop resource” or does he get Derek Jeter? Do they Yankees want homerun hitting A-Rod or a mere “3rd baseman resource”?

Did the Chicago Bulls staff a “shooting guard resource” or did they need Michael Jordan?

Did Apple do well when it had a CEO “resource” or did they achieve the incredible after Steve Jobs came back to lead the company?

Do you want a 1st year medical intern (your “doctor resource”) performing your brain surgery or do you want the foremost expert in the field?

Do you want an “acting resource” or does Brad Pitt have more marquee power?

When was the last time you looked for a “contractor resource” instead of hiring the very best contractor you could find to renovate your home?

Thoughtworkers and creative types are no different. Software engineers are simultaneously creative and logical, and there is an order of magnitude difference between the best and worst programmers (go read Peopleware if you don’t believe this). Because of this difference, estimates have to change based on the “resource,” which means we’re not interchangeable cogs after all.

IT’S THE TEAM, STUPID

You can schedule me to be the Yankees 3rd base resource (thereby saving cost in the Cost-Schedule-Quality tradeoff), but I’m certain the quality of the product would suffer despite the fact that I played little league baseball for years as a kid. Similarly, you can cast me in your movie, but I’m not sure I’d sell any tickets. I wouldn’t do any better running Apple than John Sculley, and you definitely don’t want me performing brain surgery.

Talent matters.

Winning organizations build winning teams, they don’t schedule resources and they don’t break up a winning team. They pay top dollar for top talent knowing that it’s entirely talent that makes a winning team.

Steve McConnell’s widely acclaimed Rapid Application Development ranks “Weak Personnel” as the 2nd classic mistake an organization can make when trying to build software. In discussing teamicide in Peopleware, DeMarco and Lister write “Most forms of teamicide do their damage by effectively demeaning the work, or demeaning the people who do it.”

Talent matters. Treating highly intelligent software developers as “resources” is demeaning, dehumanizing, and ultimately counterproductive to an organization that needs to build and field a winning team.

WSDL first development? Are they crazy?

From the CXF user guide: ” For new development the preferred path is to design your services in WSDL and then generate the code to implement them.”

Are they insane?

Which would you rather write by hand….

a)

@WebService(

    endpointInterface = "com.southwind.PersonFacade",

    name = "PersonFacade"

)

public interface PersonFacade {

    @WebMethod()

    public Person getPerson(@WebParam(name="ssn") String ssn);

}

b)

<?xml version='1.0' encoding='UTF-8'?><wsdl:definitions name="PersonFacadeImplService" targetNamespace="http://southwind.com/" xmlns:ns1="http://schemas.xmlsoap.org/soap/http" xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/" xmlns:tns="http://southwind.services.enrollment.bf/" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" xmlns:xsd="http://www.w3.org/2001/XMLSchema">  <wsdl:types>

<xs:schema attributeFormDefault="unqualified" elementFormDefault="unqualified" targetNamespace="http://southwind.com/" xmlns:tns="http://southwind.com/" xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="findPerson" type="tns:findPerson" />

<xs:element name="findPersonResponse" type="tns:findPersonResponse" />

<xs:element name="getPerson" type="tns:getPerson" />

<xs:element name="getPersonResponse" type="tns:getPersonResponse" />

<xs:complexType name="getPerson">

<xs:sequence>

<xs:element minOccurs="0" name="ssn" type="xs:string" />

</xs:sequence>

</xs:complexType>

<xs:complexType name="getPersonResponse">

<xs:sequence>

<xs:element minOccurs="0" name="return" type="tns:person" />

</xs:sequence>

</xs:complexType>

<xs:complexType name="person">

<xs:sequence>

<xs:element minOccurs="0" name="birthday" type="xs:dateTime" />

<xs:element maxOccurs="unbounded" minOccurs="0" name="enrollments" nillable="true" type="tns:enrollment" />

<xs:element minOccurs="0" name="firstName" type="xs:string" />

<xs:element minOccurs="0" name="lastName" type="xs:string" />

<xs:element minOccurs="0" name="ssn" type="xs:string" />

</xs:sequence>

</xs:complexType>

<xs:complexType name="enrollment">

<xs:sequence>

<xs:element minOccurs="0" name="planName" type="xs:string" />

<xs:element name="planRate" type="xs:double" />

<xs:element minOccurs="0" name="type" type="tns:type" />

</xs:sequence>

</xs:complexType>

<xs:complexType name="findPerson">

<xs:sequence>

<xs:element minOccurs="0" name="id" type="xs:string" />

</xs:sequence>

</xs:complexType>

<xs:complexType name="findPersonResponse">

<xs:sequence>

<xs:element minOccurs="0" name="return" type="tns:person" />

</xs:sequence>

</xs:complexType>

<xs:simpleType name="type">

<xs:restriction base="xs:string">

<xs:enumeration value="MEDICAL" />

<xs:enumeration value="DENTAL" />

<xs:enumeration value="VISION" />

<xs:enumeration value="PHARM" />

</xs:restriction>

</xs:simpleType>

</xs:schema>

</wsdl:types>

<wsdl:message name="findPerson">

<wsdl:part element="tns:findPerson" name="parameters">

</wsdl:part>

</wsdl:message>

<wsdl:message name="findPersonResponse">

<wsdl:part element="tns:findPersonResponse" name="parameters">

</wsdl:part>

</wsdl:message>

<wsdl:message name="getPersonResponse">

<wsdl:part element="tns:getPersonResponse" name="parameters">

</wsdl:part>

</wsdl:message>

<wsdl:message name="getPerson">

<wsdl:part element="tns:getPerson" name="parameters">

</wsdl:part>

</wsdl:message>

<wsdl:portType name="PersonFacade">

<wsdl:operation name="getPerson">

<wsdl:input message="tns:getPerson" name="getPerson">

</wsdl:input>

<wsdl:output message="tns:getPersonResponse" name="getPersonResponse">

</wsdl:output>

</wsdl:operation>

<wsdl:operation name="findPerson">

<wsdl:input message="tns:findPerson" name="findPerson">

</wsdl:input>

<wsdl:output message="tns:findPersonResponse" name="findPersonResponse">

</wsdl:output>

</wsdl:operation>

</wsdl:portType>

<wsdl:binding name="PersonFacadeImplServiceSoapBinding" type="tns:PersonFacade">

<soap:binding style="document" transport="http://schemas.xmlsoap.org/soap/http" />

<wsdl:operation name="getPerson">

<soap:operation soapAction="" style="document" />

<wsdl:input name="getPerson">

<soap:body use="literal" />

</wsdl:input>

<wsdl:output name="getPersonResponse">

<soap:body use="literal" />

</wsdl:output>

</wsdl:operation>

<wsdl:operation name="findPerson">

<soap:operation soapAction="" style="document" />

<wsdl:input name="findPerson">

<soap:body use="literal" />

</wsdl:input>

<wsdl:output name="findPersonResponse">

<soap:body use="literal" />

</wsdl:output>

</wsdl:operation>

</wsdl:binding>

<wsdl:service name="PersonFacadeImplService">

<wsdl:port binding="tns:PersonFacadeImplServiceSoapBinding" name="PersonFacadeImplPort">

<soap:address location="http://mturanskylptp2:9000/personFacade" />

</wsdl:port>

</wsdl:service>

</wsdl:definitions>

Switch to our mobile site