My recent post “Linux is killing Solaris” is surpisingly controversial, if you could infer that by reading the comments on Reddit or watching the split vote on DZone.

I think the critics and naysayers are missing a big point. One tool (htop on linux) is ridiculously easy to read at a glance while the other (prstat on solaris) requires parsing and mental math.

You be the judge. Can you choose the right horse to run this course?

Imagine you’ve got a dozen nodes on the network, each one hosting components of your enterprise message bus. You and your operations folks need visibility into the entire system so that you can tell at a glance whether something is going wrong or not.

One tool is plain text, with row on row of process data, cpu utilization by process, etc. Want to know total CPU usage of the box? Add your processes together! “Well, process A is using 38%, process B is using 13%, and process C is using 4%. Let’s see, 38 + 13 + 4 is 55%. On to the next box!” Twelve console windows into all twelve nodes looking like this:

htop_vs_prstat21.png

Can you tell how much swap space your system is using by looking at this first tool? Nope!

Alternatively, you can look at another tool that’s also plain text in a console, but has simple yet effective color-coded bars that display CPU and memory usage. It looks something like this:

htop_vs_prstat1.png

What’s important here is people. I don’t care if prstat runs twice as fast as htop. I don’t care if htop uses more memory or is less efficient. The fact is, they both run fast enough. What I want is for people to be fast when reading the console.

Our Config Management team is going to have a single monitor with SSH shells open to all the nodes on the network. The monitor will sit in our Ops center and plenty of people will glance at it to see how our system is doing. If even one of them is doing math in their head to get CPU usage on a box, then I’ve failed the usability test. I picked the wrong horse for the course.

So, htop runs on Linux and there’s no mention anywhere on the internet about it being unavailable for Solaris (except here on this blog). Get me used to all the niceties of Linux and I’m not a happy camper when I have to deploy my message bus to Solaris because my quick visual status bars are gone.

This might seem like a trivial thing to get worked up about, but my responsibility as a leading architect at my company is to make things which are simple to run, easy to test, quick to report status, and to remain cognizant of the Ops and business folks that use our system to make money for our company. There’s still a ton of work to do after you’re code complete. Monitoring is one of those non-functional but critical requirements that has to be built in if you want a robust system.

Solaris on x86 was too late. I could never afford Sun hardware at home or for development. So it all went Linux. And it’s not just for me, it’s for a lot of people. That’s why I wrote Linux is killing Solaris. I still think it’s an obvious point missed by no one, yet somehow it managed to stir up enough emotion in people on Reddit and Dzone.