Early Message Queue Experiences
Earlier this month I rolled out changes across our small fleet of services that made ActiveMQ an essential element backing much of our software. Hopefully I can note that the things that went well and the things that didn't go according to plan, at least in respect to the message queue software and our use of it.
Firstly, actually installing ActiveMQ is really not very difficult (we run Ubuntu GNU/Linux). However, there are some configuration tweaks needed to ensure a restart occurs in a timely manner. Also, since security of the data within is essential all traffic between AMQ instances goes via SSL. This latter aspect proves a bit of a pain given you need to set up mutual key store trusts between each linkage. However, it does work.
An area that we could not easily predict would be memory use. Here's an area that really could use some clarification from the AMQ documentation. There's three limits to be set - an in-RAM limit, a temporary storage (overflow for in-RAM), and a persistence storage limit. This seems to be on top of any Producer Flow Control - which occurs per thread. It may look pretty easy to configure this stuff up however there are enough questions about behaviour to make me wonder what I've missed. One of our servers actually began issuing Java out of memory exceptions - something that is documented on AMQ's site and which we managed to adjust.
Another thing - when using the PHP PECL Stomp library to connect to localhost for some reason it can fail to connect. The solution appears to be to specify 127.0.0.1 instead. The library's author did note the same thing as something to be investigated.
So launch evening night came along and having made the necessary adjustments I published our internal set of customer accounts to one of our servers. Publication itself only took a few seconds however consumption speed was dire. Given I'd tested on a small subset beforehand without even noticing a delay I was rather surprised. Either way, I was not about to subject our staff or customers to potential performance problems elsewhere and backed out. Subsequent testing proved the problem had nothing to do with AMQ at all but MySQL and ext4 performing way too many fsyncs. We adjusted MySQL and the consumption of accounts was orders of magnitude faster.
A second roll-forward was scheduled, this one went smoother although verification that all systems were operating normally took far too long - again absolutely nothing to do with AMQ more ensuring staff were aware of any new ways of operating our systems andĀ repercussionsĀ for our customers.
So now that the dust has settled how have things been? AMQ has - touch wood - not been a source of trouble. We're not exactly pushing tens of thousands of messages per second through this stuff but we're seeing the enqueue/dequeue counters flying fast enough to be coping with things nicely.
If I were to do things differently: well, we got hit with at least two major architectural changes that needed to be pushed out simultaneously. I can't tell you how painful that is to deal with.
What about AMQ use? Give the broker plenty of CPU and RAM. It will fly. And don't write a line of code before you read Enterprise Integration Patterns.





