I am very pleased to announce the release of two new tools that I hope will prove to be useful additions to your operational arsenal:
- Senedsa is a small utility and library that wraps around the Nagios
- Elesai is a wrapper around LSI’s
Both of them are distributed as Ruby gems.
Senedsa is something I have been meaning to write for quite some time. A fair amount of the tools I have written over the last few years (most notably Zettabee and Theia) have had monitoring capabilities built-in, mostly in the form of a Nagios passive monitor. I have moved to developing in Ruby essentially full-time (I still do some Python and some shell scripting, but that’s mostly in maintenance mode), and after I went down the path of writing Elesai, I simply had to write Senedsa, since the ridiculous cut-and-paste maneuver was getting old (and frankly, embarrasing).
I pondered using Kevin Bedell‘s
send_nsca gem, but in the end I decided to implement the Senedsa wrapper, primarily because we have other non-Ruby code (shell scripts) that use
send_nsca in its native form, and I suspected that would be the case for most other shops. While that implies a fork whenever it is used, it is not something we are doing at high frequency.
Senedsa be be used both as cli utility (which is handy to test your
send_nsca installation or perhaps to be called from shell scripts, tho that implies you are be forking both
send_nsca) or a library from within your Ruby scripts.
Elesai is a wrapper around LSI’s
MegaCli utility that provides access to common types of information about LSI RAID controllers (currently physical and virtual disks) without the need to “speak martian” (run
MegaCli -h to see what I mean). It is a line-oriented tool so that it can be combined with other Unix command-line tools to process and manipulate the data (i.e.,
awk, and friends). It also provides a
check action (currently as a Nagios plugin in both active and passive modes) which monitors the health of the array and its components and reports it accordingly (this is not yet configurable).
The exercise of developing Elesai has be useful in a number of ways. Perhaps the most significant one has been the realization of something I have actually known for quite some time but had not fully solidified in my mind: each and every tool monitoring if it is doing any regular work in production. Full-stop. Clearly this is something I had been doing already (Zettabee being the primary example), but it has now been elevated to full requirement.
The second aspect was the use of state machines to parse pseudo-structured output like that of
MegaCli, where identifying the current element being processed becomes trickier. The usual (or perhaps the one I have seen implemented more often) approach is a set of nested
case) statements matching the appropriate regular expressions with state variables sprinkled around. This normally works, up until the point where the parser needs changes or additions six months after the code was written. Elesai itself will need several additions in the not-too-distant future, as it currently only shows information about physical and virtual disks (and does not, for instance, take spans into account): information about the adapters themselves and batteries are high in the list of new features.
So there had to be a better way. I first looked into the excellent Parslet library, but it really wasn’t the right tool for this job. I had had state machines lingering in my mind for quite some time, and that turned out to be a incredibly good fit, especially when using the wonderful workflow state machine implementation.
If you need to deal with LSI RAID controllers, I hope you find Elesai a worthy tool to add to your toolbox. It’s easy to install:
gem install elesai. Do please report problems and feature enhancements in the issue tracker, or better yet, and if you’re up to it, fork it and contribute. Ditto Senedsa.
It’s hard to believe it has almost a year since we started the process of open sourcing tools, but it has indeed been that long, and it picked up steam a few weeks ago, when pushed out nddtune, which is admittedly a very simple tool. Today we’re continuing that effort with a couple of more significant tools: Zettabee and Theia.
A Little History
About four years ago, we had a very real need to have fairly detailed performance metrics for NetApp filers. At the time, the available solutions relied on SNMP (NetApp’s SNMP support has historically been weak) or were NetApp’s own, which, asides from expensive, were hard to integrate with the rest of our monitoring infrastructure (which is comprised of Nagios and Zenoss). As such, we set out to write a tool that would both perform detailed filer monitoring (for faults and performance) and that would be able to interface with those systems. Theia was born.
In more recent times, as we were looking at beefing up our DR strategy, we found ourselves needing a good ZFS-based replication tool, and set out to write Zettabee, which gave us an opportunity to dive deeper into ZFS capabilities.
Let the Games Begin
Today we’re very excited to be releasing those two tools into the open. Theia has been in production for the last four years, dutifully keeping an eye on our filers, while Zettabee has been pushing bits long-distance for well over nine months. We are working on putting together a roadmap for future work, but are happy to have them out in the open for further collaboration. Tim has written a good post on some of the work he has done to make this happen, and I am grateful for his help on this endeavor.
After a couple of days of running into dead ends, I am finally able to drive JIRA via its SOAP interface sanely from something other than Java in an effort to automate small, repetitive tasks that are best left to tools. Without going into the details of what is it that I needed to get accomplished (which is not the key point of this post), I wanted to share a bit of the experience before I close shop for the day.
First, check out Igor Sereda’s presentation on JIRA Client, which offers many insights on general client-side JIRA programming. Second, have the JiraSoapService javadoc handy. Given the usual needs I deal with, I use Python quite a bit, which has served me very well for nearly the last 10 years, and it’s the workhorse of my tool development. But in this case, I ran into problems at almost every turn: SOAPpy cannot deal with dates, and ZSI ran into some issues as well. So I went to Ruby and jira4r (navigator, source). Amazingly elegant, it hides all the SOAP stuff from view, producing ridiculously compact code, and so far, working flawlessly.
Martin always knew I would end up diving into Ruby :-)
Fact: I’m not a developer. That’s magic I leave to people far (far) more talented than myself. Fact: I do write code relatively often. It is not a constant activity for a variety of reasons: the weather, fires at work, my camera, Saturday nights out and about, whatever. The net result of this stop-and-go is that I don’t always keep up with new language features, and Python’s logging module is a good example of this sometimes embarrassing reality. This module has been available in the standard Python distribution since version 2.3, but it was only recently that I started using it. It rocks.
Ninety percent of the code I write is sysadmin related code. Task automation, log grokking and processing, monitoring widgets, reporting tools, management gizmos and the like. A lot of these are long-running processes that deal with a fair number of interconnected (and possibly remote) moving parts. Given a 24×7 shop, these tools have to interact with the real world, which implies strapping them to monitoring systems, the kind that will page the humans in those (hopefully) rare instances when things do go sideways and someone has to intervene to make it all better. The pace is relentless, so streamlining the development is critical to keep up and be able to maintain the code as things evolve.
Before discovering the logging module, I wrote a fair amount code involving admittedly crude custom loggers and heavy usage of the syslog module, which is fine, actually, but far from optimal. The logging module does away with all of that, simplifying the code and allowing for a fair amount of control over how and where messages are sent. In addition to log files and syslog, the logging module offers a good variety of other handlers (SMTP, socket and datagram, to name a few). Yum. It forces a bit of thinking about the organization of messages (mapping conditions to severity levels and deciding what to do for each severity), but this is actually quite helpful. Yum yum.