It’s hard to believe it has almost a year since we started the process of open sourcing tools, but it has indeed been that long, and it picked up steam a few weeks ago, when pushed out nddtune, which is admittedly a very simple tool. Today we’re continuing that effort with a couple of more significant tools: Zettabee and Theia.
A Little History
About four years ago, we had a very real need to have fairly detailed performance metrics for NetApp filers. At the time, the available solutions relied on SNMP (NetApp’s SNMP support has historically been weak) or were NetApp’s own, which, asides from expensive, were hard to integrate with the rest of our monitoring infrastructure (which is comprised of Nagios and Zenoss). As such, we set out to write a tool that would both perform detailed filer monitoring (for faults and performance) and that would be able to interface with those systems. Theia was born.
In more recent times, as we were looking at beefing up our DR strategy, we found ourselves needing a good ZFS-based replication tool, and set out to write Zettabee, which gave us an opportunity to dive deeper into ZFS capabilities.
Let the Games Begin
Today we’re very excited to be releasing those two tools into the open. Theia has been in production for the last four years, dutifully keeping an eye on our filers, while Zettabee has been pushing bits long-distance for well over nine months. We are working on putting together a roadmap for future work, but are happy to have them out in the open for further collaboration. Tim has written a good post on some of the work he has done to make this happen, and I am grateful for his help on this endeavor.
Fact: I’m not a developer. That’s magic I leave to people far (far) more talented than myself. Fact: I do write code relatively often. It is not a constant activity for a variety of reasons: the weather, fires at work, my camera, Saturday nights out and about, whatever. The net result of this stop-and-go is that I don’t always keep up with new language features, and Python’s logging module is a good example of this sometimes embarrassing reality. This module has been available in the standard Python distribution since version 2.3, but it was only recently that I started using it. It rocks.
Ninety percent of the code I write is sysadmin related code. Task automation, log grokking and processing, monitoring widgets, reporting tools, management gizmos and the like. A lot of these are long-running processes that deal with a fair number of interconnected (and possibly remote) moving parts. Given a 24×7 shop, these tools have to interact with the real world, which implies strapping them to monitoring systems, the kind that will page the humans in those (hopefully) rare instances when things do go sideways and someone has to intervene to make it all better. The pace is relentless, so streamlining the development is critical to keep up and be able to maintain the code as things evolve.
Before discovering the logging module, I wrote a fair amount code involving admittedly crude custom loggers and heavy usage of the syslog module, which is fine, actually, but far from optimal. The logging module does away with all of that, simplifying the code and allowing for a fair amount of control over how and where messages are sent. In addition to log files and syslog, the logging module offers a good variety of other handlers (SMTP, socket and datagram, to name a few). Yum. It forces a bit of thinking about the organization of messages (mapping conditions to severity levels and deciding what to do for each severity), but this is actually quite helpful. Yum yum.