

So about a month ago I wrote about Zabbix. Well it has been a month and after getting the corporate firewall team in Hightstown, NJ to open up specific ports between the two data centers we got Zabbix rolled out to our production environments. So far so good, if I even stop the Apache process for 30 seconds I get paged on my phone. We have severity levels setup so for example if the partition is 80% full it’ll email us as opposed to send SMS messages to our phone. However if a ping fails 3 times in a row then it goes nuts and hits everyone with a SMS.
The “Gooeyness” is useful to show management our uptimes and loads, plus it just looks purty. I love the CLI but I don’t mind a pretty GUI.

Note: This graph helped us actually discover a configuration problem as we were hitting our max connection limit (1500 per machine) frequently.
Edited: more screenshots.
Over the last few weeks I’ve been looking for a decent monitoring system that would monitor the 40 odd servers at work. Now anyone with even a small foot hold in the open source world has heard of nagios. Nagios is very open ended system that is very flexible but at the same time can be a behemoth to configure. Not to mention that it primarily depends on the SNMP protocol (yes it supports others as well) and there are a ton of plugins to chose from but hardly any clear documentation as everyone has *their* way of running Nagios.
So enter Zabbix. Another open source product that is also backed by the same company with a support contract. Easily to get support via the forums located on that website (Nagios does *not* have official support forums, but links to unofficial support forums) and the main product manager, Alexei, is very easy to contact. Not to mention that the product is stable and after you get the interface down (which is a blessing / hateful considering how entirely segregated the “configuration” section is from the “monitoring” section) its very easy to configure and setup. On the client side I can either do A. SNMP or B. an agentd that runs on the client machine. I have chosen B for all our Linux machines (Fedora and Debian). Easy to setup and it only requires two ports to open on the firewall (makes it easy for the corporate red tape).
It does everything I need: monitor logs, check processes (like httpd/apache, (x)inetd, mysqld, oracle, jboss), monitor critical files for any alterations (via checksum), monitor network bandwidth (out and in), process load (1m, 5m, 15m), memory usage (swap & physical) and uptime. The agent hardly produces any additional load on the machine nor is it a memory hog.
Currently its monitoring our internal machines (3 development, 1 proxy and itself) and it does a mighty fine job. Oh and load on the zabbix server machine. Well monitoring 5 machines is producing a 0.05 – 0.10 load. This is after upgrading the debian box from the 2.4.27 kernel to 2.6.17… when it was running the 2.4.27 kernel its load was 0.8 – 1.0. The thing that takes up the most load is mysqld (mysql 5.0.24) which isn’t too surprising since it does approximately 30,000 queries every hour for the 5 machines currently being monitored.

This month looks to be a great month.
1. My former employer has finally coughed up my final pay check. Took ‘em a while but L&I forced their hand.
2. New car somewhere in Week 2 or 3 of September (95% likelihood of this being a 2007 Scion tC)

3. Giong out to blue lake tomorrow
4. Cammy starting a new job at Whole Foods
5. and probably more
Upgraded from WordPress 2.0.1 to 2.0.3, nothing broke!!
Also I finally completed my external bootable hard drive. I now have
OS X 10.4.6 non-server bootable installation
OS X 10.4.0 server bootable installation
OS X 10.4.6 non-server bootable desktop
Misc / Utilities (like data rescue, and various system utilities)
All partitioned on an external firewire drive. So in case of failure I can carry drive, power cord and firewire cable to a dying Mac and attempt to salvage whats left [of it].
In the last week (from last Sunday the 21st) to this Sunday I worked a total of 89.5 hours. That isn’t including commute or anything. In part so I could help cover the holiday weekend and be a good member for the “team”. While the extra OT is nice, the thanks I got was basically a bunch of attitude today and an attempted, but failed, smack down from Shawn.
So lets see if I get this right. I work 89.5 hours, including Sunday & Memorial Day only because Shawn is absolutely horrible at scheduling and the thanks I get in return is some effort to belittle me in front of all.
Wow… what a guy.
It isn’t every day where I come across a posting where I think to myself, wow that is nicely written. Recently Alexander Kjerulf of postiviesharing.com wrote about on how not to lead geeks. I have seen articles like this before but what sets this one apart is that it is written clearly and its not a ‘rant’, rather it is an open letter to managers of geeks worldwide. Personally I’d like my “fearless leader” work on #8, #6, & #4 (thats in proper order as #8 is truly bad at times with a #6 trailing in a very close 2nd).
Here is the read up….
http://positivesharing.com/2006/03/how-not-to-lead-geeks/
This is the right way to do it. I have tried it now on 4 different boxes, 5 different versions (ranging from 4.9 to 6.1-prerelease).
After you cvsup the kernel…
cd /usr/src
make buildworld
make buildkernel KERNCONF=(namehere)
make installkernel KERNCONF=(namehere)
mergemaster -p
make installworld
mergemaster
reboot
You can leave out KERNCONF if you just want to use the generic kernel (which is a good idea if your upgrading to a major version number say 5.4 to 6.0.
I guess I finally woke up and feel like most people in the world. Here I thought I had found something good, something I could hold on to. Nope… I was wrong. As of now… I have significantly less motivation in doing my job. Believe me, I had a lot of it… but basically it feels like someone attempted to rip off my nads and then stomp on them… for something I didn’t even know I was doing wrong.
I really hope I feel better by Saturday.
Yeah so this is where I spend at least 40 hours a week…
Well christmas is almost here. Only days away. Today my 6 1/2 day vacation (half of it is paid as well
) started. I went to bed at 9am-ish with out any problems but when I woke up at 3pm I woke up with a damn cold. Since then it has worsened as well. First it was just dizyness and just unwillingness to do anything, now I am dripping and sneezing like crazy. I know for sure I got this from Cam and her cold only lasted 2 days so hopefully with plenty of sleep and enough relaxation (which I was planning to do anyways) this curse, I mean cold will be gone from my system on Christmas.
Only three more gifts need to arrive. Today the mailbox was completely empty so I am hoping that these gifts will arrive before the 25th. Thankfully one has a tracking number and is scheduled for delivery tomorrow, but the other gifts are shipped by the unreliable USPS media mail (where it can take from 3 – 30 days and it doesn’t even depend on location, trust me on that one).
Last thing. Last week a fellow colleague (Matt) came to work with a lappie that had Ubuntu installed on it. Well last weekend I converted over (having tried Fedora, Mandrake and Suse) and I found myself hooked. Matt is right, this is comfortable for anyone to use. Ubuntu (http://www.ubuntulinux.org) is based off of the conservative but very stable Debian distribution. I can honestly say that it is very stable. The only thing I had to do was update the default ipw2200 driver & firmware to a higher version. The defaults just wouldn’t talk with my Linksys WRT54G router. Only thing I hope they will add in the next release is WPA support. There is a way to get it in there but its messy. Right now I have had to reduce my wireless router settings from WPA to WEP. Not very happy about that, but I don’t think I am in any significant danger as I hardly doubt that anyone around here hacks Wifi networks for a living and if they do, my laptop can pick up on 3 other (using default shipped names & default router passwords) wifi networks.













