Select Page

The unix pgrep utility is great, it lets you grep through your process list and find interesting things. I wanted to do something similar but for my entire server group so built something quick ontop of MCollective.

I am using the Ruby sys-proctable gem to do the hard work, it returns a massive amount of information about each process and have written a simple agent on top of this.

The agent supports grepping the process tree but also supports kill and pgre+kill though I have not yet implemented more than the basic grep on the command line. Frankly the grep+kill combination scares me and I might remove it. A simple grep slipup and you will kill all processes on all your machine 🙂 Sometimes too much power is too much and should just be avoided.

At the moment mc-pgrep outputs a set format but I intend to make that configurable on the command line, here’s a sample:

% mc-pgrep -C /dev_server/ ruby
 
 * [ ============================================================> ] 4 / 4
 
dev1.my.com
       root   9833  ruby /usr/sbin/mcollectived --pid=/var/run/mcollectived.pid 
       root  21608  /usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_pass
 
dev2.my.com
       root  14568  /usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_pass
       root  31595  ruby /usr/sbin/mcollectived --pid=/var/run/mcollectived.pid 
 
dev3.my.com
       root   1620  /usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_pass
       root  14093  ruby /usr/sbin/mcollectived --pid=/var/run/mcollectived.pid 
 
dev4.my.com
       root   3231  /usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_pass
       root  20557  ruby /usr/sbin/mcollectived --pid=/var/run/mcollectived.pid 
 
   ---- process list stats ----
        Matched hosts: 4
    Matched processes: 8
        Resident Size: 37.264KB
         Virtual Size: 629.578MB

You can also limit it to only find zombies with the -z option.

This has been quite interesting for me, if I limit the pgrep to “.” (the pattern is regex) every machine will send back a Sys::ProcTable hash for all its processes. This is a 50 to 70 KByte payload per server. I’ve so far seen no problem getting his much traffic through ActiveMQ + MCollective and processing it all in a very short time:

% time mc-pgrep -F "country=/uk|us/" .
 
   ---- process list stats ----
        Matched hosts: 20
    Matched processes: 1958
        Resident Size: 1.777MB
         Virtual Size: 60.072GB
 
mc-pgrep -F "country=/uk|us/" .  0.19s user 0.06s system 7% cpu 3.420 total

That 3.4 seconds is with a 2 second discovery overhead client machine in Germany and the filter matching UK and US machines – all the way to the West Coast – my biggest delay here is network and not MC or ActiveMQ.

The code can be found at my GitHub account and still a bit of a work in progress, wiki pages will follow once I am happy with it.

And as an aside, I am slowly migrating at least my code to GitHub if not wiki and ticketing. So far my Plugins have moved, MC will move soon too.