[olug] Data::Diff rulez

Jay Hannah jay at jays.net
Fri Jan 14 20:13:20 UTC 2005


Wow.

Hey programmers: How many times have you had to lock-step between two 
different variable/object structures to find the differences (or 
similarities)? How many weird complex nested data structures have you 
tried to compare against each other using gobs of your own buggy code?

...  what? I'm the only one? Oh well, I'll continue anyway.

Hey sys admins: How many times have you looked at "ps -ef" output, 
waited a few seconds, then looked at "ps -ef" output again to see what 
changed? Ever run "diff" on two files?

Ever want to see what changed in your "ps -ef" output every few seconds 
w/o having to work for it?

I wrote the script below this week. If you run it with no parameters it 
runs "ps -ef" every 60 seconds (change that to however many seconds you 
want) and kicks out the differences from the last "ps -ef" run. So, let 
the program run and watch your screen and the program reports in every 
60 seconds listing every program that just started and every program 
that just finished. It ignores the other 800 process running on the 
server that haven't moved.

Neat, huh? ... what? I'm the only one that thinks so? Oh well, I'll 
continue anyway.

Or, perhaps you want to see the differences between what "ps -ef" 
reported at 2pm today vs. 8am this morning? Set up cron to dump "ps 
-ef" to a file every minute or hour or whatever, then just hand my 
script the two filenames as arguments and it'll tell you what changed.

Neat.

The BEAUTIFUL thing is all the code I didn't have to write to make the 
thing work. Data::Diff takes any two nested complex data structures and 
tells you all the differences (wow, I didn't have to write my own 
lock-step crap?!? thank heavens!!). As you probably know you can go 
totally nuts w/ complex nested data structures in Perl. No matter how 
ugly, if you've got two of 'em, Data::Diff can tell you the 
differences. Sampling has never been lazier.

    http://search.cpan.org/~gcampbell/Data-Diff-0.01/Diff.pm
    (Thanks George!!)

j
(my script could be a lot prettier -- it's just a hack)
yes, I realize there's probably other ps diff'ing utils out there but I 
couldn't find em w/ 5 mins on Google...
	

Begin forwarded message:
> #!/usr/bin/perl
>
> use strict;
> use Data::Diff;
> use Date::Calc qw( Today_and_Now );
> use FileHandle;
> STDOUT->autoflush();
>
> if ($ARGV[0] and $ARGV[1]) {
>    my $first  = memorize($ARGV[0]);
>    my $second = memorize($ARGV[1]);
>    my $diff = Data::Diff->new($first, $second);
>    print_diff($diff);
>    exit;
> }
>
> print "Ok. Running forever, reporing in every 60 sec.\n";
>
> my $old = memorize();
> while (sleep 60) {
>    printf("%04d-%02d-%02d %02d:%02d:%02d\n", Today_and_Now());
>    my $new = memorize();
>    my $diff = Data::Diff->new($old, $new);
>    print_diff($diff);
>    $old = $new;
> }
>
>
> sub memorize {
>    # Memorize a list of processes...
>    my ($file) = @_;
>    $file ||= 'ps -ef |';
>
>    my %ret;
>    open (IN, $file);
>    while (<IN>) {
>       chomp;
>       s/.*\d+:\d\d //;
>       s/ +$//;
>       $ret{$_}++;
>    }
>    return \%ret;
> }
>
> sub print_diff {
>    my ($diff_obj) = @_;
>
>    my (%diff, %uniq_a, %uniq_b);
>    if ($diff_obj->{out}->{diff}) {
>       %diff   = %{$diff_obj->{out}->{diff}};
>    }
>    if ($diff_obj->{out}->{uniq_a}) {
>       %uniq_a = %{$diff_obj->{out}->{uniq_a}};
>    }
>    if ($diff_obj->{out}->{uniq_b}) {
>       %uniq_b = %{$diff_obj->{out}->{uniq_b}};
>    }
>
>    foreach (keys %diff) {
>       printf(
>          "%4s -> %4s   %s\n",
>          $diff{$_}->{diff_a},
>          $diff{$_}->{diff_b},
>          $_
>       );
>    }
>    foreach (keys %uniq_a) {
>       printf(
>          "%4s ->    0   %s\n",
>          $uniq_a{$_},
>          $_
>       );
>    }
>    foreach (keys %uniq_b) {
>       printf(
>          "   0 -> %4s   %s\n",
>          $uniq_b{$_},
>          $_
>       );
>    }
>
> }
>
>
> exit;
>
>
> __END__
>
> ps -ef looks like this on my system:
>     root   5486   9848   0 02:30:59      -  0:00 /usr/sbin/syslogd
>     root   5688      1   0 02:29:25      -  0:00 
> /usr/lib/methods/ssa_daemon -l ssa0
>     root   5992   9848   0 02:31:03      -  0:01 sendmail: accepting 
> connections
>
> the guts of Date::Diff objects look like this:
>    'out' => HASH(0x304a781c)
>       'diff' => HASH(0x304ae960)
>          '-bash' => HASH(0x304ae918)
>             'diff_a' => 16
>             'diff_b' => 15
>             'type' => ''
>
>       'uniq_a' => HASH(0x304ac480)
>          '       - 106798      -   -                     - <exiting>' 
> => 1
>          '/oexec/__obin/allres.4ge dreeves' => 1
>          '/oexec/__obin/rcvlbmn.4ge CHIDTN' => 1
>          '/oexec/__obin/rcvlbmn.4ge NYCBER' => 1
>          '[prcavai.]' => 1
>          'sshd: dthacker at pts/29' => 1
>       'uniq_b' => HASH(0x304c0264)
>          '/oexec/__obin/allres.4ge ccasarru' => 1
>          '/oexec/__obin/rcvlbmn.4ge CRPBFT' => 1
>          '/usr/bin/perl /home/jhannah/src/jhannah/rrd/omni-res.pl' => 1
>          '/usr/bin/perl /oexec/__obin/bkpgmex.pl' => 1




More information about the OLUG mailing list