/ Zope / Apsis / Pound Mailing List / Archive / 2006 / 2006-06 / HA script for ColdFusion MX servers

[ << ] [ >> ]

[ SSL proxy - one public ip / Fred Fiat ... ] [ Re-read configuration? / Oliver Hankeln ... ]

HA script for ColdFusion MX servers
Ed R Zahurak <ezahurak(at)atlanticbb.net>
2006-06-24 00:00:17 [ FULL ]
Hi folks,

If anyone's interested, here's a high-availability daemon for 
ColdFusionMX servers.  Run this perl script as a service on your CFMX 
servers, and specify the HA port in your backend directives.  The script
will treat a CFMX service with a high number of queued requests as 
"downed" until the server catches up.

It's bad perl, I know.  Not pretty, but functional.  Apologies if the 
word wrap horks the script as well.

Ed Z.

-----------
#
# hams.pl
#
# CFMX High Availability Monitor Server
# v 1.0
# ezahurak(at)atlanticbb.net
#
# used in conjunction with "pound" load balancer to
# monitor CF usage and "down" server (as far as pound
# is concerned) when CF exceeds particular running and
# queued request threshholds.
#

use IO::Socket::INET;
use POSIX qw(strftime);
use strict;

my $logfile = "c:\\hams.log";
my $cfstat = "c:\\CFusionMX\\runtime\\jre\\bin\\java -cp 
c:\\CFusionMX\\lib\\cfusion.jar coldfusion.tools.CfstatMain";
my $true = ( 1 == 1 );
my $false = !($true);

$SIG{INT} =  sub { &SIGexit('INT') };
$SIG{TERM} = sub { &SIGexit('TERM') };
$SIG{KILL} = sub { &SIGexit('KILL') };
$SIG{STOP} = sub { &SIGexit('STOP') };
$SIG{ABRT} = sub { &SIGexit('ABRT') };
$SIG{QUIT} = sub { &SIGexit('QUIT') };

# defaults, change these if not supplied on command line.
my $default_port = 10080;
my $default_running = 6;
my $default_queued = 6;
my $default_downtime = 3;
my $default_livepoll = 3;

# running variables/
my $port = $default_port;
my $running = $default_running;
my $queued = $default_queued;
my $livepoll = $default_livepoll;
my $downtime = $default_downtime;
my $laststate = $true;
my $currentstate = $true;

&LogEntry("HAMS Started on port $port [$running/$queued]", $true);

my $listener = IO::Socket::INET->new('LocalPort' => $port,
				   'Proto' => 'tcp',
				   'Listen' => SOMAXCONN)
     or die "Can't create socket ($!)\n";

while ( 1 == 1)
{
         if (can_accept())
         {
             vec (my $r, fileno ($listener), 1) = 1;
             close(my $client = $listener->accept) if select ($r, undef, 
undef, $livepoll) > 0;
         }
         else
         {
             sleep $downtime;
         }
}


sub can_accept
{
         &CFIsOkay ? (defined $listener or $listener = new 
IO::Socket::INET ('LocalPort' => $port,
				   'Proto' => 'tcp',
				   'Listen' => SOMAXCONN))
            :  (undef $listener);
     }

sub CFIsOkay()
{
	open CFSTAT, "$cfstat|";
	my (at)statresult = <CFSTAT>;
	my $state = $false;
	my $check_running;
	my $check_queue;
	my $rqueue = "-";
	my $rrunning = "-";
#	print $#statresult + 1, "lines read\n";
#	print join( "", (at)statresult );
	if ( $#statresult == 3 )
	{
#		print "statline:", $statresult[3];
		my (at)stats = split( /\s+/, $statresult[3] );
#		print "stats:", join( ":", (at)stats ), "\n";
		$rqueue = $stats[6] + 0;
		$rrunning = $stats[7] + 0;
	
		if ( $currentstate )
		{
			$check_running = $running;
			$check_queue = $queued
		}
		else
		{	$check_running = 1;
			$check_queue = 1;
		}
	
#	print "cfmx: running = $rrunning($running)[$check_running] queued = 
$rqueue($queued)[$check_queue]\n";
	if ( ( $currentstate && $rrunning >= $check_running &&
$rqueue > 
$check_queue ) ||
	     ( !($currentstate) && $rrunning > $check_running ) )
	{
		&LogEntry("CFMX Down: Load [$rrunning/$rqueue]", $true);
		undef $listener;
		$state = $false;
	}
	else
	{
		&LogEntry("CFMX UP: Load [$rrunning/$rqueue]", $true) if !( 
$currentstate );
		$state = $true;
	}
	}
	else
	{
		print "CF IS DOWN: SERVICE\n";
	        &LogEntry("CFMX Down: Service [-/-]", $true);
		undef $listener;
		$state = $false;
	}
	
	if ( $state != $currentstate )
	{
		$laststate = $currentstate;
		$currentstate = $state;
		&LogEntry("State Transition : Load [$rrunning/$rqueue]");
		
	}
	
	return $state;
}

sub SIGexit()
{
	my $reason = shift;
		&LogEntry("HAMS Terminated by $reason signal.", $true);
		exit;
}

sub LogEntry()
{
	my $msg = shift;
	my $screen = shift or $false;
	my $now_string = strftime "%a %b %e %H:%M:%S %Y", localtime(time);
	open LOGFILE, ">> $logfile";
	print LOGFILE "[", $now_string, "] ", $msg, "\n";
	print "[", $now_string, "] ", $msg, "\n" if $screen;
	close LOGFILE;
}

Re: [Pound Mailing List] HA script for ColdFusion MX servers
Ted Dunning <tdunning(at)veoh.com>
2006-06-24 02:11:24 [ FULL ]
This raises the question of whether having a server marked as "down" by 
an HAPort non-response will lose all sticky sessions.  In many cases, 
what would be nicest would be to be able to mark a server as "busy" in 
which no new sessions would be assigned to it, but all existing sessions 
would be kept on the box.  At a higher load, the box could be marked as 
"down" so that load would be shed to other servers.

Ed R Zahurak wrote:[...][...]

Re: [Pound Mailing List] HA script for ColdFusion MX servers
Ed R Zahurak <ezahurak(at)atlanticbb.net>
2006-06-24 04:56:14 [ FULL ]
Ted Dunning wrote:

I'll leave that to you to figure out. ;)

CF offers the ability to store session variables in cookies or in a database,
that allows you to handle the same session across multiple machines, even if
one dies completely, is shut down, rebooted, thrown out a window and so on.

My architecture here was specifically designed to allow requests to be served
round-robin from any server in the pool.  I tend to just not use session
variables -- minimal state information (typically just auth info) is kept in a
cookie, and that's about it.  It comes in handy for those times when I've had
to throw a server out a window. :)

[The reason I've written this script, mainly, is because pound every once in a
while decides to "randomly" send anywhere from 20 to 100+ requests in a row all
to the same server, even when the others are up and running just fine.  I
suspect it's just pound's inner monkey trying to write Macbeth or something. 
This script is *very* effective at minimizing the situation when it occurs --
thankfully, the monkey still knows to check the HA status now and then.]

The trouble with what you propose is that, while the server is marked, "busy,"
nothing happens.  Your user just sits there -- if your lucky, he's content
watching the spinny browser "loading" icon and not clicking the link or the
submit button in rapid succession, eating up pound's resources waiting for
something that might or might not happen, and then pound's going to need extra
complexity to decide "busy" or "dead", and to determine when to write the
server off completely if it's been "busy" for three days, etc...  I'm a much
bigger fan of keeping things simple, and there's certainly other places in your
architecture where session state can be better handled, or at the very least
there *should* be.

Ed Z.
[...][...]

Re: [Pound Mailing List] HA script for ColdFusion MX servers
Ted Dunning <tdunning(at)veoh.com>
2006-06-25 00:00:44 [ FULL ]
Hmmm... I think you misunderstand what I suggest.

In particular, I find that abrupt and large changes in loading can cause 
oscillation in server loads.  Making more moderate changes can avoid 
these problems.  This is the motive for soft-starting a server.

Ed R Zahurak wrote:[...][...]

MailBoxer