[olug] Question: package / system for administrating multiple very similar systems ?

Sat Apr 21 02:25:58 UTC 2007

On 4/20/07, Sean Kelly <smkelly at zombie.org> wrote:
>
> On Fri, Apr 20, 2007 at 07:59:11PM -0500, Will Langford wrote:
> ...
> > Is there a pre-built 'update' system that will fetch new updates from a
> > central source and apply them as they're discovered (at some set
> interval
> > from cron or something) ?  The updates will all be custom built by us.
>
> Oh boy. You've stepped into something you can never escape from now. Your
> question will now hurl you into the complicated world of systems
> infrastructure.

Yeah, I was fearing something like this.  Our systems are pretty close to
identical and the updates we need currently are fairly peripheral in
nature.... upgrades to our own toolsets and stuff.  Nothing as drastic as
needing to keep tabs on the /etc tree remotely or anything like that.

First up, I'd recommend checking out http://www.infrastructures.org
> That site will give a good overview of the things you need to be thinking
> of when you start to manage many servers and want to have a standard way
> of
> managing them. This is a project I began to undertake a little over two
> years ago, but set aside in order to bring up a VMware Infrastructure
> environment. Shortly I'll be returning to this sort of thing.

I'll do so over the weekend probably.  I've not had much luck with finding a
condensed site relating to the subject.... much like searching for a web
CMS.... hard to find a decent place to start.

> Our remote systems are running RH 9.0 with many modifications to suit our
> > needs... so an rpm based solution would be fine.  Yes, RH9 - they've
> been in
> > service for that long... and I believe one of our systems has a 400+ day
> > uptime currently :).
>
> Uptimes like that aren't always a good thing. It means you've got an
> outdated kernel that could use some patching.

Actually, we rarely add new functionality to our base system.  They're
fairly static, running on a set guidelines of hardware (usually), and
generally live in the realm of 'if it aint broke, dont fix it'.  The servers
are in a closed environment that service closed clients that aren't really
user based (embedded world).  Hell, nearly all of our systems run postgresql
v8.0-rc3 because it provided some functionality we needed when designing the
system and have never had a reason to upgrade to a newer postgres.  Still
along the lines of 'aint broke dont fix'.  For regular desktops or systems
that have clients of unknown/unpredictable origin, I'd fully agree with
keeping a system absolutely up to date... but it's fairly unnecessary for
us.

There really aren't that many simple tools. There are a few tools I can
> point you at to start looking at:

<...snip list of tools...>

I'll give these a read over as well.  Alot of the tools I looked at for
management were too aggressive/ambitious and provided far too much control
than we desire.

I've also seen other tools that I can't really comment much on. I've seen
> some of how companies like FedEx and Yahoo! do it internally. Their models
> are much simpler and more similar to what you describe above. Yahoo!'s
> tool, called yinst, is a very cool tool that sits on top of FreeBSD's
> package management and allows them to deploy a system in minutes. I wish
> they'd release it, or even talk about it. All I have is its manpage...

That does sound ideal.  I'm very much a fan of 'keep it simple stupid'.
Which is where something along the lines of 'up2date' or something similar
where I can tell it to 'install all new packages found' from a custom
source.... it'd be... interesting.

> I've been a command line dweller for about 10 years, so I should have
> > adequate experience to set anything up, but if it requires more than 2
> hours
> > of my time to get a skeleton up, it's out of the question.
>
> Any good solution will require far more than two hours of time. You have
> to
> look at it in the long term. You'll be designing a new way to manage your
> machines that allows you to scale from your few servers up to hundreds.
> Any
> solution that can be deployed in less than two hours will eventually break
> and leave you sitting there trying to figure out how to glue the two
> pieces
> back together.

A year ago it looked like we were going to have a split in our server
base... with the possibile requirement of running a newer distro on some
hardware that wasn't compatible with our existing RH9 distro.  After alot of
playing with PXE boot-rescue-attempts and newer kernels and such, I've
actually managed to get our old ghost-image running on the newer hardware
with a modified kernel/initrd/modules. Rest of the system remains the same,
just different kernel.  Very handy.

Recently we've added a scsi raid with bbwc to some servers with particularly
high load... once again new kernel goodness but the rest is the same.  Seems
to be a very workable system.

In general, we have to be VERY careful with what we innovate/experiment with
on these live systems.  5 minutes of downtime a year would be too much.  So
we tend to stick with a static setup of known working situations.  Although
I'd rather not say it, but changes need to be 'bug compatible' :).

> A few of the automatic systems I only gave a passing throught to would be
> > up2date and yast... both of which seem like they might be a bear to
> > configure... or well beyond our needs.  I could be wrong though ?
>
> So, you mean like design your own RPMs and deploy them with up2date or
> yast? That is one way to do it, and is very similar to what FedEx did last
> I saw it. I can see some problems with it, but it seemed to work for them.
> They didn't even use RPMs, though.

If the package doesn't understand the system it's being run on, it could
lead to fubarness, or if the package makes a mistake... etc etc etc...

The reason why this was my initial approach to these style pre-built
solutions was I had originally debated coming up with a system that did the
same with some .tgz's (my roots are in slackware ... hard to shake hehe).

I would recommend you try hitting up the config-mgmt and infrastructures
> mailing lists. There are some infrastructure veterans there that could
> help
> point you in a good direction based on your environment:
>

I really do appreciate the wealth of information you've provided.  I'll
return the favor by posting my findings / opinions / decision when I've
picked a solution.  I'll also relate my experiences with the choice as we
begin to use it.

I've tried to help out a few times before, but my particular linux
experience doesn't fit in with corporate-linux-usage (no offense, but
spreadsheet jockey/exchange/etc), nor toy-ish based stuff like mythtv, etc.
About the only thing I can think of that I'd have experience with that might
be interesting to someone would be some PXE fun... or dabbling in foreign
hardware (ie: an alpha i've been meaning to sell).

-Will