Cave Documentation (was re: LRUD in Survex)

Olly Betts
Thu, 2 Aug 2001 12:36:26 +0100

This is veering sharply off-topic, but is perhaps a useful illustration of
the potential pitfalls of managing a long survey project...

On Wed, Aug 01, 2001 at 12:41:34PM +0100, Andy Waddington on Survey stuff wrote:
> Olly said
> > Note the dates "1976-1999". Merging a year's updates in by hand is a major
> > job and nobody's had time this year.
> Owing to the site being shipped onto the survex server and the structure
> being changed (or at least a lot of talk about the structure being changed)
> I no longer had a usable site image.

We moved the site to the survex server because chaos had erratic name
service (so frequently was effectively unreachable) and it took forever to
get any sysadmin work done (e.g. updating the web server cofiguration).  As
far as I recall, nobody disagreed with this move.

We also put the web site in a version control system (CVS).  You (Andy)
complained you couldn't access this from RISC OS (despite the fact you have
at least one Linux machine and 2 flavours of MS Windows at home as well).  I
pointed you to a RISC OS CVS client - if it didn't work, you didn't tell

> The people who had "taken over" the site
> maintenance apparently didn't even have the time to explain the changes they
> had made so that I could restart maintenance, which begs the question as to
> how they imagined they would have the time to merge in this year's new work.

I believe you were copied in on the thread discussing restructuring, so you
should be aware what was being suggested.  And if not, ask!

As I understand it, the restructuring so far is fairly light.  Mostly just
putting it in a subdirectory and fixing up absolute links, and also fixing
up other non-working links.  Fetching the tree from CVS and looking at the
CVS log entries will tell you what has changed since it went into CVS (which
was the original site, possibly with obvious problems corrected).

> Wookey had a lot of stuff from me such as the log book typed up, the list of
> members updated and a few other changes, but these have presumably only made
> it as far as the chaos server, where the site was originally hosted.

This is the major reason why there hasn't been other updating - suddenly the
dead server was more up-to-date in some regards than the live server, but
the live server more current in other ways.

Re-merging to divergent branches such as this is slow and error prone, and
ideally needs to be done in one session.  So this stalled work on other
updates as we were reluctant to make the versions more divergent.

Mark has now merged the changes in (they were actually fairly self-contained
it turned-out, which wasn't what we were expecting at all).

> Many
> other changes which I could have made didn't happen not only because the
> site had been taken over, but also because nobody took the fairly minimal
> amount of time needed to supply me with copies of the survey book, cave
> descriptions and the like from which to work.

Remember that the expedition is largely run by the students in the club
(though Wookey usually helps a lot, and I've been sucked in in a major way
this year), and they've usually only been around for a year or two.  So they
need prompting to do the non-obvious jobs (like sending photocopies of the
records to someone they've never met, and maybe never heard of).

> I rather resent the implications of that "nobody's had time this year".

You apologised for being otherwise occupied for much of last year - for
example, I sent you a stack of scanned entrance locations 18 months or so
ago, which never made it onto the site even before it moved.

It's understandable that you have other things to do, but without some sort
of version control, one person really needs to apply changes (otherwise you
get divergent version) and so there's a bottleneck.  If people spend time
scanning photos, or typing in text and don't see it appearing on the site,
they'll be much less inclined to send more.  This is why we moved to CVS,
where people can update mostly independently, and any cock-ups can be backed
out or resolved as the history is always there.

Anyway, we'd assumed the lack of activity on your part was due to you still
being otherwise occupied.

> > A lot of this work could be automated
> My impression was that the site was taken over for the purpose of doing
> just that - but the result seems to have been that no automation has in
> fact occurred, and the manual task has been prevented. All the things like
> log book and cave descriptions need to get from handscrawled paper to typed
> data and that is not going to be automated until OCR gets a lot better !!
> If you don't have time, then please ensure that someone who does, gets the
> chance to do the work for you, rather than just bleating...

The site wasn't "taken over" - it moved location, and was put into CVS so
that multiple people could work on it, without relying on one person to
apply changes.  The intention was that you'd still be able to update it
as well.  As far as I know you can...

> On a more general note, this just illustrates the problem of having a major
> resource like the entire documentation of your 20+ years of expedition
> maintained by a group on diverse machines in widely separated locations.

And the problems of lack of communication...

> The
> update has worked fine for many years when essentially a collaboration
> between two people doing the work (and typically exchanging a couple of
> megabytes of email a week during the active month or three), getting data
> and answers from others who didn't get the opportunity to do parallel
> updates.

And it stopped working when one of those people suddenly didn't have the
time to devote to it one year.  Which is why some change was needed.

> As soon as you widen the collaboration, it breaks down, unless some
> sort of CVS system is introduced. And even then, it makes it much harder if
> all those updates have to go back and forth over the net over a phone line
> constantly.

CVS sends only the changed files, in a highly compressed way (only changed
lines are sent).  That means it will result in much less traffic down your
phoneline than sending the files which have changed each time, and also
means you don't need to go through finding which files have changed.

You don't need to be constantly connected - you connect when you want to
synchronise with the main server, and get any changes since you last
updated.  You will need to be connected while the changes are being sent,
but that's a quick operation.

> The moral: if you want to have expedition or cave exploration documentation
> kept up to date, ensure that someone who has the time to do the work gets
> all the paper needed to do so. If you plan an automated update system, pilot
> it on a small project or part of the main project before simply moving the
> documentation to a place where noone has the time or motivation to maintain
> it !!

Andy - as far as we knew (or at least I knew) you were happy with the move
to, and the move to using CVS, and your lack of activity was
because you were otherwise occupied, rather than not being able to get at
the web site any longer.

If you've really been sitting twiddling your thumbs frustrated at not having
copies of the log books, etc and not to be able to update the site I'm very
sorry, but you really should have told someone.

On a related note, the dataset also moved into CVS at about the same time,
and has been slightly restructured and gradually updated with input from
Thilo and others.  So it can work...