Backsights and default accuracy estimates

Mon, 13 May 2002 16:23:43 +0100

On Fri, May 10, 2002 at 12:49:19PM +0100, M.J. Green wrote:
> Thinking about it, the way a surveying program should ideally run, is to
> consider the probabiliity distribution for each measurement taken, and do
> calculations based on this.  Accurately calulating the probablity
> distribution and doing calulations based on these is very hard and time
> consuming.  Therefore, as the real probability distributions are unknown
> and are too hard to manipulate, they are at some point approximated to
> gaussian.

Another reason to approximate to gaussian is that the distributions of
combinations of independent measurements will very quickly tend to a gaussian
distributions, whatever the initial distribution (even from a uniform
distribution which looks very different to a bell curve).  The Central Limit
Theorem tell us this.  I wrote a short article in Compass Points about this:

http://www.chaos.org.uk/survex/cp/CP14/CPoint14.htm

Search the page for "uniform".  There's a summary of an email discussion
on a similar topic to this one too.

(Wadders: any chance of adding some `<a name="...">' tags next time you fettle
the on-line compass points?  It would be nice to be able to link to an
individual article...)

> This leaves the problem of choosing what sigma (1 sd) is for the gaussian.
> I believe the correct way to do this is to make a suitable approximation
> for the probability distribution is, then calculate the standard deviation
> of that.

That sounds reasonable.

> This way the probability distributions have the meaning that
> they were intended, rather than putting in a scale factor when choosing
> how large one sd is, then putting the inverse if it in when deciding when
> to complain.

That's a poor way to think about it - it makes it sounds like a total bodge
which it isn't.  It's perhaps not the best possible model, but it's not
as arbitrary as you suggest.

Better to think of it as picking a confidence limit (e.g. 3 standard
deviations) above which you are dubious about the reading, and using it
consistently in the two places.  It just happens that whatever value you use
cancels out.

> For BCRA grade 5, I suggest this means that the surveyor was trying to
> read to the nearest degree.  If he could do this perfectly, thenthis would
> mean a top hat function between -0.5 and 0.5 degrees.

That has a certain appeal, though BCRA grade 5 is "accurate to +/- 1 degree"
not "accurate to +/- 0.5 degrees", which this rather ignores.

> However there is
> also instrument calibration, and the pointing in slightly the wrong way,
> and reading error when the reading is about half way.

So in other words, the reading is a combination of several variables, each with
their own distribution, and probably all largely independent.  So by the
Central Limit Theorem, the actual distribution is approximately a Gaussian!

> Due to this and in
> the spirit of the abssolute bounds suggested by the BCRA, I suggest that
> the a reasonable probability distibution would be linear in the region
> between 1 and 0.5 degrees, and then flat between 0.5 and 0 degrees.  This
> is of course just a guess but it should give a better guess as to the
> magnitude of 1sd than either my previous top hat functions, or assuming
> reading are gaussian and putting a percentage cut of that is within 1
> degree.  This gives 1sd equals 0.456 degrees, or 1 degree = 2.2 sds.

Without experimental evidence for the actual distribution, or a plausible
theoretical model for it, my inclination is to avoid trying to be too elaborate
in inventing one.  It seems very arbitrary to put in a transition at 0.5.
And note that the Gaussian distrbiution has a plausible theoretical model
behind it.

The only non-Gaussian candidate that seems to have much of a theoretical
basis is a uniform ("top hat") distribution on [-1, 1] - arguing that this is
purely a quantisation.

Or you could probably argue for a triangular distribution on (-1,0,1) as the
sum of two uniform distributions on (-0.5, 0.5), by waving your hands and
muttering that there's a reading and a calibration or something.

> When an error is thrown the probability of getting the readings that far
> out or worse given everything was measured correctly can be approximately
> shown.  This approximation becoming better the larger the loop closure.  I
> believe that Olly may have already implimented this last bit.

Yes, at least experimentally.  The problem is that at present it's run too
late to "break" bad legs and exclude them from the solution, so the code
to try to identify the actual incorrect reading is working on the forcibly
closed data, which includes distortions from the bad reading.  I should get
on to working on this in the not too distant future - I'm currently polishing
things for a 1.0.8 release which completes most of the tasks I listed as
to be done for 1.0.X.

Cheers,
Olly