CDFT: what goes in it/ source data

Julian Todd julian@ncgraphics.co.uk
Thu, 11 Jan 2001 19:49:49 -0000


Hi there, 

I'll just throw this idea in before anyone gets around to it.  


Our cave surveying procedure involves coming out of 
the cave with a damp waterproof notepad covered in mud on 
which we have scribbled our measurements and a few 
diagrams.  

What we tend to do is copy the notes from the paper 
neatly into a book, dry off the paper, stick it in an 
envelope and staple the envelope to the same page of the book 
(so we can verify suspicions of bad handwriting and so forth).  

Survex has the advantage in that you can copy the text 
of the book into an ascii file exactly as it is and 
it will parse it.  Survex has several commands which 
set the mode so it can read whatever ridiculous out-of-order 
(eg compass and clino the wrong way round, things spread over 
two lines, or previous survey station assumed to be the next 
survey station) method that you write your data down in in 
the cave.  

This proves that the conversion from your ascii notes 
convention to any XML format can be done automatically; there 
will be no point in converting them into XML by hand or -- 
more nuttily -- doing your original survey notes in the 
cave in XML.  

What I would propose is that the text in the survey notebook 
is copied out character for character into a text window.  
Then, using whatever parser you have for the job (eg the 
general purpose Survex one set to the correct mode), 
you convert this into an XML file containing all the weird 
<shot></shot> commands and so forth, but you still store the 
original source data in the file too inside one of the 
following XML constructs (which works like a <pre></pre> 
in html): 

<source_survey   form="SurvexStandard1999">
<![CDATA[   
    ; survey text copied from the paper.  
   *units metres
   A B  10 10 9
]]>
</source_survey>

--- followed by, perhaps, 

<stations>
    <station name="A"></station>
    <station name="B"></station>
</stations>
<shots>
    <shot from="A", to="B", tape="10", compass="10", clino="9"></shot>
</shots>


This text (in the CDATA[]), being a faithful representation of your notes 
from the cave, will not change unless there has been a transcription 
error or other blunder.  After you have run your Survex parser and 
extracted the data into XML notation you could delete it and 
carry on without it.  However, aside from trying to comply with 
the dogma that one must Never Represent the Same Data Twice 
Because it Might Clash, I would claim that nothing is really 
gained by throwing it away.  It is in fact serving the purpose of those 
little envelopes of dried out notes you staple into your neat survey 
book.  In that form it is easy to compare and check for transcription 
errors.  And everyone can keep to their own quirky notational 
habits without losing anything.  

Julian Todd.