[1329] in cryptography@c2.net mail archive
Re: text formatting in literary works (tiny change from coderpunks posting)
daemon@ATHENA.MIT.EDU (Derek Atkins)
Wed Aug 13 17:29:08 1997
To: Antonomasia <ant@notatla.demon.co.uk>
Cc: cryptography@c2.net
From: Derek Atkins <warlord@mit.edu>
Date: 13 Aug 1997 17:17:04 -0400
In-Reply-To: Antonomasia's message of Wed, 13 Aug 1997 03:18:52 +0100
Antonomasia <ant@notatla.demon.co.uk> writes:
> 1) Whitespace - WYSINWYG
[snip]
> All unquoted whitespace (not counting newlines) are single spaces.
>
> There are no trailing spaces. All blank lines are empty.
>
> Any necessary exceptions to this (assembler portions or whatever)
> are labeled as such in nearby comments and the local convention
> is defined.
Interesting -- I never really thought about this problem. There
are places (such as in Makefiles) where the type of whitespace is
important; there are other places where it isn't as important.
> 2) No Comments
>
> The processing of full comments amounted to perhaps 30% of the proofreading
> effort. Exclusion of comments is probably a good idea. Extravagant comments
> such as tables and diagrams are right out.
The problem is that even though a scanner is reading the code, a Human
might need to later understand what's going on later. For example,
someone who downloads the PGP 5.0i distribution and is trying to help
debug it would be completely confused without many of the comments.
> Usually source code has two audiences (compilers & humans). In this work
> there was a third (the scanner). One option to try to suit all these is to
> publish commented and uncommented versions. A Fortran standard I have read
As I said, an uncommented version only helps a scanner and a compiler,
but will fail for a user. Unfotunately, that means the creators of
the original code will be the subject of mass mailings for help and
guidance. The comments in the code correct that, mostly. All you
need to do, as a proofreader, is ignore the checksums on comments.
> 3) Properties of OCR
>
> It appears that OCR software applies rules on the likely content of natural
> language text. Some familar errors are:
>
> "*bn" -> "*ten", "cfb" -> "cib", "cfb" -> "cEb"
>
> Names could be chosen so as to be more easily recognised by this software.
The problem is that some of these constructs just don't work well, and
many of these "names" are common and completely describe what's going
on. I mean, 'bn' clearly stands for Big Number, 'cfb' clearly means
Cipher FeedBack. Would you prefer names in french? That would
probably be just as bad.
> 5) Letter Case
>
> Many names in the code were structured like pgpWordsHere or
> PgpWordsHere. The initial 'p' was sometimes wrongly changed
> into upper case. If there was a convention here I failed to
> see it. Maybe one case-style will do. Maybe different styles
> could be used to mark variables and functions ....
You clearly missed the convention. Data types begin with a capital
letter, function calls begin with a lower-case letter. So, you have
PgpCipherContext, a data type, and pgpCipherByName, a function.
> 6) A Data Dictionary
>
> The addition of a scanable data dictionary would help. Beside setting
> out the formatting conventions in use, it would help in determining
> changes in code that looked valid. I'm unsure how much detail would
> help, but some variable names would probably be appropriate.
[snip]
> This dictionary would be scanned and distributed electronically to
> proof-readers. They would then be equipped to tackle much of the work
> without visible good source.
But wouldn't the Data Dictionary have as many problems being scanned
as the rest of the document? Or are you proposing that the data
dictionary, necessarily smaller than the rest of the code, would be
hand-verified from the printed sources and then electronically
transferered after manual correction?
All in all, congratulations on a job well done!
-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL N1NWH
warlord@MIT.EDU PGP key available