[148934] in cryptography@c2.net mail archive

home help back first fref pref prev next nref lref last post

Re: [Cryptography] defaults, black boxes, APIs,

daemon@ATHENA.MIT.EDU (Phillip Hallam-Baker)
Sun Jan 5 22:45:54 2014

X-Original-To: cryptography@metzdowd.com
In-Reply-To: <1017145E-A960-4A7B-9F25-2F02F9C77A83@lrw.com>
Date: Sun, 5 Jan 2014 18:52:43 -0500
From: Phillip Hallam-Baker <hallam@gmail.com>
To: Jerry Leichter <leichter@lrw.com>
Cc: Cryptography Mailing List <cryptography@metzdowd.com>,
	Jonathan Thornburg <jthorn@astro.indiana.edu>
Errors-To: cryptography-bounces+crypto.discuss=bloom-picayune.mit.edu@metzdowd.com

--===============4069591533115043370==
Content-Type: multipart/alternative; boundary=001a1134989cdd4fc304ef41d69d

--001a1134989cdd4fc304ef41d69d
Content-Type: text/plain; charset=ISO-8859-1

On Sun, Jan 5, 2014 at 5:59 PM, Jerry Leichter <leichter@lrw.com> wrote:

>
> What I find most disturbing is that many of these bugs are trivial to
> avoid using techniques that have been known forever.
>
> Many, many years ago, I took over an internal hack project that someone at
> DEC had written.  (For those who remember, it as a LAT server that ran on
> VMS:  It played the role of a LAT terminal server, connected to other
> systems using LAT rather than CTERM.  On a LAN, LAT did much better than
> CTERM.)  The guy who wrote the original piece of code wrote it in a style
> that is all too familiar to anyone who's looked at most commercial code
> today:  Just assume everything you receive is valid because checking takes
> too long in programmer time and is "too inefficient".  The result was that
> the program, when it received Ethernet frames that didn't quite match what
> it was expecting, would crash in various horrible ways.  For reasons no one
> could ever really explain, such frames were surprisingly common.
>
> I set out to eliminate the crashes.  It turned out that almost all of them
> came down to one root cause:  The LAT protocol was defined in recursive
> components and subcomponents, each of them encode in TLV (Type Length
> Value) format.  They were parsed and handled in what you might, in a
> language parser, call recursive-descent style:  You used the T field of a
> subcomponent to select the right function; it pulled off the leading L
> field and dealt with the subcomponent and returned a pointer to where the
> next T field should start.  Unfortunately, damaged components often had
> garbage length fields.  (Well, the other fields might be garbage, too, but
> they didn't cause as much *immediate* havoc.)  Given that this was C, with
> no array bounds or other memory object checking, the result was that a
> subcomponent parser would happily walk of the end of the buffer it was
> handed based on the bogus length field.  The simple fix:  Walk the path of
> the data, from where it was pulled off the wire, through ever subcomponent
> parser, making sure that each function received a pointer to the end of the
> *containing* component, and have it check that the subcomponent didn't go
> beyond the containing component.  A boring couple of days of code cleanup -
> but, miraculously, the crashes ... stopped happening.  Who would have
> thought....  :-)
>
> This would have been, oh, 1986, give or take.  But somehow C programmers
> at larger never learned - or were even taught - the lesson.  The only thing
> that's gotten us away from the never-ending stream of bad C code that
> scribbles memory is the fading of C from most commercial products:  C++ can
> be somewhat resistant (if you use the built-in types, *carefully*), and the
> newer languages all check array and string bounds.
>

You mentioned Tony Hoare earlier, he didn't use his Turing Award lecture to
point out that lack of array bounds checking was going to bit on a whim. He
knew that it was going to be a disaster.

The CERNLIB code for the Web was actually pretty robust as all the string
handling was performed by macros with built in bounds checking.


-- 
Website: http://hallambaker.com/

--001a1134989cdd4fc304ef41d69d
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><div class=3D"gmail_extra"><br><br><div class=3D"gmail=
_quote">On Sun, Jan 5, 2014 at 5:59 PM, Jerry Leichter <span dir=3D"ltr">&l=
t;<a href=3D"mailto:leichter@lrw.com" target=3D"_blank">leichter@lrw.com</a=
>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div class=3D"im"><br></div>
What I find most disturbing is that many of these bugs are trivial to avoid=
 using techniques that have been known forever.<br>
<br>
Many, many years ago, I took over an internal hack project that someone at =
DEC had written. =A0(For those who remember, it as a LAT server that ran on=
 VMS: =A0It played the role of a LAT terminal server, connected to other sy=
stems using LAT rather than CTERM. =A0On a LAN, LAT did much better than CT=
ERM.) =A0The guy who wrote the original piece of code wrote it in a style t=
hat is all too familiar to anyone who&#39;s looked at most commercial code =
today: =A0Just assume everything you receive is valid because checking take=
s too long in programmer time and is &quot;too inefficient&quot;. =A0The re=
sult was that the program, when it received Ethernet frames that didn&#39;t=
 quite match what it was expecting, would crash in various horrible ways. =
=A0For reasons no one could ever really explain, such frames were surprisin=
gly common.<br>

<br>
I set out to eliminate the crashes. =A0It turned out that almost all of the=
m came down to one root cause: =A0The LAT protocol was defined in recursive=
 components and subcomponents, each of them encode in TLV (Type Length Valu=
e) format. =A0They were parsed and handled in what you might, in a language=
 parser, call recursive-descent style: =A0You used the T field of a subcomp=
onent to select the right function; it pulled off the leading L field and d=
ealt with the subcomponent and returned a pointer to where the next T field=
 should start. =A0Unfortunately, damaged components often had garbage lengt=
h fields. =A0(Well, the other fields might be garbage, too, but they didn&#=
39;t cause as much *immediate* havoc.) =A0Given that this was C, with no ar=
ray bounds or other memory object checking, the result was that a subcompon=
ent parser would happily walk of the end of the buffer it was handed based =
on the bogus length field. =A0The simple fix: =A0Walk the path of the data,=
 from where it was pulled off the wire, through ever subcomponent parser, m=
aking sure that each function received a pointer to the end of the *contain=
ing* component, and have it check that the subcomponent didn&#39;t go beyon=
d the containing component. =A0A boring couple of days of code cleanup - bu=
t, miraculously, the crashes ... stopped happening. =A0Who would have thoug=
ht.... =A0:-)<br>

<br>
This would have been, oh, 1986, give or take. =A0But somehow C programmers =
at larger never learned - or were even taught - the lesson. =A0The only thi=
ng that&#39;s gotten us away from the never-ending stream of bad C code tha=
t scribbles memory is the fading of C from most commercial products: =A0C++=
 can be somewhat resistant (if you use the built-in types, *carefully*), an=
d the newer languages all check array and string bounds.<br>
</blockquote><div><br></div><div>You mentioned Tony Hoare earlier, he didn&=
#39;t use his Turing Award lecture to point out that lack of array bounds c=
hecking was going to bit on a whim. He knew that it was going to be a disas=
ter.</div>
<div><br></div><div>The CERNLIB code for the Web was actually pretty robust=
 as all the string handling was performed by macros with built in bounds ch=
ecking.</div><div>=A0</div></div><div><br></div>-- <br>Website: <a href=3D"=
http://hallambaker.com/">http://hallambaker.com/</a><br>

</div></div>

--001a1134989cdd4fc304ef41d69d--

--===============4069591533115043370==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
The cryptography mailing list
cryptography@metzdowd.com
http://www.metzdowd.com/mailman/listinfo/cryptography
--===============4069591533115043370==--

home help back first fref pref prev next nref lref last post