[78932] in cryptography@c2.net mail archive

home help back first fref pref prev next nref lref last post

Re: Entropy of other languages

daemon@ATHENA.MIT.EDU (Steven M. Bellovin)
Mon Feb 5 17:15:56 2007

Date: Mon, 5 Feb 2007 16:48:16 -0500
From: "Steven M. Bellovin" <smb@cs.columbia.edu>
To: Allen <netsecurity@sound-by-design.com>
Cc: cryptography@metzdowd.com
In-Reply-To: <45C67061.7050405@sound-by-design.com>

On Sun, 04 Feb 2007 15:46:41 -0800
Allen <netsecurity@sound-by-design.com> wrote:

> Hi gang,
> 
> An idle question. English has a relatively low entropy as a language.
> Don't recall the exact figure, but if you look at words that start
> with "q" it is very low indeed.
> 
> What about other languages? Does anyone know the relative entropy of
> other alphabetic languages? What about the entropy of ideographic
> languages? Pictographic? Hieroglyphic?
> 
It should be pretty easy to do at least some experiments today --
there's a lot of online text in many different languages.  Have a look
at http://www.gutenberg.org/catalog/ for freely-available books that
one could mine for statistics.


		--Steve Bellovin, http://www.cs.columbia.edu/~smb

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo@metzdowd.com

home help back first fref pref prev next nref lref last post