[79046] in cryptography@c2.net mail archive

home help back first fref pref prev next nref lref last post

FW: Entropy of other languages

daemon@ATHENA.MIT.EDU (Trei, Peter)
Wed Feb 7 13:37:14 2007

Date: Tue, 6 Feb 2007 09:30:11 -0500
From: "Trei, Peter" <ptrei@rsasecurity.com>
To: <Cryptography@metzdowd.com>



Steven M. Bellovin wrote:

>=20
> On Sun, 04 Feb 2007 15:46:41 -0800
> Allen <netsecurity@sound-by-design.com> wrote:
>=20
> > Hi gang,
> >=20
> > An idle question. English has a relatively low entropy as a
> language.
> > Don't recall the exact figure, but if you look at words that start=20
> > with "q" it is very low indeed.
> >=20
> > What about other languages? Does anyone know the relative entropy of

> > other alphabetic languages? What about the entropy of ideographic=20
> > languages? Pictographic? Hieroglyphic?
> >=20
> It should be pretty easy to do at least some experiments today --=20
> there's a lot of online text in many different languages.  Have a look

> at http://www.gutenberg.org/catalog/ for freely-available books that=20
> one could mine for statistics.

As a very rough proxy, look at the length of the same text in different
translations.=20

My father was in advertising in Europe. When they laid out a print ad,
they always did so using the German text. If the German fit, any other
language they were interested in would do so as well.

Now that I work (among other things) on cellphone applications, I'm
running into similar issues in internationalizing text on tiny screens.

Peter Trei

Disclaimer: This is a personal opinion. It may or may not jibe with my
employer's opinion.


---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majordomo@metzdowd.com

home help back first fref pref prev next nref lref last post