[2323] in cryptography@c2.net mail archive

home help back first fref pref prev next nref lref last post

Re: Encryption without encryption - from Ron Rivest

daemon@ATHENA.MIT.EDU (Bill Stewart)
Mon Mar 23 15:33:34 1998

Date: Sun, 22 Mar 1998 14:56:46 -0800
To: cryptography@c2.net, cypherpunks@cyberpass.net
From: Bill Stewart <bill.stewart@pobox.com>
In-Reply-To: <v03102800b13af11ba6e9@[17.219.146.26]>

Rivest's Chaffing paper opens up a whole range of activities
	<http://theory.lcs.mit.edu/~rivest/chaffing.txt>
The basic approach is for Alice to send Bob a string of packets like
	(sequence_number, data, checksum)
where the data is very short, the checksum uses a secret session key,
and there are multiple streams of real and bogus data, which 
Bob can distinguish by the checksum but which an eavesdropper can't.
It's only using authentication, not encryption, but the data can 
be short enough, e.g. one bit or one byte, that individual packets
don't tell the eavesdropper anything meaningful.
A pleasant consequence is that mixing multiple datastreams increases
security, so Alice's traffic to Bob and Charlie's traffic to Dave
act as chaff for the other's messages, assuming the sequence numbers
are in similar ranges.

=== Environments where this might be interesting ===
IRC.  DC-Nets.  alt.anonymous.messages, but it's ugly.
Telnet with IPSEC/IPv6 authentication - if you've got one byte
of "typed" or chaff material per packet and identical port numbers,
you can let IP discard the chaff with the wrong checksums,
though you don't get the benefit of other people's traffic as chaff.

=== Length ===
How much does this expand the data?  Lots, maybe 5:1 - 100:1.
You get one bit of data if you're paranoid, a byte if you're not.
Since the checksum is really being used only to tell wheat 
packets from chaff packets, rather than for authentication,
it only needs to be long enough to eliminate most collisions.
32 bits should be plenty, and you might use 16 bits if you're
doing enough fancy stuff with your sequence numbers.
If you want real authentication as well, you can stream that
inside the message body and use long enough MACs.

The sequence numbers can be in order starting from 0,
or you can use some pseudo-random pattern driven by the session key
to reduce traffic analysis.  If you're doing a sequence for the whole
message, you probably need 32 bits, but it should work just fine to
do sequences for a packet at a time, and 16 bits would do; 8 may be too short.
Alternatively, if you're using TCP or ATM and can guarantee delivery
in the original order, you don't really need sequence numbers.

So the minimum length is maybe 5-7 bytes, with 1 byte of data,
and the maximum is ~24+ bytes (32 bits + 1 + 160).
8 Bytes has a nice ring, since lots of crypto algorithms might
do useful things with it on the ends.  
ATM cells hold 48 bytes of payload, which makes them an interesting
medium for mixing traffic from lots of sources together.

=== Scaling ===
The usual computer science question is "so how does this scale?",
and in this case the answer is "pretty badly" :-)  
Only the sender and recipient can tell whose traffic is whose.
To get cover traffic, you need lots of traffic streams,
all mashed together, but the receiver can't tell which traffic
belongs to which recipient, so it has to send all the traffic
to everybody and let them sort it out, effectively a broadcast.

To get beyond broadcast, either 
- tag the messages with some quickly-recognized string, 
	e.g. adding a one-byte tag which the recipient can look for
	with much less effort than a full cryptographic calculation,
- use mixes that nest the traffic in full-scale chaff
	but keep track of individual streams,
- use mixes that do wimpy chaffing, e.g. split the packet in half
	and add a checksum to each half, or
- decide that traffic analysis isn't part of your threat model
	and stick to smaller quantities of traffic.

Short packet tags can let you build connections through mixes
similar to ATM virtual circuits, where the tags are only meaningful
on a link-by-link basis, and each mix replaces the incoming tag
with an outgoing tag (and there's a connection setup required.)
Unlike ATM, collisions are fine here, since the recipient can tell
which additional packets are real and ignore the rest.

=== Cryptographic issues ===

How secure is the system with 1-bit data?  Probably as strong
as the checksum algorithm.  What about 1-byte data,
assuming that it's compressed first?  How much correlation
between packets can you discover in the presence of X% cover traffic?

Checksums can be short, but still need a cryptographically strong
algorithm, and probably should be keyed per-session.
There's a lot of HMAC literature, most of which I haven't read yet.

If you're using sequence numbers, should they all start at 0,
or should they be cryptographically-strong pseudo-random?
If you start at 0, how long can your sequences be before
they've only got your chaff traffic and nobody else's?
I assume that if you do sequences starting somewhere random
rather than at 0, it's too easy to analyze, unlike TCP
sequence numbers for which this is a good strategy.

If you use mixes, there's always the problem of trusting them,
but there's still nesting, and the threats are different for
single-link eavesdrops, multiple-link eavesdrops, and compromised mixes.
Unfortunately, mixes are an target for key recovery / compromise.

Nesting using full-expansion chaffing should be secure,
though nesting using tags or wimpy chaffing isn't very,
since eavesdroppers who can see all the traffic in and out of a mix
can keep track of which traffic is on which stream.
There is a lot of known plaintext in this environment.

=========

				Thanks! 
					Bill
Bill Stewart, bill.stewart@pobox.com
PGP Fingerprint D454 E202 CBC8 40BF  3C85 B884 0ABE 4639

home help back first fref pref prev next nref lref last post