[1389] in cryptography@c2.net mail archive
Re: How to build anonymous storage
daemon@ATHENA.MIT.EDU (Peter Gutmann)
Sat Aug 30 16:50:54 1997
From: pgut001@cs.auckland.ac.nz (Peter Gutmann)
To: kelsey@plnet.net
Cc: coderpunks@toad.com, cryptography@c2.net
Reply-To: pgut001@cs.auckland.ac.nz
X-Charge-To: pgut001
Date: Sun, 31 Aug 1997 06:12:28 (NZST)
>A while back, I got into a conversation with my mother in law, who is a
>psychologist, about patient confidentiality rules. Apparently, unlike a
>priest or lawyer, your discussions with your therapist can be subpoenaed for
>various reasons. She said that several of her colleagues had stopped keeping
>written notes, for fear of having them seized during a messy divorce or
>lawsuit.
I don't think you really need anonymous storage for this, all you need is
subpoena-proof storage. I came up with a means of doing this for the
transparent disk encryption program I wrote several years ago which is
completely transparent to the user, and which doesn't involve hiding any
information. It relies on the fact that although someone can prove you have
(potentially) vast amounts of encrypted data, they can't prove you have the
key to decrypt it, and conversely, you can provide a reasonably convincing
"proof" that you don't have the key to decrypt it.
This never really got beyond the design stage, included below is the post I
made to sci.crypt about it several years ago. The explanations aren't very
good, I was trying to come up with an easy way to explain threshold schemes to
the masses but I don't think it works too well.
Peter.
From: pgut1@cs.aukuni.ac.nz (Peter Gutmann)
Newsgroups: sci.crypt
Subject: Key safeguarding/anti-duress measures in cryptosystems (Long)
Date: 26 Jul 1994 13:54:55 GMT
What follows are some thoughts on anti-duress measures for cryptosystems that
I've been working on whenever I have time (which means: not very often). It's
actually part of the the apocrypha for the Secure FileSystem documentation,
but it needs more work before I enable the code for it in SFS. As such a part
of it is somewhat SFS-specific, but I believe most of it is of general appeal.
The most amazing thing about key safeguarding measures like the ones described
in this posting is that although much has been written about the theory of
threshold schemes in the past, noone has ever published any practical details
of the kind presented below (I know of one person who dropped the subject of
threshold schemes for his PhD because nothing much beyond the theory behind
the schemes was available to him). This may be because it's a somewhat
politically incorrecform, any debate about "can you be forced to disclose your keys" basically
goes out the window, because there's no way anyone can get at your encryption
keys unless you want them to. This may not sit too well with some people.
As it turns out, properly implementing anti-duress measures based on a
threshold scheme is a nontrivial exercise. The threshold scheme itself isn't
hard to implement, it's the management and bookkeeping which needs a lot of
thought. Since I've implemented about as much as I can before I need to make
some tough decisions on share management, I thought I'd ask for feedback on
the following. The first part is an attempt to explain threshold schemes in
nontechnical terms. Does anyone have any better way of doing this? I don't
like it much, but its the best I could come up with.
[General intro to duress attacks snipped]
The idea of needing multiple keys held by different people to open a safe can
be used for encryption as well - simply take a single key, break it into a
number of pieces, and give one piece to each member of a group of people. To
access the data, the group combines their partial keys into one whole key, and
the data is decrypted[3]. This method is also useful to allow access in case
the standard keyholder is unavailable. For example, if a company president
who holds the keys to a safe or data dies suddenly, the vice-presidents can
combine their partial keys to allow access.
Unfortunately, this method has a number of shortcomings. One of these is that
it requires that all the members of the group meet to recreate the original
key. If one member is killed in an accident, or loses their partial key, or
refuses to give it up, recreation of the original key becomes impossible. In
addition, such a simple scheme offers little protection against a duress
attack on the key holders. Although the central site which manages the
encryption might be very careful about keeping their encrypted data secure,
the key holders may give out their partial keys to anyone, or may be persuaded
by an attacker to reveal their keys without much effort. Therefore we require
an anti-duress measure to have a number of properties:
1. It should be damage-resistant. If one or more partial keys are lost,
reconstruction of the original key should still be possible.
2. It should not endanger the site which stores the encrypted data. The
central site should be able to convince an attacker that they are
incapable of decrypting the data even if they want to[4].
3. It should not endanger the people holding the partial keys. If the
holders of partial keys feel threatened simply by the fact that they are
keyholders, they may refuse to become keyholders in the first place.
These properties are only useful for long-term bulk storage of data, since,
obviously, a recipient must be able to decrypt messages sent to them, which
means that property 2 no longer holds. However if the messages, upon
decryption, are written straight to a bulk-encrypted storage area such as an
SFS volume, the anti-duress measures are extended to cover stored message
traffic as well.
Footnote [3]: The US government's Escrowed Encryption Standard works like this,
with partial keys being held by two seperate organisations.
Footnote [4]: This is why stores which have time-locked safes often display
notices to the effect that the staff cannot open the safe - if
it's obvious that the staff can't open the safe, thieves won't
try to force them to do so.
An Anti-Duress Scheme: The First Attempt
A good analogy to an anti-duress scheme using multiple keys can be created
using multi-component epoxy glue. This is the type of glue in which two
colourless liquids (let's call them A and B) are mixed together to form a
third colourless liquid which acts as a very powerful glue. By themselves,
the two components are useless. If one of the colourless liquids is mixed
with some different type of colourless liquid, the end result is useless.
Only by mixing components A and B can we obtain the desired end result.
Let's assume we want to control the creation of the epoxy glue. By giving
liquid A to one person and liquid B to another, we create a condition in which
the two people have to meet and combine their liquids to create the glue.
However this is little better than the previously discussed scheme involving
two keys to a safe.
Now let's bring 23 more colourless liquids, C through Y, into the game. We
now have 25 colourless liquids A through Y, each of which look, taste, smell,
and feel identical. We give these to 25 different people. Only 2 of them can
be used to create the epoxy. The other 23 create nothing useful. Assuming an
attacker can obtain liquids A through Y from the people holding them, they
would have to try, on average, 150 different combinations before they hit the
right one. Or they could simply mix them all together, and hope there is
enough A and B present in the result to create the epoxy.
We therefore need to address two problems: The fact that an attacker can mix
everything together in one mess and still obtain a valid result, and the fact
that they can simply try each combination until they find a match. The first
problem is solved by making the liquids C through Y anti-epoxy liquids which
act to neutralise A and B. Therefore even if the correct liquids A and B are
mixed, any one of C through Y will neutralise their effect. This forces an
attacker to try all possible combinations in order to obtain the result they
desire.
The second problem is solved by making the number of possible combinations
very large. If we require 3 of 25 liquids to be correctly mixed to create the
epoxy instead of 2 of 25, the average number of attempts jumps from 150 to
1150. If we require 10 of 25 liquids to be correctly mixed, the average
number of attempts jumps to 1,634,380. The greatest number of combinations is
when we require 25/2 of 25 liquids to be mixed, with 2,600,150 attempts being
necessary in order to guess the correct combination. If we used 100 liquids
and required a certain 50 of them to be mixed, an attacker would require
approximately 50,445,672,270,000,000,000,000,000,000 attempts in order to find
the correct combination of liquids.
Unfortunately, distributing and collecting hundreds of containers of
colourless liquid is somewhat impractical. Although this approach serves to
explain the basic principles, we need to use a more sophisticated model to
protect our data.
An Anti-Duress Scheme: The Second Attempt
Instead of looking at the scheme as requiring the combining of many seperate
components to create the whole, we can look at it as requiring the casting of
many votes in order for a final result to be approved. People can vote yes or
no, and unless significantly more yes than no votes are received, the final
result won't be approved.
To use this idea to protect encrypted data, the site with the encrypted data
takes the secret which has to be protected (in this case the encryption key)
and breaks it into a number of *shares* M, of which N shares are needed for a
yes vote to be valid. Each share is sent to a *shareholder*. The central
site is the *share issuer*. The shareholder can specify how many shares, or
yes votes, are needed in order to recreate the original secret. For example,
they may specify that out of a total of 30 shares, 15 will be needed to
recreate the original secret. Therefore up to half of the shares can be lost
before recreation of the secret becomes impossible. This is known as an (M,N)
threshold scheme, because the number of shares must cross the threshold level
N before a valid result can be returned. In this case M = 30, the total
number of shares, and N = 15, the number of shares which must be combined to
produce a valid result.
Since each share is unique, but bears no distinguishing marks, there is no way
to tell a junk or bogus share apart from a genuine one. The only way to
determine if all the shares are valid is to use them to reconstruct the
original secret, and even then a failure to reconstruct the secret will only
reveal that one (or more) shares are invalid, but not which ones are invalid.
This means that a shareholder can vote yes by returning a valid share, or vote
no by returning a junk or bogus share, and an opponent will have no way of
telling who voted yes or no.
Let's look at whether this fulfils the requirements we stated before:
1. It should be damage-resistant.
The share issuer controls the damage resistance of the scheme by
specifying how many shares of the total share issue are needed to
reconstruct the original secret.
2. It should not endanger the site which stores the encrypted data.
The central site can claim that all the shares still in their posession
are junk and bogus shares which are incapable of being used to
reconstruct the original secret. There is no way to prove whether any of
the shares really are valid or not.
3. It should not endanger the people holding the partial keys.
The shareholder can return a valid share, or a junk or bogus share if
they feel the need to. Since there is no way to tell the different types
of shares apart, there is no way an attacker can compel them to return a
valid share, as they won't know a valid share when they see it.
Practical Implementation
There are two points of attack on a system using anti-duress measures of this
sort:
1. An attack on the share issuer. In this case the issuer must be able to
convince an attacker that they do not posess enough information to
decrypt the secured data (even if they do).
2. An attack on the shareholders. In this case the shareholder must be
able to convince an attacker that they do not posess any information
which would be useful in reconstructing the encryption key (even if they
do).
We need to cover both of these in order to create a secure anti-duress scheme.
The shares are implemented using an (M,N) threshold scheme which is used to
protect the encryption key for an SFS volume. The reconstruction of each
encryption key is controlled by a *share database*, which can contain three
types of shares:
Real share records
These decrypt to give a valid share.
Duress share records
These decrypt to a valid-seeming share if a duress password is used. A
duress password is a password other than the real password which is used to
decrypt bogus shares instead of the real ones. An opponent can then be
told that one or more contributors handed in incorrect shares.
Junk share records
These decrypt to garbage with any password. These records exist to stop an
opponent simply forcing someone to decrypt all the shares in a share
database.
A share database consists of a free mixture of real, bogus, and junk shares.
For greatest security, a larger number of bogus and junk shares than real
shares should be present, preferably with multiple duress passwords being used.
There are two forms of attack which we need to take into account:
1. An attack against the share issuer, referred to as scenario 1. If the
share issuer is attacked, we assume they haven't had time to destroy their
share database, but need to give the impression that they have.
2. An attack against the share holder, referred to as scenario 2. If the
share holders are attacked, we assume that the share issuers database has
been destroyed, and that the shareholder needs to prove to an attacker that
their share or shares aren't of any use.
[!!!! See below for the rest of this section !!!!]
Share Distribution
The way the shares are distributed makes a major contribution to the overall
security of the scheme. Handing the shares out to your neighbours makes
recovery by an opponent rather easy. A much better solution, for those with
international network access, is to send them to overseas shareholders. This
will make key recovery virtually impossible for anyone but a very determined
opponent willing to reach out to other countries to get at the shareholders.
Certainly the paging overhead for nonresident citizens will be rather high.
Obtaining shares by threatening your neighbours, and obtaining shares by
tracing someone to Upper Volta are an entirely different ball game.
Another useful idea for share distribution is the use of *ghosts*. In its
simplest form, this involves generating a number of shares, putting each one
on a disk in encrypted form, sealing the disks in envelopes, and giving a few
each to some trusted friends. Each of these people keeps one and gives the
rest to some trusted friends. This is repeated as required. Now even the
share issuer has no idea where most of the shares are (thus the term ghosts
for the shareholders). With many shares (and long chains of ghosts to follow
up), recovering the shares can become quite challenging. This method
presupposes a sizeable supply of ghosts, but a simple request to keep an
envelope in a safe place isn't much to ask of someone. The only negative
aspect is the difficulty in eventually legitimately recovering the shares
(especially if the friend-of-a-friend-of-a-friend chains are very long), and
the need to tune the system for a high level of damage-resistance to account
for lost shares.
Dangers of Anti-Duress Measures
The use of these measures is not without its risks. If the share distribution
scheme is used properly, the holder of the encrypted data genuinely can't
decrypt it, no matter how hard they try. However convincing the average
attacker whose only goal is to get at the data by any means available of this
fact can be very difficult, particularly if they barely understand encryption,
let alone complex key-protection schemes of this nature. Their response may
be to simply apply more and more duress in an attempt to force the share
issuer to comply. Care should be taken to ensure that the encrypted
information really is valuable enough to make it worth protecting in this
manner.
---------------- End of clearly-laid-out bit --------------
Key safeguarding (scenario 0):
The database maintenance program is used for the following types of operations:
Create and add M genuine shares, of which N are needed to reconstruct the
original secret.
Extract a share for distribution.
Recombine a share with the database.
This uses the database simply for key management, and is meant for key
safeguarding rather than defending against any form of attack. This scheme is
currently implemented in SFS.
Share issuer attack (scenario 1):
The database maintenance program is used for the following types of operations:
Create and add M genuine shares, of which N are needed to reconstruct the
original secret.
Add a random number of duress shares at random locations in the database
Add a random number of junk shares at random locations in the database
Every time the database is updated, all access information (such as the file
access/update time) is reset to a standard value to obscure the time at which
changes were last made.
That's all. In most cases this will probably be what happens. The sort of
people who need this form of protection won't be given any advance warning when
an enemy breaks into their home/business/computer system at 3am, so they won't
have time to delete the share database. All they need to do is convince the
enemy that what's there isn't of any use to them.
Shareholder attack (scenario 2):
The database maintenance program is used for the following types of operations:
Create and add M genuine shares, of which N are needed to reconstruct the
original secret.
Create a duress share for distribution.
Create a junk share for distribution.
Extract a share for distribution and delete the original from the database
Garble the database (add random shares, change values of existing shares).
The garbling of the database is better than straight deletion, since an
attacker has no idea whether what's left is really a valid database from
someone anticipating scenario 1, or a garbage database from someone
anticipating scenario 2.
Every time the database is updated, all access information (such as the file
access/update time) is reset to some standard value to obscure the time at
which changes were last made.
Variant 1:
An extracted share is simply a share record encrypted with a password chosen by
the share issuer. This ensures that the shareholders cannot conspire behind
the share issuers back to reconstruct the original secret. The encrypted share
record is as follows:
Offset Size Type Description
0 4 LONG Share number
4 ?? BYTE[] Share data
A share is thus "E( share number, share data )".
For the LaGrange interpolating polynomial scheme used in SFS, the share number
is the X coordinate and the share is the Y coordinates. The validity of a
decryption is checked by checking the share number, which should be in the
range 1 ... M. As each share number is a small value, the chances of an
incorrect share being accepted as genuine are quite small.
>> Problem: Is it small enough?
Variant 2a
Another suggested layout for the share record is to encrypt the random Y
coordinate, and make the X coordinate part of the check value. An encrypted
share record is thus:
Offset Size Type Description
0 ?? BYTE[] Share data
and a share is "E( share data ), check value 1,..., check value n".
Since multiple check values are present, a bogus share can be generated simply
by decrypting with the wrong password. One of the check values corresponds to
the valid password, the rest correspond to duress passwords and junk passwords.
This method will be used in the (less frequent) situation where the person
under attack has a chance to destroy the share database before an attacker can
get to it.
>> Problem: With the nCr explosion thing for bigger numbers, does this mean we
should simply give everyone 10 shares each just to make the numbers
bigger (so we get 50 of 100 rather than 5 of 10)? Is there any
other way to artificially inflate the totals?
Variant 2b
Another option is to store X clear:
The data would be:
X clear
IV, E( Y ), check
IV, E( Y ), check
IV, E( Y ), check
The way to determine which E( Y ) is correct for a given password is whether
the key check fails. To avoid collisions, you check every check value whenever
you added a new E( Y ). This isn't that bad, just do your E( Y ), make sure
the check value doesn't collide with an existing one, if it does increment the
IV and try again.
In fact we may not need to store X at all - just make X part of the encryption
process somehow and interate until we find a value which results in the key
check being passed (ie try X=0, then X=1, then X=2, until we pass the key
check). However this makes the above collision-avoidance impossible, and isn't
terribly fast.
A comparison of variant 2a and 2b:
2a 2b
X clear X clear
IV, E( Y ) IV, E( Y ), check
check IV, E( Y ), check
check IV, E( Y ), check
check
Which is better? Size isn't much of a consideration, 100 shares are only about
12K with the 2b variant. We can change 2b to create a 2c share record which is:
X, IV, E( Y ), check
and scatter them at random throughout the share file, but is this useful? This
variant is conceptually nicer since each share is now a nicely-packaged
seperate entity.
Duress/junk share generation:
To get a valid duress/junk IV/key combination which won't mistakenly decrypt
with a genuine key:
for each duress/junk share
encrypt with duress/junk key
decrypt with real key
if( shareNo is valid )
increment IV, try again;
>> Problem: You need to know the genuine encryption key to generate duress/junk
shares.
>> Problem: If two groups of duress shares clash, an attacker knows that both
groups are just duress/junk shares, therefore we need to have *no*
clashes at all. Unfortunately adding any new shares then requires
that the keys for all previous shares be known to ensure that there
are no inadvertant decrypts of one share group with the key for
another share group.
A very serious problem here is how to generate duress/junk shares which don't
clash with real shares quickly and efficiently.