[1389] in cryptography@c2.net mail archive

home help back first fref pref prev next nref lref last post

Re: How to build anonymous storage

daemon@ATHENA.MIT.EDU (Peter Gutmann)
Sat Aug 30 16:50:54 1997

From: pgut001@cs.auckland.ac.nz (Peter Gutmann)
To: kelsey@plnet.net
Cc: coderpunks@toad.com, cryptography@c2.net
Reply-To: pgut001@cs.auckland.ac.nz
X-Charge-To: pgut001
Date: Sun, 31 Aug 1997 06:12:28 (NZST)

>A while back, I got into a conversation with my mother in law, who is a 
>psychologist, about patient confidentiality rules. Apparently, unlike a 
>priest or lawyer, your discussions with your therapist can be subpoenaed for 
>various reasons.  She said that several of her colleagues had stopped keeping 
>written notes, for fear of having them seized during a messy divorce or 
>lawsuit. 
 
I don't think you really need anonymous storage for this, all you need is 
subpoena-proof storage.  I came up with a means of doing this for the 
transparent disk encryption program I wrote several years ago which is 
completely transparent to the user, and which doesn't involve hiding any 
information.  It relies on the fact that although someone can prove you have 
(potentially) vast amounts of encrypted data, they can't prove you have the 
key to decrypt it, and conversely, you can provide a reasonably convincing 
"proof" that you don't have the key to decrypt it.
 
This never really got beyond the design stage, included below is the post I 
made to sci.crypt about it several years ago.  The explanations aren't very 
good, I was trying to come up with an easy way to explain threshold schemes to 
the masses but I don't think it works too well.
 
Peter.
 
From: pgut1@cs.aukuni.ac.nz (Peter Gutmann)
Newsgroups: sci.crypt
Subject: Key safeguarding/anti-duress measures in cryptosystems (Long)
Date: 26 Jul 1994 13:54:55 GMT
 
What follows are some thoughts on anti-duress measures for cryptosystems that 
I've been working on whenever I have time (which means: not very often).  It's 
actually part of the the apocrypha for the Secure FileSystem documentation, 
but it needs more work before I enable the code for it in SFS.  As such a part 
of it is somewhat SFS-specific, but I believe most of it is of general appeal.
 
The most amazing thing about key safeguarding measures like the ones described 
in this posting is that although much has been written about the theory of 
threshold schemes in the past, noone has ever published any practical details 
of the kind presented below (I know of one person who dropped the subject of 
threshold schemes for his PhD because nothing much beyond the theory behind 
the schemes was available to him).  This may be because it's a somewhat 
politically incorrecform, any debate about "can you be forced to disclose your keys" basically 
goes out the window, because there's no way anyone can get at your encryption 
keys unless you want them to.  This may not sit too well with some people.
 
As it turns out, properly implementing anti-duress measures based on a 
threshold scheme is a nontrivial exercise.  The threshold scheme itself isn't 
hard to implement, it's the management and bookkeeping which needs a lot of 
thought.  Since I've implemented about as much as I can before I need to make 
some tough decisions on share management, I thought I'd ask for feedback on 
the following.  The first part is an attempt to explain threshold schemes in 
nontechnical terms.  Does anyone have any better way of doing this?  I don't 
like it much, but its the best I could come up with.
 
[General intro to duress attacks snipped]
 
The idea of needing multiple keys held by different people to open a safe can 
be used for encryption as well - simply take a single key, break it into a 
number of pieces, and give one piece to each member of a group of people.  To 
access the data, the group combines their partial keys into one whole key, and 
the data is decrypted[3].  This method is also useful to allow access in case 
the standard keyholder is unavailable.  For example, if a company president 
who holds the keys to a safe or data dies suddenly, the vice-presidents can 
combine their partial keys to allow access.
 
Unfortunately, this method has a number of shortcomings.  One of these is that 
it requires that all the members of the group meet to recreate the original 
key.  If one member is killed in an accident, or loses their partial key, or 
refuses to give it up, recreation of the original key becomes impossible.  In 
addition, such a simple scheme offers little protection against a duress 
attack on the key holders.  Although the central site which manages the 
encryption might be very careful about keeping their encrypted data secure, 
the key holders may give out their partial keys to anyone, or may be persuaded 
by an attacker to reveal their keys without much effort.  Therefore we require 
an anti-duress measure to have a number of properties:
 
  1.  It should be damage-resistant.  If one or more partial keys are lost,
      reconstruction of the original key should still be possible.
 
  2.  It should not endanger the site which stores the encrypted data.  The
      central site should be able to convince an attacker that they are
      incapable of decrypting the data even if they want to[4].
 
  3.  It should not endanger the people holding the partial keys.  If the
      holders of partial keys feel threatened simply by the fact that they are
      keyholders, they may refuse to become keyholders in the first place.
 
These properties are only useful for long-term bulk storage of data, since, 
obviously, a recipient must be able to decrypt messages sent to them, which 
means that property 2 no longer holds.  However if the messages, upon 
decryption, are written straight to a bulk-encrypted storage area such as an 
SFS volume, the anti-duress measures are extended to cover stored message 
traffic as well.
 
Footnote [3]: The US government's Escrowed Encryption Standard works like this,
              with partial keys being held by two seperate organisations.
 
Footnote [4]: This is why stores which have time-locked safes often display
              notices to the effect that the staff cannot open the safe - if
              it's obvious that the staff can't open the safe, thieves won't
              try to force them to do so.
 
 
An Anti-Duress Scheme: The First Attempt
 
A good analogy to an anti-duress scheme using multiple keys can be created 
using multi-component epoxy glue.  This is the type of glue in which two 
colourless liquids (let's call them A and B) are mixed together to form a 
third colourless liquid which acts as a very powerful glue.  By themselves, 
the two components are useless.  If one of the colourless liquids is mixed 
with some different type of colourless liquid, the end result is useless.  
Only by mixing components A and B can we obtain the desired end result.
 
Let's assume we want to control the creation of the epoxy glue.  By giving 
liquid A to one person and liquid B to another, we create a condition in which 
the two people have to meet and combine their liquids to create the glue. 
However this is little better than the previously discussed scheme involving 
two keys to a safe.
 
Now let's bring 23 more colourless liquids, C through Y, into the game.  We 
now have 25 colourless liquids A through Y, each of which look, taste, smell, 
and feel identical.  We give these to 25 different people.  Only 2 of them can 
be used to create the epoxy.  The other 23 create nothing useful.  Assuming an 
attacker can obtain liquids A through Y from the people holding them, they 
would have to try, on average, 150 different combinations before they hit the 
right one.  Or they could simply mix them all together, and hope there is 
enough A and B present in the result to create the epoxy.
 
We therefore need to address two problems: The fact that an attacker can mix 
everything together in one mess and still obtain a valid result, and the fact 
that they can simply try each combination until they find a match.  The first 
problem is solved by making the liquids C through Y anti-epoxy liquids which 
act to neutralise A and B.  Therefore even if the correct liquids A and B are 
mixed, any one of C through Y will neutralise their effect.  This forces an 
attacker to try all possible combinations in order to obtain the result they 
desire.
 
The second problem is solved by making the number of possible combinations 
very large.  If we require 3 of 25 liquids to be correctly mixed to create the 
epoxy instead of 2 of 25, the average number of attempts jumps from 150 to 
1150.  If we require 10 of 25 liquids to be correctly mixed, the average 
number of attempts jumps to 1,634,380.  The greatest number of combinations is 
when we require 25/2 of 25 liquids to be mixed, with 2,600,150 attempts being 
necessary in order to guess the correct combination.  If we used 100 liquids 
and required a certain 50 of them to be mixed, an attacker would require 
approximately 50,445,672,270,000,000,000,000,000,000 attempts in order to find 
the correct combination of liquids.
 
Unfortunately, distributing and collecting hundreds of containers of 
colourless liquid is somewhat impractical.  Although this approach serves to 
explain the basic principles, we need to use a more sophisticated model to 
protect our data.
 
 
An Anti-Duress Scheme: The Second Attempt
 
Instead of looking at the scheme as requiring the combining of many seperate 
components to create the whole, we can look at it as requiring the casting of 
many votes in order for a final result to be approved.  People can vote yes or 
no, and unless significantly more yes than no votes are received, the final 
result won't be approved.
 
To use this idea to protect encrypted data, the site with the encrypted data 
takes the secret which has to be protected (in this case the encryption key) 
and breaks it into a number of *shares* M, of which N shares are needed for a 
yes vote to be valid.  Each share is sent to a *shareholder*.  The central 
site is the *share issuer*.  The shareholder can specify how many shares, or 
yes votes, are needed in order to recreate the original secret.  For example, 
they may specify that out of a total of 30 shares, 15 will be needed to 
recreate the original secret.  Therefore up to half of the shares can be lost 
before recreation of the secret becomes impossible.  This is known as an (M,N) 
threshold scheme, because the number of shares must cross the threshold level 
N before a valid result can be returned.  In this case M = 30, the total 
number of shares, and N = 15, the number of shares which must be combined to 
produce a valid result.
 
Since each share is unique, but bears no distinguishing marks, there is no way 
to tell a junk or bogus share apart from a genuine one.  The only way to 
determine if all the shares are valid is to use them to reconstruct the 
original secret, and even then a failure to reconstruct the secret will only 
reveal that one (or more) shares are invalid, but not which ones are invalid. 
This means that a shareholder can vote yes by returning a valid share, or vote 
no by returning a junk or bogus share, and an opponent will have no way of 
telling who voted yes or no.
 
Let's look at whether this fulfils the requirements we stated before:
 
  1.  It should be damage-resistant.
 
      The share issuer controls the damage resistance of the scheme by
      specifying how many shares of the total share issue are needed to
      reconstruct the original secret.
 
  2.  It should not endanger the site which stores the encrypted data.
 
      The central site can claim that all the shares still in their posession
      are junk and bogus shares which are incapable of being used to
      reconstruct the original secret.  There is no way to prove whether any of
      the shares really are valid or not.
 
  3.  It should not endanger the people holding the partial keys.
 
      The shareholder can return a valid share, or a junk or bogus share if
      they feel the need to.  Since there is no way to tell the different types
      of shares apart, there is no way an attacker can compel them to return a
      valid share, as they won't know a valid share when they see it.
 
 
Practical Implementation
 
There are two points of attack on a system using anti-duress measures of this 
sort:
 
    1. An attack on the share issuer.  In this case the issuer must be able to
       convince an attacker that they do not posess enough information to
       decrypt the secured data (even if they do).
 
    2. An attack on the shareholders.  In this case the shareholder must be
       able to convince an attacker that they do not posess any information
       which would be useful in reconstructing the encryption key (even if they
       do).
 
We need to cover both of these in order to create a secure anti-duress scheme.
 
The shares are implemented using an (M,N) threshold scheme which is used to 
protect the encryption key for an SFS volume.  The reconstruction of each 
encryption key is controlled by a *share database*, which can contain three 
types of shares:
 
  Real share records
 
    These decrypt to give a valid share.
 
  Duress share records
 
    These decrypt to a valid-seeming share if a duress password is used.  A
    duress password is a password other than the real password which is used to
    decrypt bogus shares instead of the real ones.  An opponent can then be
    told that one or more contributors handed in incorrect shares.
 
  Junk share records
 
    These decrypt to garbage with any password.  These records exist to stop an
    opponent simply forcing someone to decrypt all the shares in a share
    database.
 
A share database consists of a free mixture of real, bogus, and junk shares. 
For greatest security, a larger number of bogus and junk shares than real 
shares should be present, preferably with multiple duress passwords being used.
 
There are two forms of attack which we need to take into account:
 
 1. An attack against the share issuer, referred to as scenario 1.  If the
    share issuer is attacked, we assume they haven't had time to destroy their
    share database, but need to give the impression that they have.
 
 2. An attack against the share holder, referred to as scenario 2.  If the
    share holders are attacked, we assume that the share issuers database has
    been destroyed, and that the shareholder needs to prove to an attacker that
    their share or shares aren't of any use.
 
[!!!! See below for the rest of this section !!!!]
 
 
Share Distribution
 
The way the shares are distributed makes a major contribution to the overall 
security of the scheme.  Handing the shares out to your neighbours makes 
recovery by an opponent rather easy.  A much better solution, for those with 
international network access, is to send them to overseas shareholders.  This 
will make key recovery virtually impossible for anyone but a very determined 
opponent willing to reach out to other countries to get at the shareholders. 
Certainly the paging overhead for nonresident citizens will be rather high. 
Obtaining shares by threatening your neighbours, and obtaining shares by 
tracing someone to Upper Volta are an entirely different ball game.
 
Another useful idea for share distribution is the use of *ghosts*.  In its 
simplest form, this involves generating a number of shares, putting each one 
on a disk in encrypted form, sealing the disks in envelopes, and giving a few 
each to some trusted friends.  Each of these people keeps one and gives the 
rest to some trusted friends.  This is repeated as required.  Now even the 
share issuer has no idea where most of the shares are (thus the term ghosts 
for the shareholders).  With many shares (and long chains of ghosts to follow 
up), recovering the shares can become quite challenging.  This method 
presupposes a sizeable supply of ghosts, but a simple request to keep an 
envelope in a safe place isn't much to ask of someone.  The only negative 
aspect is the difficulty in eventually legitimately recovering the shares 
(especially if the friend-of-a-friend-of-a-friend chains are very long), and 
the need to tune the system for a high level of damage-resistance to account 
for lost shares.
 
 
Dangers of Anti-Duress Measures
 
The use of these measures is not without its risks.  If the share distribution 
scheme is used properly, the holder of the encrypted data genuinely can't 
decrypt it, no matter how hard they try.  However convincing the average 
attacker whose only goal is to get at the data by any means available of this 
fact can be very difficult, particularly if they barely understand encryption, 
let alone complex key-protection schemes of this nature.  Their response may 
be to simply apply more and more duress in an attempt to force the share 
issuer to comply.  Care should be taken to ensure that the encrypted 
information really is valuable enough to make it worth protecting in this 
manner.
 
       ---------------- End of clearly-laid-out bit --------------
 
Key safeguarding (scenario 0):
 
The database maintenance program is used for the following types of operations:
 
    Create and add M genuine shares, of which N are needed to reconstruct the
    original secret.
 
    Extract a share for distribution.
 
    Recombine a share with the database.
 
This uses the database simply for key management, and is meant for key
safeguarding rather than defending against any form of attack.  This scheme is
currently implemented in SFS.
 
 
Share issuer attack (scenario 1):
 
The database maintenance program is used for the following types of operations:
 
    Create and add M genuine shares, of which N are needed to reconstruct the
    original secret.
 
    Add a random number of duress shares at random locations in the database
 
    Add a random number of junk shares at random locations in the database
 
Every time the database is updated, all access information (such as the file
access/update time) is reset to a standard value to obscure the time at which
changes were last made.
 
That's all.  In most cases this will probably be what happens.  The sort of
people who need this form of protection won't be given any advance warning when
an enemy breaks into their home/business/computer system at 3am, so they won't
have time to delete the share database.  All they need to do is convince the
enemy that what's there isn't of any use to them.
 
 
Shareholder attack (scenario 2):
 
The database maintenance program is used for the following types of operations:
 
    Create and add M genuine shares, of which N are needed to reconstruct the
    original secret.
 
    Create a duress share for distribution.
 
    Create a junk share for distribution.
 
    Extract a share for distribution and delete the original from the database
 
    Garble the database (add random shares, change values of existing shares).
 
The garbling of the database is better than straight deletion, since an
attacker has no idea whether what's left is really a valid database from
someone anticipating scenario 1, or a garbage database from someone
anticipating scenario 2.
 
Every time the database is updated, all access information (such as the file
access/update time) is reset to some standard value to obscure the time at
which changes were last made.
 
Variant 1:
 
An extracted share is simply a share record encrypted with a password chosen by
the share issuer.  This ensures that the shareholders cannot conspire behind
the share issuers back to reconstruct the original secret.  The encrypted share
record is as follows:
 
    Offset  Size    Type        Description
 
       0      4     LONG        Share number
       4     ??     BYTE[]      Share data
 
A share is thus "E( share number, share data )".
 
For the LaGrange interpolating polynomial scheme used in SFS, the share number
is the X coordinate and the share is the Y coordinates.  The validity of a
decryption is checked by checking the share number, which should be in the
range 1 ... M.  As each share number is a small value, the chances of an
incorrect share being accepted as genuine are quite small.
 
>> Problem: Is it small enough?
 
Variant 2a
 
Another suggested layout for the share record is to encrypt the random Y
coordinate, and make the X coordinate part of the check value.  An encrypted
share record is thus:
 
    Offset  Size    Type        Description
 
       0     ??     BYTE[]      Share data
 
and a share is "E( share data ), check value 1,..., check value n".
 
Since multiple check values are present, a bogus share can be generated simply
by decrypting with the wrong password.  One of the check values corresponds to
the valid password, the rest correspond to duress passwords and junk passwords.
 
This method will be used in the (less frequent) situation where the person
under attack has a chance to destroy the share database before an attacker can
get to it.
 
>> Problem: With the nCr explosion thing for bigger numbers, does this mean we
            should simply give everyone 10 shares each just to make the numbers
            bigger (so we get 50 of 100 rather than 5 of 10)?  Is there any
            other way to artificially inflate the totals?
 
Variant 2b
 
Another option is to store X clear:
 
The data would be:
 
    X clear
        IV, E( Y ), check
        IV, E( Y ), check
        IV, E( Y ), check
 
The way to determine which E( Y ) is correct for a given password is whether
the key check fails.  To avoid collisions, you check every check value whenever
you added a new E( Y ).  This isn't that bad, just do your E( Y ), make sure
the check value doesn't collide with an existing one, if it does increment the
IV and try again.
 
In fact we may not need to store X at all - just make X part of the encryption
process somehow and interate until we find a value which results in the key
check being passed (ie try X=0, then X=1, then X=2, until we pass the key
check).  However this makes the above collision-avoidance impossible, and isn't
terribly fast.
 
A comparison of variant 2a and 2b:
 
      2a                      2b
 
    X clear                 X clear
    IV, E( Y )                  IV, E( Y ), check
        check                   IV, E( Y ), check
        check                   IV, E( Y ), check
        check
 
Which is better?  Size isn't much of a consideration, 100 shares are only about
12K with the 2b variant.  We can change 2b to create a 2c share record which is:
 
    X, IV, E( Y ), check
 
and scatter them at random throughout the share file, but is this useful?  This
variant is conceptually nicer since each share is now a nicely-packaged
seperate entity.
 
 
Duress/junk share generation:
 
To get a valid duress/junk IV/key combination which won't mistakenly decrypt
with a genuine key:
 
    for each duress/junk share
        encrypt with duress/junk key
        decrypt with real key
        if( shareNo is valid )
            increment IV, try again;
 
>> Problem: You need to know the genuine encryption key to generate duress/junk
            shares.
 
>> Problem: If two groups of duress shares clash, an attacker knows that both
            groups are just duress/junk shares, therefore we need to have *no*
            clashes at all.  Unfortunately adding any new shares then requires
            that the keys for all previous shares be known to ensure that there
            are no inadvertant decrypts of one share group with the key for
            another share group.
 
A very serious problem here is how to generate duress/junk shares which don't
clash with real shares quickly and efficiently.


home help back first fref pref prev next nref lref last post