[38900] in Kerberos

home help back first fref pref prev next nref lref last post

Re: Concurrency issues with FILE ccache

daemon@ATHENA.MIT.EDU (Greg Hudson)
Fri Apr 9 14:26:39 2021

To: "Osipov, Michael (LDA IT PLM)" <michael.osipov@siemens.com>,
        <kerberos@mit.edu>
From: Greg Hudson <ghudson@mit.edu>
Message-ID: <a9f9a215-0409-5b5f-dc10-6f283bd86a1f@mit.edu>
Date: Fri, 9 Apr 2021 14:24:04 -0400
MIME-Version: 1.0
In-Reply-To: <283ec56f-84fc-bd5c-43c6-773202505e38@siemens.com>
Content-Language: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: kerberos-bounces@mit.edu

On 4/9/21 11:35 AM, Osipov, Michael (LDA IT PLM) wrote:
> I am quite sure that this is a race condition where stat() is performed,
> file does not exist, open() with write is performed, in parallel it is
> already created and the later call returns in EEXIST.

I agree, except I think it's just unlink() and open(O_CREAT|O_EXCL)
calls with no stat().  I had erroneously assumed that the unexpected
error was happening inside fcc_store() because of "Failed to store
credentials" in the message, but that string turns out to be from
get_in_tkt.c in a block of code that also calls krb5_cc_initialize().

The fcc_initialize() EEXIST self-race has existed since 1.0.  I'd
speculate that the original developers' assumption was that lots of
processes might be competing to use a file ccache, but that creating
ccaches would be a rare and one-at-a-time affair (happening at login or
when a user runs "kinit").  With client keytab support, that is no
longer the case; it's easy to have multiple threads or processes
competing to create or refresh a cache as part of gss_acquire_cred() or
gss_init_sec_context().

Just fixing the fcc_initialize() race wouldn't really solve the problem;
there would still be a window between krb5_cc_initialize() and
krb5_cc_store_cred() where other threads (or processes) would see an
initialized cache with no TGT in it, and would fail the
gss_init_sec_context() call.  This ticket describes that problem and
some possible solutions:

  https://krbdev.mit.edu/rt/Ticket/Display.html?id=7707

Heimdal has implemented option 5.  I'm not wild about it and it won't
work with other ccache types, but it's a working stopgap and it can
always be backed out in favor of a different solution later.
________________________________________________
Kerberos mailing list           Kerberos@mit.edu
https://mailman.mit.edu/mailman/listinfo/kerberos

home help back first fref pref prev next nref lref last post