[38894] in Kerberos

home help back first fref pref prev next nref lref last post

Concurrency issues with FILE ccache

daemon@ATHENA.MIT.EDU (Osipov, Michael (LDA IT PLM))
Tue Apr 6 11:51:52 2021

To: <kerberos@mit.edu>
From: "Osipov, Michael (LDA IT PLM)" <michael.osipov@siemens.com>
Message-ID: <9887c49f-b7ed-d83e-a190-b7398d5290d8@siemens.com>
Date: Tue, 6 Apr 2021 17:48:37 +0200
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: kerberos-bounces@mit.edu

Hi,

we do experience some weird concurrency issues with FILE: based 
credential caches.
One Python application uses tens (mostly 16 to 24) of concurrent threads 
to access resources via py-requests and py-requests-gssapi on top of 
Debian 10 with MIT Kerberos 1.17 (GitLab Runner) and FreeBSD 12-STABLE 
with MIT Kerberos 1.19.1 (my dev box). GSS context is maintained per 
thread/request rather than using Request's Session object.
The initiator is a service keytab from KRB5_CLIENT_KTNAME, for testing 
purposes I do use kinit with my personal AD account, but the outcome is 
the same. On both platforms I see output from klist like this:
> Ticketzwischenspeicher: FILE:/tmp/krb5cc_722
> Standard-Principal: osipovmi@EXAMPLE.COM
> 
> Valid starting       Expires              Service principal
> 06.04.2021 17:18:38  07.04.2021 03:18:38  krbtgt/EXAMPLE.COM@EXAMPLE.COM
>         erneuern bis 07.04.2021 17:18:35
> 06.04.2021 17:18:42  07.04.2021 03:18:38  HTTP/hostname.EXAMPLE.COM@EXAMPLE.COM
> 06.04.2021 17:18:42  07.04.2021 03:18:38  HTTP/hostname.EXAMPLE.COM@EXAMPLE.COM
> 06.04.2021 17:18:42  07.04.2021 03:18:38  HTTP/hostname.EXAMPLE.COM@EXAMPLE.COM
> 06.04.2021 17:18:42  07.04.2021 03:18:38  HTTP/hostname.EXAMPLE.COM@EXAMPLE.COM
> 06.04.2021 17:18:42  07.04.2021 03:18:38  HTTP/hostname.EXAMPLE.COM@EXAMPLE.COM

On Debian even output like:

> gssapi.raw.misc.GSSError: Major (851968): Unspecified GSS failure.  Minor code may provide more information, Minor (100001): Failed to store credentials: Internal credentials cache error (filename: /tmp/krb5cc_1000)

Which leads me to the fact that TGT and service ticket acquisition or 
the read/write to the FILE cache is not R/W locked.
For testing purposes I have modified py-requests-gssapi in such a way 
that threading.Lock() is used around the first SecurityContext step call 
because here is the cache set up and I do see in klist output only *one* 
service ticket and the rest goes on smoothly.
I have also considered to use
> gssapi.raw.acquire_cred_from(store={b"ccache":b"FILE:/tmp/<app>_<threadname>", b"client_keytab":b"/path/to/service.keytab"}, usage='initiate')
but I'd like to avoid adding even more code to fix a symptom not the cause.

I have also compared KRB5_TRACE output from the original and patched 
version of py-requests-gssapi and one can clearly see that in the former 
-- due to race conditions .. every thread tries to retrieve a TGT and a 
service ticket while the latter (patched) does it only once.

What is the general advise here? Is any of the caches threadsafe because 
none is documented either way?

Regards,

Michael
________________________________________________
Kerberos mailing list           Kerberos@mit.edu
https://mailman.mit.edu/mailman/listinfo/kerberos

home help back first fref pref prev next nref lref last post