﻿Secure Remote Authentication Using
Biometric Data
Xavier Boyen 1 , Yevgeniy Dodis 2⋆ , Jonathan Katz 3⋆⋆ , Rafail Ostrovsky 4⋆ ⋆ ⋆ ,
and Adam Smith 5
1 Voltage, Inc., xb@boyen.org
2 New York University, dodis@cs.nyu.edu
3 University of Maryland, jkatz@cs.umd.edu
4 UCLA, rafail@cs.ucla.edu
5 Weizmann Institute, asmith@csail.mit.edu
Abstract. Biometric data offer a potential source of high-entropy, secret
information that can be used in cryptographic protocols provided two
issues are addressed: (1) biometric data are not uniformly distributed;
and (2) they are not exactly reproducible. Recent work, most notably
that of Dodis, Reyzin, and Smith, has shown how these obstacles may
be overcome by allowing some auxiliary public information to be reliably
sent from a server to the human user. Subsequent work of Boyen has
shown how to extend these techniques, in the random oracle model, to
enable unidirectional authentication from the user to the server without
the assumption of a reliable communication channel.
We show two efficient techniques enabling the use of biometric data to
achieve mutual authentication or authenticated key exchange over a completely
insecure (i.e., adversarially controlled) channel. In addition to
achieving stronger security guarantees than the work of Boyen, we improve
upon his solution in a number of other respects: we tolerate a
broader class of errors and, in one case, improve upon the parameters of
his solution and give a proof of security in the standard model.
1 Using Biometric Data for Secure Authentication
Biometric data, as a potential source of high-entropy, secret information, have
been suggested as a way to enable strong, cryptographically-secure authentication
of human users without requiring them to remember or store traditional
cryptographic keys. Before such data can be used in existing cryptographic protocols,
however, two issues must be addressed: first, biometric data are not uniformly
distributed and hence do not offer provable security guarantees if used
⋆
Supported by NSF CAREER award 0133806 and Trusted Computing grant 0311095.
⋆⋆
Supported by NSF CAREER award 0447075 and Trusted Computing grants 0310751
and 0310499.
⋆ ⋆ ⋆
Supported in part by a gift from Teradata, an Intel equipment grant, an OKAWA
research award, and an NSF Cybertrust grant
as is, say, as a key for a pseudorandom function. While the problem of nonuniformity
can be addressed using a hash function, viewed either as a random
oracle [2] or a strong extractor [20], a second and more difficult problem is that
biometric data are not exactly reproducible, as two biometric scans of the same
feature are rarely identical. Thus, traditional protocols will not even guarantee
correctness when the parties use a shared secret derived from biometric data.
Much work has focused on addressing these problems in efforts to develop secure
techniques for biometric authentication [8, 15, 19, 14, 22, 21]. Most recently,
Dodis, Reyzin, and Smith [9] showed how to use biometric data to securely derive
cryptographic keys which could then be used, in particular, for the purposes
of authentication. Roughly speaking (see Section 2 for formal definitions), they
introduce two primitives: a secure sketch which allows recovery of a shared secret
given a close approximation thereof, and a fuzzy extractor which extracts a uniformly
distributed string s from this shared secret in an error-tolerant manner.
Both primitives work by constructing a “public” string pub which is stored by
the server and transmitted to the user; loosely speaking, pub encodes the redundancy
needed for error-tolerant reconstruction. The primitives are designed so
as to be “secure” even when an adversary learns the value of this public string.
Unfortunately, although these primitives suffice to obtain security in the presence
of an eavesdropping adversary who learns pub as it is sent to the user, the
work of Dodis et al. does not address the issue of malicious modification of pub.
As a consequence, their work does not provide a method for secure authentication
in the presence of an active adversary who may modify the messages sent
between the server and the user. Indeed, depending on the specific sketch or
fuzzy extractor being utilized, an adversary who maliciously alters the public
string sent to a user may be able to learn that user’s biometric data in its entirety.
A “solution” is for the user to store pub himself rather than obtain it from
the server (or to authenticate pub using a certificate chain), but this defeats the
purpose of using biometric data in the first place: namely, to avoid the need
for the user to store any additional cryptographic information — even if that
information need not be kept secret.
Boyen [5], inter alia, partially addresses potential adversarial modification
of pub (although his work focuses primarily on the orthogonal issue of re-using
biometric data with multiple servers, which we do not explicitly address here).
The main drawback of his technique in our context is that it provides only unidirectional
authentication from the user to the server. Indeed, Boyen’s approach
cannot be used to achieve authentication of the server to the user since his definition
of “insider security” (cf. [5, Section 5.2]) does not preclude an adversary
from knowing the (incorrect) value s ′ of the shared secret recovered by the user
when the adversary forwards a specially crafted pub ′ to this user; if the adversary
knows s ′ , then from the viewpoint of the user the adversary can do anything the
server could do, and hence authentication of the server to the user is impossible.
The lack of mutual authentication implies that — when communicating over
an insecure network — the user and server cannot securely establish a shared
session key with which to encrypt and authenticate future messages: the user
may unwittingly share a key with an adversary who can then decrypt any data
sent by that user as well as authenticate arbitrary data.
1.1 Our Contributions
In this paper, we provide the first full solution to the problem of secure remote
authentication using biometric data 1 : in particular, we show how to achieve
mutual authentication and/or authenticated key exchange over a completely
insecure channel. We offer two constructions. The first one is a generic solution
which protects against modification of the public value pub in any context in
which secure sketches or fuzzy extractors are used; thus, this solution serves as
a drop-in replacement that “compiles” any protocol which is secure when pub is
assumed to be transmitted reliably into one which is secure even when pub might
be tampered with (we do not formalize this notion of “compilation”, but rather
view it as an intuitive way to understand our results). Our second construction
is specific to the settings of remote authentication and key exchange, where it
offers some improvements to the generic solution.
Compared with the work of Boyen [5], which was mostly concerned with the
re-usability of biometrics, our constructions enjoy the following key advantages:
– Both of our solutions tolerate a stronger class of errors. In particular, Boyen’s
work only allows for data-independent errors, whereas our analysis handles
arbitrary (but bounded) errors. We remark that small yet data-dependent
errors seem natural in the context of biometric data.
– Our second solution is proven secure in the standard model.
– Our second solution achieves improved bounds on the entropy loss, on the
order of 128 bits of entropy for practical choices of the parameters. This point
is particularly important since the entropy of certain biometric features is
roughly this order of magnitude (e.g., 173–250 bits for an iris scan [8, 13]).
Organization. We review some basic definitions as well as the sketches/fuzzy
extractors of Dodis et al. [9] in Section 2. In Section 3 we introduce the notion of
robust sketches/fuzzy extractors which are resilient to modification of the public
value, and can be used as a generic replacement for sketches/fuzzy extractors in
any application. Our second solution, which is specific to the problem of using
biometric data for authentication and offers some advantages with respect to
our generic construction, is described in Section 4.
2 Definitions
Unless explicitly stated otherwise, all logarithms are base 2. We let Uℓ denote
the uniform distribution over ℓ-bit strings. A metric space (M, d) is a finite
set M equipped with a symmetric distance function d : M × M → Z + ∪ {0}
1 Of course, our techniques are applicable to any scenario which relies on secret data
that, like biometric data, are non-uniform and/or not exactly reproducible.
satisfying the triangle inequality and such that d(x, y) = 0 ⇔ x = y. (All metric
spaces considered in this work are discrete, and the distances integer-valued.)
For our application, we assume that the format of the biometric data is such
that it forms a metric space under some appropriate distance function. We will
not need to specify any particular metric space in our work, as our results build
in a generic way on earlier sketch and fuzzy extractor constructions over any
such space (e.g., those constructed in [9] for a variety of metrics).
A probability space (Ω, P ) is a finite set Ω and a function P : Ω → [0, 1] such
that �
ω∈Ω
P (ω) = 1. A random variable W defined over the probability space
(Ω, P ) and taking values in a set M is a function W : Ω → M. If (Ω, P ) is a
probability space over which two random variables W and W ′ taking values in a
metric space (M, d) are defined, then we say that d(W, W ′ ) ≤ t if for all ω ∈ Ω
it holds that d(W (ω), W ′ (ω)) ≤ t.
Given a metric space (M, d) and a point x ∈ M we define
Vol M t (x) def
= |{x ′ ∈ M | d(x, x ′ ) ≤ t}| , Vol M t
def
= max
x∈M {VolM t (x)}.
The latter is the maximum number of points in any “ball” of radius t in (M, d).
Following [9], for a pair of random variables A and B, we define the minentropy
H∞(A) of A, and the average min-entropy of A given B, as
H∞(A) = − log(max
a Pr[A = a]), H∞(A|B) ¯ def
= − log(Expb←B[2 −H∞(A|B=b) ]).
The statistical difference between random variables A and B taking values in
the same set D is defined as SD(A, B) def
= 1
2
2.1 Secure Sketches and Fuzzy Extractors
�
d∈D |Pr[A = d] − Pr[B = d]|.
We review the definitions from [9] using slightly different terminology. Recall
from the introduction that a secure sketch provides a way to recover a shared
secret w from any value w ′ which is a “close” approximation of w. More formally:
Definition 1. An (m, m ′ , t)-secure sketch over a metric space (M, d) comprises
a sketching procedure SS : M → {0, 1} ∗ and a recovery procedure Rec, where:
(Security) For all random variables W over M such that H∞(W ) ≥ m, we
have ¯ H∞(W | SS(W )) ≥ m ′ .
(Error tolerance) For all pairs of points w, w ′ ∈ M with d(w, w ′ ) ≤ t, it holds
that Rec(w ′ , SS(w)) = w. ♦
While secure sketches address the issue of error correction, they do not address
the issue of the possible non-uniformity of W . Fuzzy extractors, defined next,
correct for this.
Definition 2. An (m, ℓ, t, δ)-fuzzy extractor over a metric space (M, d) comprises
a (randomized) extraction algorithm Ext : M → {0, 1} ℓ × {0, 1} ∗ and a
recovery procedure Rec such that:
(Security) For all random variables W over M that satisfy H∞(W ) ≥ m, if
〈R, pub〉 ← Ext(W ) then SD(〈R, pub〉, 〈Uℓ, pub〉) ≤ δ.
(Error tolerance) For all pairs of points w, w ′ ∈ M with d(w, w ′ ) ≤ t, if
〈R, pub〉 ← Ext(w) then it is the case that Rec(w ′ , pub) = R. ♦
As shown in [9, Lemma 3.1], it is easy to construct a fuzzy extractor over a metric
space (M, d) given any secure sketch defined over the same space, by applying
a strong extractor [20] using a random “key” which is then included as part of
pub. Starting with an (m, m ′ , t)-secure sketch and with an appropriate choice of
extractor, this transformation yields an (m, m ′ − 2 log( 1
δ ), t, δ)-fuzzy extractor.
2.2 Modeling Error in Biometric Applications
As error correction is a key motivation for our work, it is necessary to develop
some formal model of the types of errors that may occur. In prior work by
Boyen [5], the error in various biometric readings was assumed to be under
adversarial control, with the restriction that the adversary could only specify
data-independent errors (e.g., constant shifts, permutations, etc.). It is not clear
that this is a realistic model in practice, as one certainly expects, say, portions of
the biometric data where “features” are present to be more susceptible to error.
Here, we consider a much more general error model where the errors may be
data-dependent and hence correlated not only with each other but also with the
biometric secret itself. Furthermore, as we are ultimately interested in modeling
“nature” — as manifested in the physical processes that cause fluctuations in the
biometric measurements — we do not even require that the errors be efficiently
computable (although we will impose this requirement in Section 4). The only
restriction we make is that the errors be “small” and, in particular, less than
the desired error-correction bound; since the error-correction bound in any realworld
application should be selected to ensure correctness with high probability,
this restriction seems reasonable. Formally:
Definition 3. A t-bounded distortion ensemble W = {Wi}i=0,... is a sequence
of random variables Wi : Ω → M such that for all i we have d(W0, Wi) ≤ t. ♦
For our application, W0 represents the biometric reading obtained when a
user initially registers with a server, and Wi represents the biometric reading
on the i th authentication attempt by this user. Note that, regardless of the
protocol used, an adversary can always impersonate the server if the adversary
can guess Wi for some i > 0. The following lemmas give bound the probability
of this occurrence. First, we show that the min-entropy of each Wi is, at worst,
log(Vol M t ) bits less than that of W0. Moreover, we show that Wi is no easier to
guess than W0 when SS(W0) is available.
Lemma 1. Let W0, W1 be random variables over M satisfying d(W0, W1) ≤ t,
and let B be an arbitrary random variable. Then
¯H∞(W1 | B) ≥ ¯ H∞(W0 | B) − log Vol M t .
Proof. Fix x ∈ M and any outcome B = b. Since d(W0, W1) ≤ t, we have
Pr[W1 = x | B = b] ≤ �
x ′ |d(x,x ′ )≤t Pr[W0 = x ′ | B = b] ≤ Vol M t · 2 −H∞(W0|B=b) ,
which means that H∞(W1 | B = b) ≥ H∞(W0 | B = b) − log Vol M t . Since this
relation holds for every b, the lemma follows. ⊓⊔
Secure sketches imply the following, stronger form of Lemma 1 which essentially
states that points close to W0 cannot be easier to guess than W0 if the value of
the sketch SS(W0) is known.
Lemma 2. Let W0, W1 be random variables over M satisfying d(W0, W1) ≤ t,
and let B be an arbitrary random variable. Let (SS, Rec) be a (⋆, ⋆, t)-secure
sketch. Then
¯H∞(W1 | SS(W0), B) ≥ ¯ H∞(W0 | SS(W0), B).
Proof. Notice that since d(W0, W1) ≤ t, we have Rec(W1, SS(W0)) = W0, which
means that if for some x, b, pub we have Pr(W1 = x | SS(W0) = pub, B = b) ≥ α,
then Pr(W0 = Rec(x, pub) | SS(W0) = pub, B = b) ≥ α as well. Since this holds
for all x, b and pub, the lemma follows. ⊓⊔
The analogue of Lemma 2 for fuzzy extractors holds as well (with SS(W0) replaced
by pub).
3 Robust Sketches and Fuzzy Extractors
Recall that a secure sketch, informally speaking, takes a secret w and returns
some value pub which allows the recovery of w given any “close” approximation
w ′ of w. When pub is transmitted to a user over an insecure network, however,
an adversary might modify pub in transit. In this section, we define the notion
of a robust sketch which protects against this sort of attack in a very strong way:
with high probability, the user will detect that pub has been modified and can
thus immediately abort in this case. A robust fuzzy extractor is defined similarly.
We then show: (1) a construction of a robust sketch in the random oracle model,
starting from any secure sketch; and (2) a conversion from any robust sketch
to a robust fuzzy extractor; this conversion does not require random oracles.
We conclude this section by showing the immediate application of robust fuzzy
extractors to the problem of mutual authentication.
We first define a slightly stronger notion of a secure sketch:
Definition 4. An (m, m ′ , t)-secure sketch (SS, Rec) is said to be well-formed
if it satisfies the conditions of Definition 1 except for the following modifications:
(1) Rec may now return either an element in M or the distinguished
symbol ⊥; and (2) for all w ′ ∈ M and arbitrary pub ′ , if Rec(w ′ , pub ′ ) �=⊥ then
d(w ′ , Rec(w ′ , pub ′ )) ≤ t. ♦
It is straightforward to transform any secure sketch (SS, Rec) into a well-formed
secure sketch (SS, Rec ′ ): Rec ′ runs Rec and then verifies that its output w is
within distance t of the input w ′ . If yes, it outputs w; otherwise, it outputs ⊥.
We now define the notion of a robust sketch:
Definition 5. Given algorithms (SS, Rec) and random variables W = {W0, W1,
. . ., Wn} over metric space (M, d), consider the following game between an adversary
A and a challenger: Let w0 (resp., wi) be the value assumed by W0
(resp., Wi). The challenger computes pub ← SS(w0) and gives pub to A. Next,
for i = 1, . . . , n, the adversary A outputs a “challenge” pub i �= pub and is given
Rec(wi, pub i) in return. If there exists an i such that Rec(wi, pub i) �=⊥ we say
that the adversary succeeds and this event is denoted by Succ.
We say that (SS, Rec) is an (m, m ′′ , n, ε, t)-robust sketch over (M, d) if it is a
well-formed (m, ⋆, t)-secure sketch and: (1) for all t-bounded distortion ensembles
W with H∞(W0) ≥ m and all adversaries A we have Pr[Succ] ≤ ε; and (2) the
average min-entropy of W0, conditioned on the entire view of A throughout the
above game, is at least m ′′ . 2 ♦
A simpler definition would be to consider only random variables {W0, W1} and
to have A only output a single value pub1 �= pub. A standard hybrid argument
would then imply the above definition with ε increased by a multiplicative
factor of n. We have chosen to work with the more general definition above
as it potentially allows for a tighter concrete security analysis. Also, although
the above definition allows all-powerful adversaries, we will consider adversaries
whose queries to a random oracle are bounded (but which are otherwise computationally
unbounded). We remark that for a truly unbounded adversary (i.e.,
where even the oracle queries — if any — are unbounded), it is necessarily the
case that m ′′ ≥ log 1
ε since at the last step of the game the adversary can guess
Wn with probability 2−m′′ and thus succeed with probability ε ≥ 2−m′′ .
3.1 Constructing a Generic Robust Sketch
Let H : {0, 1} ∗ → {0, 1} k be a hash function. We construct a robust sketch
(SS, Rec) from any well-formed secure sketch (SS ∗ , Rec ∗ ) as follows:
SS(w)
pub ∗ ← SS ∗ (w)
h = H(w, pub ∗ )
return pub = 〈pub ∗ , h〉
Rec(w, pub = 〈pub ∗ , h〉)
w ′ = Rec ∗ (w, pub ∗ )
if w ′ =⊥ output ⊥
if H(w ′ , pub ∗ ) �= h output ⊥
otherwise, output w ′
Theorem 1. If (SS ∗ , Rec ∗ ) is a well-formed (m, m ′ , t)-secure sketch over metric
space (M, d) and H : {0, 1} ∗ → {0, 1} k is a random oracle, then (SS, Rec) is an
(m, m ′′ , n, ε, t)-robust sketch over (M, d) for any adversary making at most qH
queries to H, where
ε = (q 2 H + n) · 2 −k + (3qH + 2n · Vol M t ) · 2 −m′
,
m ′′ = − log ε.
When k ≥ m ′ + log qH (which can be enforced in practice), the above simplifies
to ε ≤ (4qH + 2n · Vol M t ) · 2 −m′
and m ′′ ≥ m ′ − log(4qH + 2n · Vol M t ).
2 In particular, this implies that (SS, Rec) is an (m, m ′′ , t)-secure sketch.
Proof (Sketch) It is easy to see that (SS, Rec) is an (m, ⋆, t)-secure sketch
and thus we only need to prove the latter two conditions of Definition 5. In
order to provide intuition, the following proof is somewhat informal; however,
the arguments given here can easily be formalized. Let pub = 〈pub ∗ , h〉 denote
the value output by SS in an execution of the game described in Definition 5.
Note that if A ever outputs pubi = 〈pub ∗ i , hi〉 with pub ∗ i = pub ∗ then the response
is always ⊥, since then we must have hi �= h and so Rec will output ⊥. Thus, we
simply assume that pub ∗ i �= pub ∗ .
Fix a t-bounded distortion ensemble {W0, W1, . . . , Wn} with H∞(W0) ≥ m.
For any output pubi = 〈pub ∗ i , hi〉 of A, define the random variable W ′ def
i =
Rec ∗ (Wi, pub ∗ i ). In order not to complicate notation, we define
H∞(W ′ i ) def
�
= − log max
x∈M Pr[W ′ �
i = x] ;
i.e., we ignore the probability that W ′ i =⊥ since A does not succeed in this case.
¯H∞(W ′ i | X) for a random variable X is defined similarly. Let w0, wi, and w ′ i
denote the values taken by the random variables W0, Wi, and W ′ i , respectively.
We classify the random oracle queries of A into two types: type 1 queries are
those of the form H(·, pub ∗ ), and type 2 queries are all the others. Informally,
type 1 queries represent attempts by A to learn the value of w0; in particular,
if A finds w such that H(w, pub ∗ ) = h then it is “likely” that w0 = w. Type 2
queries represent attempts by A to determine an appropriate value for some hi;
i.e., if A “guesses” that w ′ i = w for a particular choice of pub∗i then a “winning”
strategy is for A to obtain hi = H(w, pub ∗ i ) and output pubi = 〈pub ∗ i , hi〉.
Without loss of generality, we assume that A makes all its type 1 queries first,
then makes all its type 2 queries, and finishes by making its n queries to the
challenger (cf. Definition 5) in parallel (i.e., non-adaptively). The validity of the
assumption on the ordering of the type 1 and type 2 queries follows essentially
from the analysis that follows. The assumption that all queries to the random
oracle are made before any queries to the challenger is justified by the observation
that if Rec(Wi, pubi) �=⊥ then the adversary has already succeeded, in which case
we can end the game, whereas the only remaining response Rec(Wi, pubi) =⊥
can be simulated by the adversary itself. This also justifies why we may assume
that the adversary’s challenges are made in parallel.
Let Q1 (resp., Q2) be a random variable denoting the sequence of type 1
(resp., type 2) queries made by A and the corresponding responses, and let q1
(resp., q2) denote the value assumed by Q1 (resp., Q2). For some fixed value of
def
pub, define γpub = H∞(W0|pub). Notice, since (SS ∗ , Rec ∗ ) is an (m, m ′ , t)-secure
sketch, we have Exppub[2−γpub −m ] ≤ 2 ′
. Now, define γ ′ def
= H∞(W0 | pub, q1),
pub,q1
and let us call the value q1 “bad” if γ ′ pub,q1 ≤ γpub − 1. We consider two cases: If
2γpub ≤ 2qH we will not have any guarantees, but using Markov’s inequality we
have Pr[2γpub ≤ 2qH] = Pr[2−γpub −m ≥ 2 ′
· (2m′ /2qH)] ≤ 2qH · 2−m′ . Otherwise, if
2γpub > 2qH, we observe that the type 1 queries of A may be viewed as guesses
of w0. In fact, it is easy to see that we only improve the success probability
of A if in response to a type 1 query of the form H(w, pub ∗ ) we simply tell A
whether w0 = w or not. 3 It is immediate that A learns the correct value of w0
with probability at most qH ·2 −γpub . Moreover, when this does not happen, A has
eliminated at most qH ≤ 2 γpub /2 (out of at least 2 γpub ) possibilities for w0, which
means that γ ′ pub,q1 ≥ γpub − 1, or in other words that q1 is “good”. Therefore,
the probability that q1 is “bad” in this second case is at most qH · 2 −γpub .
Combining the above two arguments, we see that
Exp pub[Pr[q1 bad]] ≤ Prpub[2 γpub ≤ 2qH] + Exp pub[qH · 2 −γpub ]
Next, define γ ′′
pub,q1
≤ 2qH · 2 −m′
+ qH · 2 −m′
= 3qH · 2 −m′
. (1)
def
= mini(H∞(W ′ i | pub, q1)). Recall that {W0, W1, . . .} is a
t-bounded distortion ensemble which means d(W0, Wi) ≤ t. Furthermore, since
(SS ∗ , Rec ∗ ) is well-formed, {Wi, W ′ i } is also a t-bounded distortion ensemble4
regardless of pub ∗ i , which means d(Wi, W ′ i ) ≤ t. Applying Lemma 2 on {W0, Wi}
(noticing that pub contains pub ∗ ), followed by Lemma 1 on {Wi, W ′ i }, we have
γ ′′
pub,q1 ≥ min
i (H∞(Wi | pub, q1)) − log Vol M t ≥ γ ′ pub,q1 − log VolMt . (2)
We now consider the type 2 queries made by A. Clearly, the answers to these
queries do not affect the conditional min-entropies of W ′ i (since these queries do
not include pub ∗ ), so the best probability for the attacker to predict any of the
W ′ i
pub,q is still given by 2−γ′′ 1 , for fixed pub and q1. Assume for a moment that
there are no collisions in the outputs of any of the adversary’s random oracle
queries, and consider the adversary’s i th query 〈pub ∗ i , hi〉 to the challenger. The
probability that this query is “successful” is at most the probability that A asked
a type 2 query of the form H(w ′ i , ·) for the correct w′ i plus the probability that
such a query was not asked, yet A nevertheless managed to predict the value
H(w ′ i , pub∗i ). Clearly, the second case happens with probability at most 2−k . As
for the first case, for any hi there is at most one w for which H(w, ·) = hi, since,
by assumption, there are no collisions in these type 2 queries. Thus, the adversary
succeeds on its ith query if this w is equal to the correct value w ′ i . By what we just
argued, the probability that this occurs is at most 2 −γ′′
pub,q 1 , irrespective of pub ∗ i .
Therefore, assuming no collisions in type 2 queries, the success probability of A
in any one of its n parallel queries is at most n · (2 −γ′′
pub,q 1 + 2 −k ). Furthermore,
by the birthday bound the probability of a collision is at most q2 H /2k . Therefore,
, we find
conditioned on pub and q1 and for the corresponding value of γ ′′
pub,q1
that Pr[Succ | pub, q1] ≤ n · 2 −γ′′ pub,q1 + (q2 H + n) · 2−k .
To conclude, the adversary’s overall probability of success is thus bounded
by the expectation, over pub and q1, of this previous quantity; that is:
Pr[Succ] = Exp pub,q1 [Pr[Succ | pub, q1]]
3 This has no effect when H(w, pub ∗ ) �= h as then A learns anyway that w �= w0. The
modification has a small (but positive) effect on the success probability of A when
H(w, pub ∗ ) = h since this fact by itself does not definitively guarantee that w = w0.
4 This ignores the case when W ′ i =⊥; see the definition of H∞(W ′ i ) given earlier.
≤ (q 2 H + n) · 2 −k
⎡
+ Exp pub
⎣ Pr [q1 bad | pub] +
q1←Q1
�
n · 2
q1 good
−γ′′
pub,q 1 · Pr[Q1 = q1 | pub]
Using Equation 2, we see that 2 −γ′′
pub,q 1 ≤ Vol M t · 2 −γ′
pub,q 1 . Moreover, for good q1
we have γ ′ pub,q1 ≥ γpub − 1, which means that 2 −γ′′
pub,q 1 ≤ 2 · Vol M t · 2 −γpub . Finally,
using Equation 1, we have Exppub[Pr[q1 bad | pub]] ≤ 3qH · 2−m′ . Combining all
these, we successively derive:
Pr[Succ] ≤ (q 2 H + n) · 2 −k + 3qH · 2 −m′
�
+ Exppub 2n · Vol M t · 2 −γpub
�
· Pr [q1 good]
q1←Q1
≤ (q 2 H + n) · 2 −k + 3qH · 2 −m′
+ 2n · Vol M t · Exppub ≤ (q 2 H + n) · 2 −k + (3qH + 2n · Vol M t ) · 2 −m′
= ε.
� �
−γpub 2
As for the claimed value of m ′′ , we omit the details since they follow almost
the same argument. As above, assuming that q1 is good, that no collisions occur
in type 2 queries, and that the adversary does not manage to guess any of the
values H(w ′ i , pub∗i ), the conditional min-entropy of W0 is at least γ ′ − log(n ·
pub,q1
Vol M t ) ≥ γpub −1−log(n·Vol M t ). On the other hand, all these bad events leading
to a possibly smaller min-entropy of W0 happen with (expected) probability (over
pub) at most (q2 H + n) · 2−k + 3qH · 2−m′ . From this, it is easy to see that if View
represents the adversary’s view in the experiment, then
�
¯H∞(W0|View) ≥ − log (q 2 H + n) · 2 −k + 3qH · 2 −m′
(3)
+ Exp pub
≥ − log
�
2n · Vol M t · 2 −γpub
��
�
(q 2 H + n) · 2 −k + (3qH + 2n · Vol M t ) · 2 −m′�
⎤
⎦ .
= m ′′ .
We remark that the above proof uses only a non-programmable random oracle.
The bound on ε that we�derive in the above proof has an intuitive interpretation.
The sub-expression qH + n · Vol M �
t · 2−m′ that appears (up to a small
constant factor due to the analysis) can be viewed as the probability that the
adversary “gets information” about the point w0: The contribution qH · 2−m′ is due to the type 1 oracle queries where, for each of at most qH queries, the
adversary “hits” the correct value of w0 with probability 2−m′ . Then, each of
the adversary’s n challenges cover no more than Vol M t candidates for w0, since
each such query eliminates at most one value for w ′ i (unless collisions in type 2
queries occur), which in turn eliminates up to Vol M t candidates for wi, each of
which can only eliminate one candidate Rec(wi, pub ∗ ) for w0. Besides the above,
the other contributions to ε are due to the probability of collisions in the random
⊓⊔
oracle, plus a small term to account for the possibility that the adversary can
guess the output of the random oracle at an unqueried point.
In practice, one can set k large enough so that max(qH, n · Vol M t ) is the
dominant factor determining the amount of the additional “loss” incurred as
compared to regular “non-robust” sketches.
3.2 From Robust Sketches to Robust Fuzzy Extractors
In a manner exactly analogous to the above, we may define the notion of a robust
fuzzy extractor. We include the definition here since we refer to it in the next
subsection:
Definition 6. Given algorithms (Ext, Rec) and random variables W = {W0,
W1, . . ., Wn} over a metric space (M, d), consider the following game between
an adversary A and a challenger: Let w0 (resp., wi) be the value assumed by
W0 (resp., Wi). The challenger computes (R, pub) ← Ext(w0) and gives pub to
A. Next, for i = 1, . . . , n, the adversary A outputs pub i �= pub and is given
Rec(wi, pub i) in return. If there exists an i such that Rec(wi, pub i) �=⊥, we say
the adversary succeeds and this event is denoted by Succ.
We say (Ext, Rec) is an (m, ℓ, n, ε, t, δ)-robust fuzzy extractor over (M, d) if
the following hold for all t-bounded distortion ensembles W with H∞(W0) ≥ m:
(Robustness) For all adversaries A, it holds that Pr[Succ] ≤ ε.
(Security) Let View denote the entire view of A at the conclusion of the above
game. Then, SD(〈R, View〉, 〈Uℓ, View〉) ≤ δ.
(Error-tolerance) For all w ′ with d(w0, w ′ ) ≤ t, we have Rec(w ′ , pub) = R. ♦
By applying a pairwise-independent hash function (i.e., a strong extractor) almost
exactly as in [9, Lemma 3.1], we can convert any robust sketch to a robust
fuzzy extractor in the standard model (losing, as there, 2 log δ−1 bits of entropy).
A subtlety is that we need to “bind” the hash function key to the sketch itself.
We do this using a primitive we call a labeled robust sketch which, in a nutshell,
defines a family of robust sketches indexed by a label. (For example, in
the specific construction of robust sketches from the previous section labels can
be incorporated by including the label as the input to the hash function which
is modeled as a random oracle.) By using the key to the hash function as a
label, we can bind the key to the sketch as needed. Details will appear in the
full version.
We remark that if we are content to assume a random oracle G (as we anyway
only know how to construct robust sketches in the random oracle model), we can
trivially “extract” from a random variable w by computing G(w). This has the
advantage of not losing 2 log δ−1 bits of entropy when extracting. In this case,
we achieve δ ≤ qG · 2−m′′ , where qG is the number of queries to G and m ′′ is as
in Theorem 1.
3.3 Application to Secure Authentication
The application of a robust fuzzy extractor to achieve mutual authentication or
authenticated key exchange over an insecure channel is immediate. Given any
secure protocol Π (say, for authenticated key exchange) based on a uniformlydistributed
shared key of length ℓ, any (m, ℓ, n, ε, t, δ)-robust fuzzy extractor
(Ext, Rec), and any source W0 with H∞(W0) ≥ m, consider the protocol Π ′
constructed as follows:
Initialization. The user samples w0 according to W0 (i.e., takes a scan of
his biometric data) and computes (R, pub) ← Ext(w0). The user registers
(R, pub) at the server.
Protocol execution. The i th time the user wants to run the protocol, the user
will sample wi according to some distribution Wi (i.e., the user re-scans
his biometric data). The server sends pub to the user, who then computes
ˆR = Ext(wi, pub). If ˆ R =⊥, the user immediately aborts. Otherwise, the
server and user execute protocol Π, with the server and the user respectively
using the keys R and ˆ R.
Assume that W = {W0, W1, . . .} is a t-bounded distortion ensemble. Correctness
of the above protocol is easily seen to hold: if the user obtains the correct value
of pub from the server then, because d(w0, wi) ≤ t, the user will recover ˆ R = R
and thus both user and server will end up using the same key R in the underlying
protocol Π. The security of Π ′ with respect to the definitions of [3, 1], which
consider an active adversary who may control all messages sent between the user
and the server, follows from the following observations:
– If the adversary forwards pub ′ �= pub to at most n different user-instances,
these instances will all abort immediately (without running Π) except with
probability at most ε. Thus, except with this probability, the adversary is
limited to forwarding the correct value of pub.
– When the adversary forwards pub unchanged, the user and server run an
execution of Π using a key R which is within statistical difference δ from a
uniformly distributed ℓ-bit key. Note that this is true even when conditioned
on the view of the adversary in sessions when it does not forward pub unchanged
(cf. Definition 6). Thus, assuming Π is secure, the adversary will
not succeed in “breaking” Π ′ in this case either.
In terms of concrete security, if the security of Π against an adversary who
executes at most n sessions with the user and the server is εΠ, then the security
of Π ′ is ε+δ+εΠ. A formal proof following the above intuition is straightforward,
and will appear in the full version of this work.
4 Improved Solution Tailored for Mutual Authentication
As discussed in the introduction, the robust sketches and fuzzy extractors described
in the previous section provide a general mechanism for dealing with
adversarial modification of the public value pub. In particular, taking any protocol
based on the secure sketches or fuzzy extractors of [9] which is secure when
the public value is assumed not to be tampered with, and plugging in a robust
sketch or fuzzy extractor, yields a protocol secure against an adversary who may
either modify the contents of the server — as in the case where the server itself
is malicious — or else modify the value of pub when it is sent to the user.
For specific problems of interest, however, it remains important to explore solutions
which might improve upon the general-purpose solution described above.
In this section, we show that for the case of mutual authentication and/or authenticated
key exchange an improved solution is indeed possible. As compared
to the generic solution based on robust fuzzy extractors (cf. Section 3.3), the
solution described here has the advantages that: (1) it is provably secure in
the standard model; and (2) it can achieve improved bounds on the “effective
entropy loss”. We provide an overview of our solution now.
Given the proof of Theorem 1, the intuition behind our current solution is
actually quite straightforward. As in that proof, let W = {W0, . . .} be a sequence
of random variables where W0 represents the initial recorded value of the user’s
biometric data and Wi denotes the ith scanned value of the biometric data. Given
a well-formed secure sketch (SS ∗ , Rec ∗ ) and a value pub ∗ i �= pub ∗ = SS ∗ (W0)
def
chosen by the adversary, let W ′ i = Rec(Wi, pub ∗ i ) and define the min-entropy
of W ′ i as in the proof of Theorem 1. At a high level, Theorem 1 follows from
the observations that: (1) the average min-entropy of W ′ i is “high” for any value
pub ∗ i ; and (2) since the adversary succeeds only if it can also output a value
hi = H(W ′ i , pub∗i ), where H is a random oracle, the adversary is essentially
′
−H∞(W unable to succeed with probability better than 2 i ) in the ith iteration.
Crucial to the proof also is the fact that, except with “small” probability, the
value h = H(W0, pub ∗ ) does not reduce the entropy of W0 “very much” (again
using the fact that H is a random oracle).
The above suggests that another way to ensure that the adversary does not
′
−H∞(W succeed with probability better than 2 i ) in any given iteration would be
to have the user run an “equality test” using its recovered value W ′ i . If this
equality test is “secure” (in some appropriate sense we have not yet defined)
then the adversary will effectively be reduced to simply guessing the value of
W ′ i , and hence its success probability in that iteration will be as claimed. Since
we have already noted that the average min-entropy of W ′ i is “high” when any
well-formed secure sketch is used (regardless of the value pub ∗ i chosen by the
adversary), this will be sufficient to ensure security of the protocol overall.
Thinking about what notion of security this “equality test” should satisfy,
one realizes that it must be secure for arbitrary distributions on the user’s secret
value, and not just uniform ones. Also, the protocol must ensure that each interaction
by the adversary corresponds to a guess of (at most) one possible value
for W ′ i . Finally, since the protocol is meant to be run over an insecure network,
it must be “non-malleable” in some sense so that the adversary cannot execute a
man-in-the-middle attack when the user and server are both executing the protocol.
Finally, the adversary should not gain any information about the user’s
true secret W0 (at least in a computational sense) after passively eavesdropping
on multiple executions of the protocol. With the problem laid out in this way, it
becomes clear that one possibility is to use a password-only authenticated key
exchange (PAK) protocol [4, 1, 6] as the underlying “equality test”.
Although the above intuition is appealing, we remark that a number of subtleties
arise when trying to apply this idea to obtain a provably secure solution.
In particular, we will require the PAK protocol to satisfy a slightly stronger
definition of security than that usually considered for PAK (cf. [1, 6, 12]); informally,
the PAK protocol should remain “secure” even when: (1) the adversary
can dynamically add clients to the system, with (unique) identities chosen by the
adversary; (2) the adversary can specify non-uniform and dependent password
distributions for these clients; and (3) the adversary can specify such distributions
adaptively at the time the client is added to the system. Luckily, it is not
difficult to verify that at least some existing protocols (e.g., [1, 17, 18, 11, 16])
satisfy a definition of this sort. 5 (Interestingly, the recent definition of [7] seems
to imply the above properties.) Due to lack of space, the formal definition of
security required for our application is deferred to the full version.
4.1 A Direct Construction
With the above in mind, we now describe our construction. Let Π be a PAK
protocol and let (SS, Rec) be a well-formed secure sketch. Construct a modified
protocol Π ′ as follows:
Initialization. User U samples w0 according to W0 (i.e., takes a scan of his
biometric data) and computes pub ← SS(w0). The user registers (w0, pub)
at the server S.
Protocol execution (server). The server sends pub to the user. It then executes
protocol Π using the following parameters: it sets its own “identity”
(within Π) to be S�pub, its “partner identity” to be pid = U�pub, and the
“password” to be w0.
Protocol execution (user). The i th time the user executes the protocol, the
user first samples wi according to distribution Wi (i.e., the user re-scans his
biometric data). The user also obtains a value pub ′ in the initial message it
receives, and computes w ′ = Rec(wi, pub ′ ). If w ′ =⊥ then the user simply
aborts. Otherwise, the user executes protocol Π, setting its own “identity”
to U�pub ′ , its “partner identity” to S�pub ′ , and using the “password” w ′ .
It is easy to see that correctness holds, since if the user and the server interact
without any interference from the adversary then: (1) the identity used by the
server is equal to the partner ID of the user; (2) the identity of the user is the
same as the partner ID of the server; and (3) the passwords w0 and w ′ are
identical. Before discussing the security of this protocol, we need to introduce a
5 In fact, it is already stated explicitly in [17, 11] that the given protocols remain secure
even under conditions 1 and 2, and it is not hard to see that they remain secure
under condition 3 as well.
slight restriction of the notion of a t-bounded distortion ensemble in which the
various random variables in the ensemble are (efficiently) computable:
Definition 7. Let (M, d) be a metric space. An explicitly computable t-bounded
distortion ensemble is a sequence of boolean circuits W = {W0, . . .} and a
parameter ℓ such that, for all i, the circuit Wi computes a function from {0, 1} ℓ
to M and, furthermore, for all r ∈ {0, 1} ℓ we have d(W0(r), Wi(r)) ≤ t. ♦
In our application, W will be output by a ppt adversary, ensuring both that
the ensemble contains only a polynomial number of circuits and that each such
circuit is of polynomial size (and hence may be evaluated efficiently). We remark
that it is not necessary for our proof that it be possible to efficiently
verify whether a given W satisfies the “t-bounded” property or whether the
min-entropy of W0 is as claimed, although the security guarantee stated below
only holds if W does indeed satisfy these properties. 6 With the above in mind,
we now state the security achieved by our protocol:
Theorem 2. Let Π be a secure PAK protocol (with respect to the definition
sketched earlier) and let A be a ppt adversary. If (SS, Rec) is a well-formed
(m, m ′ , t)-secure sketch over a metric space (M, d), and W = {W0, . . .} is
an explicitly-computable t-bounded distortion ensemble (output adaptively by A)
with H∞(W0) ≥ m, then the success probability of A in attacking protocol Π ′ is at
most qs ·2−m′′ +negl(κ), where qs represents the number of sessions in which the
adversary attempts to impersonate one of the parties, and m ′′ = m ′ − log Vol M t .
Due to space limitations, the proof is deferred to the full version.
Specific instantiations. As noted earlier, a number of PAK protocols satisfying
the required definition of security are known. If one is content to work in the
random oracle model then the protocol of [1] may be used (note that this still
represents an improvement over the solution based on robust fuzzy extractors
since the “effective key size” will be larger, as we discuss in the next paragraph).
To obtain a solution in the standard model which is only slightly less efficient, the
PAK protocols of [17, 11, 16] could be used. 7 Note that although these protocols
were designed for use with “short” passwords, they can be easily modified to
handle “large” passwords without much loss of efficiency; we discuss this further
in the full version.
6 As to whether the adversary can be “trusted” to output a W satisfying these properties,
recall that W anyway is meant to model naturally-occurring errors. Clearly, if
a real-world adversary has the ability to, e.g., introduce arbitrarily-large errors then
only weaker security guarantees can be expected to hold.
7 Although these protocols require public parameters, such parameters can be “hard
coded” into the implementation of the protocol and are fixed for all users of the
system; thus, users are not required to remember or store these values. The difference
is akin to the difference between PAK protocols in a “hybrid” PKI model (where
clients store their server’s public key) and PAK protocols (including [17, 11, 16]) in
which clients need only remember a short password.
4.2 Comparing Our Two Solutions
It is somewhat difficult to compare the security offered by our two solutions
(i.e., the one based on robust fuzzy extractors and the one described in this
section) since an exact comparison depends on a number of assumptions and
design decisions. As we already observed, the main advantage of the solution
described in this section is that it does not rely on random oracles. On the other
hand, the solution based on robust fuzzy extractors is simpler and more efficient.
The solution presented in this section does not require any randomness extraction,
and it therefore “saves” 2 log δ −1 bits of entropy as compared with
solutions that apply standard randomness extractors to the recovered biometric
data. Since a likely value in practice is δ ≤ 2 −64 , this results in a potential savings
of at least 128 bits of entropy. When the entropy of the original biometric
data is “large”, however, we notice that (1) as mentioned already in the previous
section, we may use a random oracle as our randomness extractor and thereby
avoid the loss of 2 log δ −1 bits of entropy; and (2) our two approaches can be
combined, and one can use a PAK protocol with any robust sketch. If this is
done then additional extraction is not required, and so we again avoid losing
2 log δ −1 bits of entropy.
On the other hand, the solution of the present section offers a clear advantage
when the entropy of the original biometric data is “small”. Although in this case
the adversary can succeed by an exhaustive, on-line “dictionary” attack, the
security of our second solution implies that this is the best an adversary can do.
In contrast, our solution based on robust sketches would not be appropriate in
this case since the adversary could determine the user’s secret biometric data
using off-line queries to the random oracle (cf. the factor proportional to qH·2 −m′
in Theorem 1).
References
1. M. Bellare, D. Pointcheval, and P. Rogaway. Authenticated Key Exchange Secure
Against Dictionary Attacks. Adv. in Cryptology — Eurocrypt 2000, LNCS vol.
1807, Springer-Verlag, pp. 139–155, 2000.
2. M. Bellare and P. Rogaway. Random Oracles are Practical: A Paradigm for Designing
Efficient Protocols. ACM CCS 1993, ACM Press, 1993.
3. M. Bellare and P. Rogaway. Entity Authentication and Key Distribution. Adv. in
Cryptology — Crypto 1993, LNCS vol. 773, Springer-Verlag, pp. 232–249, 1993.
4. S. Bellovin and M. Merritt. Encrypted Key Exchange: Password-Based Protocols
Secure Against Dictionary Attacks. IEEE Symposium on Research in Security and
Privacy, IEEE, pp. 72–84, 1992.
5. X. Boyen. Reusable Cryptographic Fuzzy Extractors. ACM CCS 2004, ACM Press,
pp. 82–91, 2004.
6. V. Boyko, P. MacKenzie, and S. Patel. Provably-Secure Password-Authenticated
Key Exchange Using Diffie-Hellman. Adv. in Cryptology — Eurocrypt 2000, LNCS
vol. 1807, Springer-Verlag, pp. 156–171, 2000.
7. R. Canetti, S. Halevi, J. Katz, Y. Lindell, and P. MacKenzie. Universally Composable
Password-Based Key Exchange. Eurocrypt 2005 (these proceedings).
8. G. Davida, Y. Frankel, and B. Matt. On Enabling Secure Applications Through
Off-Line Biometric Identification. IEEE Security and Privacy ’98.
9. Y. Dodis, L. Reyzin, and A. Smith. Fuzzy Extractors: How to Generate Strong
Keys from Biometrics and Other Noisy Data. Adv. in Cryptology — Eurocrypt
2004, LNCS vol. 3027, Springer-Verlag, pp. 523–540, 2004.
10. N. Frykholm and A. Juels. Error-Tolerant Password Recovery. ACM CCS 2001.
11. R. Gennaro and Y. Lindell. A Framework for Password-Based Authenticated Key
Exchange. Adv. in Cryptology — Eurocrypt 2003, LNCS vol. 2656, Springer-Verlag,
pp. 524–543, 2003.
12. O. Goldreich and Y. Lindell. Session-Key Generation Using Human Passwords
Only. Adv. in Cryptology — Crypto 2001, LNCS vol. 2139, Springer-Verlag, pp.
408–432, 2001.
13. A. Juels. Fuzzy Commitment. Slides from a presentation at DIMACS, 2004. Available
at http://dimacs.rutgers.edu/Workshops/Practice/slides/juels.ppt
14. A. Juels and M. Sudan. A Fuzzy Vault Scheme. IEEE Intl. Symp. on Info. Theory,
2002.
15. A. Juels and M. Wattenberg. A Fuzzy Commitment Scheme. ACM CCS 1999,
ACM Press, 1999.
16. J. Katz, P. MacKenzie, G. Taban, and V. Gligor. Two-Server Password-Only Authenticated
Key Exchange. Manuscript, Jan. 2005.
17. J. Katz, R. Ostrovsky, and M. Yung. Efficient Password-Authenticated Key Exchange
Using Human-Memorable Passwords. Adv. in Cryptology — Eurocrypt
2001, LNCS vol. 2045, Springer-Verlag, pp. 475–494, 2001.
18. J. Katz, R. Ostrovsky, and M. Yung. Forward Secrecy in Password-Only Key-
Exchange Protocols. Security in Communication Networks: SCN 2002, LNCS vol.
2576, Springer-Verlag, pp. 29–44, 2002.
19. F. Monrose, M. Reiter, and S. Wetzel. Password Hardening Based on Keystroke
Dynamics. ACM CCS 1999, ACM Press, 1999.
20. N. Nisan and A. Ta-Shma. Extracting Randomness: A Survey and New Constructions.
J. Computer and System Sciences 58(1): 148–173, 1999.
21. P. Tuyls and J. Goseling. Capacity and Examples of Template-Protecting Biometric
Authentication Systems. Biometric Authentication Workshop, 2004.
22. E. Verbitskiy, P. Tuyls, D. Denteneer, and J.-P. Linnartz. Reliable Biometric Authentication
with Privacy Protection. 24th Benelux Symp. on Info. Theory, 2003.
