MMH-Badger MAC

Badger is a message authentication code (MAC) based on the idea of universal hashing and was developedTemplate:When by Boesgaard, Scavenius, Pedersen, Christensen, and Zenner.^[1] It is constructed by strengthening the ∆-universal hash family MMH using an ϵ-almost strongly universal (ASU) hash function family after the application of ENH (see below), where the value of ϵ is $1 / (2^{32} - 5)$ .^[2] Since Badger is a MAC function based on the universal hash function approach, the conditions needed for the security of Badger are the same as those for other universal hash functions such as UMAC.

Introduction

The Badger MAC processes a message of length up to $2^{64} - 1$ bits and returns an authentication tag of length $u \cdot 32$ bits, where $1 \leq u \leq 5$ . According to the security needs, user can choose the value of $u$ , that is the number of parallel hash trees in Badger. One can choose larger values of u, but those values do not influence further the security of MAC. The algorithm uses a 128-bit key and the limited message length to be processed under this key is $2^{64}$ .^[3]

The key setup has to be run only once per key in order to run the Badger algorithm under a given key, since the resulting internal state of the MAC can be saved to be used with any other message that will be processed later.

ENH

Hash families can be combined in order to obtain new hash families. For the ϵ-AU, ϵ-A∆U, and ϵ-ASU families, the latter are contained in the former. For instance, an A∆U family is also an AU family, an ASU is also an A∆U family, and so forth. On the other hand, a stronger family can be reduced to a weaker one, as long as a performance gain can be reached. A method to reduce ∆-universal hash function to universal hash functions will be described in the following.

Theorem 2^[1]

Let $H^{△}$ be an ϵ-AΔU hash family from a set A to a set B. Consider a message $(m, m_{b}) \in A \times B$ . Then the family H consisting of the functions $h (m, m_{b}) = H^{△} (m) + m_{b}$ is ϵ-AU.

If $m \neq m^{'}$ , then the probability that $h (m, m_{b}) = h (m^{'}, m'_{b})$ is at most ϵ, since $H^{△}$ is an ϵ-A∆U family. If $m = m^{'}$ but $m_{b} \neq {m_{b}}^{'}$ , then the probability is trivially 0. The proof for Theorem 2 was described in ^[1]

The ENH-family is constructed based on the universal hash family NH (which is also used in UMAC):

N H_{K} (M) = \sum_{i = 1}^{\frac{ℓ}{2}} (k_{(2 i - 1)} +_{w} m_{(2 i - 1)}) \times (k_{2 i} +_{w} m_{2 i}) mod 2^{2 w}

Where ' $+_{w}$ ' means 'addition modulo $2^{w}$ ', and $m_{i}, k_{i} \in {0, \dots, 2^{w} - 1}$ . It is a $2^{- w}$ -A∆U hash family.

Lemma 1^[1]

The following version of NH is $2^{- w}$ -A∆U:

N H_{K} (M) = (k_{1} +_{w} m_{1}) \times (k_{2} +_{w} m_{2}) mod 2^{2 w}

Choosing w=32 and applying Theorem 1, one can obtain the $2^{- 32}$ -AU function family ENH, which will be the basic building block of the badger MAC:

E N H_{k_{1}, k_{2}} (m_{1}, m_{2}, m_{3}, m_{4}) = (m_{1} +_{32} k_{1}) (m_{2} +_{32} k_{2}) +_{64} m_{3} +_{64} 2^{32} m_{4}

where all arguments are 32-bits long and the output has 64-bits.

Construction

Badger is constructed using the strongly universality hash family and can be described as

ℋ = H^{*} \times F,

^[1]

where an $ϵ_{H^{*}}$ -AU universal function family H* is used to hash messages of any size onto a fixed size and an $ϵ_{F}$ -ASU function family F is used to guarantee the strong universality of the overall construction. NH and ENH are used to construct H*. The maximum input size of the function family H* is $2^{64} - 1$ and the output size is 128 bits, split into 64 bits each for the message and the hash. The collision probability for the H*-function ranges from $2^{- 32}$ to $2^{- 26.14}$ . To construct the strongly universal function family F, the ∆-universal hash family MMH* is transformed into a strongly universal hash family by adding another key.

Two steps on Badger

There are two steps that have to be executed for every message: processing phase and finalize phase.^[3]

Processing phase

In this phase, the data is hashed to a 64-bit string. A core function Template:Mvar : ${0, 1}^{64} \times {0, 1}^{128} \to {0, 1}^{64}$ is used in this processing phase, that hashes a 128-bit string $m_{2} ∥ m_{1}$ to a 64-bit string $h (k, m_{2}, m_{1})$ as follows:

h (k, m_{2}, m_{1}) = (L (m_{1}) +_{32} L (k)) \cdot (U (m_{1}) +_{32} U (k)) +_{64} m_{2}

for any n, $+_{n}$ means addition modulo $2^{n}$ . Given a Template:Tmath-bit string x, Template:Tmath means least significant n bits, and Template:Tmath means most significant n bits.

A message can be processed by using this function. Denote Template:Code by $k_{j}^{i}$ .

Pseudo-code of the processing phase is as follow.

L = |M|
if L = 0
     $M^{1} = \dots = M^{u} = 0$ 
    Go to finalization
r = L mod 64
if r ≠ 0:
     $M = 0^{64 - r} ∥ M$ 
for i = 1 to u:
     $M^{i} = M$ 
     $v^{'} = \max {1, ⌈ \log_{2} L ⌉ - 6}$ 
for j = 1 to v′:
    divide Template:Tmath into 64-bit blocks, Template:Tmath
if t is even:
    Template:Tmath
else
    Template:Tmath

Finalize phase

In this phase, the 64-string resulting from the processing phase is transformed into the desired MAC tag. This finalization phase uses the Rabbit stream cipher and uses both key setup and IV setup by taking the finalization key Template:Code as $k_{j}^{i}$ .

Pseudo-code of the finalization phase

RabbitKeySetup(K)
RabbitIVSetup(N)
for i = 1 to u:
     $Q^{i} = 0^{7} ∥ L ∥ M^{i}$ 
    divide Template:Tmath into 27-bit blocks,  $Q^{i} = q_{5}^{i} ∥ \dots ∥ q_{1}^{i}$ 
     $S^{i} = (\sum_{j = 1}^{5} (q_{j}^{i} K_{j}^{i})) + K_{6}^{i} mod p$ 
 $S = S^{u} ∥ \dots ∥ S^{1}$ 
S = S ⨁ RabbitNextbit(u∙32)
return S

Notation

From the pseudocode above, k denotes the key in the Rabbit Key Setup(K) which initializes Rabbit with the 128-bit key k. M denotes the message to be hashed and |M| denotes the length of the message in bits. Template:Tmath denotes a message M that is divided into i blocks. For the given Template:Tmath-bit string x then Template:Math and Template:Math respectively denoted its least significant n bits and most significant n bits.

Performance

Boesgard, Christensen and Zenner report the performance of Badger measured on a 1.0 GHz Pentium III and on a 1.7 GHz Pentium 4 processor.^[1] The speed-optimized versions were programmed in assembly language inlined in C and compiled using the Intel C++ 7.1 compiler.

The following table presents Badger's properties for various restricted message lengths. "Memory req." denotes the amount of memory required to store the internal state including key material and the inner state of the Rabbit stream cipher . "Setup" denotes the key setup, and "Fin." denotes finalization with IV-setup.

Max. Message Size	Forgery Bound	Memory Reg.	Setup Pentium III	Fin. Pentium III	Setup Pentium III	Fin. Pentium III
$2^{11}$ bytes (e.g.IPsec)	$2^{- 57.7}$	400 bytes	1133 cycles	409 cycles	1774 cycles	776 cycles
$2^{15}$ bytes (e.g.TLS)	$2^{- 56.6}$	528 bytes	1370 cycles	421 cycles	2100 cycles	778 cycles
$2^{32}$ bytes	$2^{- 54.2}$	1072 bytes	2376 cycles	421 cycles	3488 cycles	778 cycles
$2^{61} - 1$ bytes	$2^{- 52.2}$	2000 bytes	4093 cycles	433 cycles	5854 cycles	800 cycles

MMH (Multilinear Modular Hashing)

The name MMH stands for Multilinear-Modular-Hashing. Applications in Multimedia are for example to verify the integrity of an on-line multimedia title. The performance of MMH is based on the improved support of integer scalar products in modern microprocessors.

MMH uses single precision scalar products as its most basic operation. It consists of a (modified) inner product between the message and a key modulo a prime $p$ . The construction of MMH works in the finite field $F_{p}$ for some prime integer $p$ .

MMH*

MMH* involves a construction of a family of hash functions consisting of multilinear functions on $F_{p}^{k}$ for some positive integer Template:Mvar. The family MMH* of functions from $F_{p}^{k}$ to $F_{p}$ is defined as follows.

{M M H}^{*} = {g_{x} : F_{p}^{k} \to F_{p} | x \in F_{p}^{k}}

where x, m are vectors, and the functions $g_{x}$ are defined as follows.

g_{x} (m) = m x mod p = \sum_{i = 1}^{n} m_{i} x_{i} mod p

In the case of MAC, Template:Mvar is a message and Template:Mvar is a key where $m = (m_{1}, \dots, m_{k})$ and $x = (x_{1}, \dots, x_{k}), x_{i}, m_{i} \in F_{p}$ .

MMH* should satisfy the security requirements of a MAC, enabling say Ana and Bob to communicate in an authenticated way. They have a secret key Template:Mvar. Say Charles listens to the conversation between Ana and Bob and wants to change the message into his own message to Bob which should pass as a message from Ana. So, his message Template:Mvar and Ana's message Template:Mvar will differ in at least one bit (e.g. $m_{1} \neq m'_{1}$ ).

Assume that Charles knows that the function is of the form $g_{x} (m)$ and he knows Ana's message Template:Mvar but he does not know the key x then the probability that Charles can change the message or send his own message can be explained by the following theorem.

Theorem 1^[4]:The family MMH* is ∆-universal.

Proof:

Take $a \in F_{p}$ , and let $m, m^{'}$ be two different messages. Assume without loss of generality that $m_{1} \neq m'_{1}$ . Then for any choice of $x_{2}, x_{3}, \dots, x_{s}$ , there is

\begin{matrix} \Pr_{x_{1}} [g_{x} (m) - g_{x} (m^{'}) \equiv a mod p] & = \Pr_{x_{1}} [(m_{1} x_{1} + m_{2} x_{2} + \dots + m_{k} x_{k}) - (m'_{1} x_{1} + m'_{2} x_{2} + \dots + m'_{k} x_{k}) \equiv a mod p] \\ = \Pr_{x_{1}} [(m_{1} - m'_{1}) x_{1} + (m_{2} - m'_{2}) x_{2} + \dots + (m_{k} - m'_{k}) x_{k}] \equiv a mod p] \\ = \Pr_{x_{1}} [(m_{1} - m'_{1}) x_{1} + \sum_{k = 2}^{s} (m_{k} - m'_{k}) x_{k} \equiv a mod p] \\ = \Pr_{x_{1}} [(m_{1} - m'_{1}) x_{1} \equiv a - \sum_{k = 2}^{s} (m_{k} - m'_{k}) x_{k} mod p] \\ = \frac{1}{p} \end{matrix}

To explain the theorem above, take $F_{p}$ for $p$ prime represent the field as $F_{p} = \underset{p}{\underset{⏟}{{0, 1, \dots, p - 1}}}$ . If one takes an element in $F_{p}$ , let say $0 \in F_{p}$ then the probability that $x_{1} = 0$ is

\Pr_{x_{1} \in F_{p}} (x_{1} = 0) = \frac{1}{p}

So, what one actually needs to compute is

\Pr_{(x_{1}, \dots, x_{k}) \in F_{p}^{k}} (g_{x} (m) \equiv g_{x} (m^{'}) mod p)

But,

\begin{matrix} \Pr_{(x_{1}, \dots, x_{k}) \in F_{p}^{k}} (g_{x} (m) \equiv g_{x} (m^{'}) mod p) & = \sum_{(x_{2}, \dots, x_{k}) \in F_{p}^{k - 1}} \Pr_{(x_{2}^{'} \dots, x_{k}^{'}) \in F_{p}^{k - 1}} (x_{2} = x_{2}^{'}, \dots, x_{k} = x_{k}^{'}) \cdot \Pr_{x_{1} \in F_{p}} (g_{x} (m) \equiv g_{x} (m^{'}) mod p) \\ = \sum_{(x_{2}, \dots, x_{k}) \in F_{p}^{k - 1}} \frac{1}{p^{k - 1}} \cdot \frac{1}{p} \\ = p^{k - 1} \cdot \frac{1}{p^{k - 1}} \cdot \frac{1}{p} \\ = \frac{1}{p} \end{matrix}

From the proof above, $\frac{1}{p}$ is the collision probability of the attacker in 1 round, so on average Template:Mvar verification queries will suffice to get one message accepted. To reduce the collision probability, it is necessary to choose large p or to concatenate Template:Mvar such MACs using Template:Mvar independent keys so that the collision probability] becomes $\frac{1}{p^{n}}$ . In this case the number of keys are increased by a factor of Template:Mvar and the output is also increased by Template:Mvar.

MMH*32

Halevi and Krawczyk^[4] construct a variant called ${M M H}_{32}^{*}$ . The construction works with 32-bit integers and with the prime integer $p = 2^{32} + 15$ . Actually the prime p can be chosen to be any prime which satisfies $2^{32} < p < 2^{32} + 2^{16}$ . This idea is adopted from the suggestion by Carter and Wegman to use the primes $2^{16} + 1$ or $2^{31} - 1$ .

{M M H}_{32}^{*}

is defined as follows:

{M M H}_{32}^{*} = {g_{x} ({0, 1}^{32})^{k}} \to F_{p},

where ${0, 1}^{32}$ means ${0, 1, \dots, 2^{32} - 1}$ (i.e., binary representation)

The functions $g_{x}$ are defined as follows.

\begin{matrix} g_{x} (m) & \overset{d e f}{=} m \cdot x mod (2^{32} + 15) \\ = \sum_{i = 1}^{k} m_{i} \cdot x_{i} mod (2^{32} + 15) \end{matrix}

where

x = (x_{1}, \dots, x_{k}), m = (m, \dots, m_{k})

By theorem 1, the collision probability is about $ϵ = 2^{- 32}$ , and the family of ${M M H}_{32}^{*}$ can be defined as ϵ-almost ∆ Universal with $ϵ = 2^{- 32}$ .

The value of k

The value of k that describes the length of the message and key vectors has several effects:

Since the costly modular reduction over k is multiply and add operations increasing k should decrease the speed.
Since the key x consist of k 32-bit integers increasing k will results in a longer key.
The probability of breaking the system is $1 / p$ and $p \approx 2^{k}$ so increasing k makes the system harder to break.

Performance

Below are the timing results for various implementations of MMH^[4] in 1997, designed by Halevi and Krawczyk.

A 150 MHz PowerPC 604 RISC machine running AIX

150 MHz PowerPC 604	Message in Memory	Message in Cache
64-bit	390 Mbit/second	417 Mbit/second
32-bit output	597 Mbit/second	820 Mbit/second

A 150 MHz Pentium-Pro machine running Windows NT

150 MHz PowerPC 604	Message in Memory	Message in Cache
64-bit	296 Mbit/second	356 Mbit/second
32-bit output	556 Mbit/second	813 Mbit/second

A 200 MHz Pentium-Pro machine running Linux

150 MHz PowerPC 604	Message in Memory	Message in Cache
64-bit	380 Mbit/second	500 Mbit/second
32-bit output	645 Mbit/second	1080 Mbit/second

References

Template:Reflist

↑ ^1.0 ^1.1 ^1.2 ^1.3 ^1.4 ^1.5 Template:Cite web
↑ Template:Cite web
↑ ^3.0 ^3.1 Template:Cite web
↑ ^4.0 ^4.1 ^4.2 Template:Cite book

[BS05-1] 1.0 ^1.1 ^1.2 ^1.3 ^1.4 ^1.5 Template:Cite web

[SV05-2] Template:Cite web

[B-3] 3.0 ^3.1 Template:Cite web

[HK97-4] 4.0 ^4.1 ^4.2 Template:Cite book

[1]

[2]

[3]

[4]

MMH-Badger MAC

Contents

Introduction

ENH

Construction

Two steps on Badger

Processing phase

Finalize phase

Notation

Performance

MMH (Multilinear Modular Hashing)

MMH*

MMH*32

The value of k

Performance

See also

References

Navigation menu

MMH-Badger MAC

Introduction

ENH

Construction

Two steps on Badger

Processing phase

Finalize phase

Notation

Performance

MMH (Multilinear Modular Hashing)

MMH*

MMH*32

The value of k

Performance

See also

References

Navigation menu

Search