\documentclass[10pt]{article}
\usepackage{amsfonts,amsthm,amsmath,amssymb}
\usepackage{array}
\usepackage{epsfig}
\usepackage{fullpage}
\usepackage[table]{xcolor}
\usepackage{adjustbox}
\usepackage{dashbox}
\usepackage{cjhebrew}
\usepackage{xspace}
\usepackage{url}
\usepackage{hyperref}
% packages for the graphs:
\usepackage{tikz}
\usepackage{tikz-qtree}
\begin{document}
\input{preamble.tex}
\lecture{5: {\large Pseudorandom Functions}}{November 23, 2017}{Nir Bitansky}{Yoav Kaempfer, Lior Solodkin}
%%%% body goes in here %%%%
\section{Previously on Foundations of Crypto}
We've seen that any one-way function (OWF) has a (randomized) hardcore bit (HCB), and saw how to use such a HCB to construct a pseudorandom generator (PRG) out of one-way permutations. As already mentioned last lecture, the more general theorem that OWFs imply PRGs is also true, but requires additional techniques that are not within the scope of this course.
So far we've seen that starting from OWFs, we can obtain PRGs and encryption for messages of arbitrary polynomial length. We now aim to achieve additional cryptographic goals based on these tools, and see how far they could get us.
\begin{figure}[h]
\centering
\begin{tikzpicture}
[grow=right, edge from parent/.style={thick,draw=black!70,-latex},
every node/.style={shape=rectangle, draw, align=center},]
\tikzset{level distance=60pt,sibling distance=18pt} \tikzset{execute at begin node=\strut}
\Tree [.OWF [.PRG ? Enc ? ]]
\end{tikzpicture}
\end{figure}
\section{Pseudorandom Functions}
Our first step will be to construct a much stronger form of PRG, which we will call a {\em pseudorandom function} (PRF). Let us start by motivating this object, within the familiar context of encryption.
\paragraph{A Motivating Question: Encrypting Multiple Messages.} We know that assuming OWFs, we can encrypt messages of arbitrary polynomial length $\ell(n)=n^{O(1)}$ (using a key of fixed size $n$). Our security definitions, however, only dealt with attackers that see a single (perhaps long) ciphertext.
\begin{center}
{\em Can we securely encrypt multiple messages?}
\end{center}
%
There are several ways of defining what security for multiple messages actually means. For now, we will consider the following natural definition known as security against known plaintext attacks (KPA).\footnote{In a nutshell, in a known plaintext attack, the encrypted messages are known ahead of time. Later, we shall define the notion of chosen plaintext attacks, where the attacker may adaptively request to see different encrypted messages, depending on all encryptions it had seen so far.} Also, for simplicity we will restrict attention to single bit messages. This is w.l.o.g in the sense that to encrypt longer messages, we can encrypt their bits separately (although this may cost us in efficiency).
\begin{definition}[Multi-Message Encryption (against Known Plaintext Attacks)]
A bit encryption scheme $(E,D)$ is multi-message secure if for any polynomial $\ell(\cdot)$,
$$
\set{E_{sk}(x_1),\dots, E_{sk}(x_\ell)}_{\begin{subarray}{l}n\in\Nat\\x \in\zo^{\ell(n)}\\y \in\zo^{\ell(n)}\end{subarray}}
\approx_c
\set{E_{sk}(y_1),\dots, E_{sk}(y_\ell)}_{\begin{subarray}{l}n\in\Nat\\x \in\zo^{\ell(n)}\\y \in\zo^{\ell(n)}\end{subarray}}\enspace,
$$
where in the above $sk \gets \zo^n$.
\end{definition}
Apriori we may hope that any encryption scheme that's secure for a single message would also be secure if we reuse it for two messages. However, we've already seen that this is not true in general (for instance, in one-time pad). Still, we may hope to construct a designated scheme that will be secure for multiple messages. In fact, in some sense, we've already did.
\paragraph{Stateful Encryption.} Our construction of encryption for long messages from one-extra-bit encryption actually already supports many messages to some extent. Specifically, it had a chain structure where we could always encrypt (and decrypt) an additional bit $m_{i+1}$ provided that the encryptor (and decryptor) has the current secret key $sk_i$. Such {\em stateful} encryption schemes are commonly known as {\em stream ciphers} or {\em state ciphers}. The obvious caveat of such encryption schemes is that they require the parties to remain synchronized. In particular, if some messages are dropped, then we can no longer guarantee consistent decryption. There are solutions that can {\em resynchronize} after a few failures, but in general this is a problem.
\paragraph{Publicly Synchronizing.} One na\"ive way to solve the synchronization problem is for the encryptor to send his state online. This of course cannot work in general --- the state may contain secret randomness (e.g., the next secret key) that will compromise security. However, using PRGs it seems that we can actually construct an encryption scheme that only requires keeping a public state, and in fact, a pretty short one.
The construction is simple: The encryptor would keep a counter of the number of messages it encrypted so far. When it wants to encrypt the $i$th bit, it will first expand its $n$-bit random secret key $s$ using a PRG $G$ and use the $i$th output bit $G(s)_i$ as a one-time pad. Specifically, it will send $i, G(s)_i \oplus m_{i}$ as the encryption. (Recall that we can indeed extend the output of $G$ as much as we want to any polynomial number of bits.)
In this scheme, there is no issue of synchronization, but the scheme still has some undesired properties. First, the encryptor has to maintain a state (the counter $i$). Furthermore, the complexity of encryption/decryption potentially grows with the number of encrypted messages --- this is certainly the case if we use the chain-style PRG extension we've previously seen.
\paragraph{The Solution: PRFs.} Both of the above problems can be solved using PRFs, which intuitively can be viewed as a PRG with an {\em exponential} number of output bits where the $i$th output bit $G(s)_i$ can be computed in fixed polynomial time (independent of $i$). There are several conceivable ways to formalize this. We will consider a relatively strong definition which requires that the function $i \longmapsto G(s)_i$ is indistinguishable from a truly random function.
\begin{definition}[Pseudorandom Function (PRF)]
A pseudorandom function is a polynomial-time function $F$ that takes as input a seed $s\in\zo^n$ and an index $i\in\zo^n \cong [2^n]$ and outputs a bit, such that for any n.u. PPT $A=\set{A_n}_{n\in\Nat}$ there exists a negligible $\mu$ such that for all $n\in\Nat$:
$$
\abs{\Pr\[A_n^{F_{s}(\cdot)}=1\pST s\gets \zo^n\]-\Pr\[A_n^{R(\cdot)}=1\pST R \gets \zo^{\zo^n}\]}\leq \mu(n)\enspace.
$$
\end{definition}
In this definition, $A_n$ is an oracle-aided algorithm (it does not view the function at its entirety, but can make a polynomial number of arbitrary queries). The notation $\zo^{\zo^n}$ stands for the set of all functions mapping $\zo^n$ to $\zo$. In other words, $R$ is simply a random function, that for each input string returns a random independent bit.\footnote{Note that PRFs indeed strengthen the notion of PRGs. In particular, any PRF $F$ can be used to construct a PRG $G$ (of arbitrary polynomial length), where the $i$th output bit of $G(s)$ is simply $F_s(i)$.}
\paragraph{Encrypting Multiple Messages Using PRFs.} Given PRFs, we can now follow the same rational as before, but completely lose the state, and in particular, encrypt in fixed time (independently of the number of messages encrypted so far). The scheme is given by:
\begin{itemize}
\item $E_{sk}(m)$:
\begin{itemize}
\item Interpret $sk$ as a seed for a PRF.
\item Sample a random $r\gets \zo^n$ and output $r,F_{sk}(r)\oplus m$.
\end{itemize}
\item $D_{sk}(r,v)$:
\begin{itemize}
\item Output $v\oplus F_{sk}(r)$.
\end{itemize}
\end{itemize}
\def \cE {{\cal E}}
It's easy to verify the correctness of the scheme. Let's sketch the proof of security.
\begin{proof}[Multi-Message Security, Sketch]
Notice that we can describe our encryption algorithm $E_{sk}$ as an oracle-aided algorithm $\cE^{F_{sk}}$ that uses $F_{sk}$ as a black box. Now, we can consider an alternative (mental) experiment, where instead of using $F_{sk}$ we use a truly random oracle $R$.
\begin{claim}
For any polynomial $\ell$,
$$
\set{\cE^{F_{sk}}(x_1),\dots, \cE^{F_{sk}}(x_\ell)\pST sk\gets \zo^n}_{\begin{subarray}{l}n\in\Nat\\x \in\zo^{\ell(n)}\end{subarray}}
\approx_c
\set{\cE^{R}(x_1),\dots, \cE^{R}(x_\ell)\pST R\gets \zo^{\zo^n}}_{\begin{subarray}{l}n\in\Nat\\x \in\zo^{\ell(n)}\end{subarray}}\enspace.
$$
\end{claim}
This claim follows directly from the security of the PRF. Indeed, given a non-uniform PPT $A$ that distinguishes between the above two ensembles (for infinitely many $n$ and $x\in\zo^{\ell(n)}$), we can construct a distinguisher $B^{(\cdot)}$ for the pseudorandom function. The distinguisher $B^{(\cdot)}$ simply emulates $\cE^{(\cdot)}$ to encrypt the bits of $x$, and answer the oracle calls that $\cE^{(\cdot)}$ makes using its own oracle. $B$ distinguishes a PRF from a random function with exactly the same advantage that $A$ distinguishes encryptions relative to $F_{sk}$ from encryptions relative to $R$.
This means that to finish the proof, we just need to show that the scheme is secure when instantiated with a random function. This is actually true in a statistical (rather than computational sense).
\begin{claim}
$$
\set{\cE^R(x_1),\dots, \cE^R(x_\ell)}_{\begin{subarray}{l}n\in\Nat\\x \in\zo^{\ell(n)}\\y \in\zo^{\ell(n)}\end{subarray}}
\approx_s
\set{\cE^R(y_1),\dots, \cE^R(y_\ell)}_{\begin{subarray}{l}n\in\Nat\\x \in\zo^{\ell(n)}\\y \in\zo^{\ell(n)}\end{subarray}}\enspace.
$$
\end{claim}
To see that this is true, notice that as long as the random values $r_1,\dots,r_\ell$ chosen by $\cE$ are distinct, then the produced encryptions $(r_1,R(r_1)\oplus x_1),\dots,(r_\ell,R(r_\ell)\oplus x_\ell)$ are uniformly random, independently of the underlying message $x$. Thus, it suffices to bound the probability that the random strings $r_i$ are not distinct. There are at most ${\ell \choose 2}$ such potential collisions, each occurring with probability $2^{-n}$. Overall, by a union bound, the probability that these strings are not distinct is at most $2^{-n}\ell^2$, which is negligible.
\end{proof}
\paragraph{Note on Randomized Encryption.} Note that the constructed encryption scheme is randomized. In fact, any stateless encryption scheme for more than a single message must be randomized (think why).
\section{The GGM Construction}
In '86 Goldreich, Goldwasser, and Micali gave a beautiful construction of a PRF from any PRG.
\begin{theorem}[\cite{GGM}]
If there exist PRGs, then there exist PRFs.
\end{theorem}
\begin{corollary}
Assuming OWFs, there exist PRFs.
\end{corollary}
\begin{proof}
The construction is based on a recurring theme in cryptographic constructions --- replacing a {\em chain} by a {\em tree}, which has the advantage that the number of nodes we can reach is exponential in the depth (instead of linear). Specifically, let $G:\zo^n\rightarrow \zo^{2n}$ be a length-doubling PRG. For a seed $s$, we'll denote by $G_0(s)$ and $G_1(s)$ the two $n$-bit parts of the output $G(s)$.
\begin{figure}[h]
\centering
\begin{tikzpicture}
[edge from parent/.style={thick,draw=black!70,-latex},]
\tikzset{level distance=40pt,sibling distance=-5pt} \tikzset{execute at begin node=\strut}
\tikzset{font={\fontsize{7.5pt}{9}\selectfont}}
\Tree [.s [.G\textsubscript{0}(s)
[.G\textsubscript{0}(G\textsubscript{0}(s))
[.G\textsubscript{0}(G\textsubscript{0}(G\textsubscript{0}(s))) ] [.G\textsubscript{1}(G\textsubscript{0}(G\textsubscript{0}(s))) ]]
[.G\textsubscript{1}(G\textsubscript{0}(s))
[.G\textsubscript{0}(G\textsubscript{1}(G\textsubscript{0}(s))) ] [.G\textsubscript{1}(G\textsubscript{1}(G\textsubscript{0}(s))) ]]
]
[.G\textsubscript{1}(s)
[.G\textsubscript{0}(G\textsubscript{1}(s))
[.G\textsubscript{0}(G\textsubscript{0}(G\textsubscript{1}(s))) ] [.G\textsubscript{1}(G\textsubscript{0}(G\textsubscript{1}(s))) ]]
[.G\textsubscript{1}(G\textsubscript{1}(s))
[.G\textsubscript{0}(G\textsubscript{1}(G\textsubscript{1}(s))) ] [.G\textsubscript{1}(G\textsubscript{1}(G\textsubscript{1}(s))) ]]
]]
]
\end{tikzpicture}
\caption{A GGM Tree for inputs of length $n=3$. To compute $F_s(x_1x_2x_3)$ follow the corresponding path and output the resulting leaf.}
\end{figure}
We define:
$$
F_s(x) = G_{x_n}(\cdots G_{x_2}(G_{x_1}(s))\cdots)\enspace.
$$
%
We will now analyze the security of the construction. We will rely on the following claim.
\begin{claim}\label{clm:fold}
For any polynomial $q=q(n)$,
$$
\set{U_{2n}^{(1)},\dots,U_{2n}^{(q)}}_{n\in\Nat} \approx_c \set{G(U_{n}^{(1)}),\dots,G(U_{n}^{(q)})}_{n\in\Nat}\enspace.
$$
\end{claim}
%
The claim can be proved by a hybrid argument (you proved a more general claim in the homework).
\medskip\noindent
We turn to prove the security of our constructed function $F$. Let $A$ be an adversary that distinguishes the function $F_s$ (for a random $s$) from a random function $R$, with advantage $\varepsilon=\varepsilon(n)$, and assume it makes at most $q=q(n)$ queries to its oracle. We will construct a corresponding distinguisher $B$ for the two distribution ensembles in Claim \ref{clm:fold}.
We consider the following labeling of the nodes $\set{x_1\dots x_i\pST \begin{array}{c}0\leq i\leq n\\x_1\dots x_i \in \zo^i \end{array}}$ in the full binary tree. The root of the tree $\varepsilon$ (at level zero) is labeled by the random seed $L_\varepsilon := s$. An internal node $x_1\dots x_i$ (at level $i$) is recursively labeled by $G_{x_i}(L_{x_1\dots x_{i-1}})$. Note that each leaf $x_1\dots x_n$ will be labeled by $L_{x_1\dots x_n}:=F_s(x_1\dots x_n)$.
We now consider $n+1$ hybrid experiments $H_0,\dots,H_n$ where we gradually change the labels as follows. In $H_i$, all the labels $L_{x_1\dots x_i}$ for nodes at level $i$ are sampled independently at random. Then, the rest of the labels down the tree are chosen according to the same recursive rule as before (the labels at levels $j< i$ are ignored). In each of these hybrid experiments, the adversary's queries are answered according to the leaf labels.
Note that $H_0$ corresponds exactly to the case that the labels are given by the PRF $L_{x_1\dots x_n} =F_s(x_1\dots x_n)$, whereas $H_n$ corresponds to the case that the labels are chosen by a truly random function $L_{x_1\dots x_n} =R(x_1\dots x_n)$. Then, there exists an $i\in [n]$ such that $A$ distinguishes $H_{i-1}$ from $H_i$ with advantage $\varepsilon/n$.
We note the following difference between the two hybrids:
\begin{itemize}
\item
In $H_{i-1}$, all the labels in layer $i$ are chosen pseudorandomly (by applying $G$ to their parent labels).
\item
In $H_{i}$, all the labels in layer $i$ are chosen completely at random.
\end{itemize}
We now construct the distinguisher $B$. Recall that $B$ obtains $t$ samples $Y^{(1)},\dots,Y^{(t)}$ and has to efficiently decide whether they where sampled at random or pseudorandomly. Intuitively, we'd like to use these samples as the labels of $i$th level in the tree. However, the size of the $i$th layer could very well be exponential. So we may not have enough samples to do so, and in any case certainly cannot sample all labels efficiently.
The point is that the adversary $A$, doesn't really look at all the leafs, but on at most $t$ of them, so most of the labels are irrelevant. Specifically, we will compute the labels in a lazy fashion.
\begin{figure}[h]
\centering
\begin{tikzpicture}
[edge from parent/.style={thick,draw=black!70,-latex},]
\tikzset{level distance=40pt,sibling distance=-5pt} \tikzset{execute at begin node=\strut}
\tikzset{font={\fontsize{7.5pt}{9}\selectfont}}
\Tree [.* [.{\color{red} s\textsubscript{0}} [.G\textsubscript{0}(s\textsubscript{0})
[.G\textsubscript{0}(G\textsubscript{0}(s\textsubscript{0})) ]
[.G\textsubscript{1}(G\textsubscript{0}(s\textsubscript{0})) ]
]
[.G\textsubscript{1}(s\textsubscript{0})
[.G\textsubscript{0}(G\textsubscript{1}(s\textsubscript{0})) ]
[.G\textsubscript{1}(G\textsubscript{1}(s\textsubscript{0})) ]
]]
[.{\color{red} s\textsubscript{1}} [.G\textsubscript{0}(s\textsubscript{1})
[.G\textsubscript{0}(G\textsubscript{0}(s\textsubscript{1})) ]
[.G\textsubscript{1}(G\textsubscript{0}(s\textsubscript{1})) ]
]
[.G\textsubscript{1}(s\textsubscript{1})
[.G\textsubscript{0}(G\textsubscript{1}(s\textsubscript{1})) ]
[.G\textsubscript{1}(G\textsubscript{1}(s\textsubscript{1})) ]
]]]
\end{tikzpicture}
\caption{$H_i$ illustrated for $i=1$}
\end{figure}
\begin{figure}[h]
\centering
\begin{tikzpicture}
[edge from parent/.style={thick,draw=black!70,-latex},]
\tikzset{level distance=40pt,sibling distance=-5pt} \tikzset{execute at begin node=\strut}
\tikzset{font={\fontsize{7.5pt}{9}\selectfont}}
\Tree [.* [.Y\textsubscript{0}\textsuperscript{(1)}
[.G\textsubscript{0}(Y\textsubscript{0}\textsuperscript{(1)}) ]
[.G\textsubscript{1}(Y\textsubscript{0}\textsuperscript{(1)})
[.G\textsubscript{0}(G\textsubscript{1}(Y\textsubscript{0}\textsuperscript{(1)})) ]
[.G\textsubscript{1}(G\textsubscript{1}(Y\textsubscript{0}\textsuperscript{(1)})) ]]]
[.Y\textsubscript{1}\textsuperscript{(1)} ]]
\end{tikzpicture}
\caption{Lazy labeling using the list $Y^{(1)}, ..., Y^{(q)}$ for level $i$. Paths in the tree are only labeled per $A$s queries.}
\end{figure}
\paragraph{Algorithm $B(Y^{(1)},\dots,Y^{(q)})$:}
\begin{itemize}
\item
Initialize $\cL=\emptyset$ (the set of $i$-level labels encountered so far).
\item
Initialize a counter $c=0$ (the number of samples used so far from the list $Y^{(1)},\dots,Y^{(q)}$).
\item
Emulate $A$, and when it makes a query $x$ act as follows:
\begin{enumerate}
\item If $\cL$ contains a pair $(x_1\dots x_i, L_{x_1\dots x_i})$, compute the labels down the path from $x_1\dots x_i$ to $x_1\dots x_n$ according to the recursive rule, and return the corresponding leaf.
\item If $\cL$ has no such element:
\begin{itemize}
\item Use the next sample $Y^{(c+1)}=Y^{(c+1)}_0 Y^{(c+1)}_1$\\ to add such an element and a sibling $(x_1\dots x_{i-1}0,Y^{(c+1)}_0),(x_1\dots x_{i-1}1,Y^{(c+1)}_1)$ to the list $\cL$.
\item Increase the counter $c$ by one.
\item Return to step (1) (now we can compute the corresponding leaf).
\end{itemize}
\end{enumerate}
\item
At the end output whatever $A$ outputs.
\end{itemize}
It is left to see that if the samples $Y^{(1)},\dots,Y^{(q)}$ are pseudorandom, then $A$'s emulated view is exactly as in $H_{i-1}$, whereas if they are random, then it's exactly as in $H_i$. Thus, $B$ manages to distinguish the two cases with advantage $\varepsilon/n$. This means that $\epsilon(n) = n^{-\omega(1)}$.
\end{proof}
\bibliographystyle{alpha}
\bibliography{bib5}
\end{document}