\documentclass[10pt]{article}
\usepackage{amsfonts,amsthm,amsmath,amssymb}
\usepackage{array}
\usepackage{epsfig}
\usepackage{fullpage}
\usepackage[table]{xcolor}
\usepackage{adjustbox}
\usepackage{dashbox}
\usepackage{cjhebrew}
\usepackage{xspace}
\usepackage{url}
\usepackage{hyperref}
% packages for the graphs:
%\usepackage{tikz}
%\usepackage{tikz-qtree}
\begin{document}
\input{preamble.tex}
\lecture{11: {\large Fully-Homomorphic Encryption}}{January 4, 2018}{Nir Bitansky}{Yotam Nitzan, Evyatar Parker}
\section{Previously on Foundations of Crypto}
In the previous two lectures we learned about multi-party computation. We've seen that already in the 80's cryptographers have managed to prove a very strong completeness theorem --- {\em assuming oblivious transfer any function can be securely computed}. That may seem as a good place to stop the course, what else could we want?
As we've already mentioned in previous lectures this is hardly the case because the latter completeness theorems are not sensitive to resources --- they require lot's of interaction and computation, which are sometimes not available.
In the next lectures we'll encounter several such scenarios and introduce new type of tools that would solve them. The problems that we'll look at are mostly motivated by today's reality where most of our data (including private data) is stored and processed on various clouds.
\section{Fully-Homomorphic Encryption}
Say you want to store your private data $x$ on the web, so that on one hand the server doesn't learn the contents of your data, but on the other, you can still ask the server to perform any computation $f$ on the data, so that you would eventually get $f(x)$. Naturally, you want the communication and computation you have to invest to be proportional to the input-output size, and the description of $f$, but not the time it takes to perform the computation.
Fully-homomorphic encryption (FHE) \cite{DBLP:journals/cacm/RivestSA78} achieves exactly these objectives. It basically provides a way {\em to compute under the encryption}.
\begin{figure}[h!]
\centering
\includegraphics [width=15cm, clip=true]{scribe_fig1.png}
\caption{FHE allow to delegate computations to be performed over encrypted input. The total communication and computation by the client depend on $\abs{f} + \abs{f\(x\)}$ but not on $time(f,x)$ }
\end{figure}
\scribes{add Fig 1}
\begin{definition}[$F$-Homomorphic Encryption]
Let $F=\set{F_n}$ be a collection of $n^{O(1)}$-size circuits with Boolean output. A secret-key, CPA-secure, bit encryption scheme $(E,D)$ is $F$-Homomorphic if there exists an algorithm $Eval$ satisfying:
\begin{itemize}
\item
{\bf Homomorphism:} for any $sk\in \zo^n$, $f\in F_n$,
$$D_{sk}(Eval_f(E_{sk}(x_1),\dots,E_{sk}(x_\ell)))=f(x_1,\dots,x_\ell)\enspace.$$
\item
{\bf Compactness:} the output of $Eval_f$ is of length $n$ (doesn't grow with the complexity of $f$).
\end{itemize}
The scheme is {\bf fully homomorphic} if it is homomorphic for any collection $F$ of polynomial-size circuits.
\end{definition}
\paragraph{Remark:} The restriction to the secret key setting is actually w.l.o.g. for fully-homomorphic encryption. Indeed, there's a generic transformation from secret to public key \cite{DBLP:conf/tcc/Rothblum11}.
\medskip\noindent
Schemes that are homomorphic for a specific operation, e.g. addition, have been known for quite some time. In fact, most public-key schemes, such as RSA or Diffie-Hellman/Elgamal, are algebraic enough to admit some homomorphism. Constructing fully-homomorphic encryption was an open problem for three decades, and was in fact believed by many to be implausible. This changed in 2009 with a breakthrough result by Craig Gentry \cite{DBLP:conf/stoc/Gentry09} who provided the first candidate for homomorphic encryption. His initial construction already contained ingenious ideas, but was based on quite non-standard hardness assumptions. In the following years, fully-homomorphic encryption was based on standard assumptions \cite{DBLP:conf/focs/BrakerskiV11}, made much more efficient, and considerably simplified. Today, we will see one such construction by Gentry, Sahai, and Waters \cite{DBLP:conf/crypto/GentrySW13}, with the main objective being simplicity (and not necessarily efficiency or security).
\paragraph{Gentry's Blueprint.} We will follow the blueprint suggested in Gentry's construction, and utilized also in its followups. This includes two main steps:
\begin{enumerate}
\item
Construct a scheme for shallow circuits (where we will address how shallow later on).
\item
Show how to convert any such scheme to a scheme for general circuits.
\end{enumerate}
%
\section{Homomorphic Encryption for Shallow Circuits from LWE}
The construction we'll see is based on the hardness of a specific algebraic assumption called {\em Learning with Errors} (LWE), which is a cousin of the Learning Parity (LPN) with Noise problem that we've already met.
\def \ZZ {\mathbb{Z}}
\paragraph{The Learning with Errors Assumption \cite{DBLP:conf/stoc/Regev05}:}
Let $q=q(n)$ be a prime, and let $B=B(n)\ll q(n)$. The (decisional) $LWE_{q,B}$ assumption says that there is an efficiently samplable distribution $\chi=\chi_n$ on integers in $[-B,B]$ such that the following two oracles are computationally indistinguishable:
\begin{enumerate}
\item The oracle $NLE_{\chi,s}$ that is parameterized by a random $s\gets \ZZ_q^n$, and returns noisy linear equations in $s$ of the form:
$$
a, \langle a,s \rangle +e \text{ where } a\gets \ZZ_q^n \text{ and } e\gets \chi\enspace.
$$
\item The random oracle $R$ that returns random samples
$$
a, b \text{ where } a\gets \ZZ_q^n,b\gets \ZZ_q \enspace.
$$
\end{enumerate}
Like the LPN problem, this problem is related to the problem of decoding random linear codes (a collection of $m\gg n$ samples can be viewed as noisy version of the code word $As$, for a random linear code whose generator matrix is $A$). Intuitively, the problem becomes harder as the ratio $B/q$ decreases (in the extreme cases, when the ratio is zero, the problem is solved by gaussian elimination, and when the ratio is $1/2$, the problem is information theoretically impossible to solve).
This problem, however, has so far turned out to be much more versatile than LPN. Indeed, the transition to a large moduli $q$ and different error distributions allows for geometric interpretations of the problem and leads to strong connections to problems on lattices. Remarkably there are also reductions from the (average-case) hardness of the problem to the worst-case hardness of classical approximation problems on lattices such as (the gap version of) the shortest vector problem (where as expected the approximation factors are related to the noise to moduli ratio).
\paragraph{A Useful Abstraction.} To construct homomorphic encryption based on LWE, it will be convenient to think about the following abstraction.
\begin{definition}[Pseudorandom Approximate Hyperplane (PAH)]
A $(q,B)$-PAH is given by PPT algorithms $(G,H)$ where $G(1^n)$ generates a secret-key vector $s\in \ZZ_q^n$ and $H_s$, given additional randomness, samples vectors $h\in \ZZ_q^n$ so that
\begin{itemize}
\item
for any $h$ sampled from $H_s$, $\langle h, s \rangle \in [-B,B]$.
\item
The oracle $H_s$ is computationally indistinguishable from an oracle that samples random vectors in $\ZZ_q^n$.
\item
$s_1 = \lceil q/2 \rceil$.\footnote{This requirement is not essential, but will simplify a bit our constructions.}
\end{itemize}
\end{definition}
\begin{claim}
$LWE_{q,B}$ implies a $(q,B)$-PAH.
\end{claim}
\begin{proof}[Proof Sketch]
$G(1^n)$ outputs $s= (\lceil q/2 \rceil,s')$ for an LWE secret $s'\in \ZZ_q^{n-1}$. $H_s$ outputs $(\(\lceil q/2 \rceil\)^{-1}(as'+e),-a)$.
\end{proof}
We will prove the following:
\begin{theorem}[\cite{DBLP:conf/crypto/GentrySW13}]
Let $q\approx 2^{\sqrt{n}}$ and $B\approx \sqrt{q}$. Then a $(q,B)-PAH$ implies a homomorphic scheme for circuits of depth $n^{.49}$.
\end{theorem}
\paragraph{Warmup: Homomorphism for Additions (mod 2).} As a warmup, let us use our PAH to construct a scheme that supports an arbitrary polynomial number of additions mod 2:
\begin{itemize}
\item The secret key will be a secret PAH vector $s\in \ZZ_q^n$.
\item To encrypt $b\in \zo$ sample $h\gets H_s$ and output $c=h+(b,0,\dots,0)$.
\item To decrypt $c$, output $0$ if $\langle c,s\rangle\in [-q/4,q/4]$ or $1$ otherwise.
\item To homomorphically add $c$ and $c'$ output $c+c'$.
\end{itemize}
CPA security of this scheme easily follows by the fact that it has pseudorandom ciphertexts.
To see correctness, it is useful to first imagine that $H_s$ generates vectors $h$ that are truly orthogonal to $s$. In this case,
$$
\langle c,s\rangle = \langle h,s\rangle +b s_1= 0 + b \lceil q/2 \rceil \mod q\enspace,
$$
which is in $[-q/4,q/4]$ iff $b=0$.
Furthermore, for any $c_1,\dots,c_\ell$ encrypting $b_1,\dots,b_\ell$,
$$
\langle \sum c_i,s\rangle = \langle \sum h_i ,s\rangle +s_1\sum b_i= \lceil q/2 \rceil \sum b_i \mod q \approx_\ell \lceil q/2 \rceil \bigoplus_i b_i \enspace,
$$
where $\approx_\ell$ means up to an additive error of $\ell = n^{O(1)}\ll q/4$, meaning that this will decrypt to $\bigoplus_i b_i$.
Going back to reality, we cannot really choose the vectors $h$ to be orthogonal, as then they won't be pseudorandom (they all come from an $(n-1)$-dimensional space, and this will be detected after $\approx n$ samples). However, the same idea works when $h$ is only close to being orthogonal --- the above equations simply become approximate. In particular, $\langle c_i ,s\rangle \approx_{B} 0$ and $\langle\sum c_i ,s\rangle \approx_{\ell B} 0$. Since $\ell B \ll q/4$ decryption remains correct.
\paragraph{Supporting Multiplication via The Eigen-Vector Method.} To support general circuits, we need to support a universal set of operations. For this, we will want a scheme that supports both $+,\times \mod 2$. Thinking about our warmup additive scheme, there's no obvious way to homomorphically multiply two ciphertexts.\footnote{Two natural rather natural ways of multiplying vectors is via inner product or outer product. The first loses the geometric relation with the secret key $s$. The second actually makes sense, but results in a ciphertext vector that is quadratically larger. There are ways to {\em relinearize} it \cite{DBLP:conf/focs/BrakerskiV11}, but today we'll take a different approach.}
The high-level idea would be to generalize the scheme so that any ciphertext encrypting $b$ is a matrix $C$ instead of vector $c$. We'll replace the invariant $c^T s \approx b s_1$ with $Cs \approx b s$. Namely, the secret key $s$ will be an approximate eigen vector of $C$ with the encrypted message $b$ as an eigen value.
As before let's imagine for a second that the secret key is truly an eigen vector (rather than approximate). Then, we have
$$
(C+C')s = (b+b')s \text{ \hspace{1cm} and \hspace{1cm}} (CC')s = bb's\enspace.
$$
meaning we can make any (polynomial) number of these operations.
Again, we can not hope for security with exact eigen vectors, as these are easy to find using gaussian elimination. We will want to show that this idea can still be applied with approximate eigen-vectors --- we would like to construct our matrices $C$ (with approximate eigen-vector $s$) to be pseudorandom.
\paragraph{1st Attempt.} It seems we've already developed the needed tools to achieve the above goal. We will construct the scheme as follows:
\begin{itemize}
\item
The secret key will be a vector $s\in \ZZ_q^n$ for a PAH.
\item
An encryption of $b$ will be generated by sampling $H= (h_1,\dots,h_{n})^T$ where $h_i \gets H_s$ and outputting $C=H+bI$.
\item To decrypt $C$, compute $v= Cs$. Output $0$ if $v_1 \in [-q/4,q/4]$ or $1$ otherwise.
\item To homomorphically add $C$ and $C'$ output $C+C'$.
\item To homomorphically multiply $C$ and $C'$ output $CC'$.
\end{itemize}
CPA security of this scheme follows as before by the fact that $H$ and thus also $C$ is pseudorandom.
We now analyze correctness. We define the {\em error} $e(C)$ in a ciphertext $C$ encrypting $b$ with $s$ and its magnitude $\|e(C)\|$ as
$$
e(C) := Cs-bs \hspace{2cm} \|e(C)\| := \min\set{\beta \pST e(C) \in [-\beta,\beta]^n}\enspace.\footnote{Formally, $e(C)= e_{s,b}(C)$ is defined with respect to $s,b$.}
$$
For an arithmetic function $f$, we will want to understand the error in a ciphertext $C$ encrypting $f(x)$ that is the result of homomorphically evaluating $f$ on fresh encryptions of $x_1,\dots,x_\ell$, aiming that $\|e(C)\|\ll q/4$. Indeed, if that's, then for $v= Cs=e(C)+f(x)s$ it holds that $v_1 \approx_{q/4} f(x)\lceil q/2 \rceil$, and decryption will be correct.
We examine the error:
\begin{itemize}
\item In any fresh (i.e., generated by the encryption algorithm) ciphertext $C$:
$$
\|e(C)\| = \|Hs\| \leq B\enspace.
$$
\item Given two (possibly not fresh) ciphertexts $C$ and $C'$ encrypting $b$ and $b'$:
\begin{align*}
&\|e(C+C')\| = \|(C+C')s-(b+b')s\| = \|e(C)+e(C')\| \leq \|e(C)\|+\|e(C')\| \enspace.\\\\
&\|e(CC')\| = \|CC's-bb's\| = \|C(b's+e(C'))-bb's\|=\|b'e(C)+Ce(C')\|\enspace.
\end{align*}
\end{itemize}
As we can see, while adding two ciphertexts with small error results in a ciphertext with small error, this is not the case for multiplication. Indeed, if the matrix $C$ is pseudorandom, then it is likely to be the case that $\|Ce(C')\| \notin [-q/4,q/4]$.
\paragraph{Solution: Binary Encoding.} There is a surprisingly na\"ive solution to the last problem. Specifically, it would have been great if we could only encode the matrix $C$, containing large integers in $\bbZ_q$, as a new matrix that only has small entries. Hmmm... do we have such an encoding? We do, binary representation!
For every $c\in \ZZ_q$, let $bin(c)\in \zo^{1\times \log q}$ be its binary representation.\footnote{Formally, we should consider $\lceil \log q \rceil$.} Consistently, for a matrix $C \in \ZZ_q^{m\times n}$, with entries $c_{ij}$, let $bin(C)\in \zo^{m\times n\log q}$ be the matrix binary representing each entry $bin(c_{ij})$.
Roughly speaking, instead of multiplying $C$ by $C'$, we would like to multiply $bin(C)$ with $C'$. However, we first need to slightly augment the scheme as $bin(C)$ may lose the geometric relation between $C$ and $s$, in fact, the dimensions don't even work out. We will consider the inverse operation to $bin$, which takes the binary representation and turns it into back into representation in $\ZZ_q$:
{\small $$
Q = \left(
\begin{array}{ccccc}
\begin{array}{c} 1 \\ 2 \\ \vdots \\ 2^{\log q} \end{array} & & & & \\
& \begin{array}{c} 1 \\ 2 \\ \vdots \\ 2^{\log q} \end{array} & & & \\
& & & & \\
& & \begin{array}{c} \ddots \\ \ddots \\ \vdots \\ \ddots \end{array} & \\
& & & & \begin{array}{c} 1 \\ 2 \\ \vdots \\ 2^{\log q} \end{array} \\
\end{array}
\right)_{n\log q \times n}
$$
}
Note that for any $C\in \ZZ_q^{n\log q \times n}$,
$$
bin(C) Q = C\enspace.
$$
We will now tweak our encryption as follows:
\begin{itemize}
\item
The secret key will still be a vector $s\in \ZZ_q^n$ for a PAH.
\item
An encryption of $b$ will be generated by sampling $H= (h_1,\dots,h_{n\log q})^T$ where $h_i \gets H_s$ and outputting $C=H+bQ$.
\item To decrypt $C$, compute $v= Cs$. Output $0$ if $v_1 \in [-q/4,q/4]$ or $1$ otherwise.
\item To homomorphically add $C$ and $C'$ output $C+C'$.
\item To homomorphically multiply $C$ and $C'$ output $bin(C)C'$.
\end{itemize}
\paragraph{Analysis.} The security of the augmented schemes follows just as the security of the previous scheme.
We now analyze correctness. We redefine the {\em error} $e(C)$ in a ciphertext $C$ encrypting $b$ as
$$
e(C) := Cs-bQs \hspace{2cm}\enspace.
$$
We reexamine the error:
\begin{itemize}
\item As before, in any fresh ciphertext $C$:
$$
\|e(C)\| = \|Hs\| \leq B\enspace.
$$
\item Given two (possibly not fresh) ciphertexts $C$ and $C'$ encrypting $b$ and $b'$:
\begin{align*}
\|e(C+C')\| = &\|(C+C')s-(b+b')Qs\| = \|e(C)+e(C')\| \leq \|e(C)\|+\|e(C')\| \enspace.\\\\
\|e(bin(C) C')\| = &\|bin(C)C's-bb'Qs\| = \|bin(C)(b'Qs+e(C'))-bb'Qs\|=\\
&\|b'Cs +bin(C)e(C')-bb'Qs\|=\|b'e(C)+bin(C)e(C')\| \leq \|e(C)\|+n\log q\|e(C')\|\enspace.
\end{align*}
\end{itemize}
Eventually, we would like to homomorphically evaluate an arbitrary arithmetic function $f$. We can think w.l.o.g on the circuit describing the function as consisting of interchanging all-addition layers and all-multiplication layer, the error in the resulting ciphertext $C$ will depend on the number of multiplication layers, which is bounded by the depth $d$. Specifically it is dominated by
$$
\|e(C)\| \leq B\(n\log q\)^{O(d)}\enspace,
$$
and thus for $d \leq n^{0.49}$,
$$\|e(C)\| \leq 2^{o(\sqrt{n})} \ll q/4\enspace,$$
implying that we will obtain the correct result when decrypting.
\begin{figure}[h!]
\centering
\includegraphics [width=15cm, clip=true]{scribe_fig2.png}
\caption{A layered arithmatic circuit of depth $d$ }
\end{figure}
\scribes{Add Fig 2}
\section{Bootstrapping: From Shallow Circuits to Arbitrary Circuits}
The latter scheme actually allows to construct what is known as a leveled homomorphic encryption --- if we know a bound on the depth $d$ of the circuits to be evaluated ahead of time, we can choose our parameters $n,q$ such that $d \leq \sqrt{n}$.
Going back to Gentry's blueprint, we would now like to show how to turn the scheme we got into a fully homomorphic encryption, where an apriori bound on the depth is not known.
Here Gentry had a simple, but ingenious observation:
{\em
to correctly evaluate arbitrary functions $f$, it is enough to be able to correctly evaluate {\em the scheme's own decryption circuit plus one universal operation}.
}
Formalizing this requires an extra {\em circular security} assumption:
\begin{theorem}[\cite{DBLP:conf/stoc/Gentry09}]
Consider the function
$$
f^D_{C,C'}(s) = D_s(C) \hspace{0.3cm}NAND \hspace{0.3cm} D_s(C')\enspace.
$$
Then any homomorphic scheme $(E,D)$ for $f$ that is secure in the presence of an encryption $E_s(s)$ can be made fully homomorphic.
\end{theorem}
The idea behind this theorem is to publish $E_s(s)$ as an evaluation key. Then every time we evaluate $f^D_{C,C'}$ homomorphically on $E_s(s)$, with ``decryptable'' $C,C'$, we get a decryptable encryption of $D_s(C) \hspace{0.3cm}NAND \hspace{0.3cm} D_s(C')$ and can move on this way through the entire circuit.
\begin{center}
{\em Did our scheme support its own decryption circuit?}
\end{center}
Recall, that our scheme supported circuits of depth $n^{0.49}$, and our decryption process was:
\begin{itemize}
\item
To decrypt $C$, compute $v= Cs$. Output $0$ if $v_1 \in [-q/4,q/4]$ or $1$ otherwise.
\end{itemize}
This could be simplified to checking if $\langle c_1,s \rangle \in [-q/4,q/4]$. Computing an inner product of vectors of dimension $n$ over $\ZZ_q$ can be done in depth $\log n$ if we allow arithmetic operations $\mod q$. However, allowing only arithmetics over $\ZZ_2$, leads to an overhead of $\polylog(q)$, and will thus exceed our allow bound $d \ll \log q$. What saves us is the fact that we only care about a few most significant bits of $\langle c_1,s \rangle$, which are the sum of $n$ numbers. In particular we only care about $O(\log n)$ most significant bits in each of the inputs, and can thus reduce the arithmetic overhead to $\polylog n$.
The security of the scheme clearly holds so long that the original scheme is secure given $E_s(s)$. Getting FHE from standard assumptions and without making a circular-security assumption is still an open problem.
\bibliographystyle{alpha}
\bibliography{bib11}
\end{document}