An Intensive Introduction to Cryptography — Boaz Barak

Index

\[ \newcommand{\undefined}{} \newcommand{\hfill}{} \newcommand{\qedhere}{\square} \newcommand{\qed}{\square} \newcommand{\ensuremath}[1]{#1} \newcommand{\bbA}{\mathbb A} \newcommand{\bbB}{\mathbb B} \newcommand{\bbC}{\mathbb C} \newcommand{\bbD}{\mathbb D} \newcommand{\bbE}{\mathbb E} \newcommand{\bbF}{\mathbb F} \newcommand{\bbG}{\mathbb G} \newcommand{\bbH}{\mathbb H} \newcommand{\bbI}{\mathbb I} \newcommand{\bbJ}{\mathbb J} \newcommand{\bbK}{\mathbb K} \newcommand{\bbL}{\mathbb L} \newcommand{\bbM}{\mathbb M} \newcommand{\bbN}{\mathbb N} \newcommand{\bbO}{\mathbb O} \newcommand{\bbP}{\mathbb P} \newcommand{\bbQ}{\mathbb Q} \newcommand{\bbR}{\mathbb R} \newcommand{\bbS}{\mathbb S} \newcommand{\bbT}{\mathbb T} \newcommand{\bbU}{\mathbb U} \newcommand{\bbV}{\mathbb V} \newcommand{\bbW}{\mathbb W} \newcommand{\bbX}{\mathbb X} \newcommand{\bbY}{\mathbb Y} \newcommand{\bbZ}{\mathbb Z} \newcommand{\sA}{\mathscr A} \newcommand{\sB}{\mathscr B} \newcommand{\sC}{\mathscr C} \newcommand{\sD}{\mathscr D} \newcommand{\sE}{\mathscr E} \newcommand{\sF}{\mathscr F} \newcommand{\sG}{\mathscr G} \newcommand{\sH}{\mathscr H} \newcommand{\sI}{\mathscr I} \newcommand{\sJ}{\mathscr J} \newcommand{\sK}{\mathscr K} \newcommand{\sL}{\mathscr L} \newcommand{\sM}{\mathscr M} \newcommand{\sN}{\mathscr N} \newcommand{\sO}{\mathscr O} \newcommand{\sP}{\mathscr P} \newcommand{\sQ}{\mathscr Q} \newcommand{\sR}{\mathscr R} \newcommand{\sS}{\mathscr S} \newcommand{\sT}{\mathscr T} \newcommand{\sU}{\mathscr U} \newcommand{\sV}{\mathscr V} \newcommand{\sW}{\mathscr W} \newcommand{\sX}{\mathscr X} \newcommand{\sY}{\mathscr Y} \newcommand{\sZ}{\mathscr Z} \newcommand{\sfA}{\mathsf A} \newcommand{\sfB}{\mathsf B} \newcommand{\sfC}{\mathsf C} \newcommand{\sfD}{\mathsf D} \newcommand{\sfE}{\mathsf E} \newcommand{\sfF}{\mathsf F} \newcommand{\sfG}{\mathsf G} \newcommand{\sfH}{\mathsf H} \newcommand{\sfI}{\mathsf I} \newcommand{\sfJ}{\mathsf J} \newcommand{\sfK}{\mathsf K} \newcommand{\sfL}{\mathsf L} \newcommand{\sfM}{\mathsf M} \newcommand{\sfN}{\mathsf N} \newcommand{\sfO}{\mathsf O} \newcommand{\sfP}{\mathsf P} \newcommand{\sfQ}{\mathsf Q} \newcommand{\sfR}{\mathsf R} \newcommand{\sfS}{\mathsf S} \newcommand{\sfT}{\mathsf T} \newcommand{\sfU}{\mathsf U} \newcommand{\sfV}{\mathsf V} \newcommand{\sfW}{\mathsf W} \newcommand{\sfX}{\mathsf X} \newcommand{\sfY}{\mathsf Y} \newcommand{\sfZ}{\mathsf Z} \newcommand{\cA}{\mathcal A} \newcommand{\cB}{\mathcal B} \newcommand{\cC}{\mathcal C} \newcommand{\cD}{\mathcal D} \newcommand{\cE}{\mathcal E} \newcommand{\cF}{\mathcal F} \newcommand{\cG}{\mathcal G} \newcommand{\cH}{\mathcal H} \newcommand{\cI}{\mathcal I} \newcommand{\cJ}{\mathcal J} \newcommand{\cK}{\mathcal K} \newcommand{\cL}{\mathcal L} \newcommand{\cM}{\mathcal M} \newcommand{\cN}{\mathcal N} \newcommand{\cO}{\mathcal O} \newcommand{\cP}{\mathcal P} \newcommand{\cQ}{\mathcal Q} \newcommand{\cR}{\mathcal R} \newcommand{\cS}{\mathcal S} \newcommand{\cT}{\mathcal T} \newcommand{\cU}{\mathcal U} \newcommand{\cV}{\mathcal V} \newcommand{\cW}{\mathcal W} \newcommand{\cX}{\mathcal X} \newcommand{\cY}{\mathcal Y} \newcommand{\cZ}{\mathcal Z} \newcommand{\bfA}{\mathbf A} \newcommand{\bfB}{\mathbf B} \newcommand{\bfC}{\mathbf C} \newcommand{\bfD}{\mathbf D} \newcommand{\bfE}{\mathbf E} \newcommand{\bfF}{\mathbf F} \newcommand{\bfG}{\mathbf G} \newcommand{\bfH}{\mathbf H} \newcommand{\bfI}{\mathbf I} \newcommand{\bfJ}{\mathbf J} \newcommand{\bfK}{\mathbf K} \newcommand{\bfL}{\mathbf L} \newcommand{\bfM}{\mathbf M} \newcommand{\bfN}{\mathbf N} \newcommand{\bfO}{\mathbf O} \newcommand{\bfP}{\mathbf P} \newcommand{\bfQ}{\mathbf Q} \newcommand{\bfR}{\mathbf R} \newcommand{\bfS}{\mathbf S} \newcommand{\bfT}{\mathbf T} \newcommand{\bfU}{\mathbf U} \newcommand{\bfV}{\mathbf V} \newcommand{\bfW}{\mathbf W} \newcommand{\bfX}{\mathbf X} \newcommand{\bfY}{\mathbf Y} \newcommand{\bfZ}{\mathbf Z} \newcommand{\rmA}{\mathrm A} \newcommand{\rmB}{\mathrm B} \newcommand{\rmC}{\mathrm C} \newcommand{\rmD}{\mathrm D} \newcommand{\rmE}{\mathrm E} \newcommand{\rmF}{\mathrm F} \newcommand{\rmG}{\mathrm G} \newcommand{\rmH}{\mathrm H} \newcommand{\rmI}{\mathrm I} \newcommand{\rmJ}{\mathrm J} \newcommand{\rmK}{\mathrm K} \newcommand{\rmL}{\mathrm L} \newcommand{\rmM}{\mathrm M} \newcommand{\rmN}{\mathrm N} \newcommand{\rmO}{\mathrm O} \newcommand{\rmP}{\mathrm P} \newcommand{\rmQ}{\mathrm Q} \newcommand{\rmR}{\mathrm R} \newcommand{\rmS}{\mathrm S} \newcommand{\rmT}{\mathrm T} \newcommand{\rmU}{\mathrm U} \newcommand{\rmV}{\mathrm V} \newcommand{\rmW}{\mathrm W} \newcommand{\rmX}{\mathrm X} \newcommand{\rmY}{\mathrm Y} \newcommand{\rmZ}{\mathrm Z} \newcommand{\paren}[1]{( #1 )} \newcommand{\Paren}[1]{\left( #1 \right)} \newcommand{\bigparen}[1]{\bigl( #1 \bigr)} \newcommand{\Bigparen}[1]{\Bigl( #1 \Bigr)} \newcommand{\biggparen}[1]{\biggl( #1 \biggr)} \newcommand{\Biggparen}[1]{\Biggl( #1 \Biggr)} \newcommand{\abs}[1]{\lvert #1 \rvert} \newcommand{\Abs}[1]{\left\lvert #1 \right\rvert} \newcommand{\bigabs}[1]{\bigl\lvert #1 \bigr\rvert} \newcommand{\Bigabs}[1]{\Bigl\lvert #1 \Bigr\rvert} \newcommand{\biggabs}[1]{\biggl\lvert #1 \biggr\rvert} \newcommand{\Biggabs}[1]{\Biggl\lvert #1 \Biggr\rvert} \newcommand{\card}[1]{\lvert #1 \rvert} \newcommand{\Card}[1]{\left\lvert #1 \right\rvert} \newcommand{\bigcard}[1]{\bigl\lvert #1 \bigr\rvert} \newcommand{\Bigcard}[1]{\Bigl\lvert #1 \Bigr\rvert} \newcommand{\biggcard}[1]{\biggl\lvert #1 \biggr\rvert} \newcommand{\Biggcard}[1]{\Biggl\lvert #1 \Biggr\rvert} \newcommand{\norm}[1]{\lVert #1 \rVert} \newcommand{\Norm}[1]{\left\lVert #1 \right\rVert} \newcommand{\bignorm}[1]{\bigl\lVert #1 \bigr\rVert} \newcommand{\Bignorm}[1]{\Bigl\lVert #1 \Bigr\rVert} \newcommand{\biggnorm}[1]{\biggl\lVert #1 \biggr\rVert} \newcommand{\Biggnorm}[1]{\Biggl\lVert #1 \Biggr\rVert} \newcommand{\iprod}[1]{\langle #1 \rangle} \newcommand{\Iprod}[1]{\left\langle #1 \right\rangle} \newcommand{\bigiprod}[1]{\bigl\langle #1 \bigr\rangle} \newcommand{\Bigiprod}[1]{\Bigl\langle #1 \Bigr\rangle} \newcommand{\biggiprod}[1]{\biggl\langle #1 \biggr\rangle} \newcommand{\Biggiprod}[1]{\Biggl\langle #1 \Biggr\rangle} \newcommand{\set}[1]{\lbrace #1 \rbrace} \newcommand{\Set}[1]{\left\lbrace #1 \right\rbrace} \newcommand{\bigset}[1]{\bigl\lbrace #1 \bigr\rbrace} \newcommand{\Bigset}[1]{\Bigl\lbrace #1 \Bigr\rbrace} \newcommand{\biggset}[1]{\biggl\lbrace #1 \biggr\rbrace} \newcommand{\Biggset}[1]{\Biggl\lbrace #1 \Biggr\rbrace} \newcommand{\bracket}[1]{\lbrack #1 \rbrack} \newcommand{\Bracket}[1]{\left\lbrack #1 \right\rbrack} \newcommand{\bigbracket}[1]{\bigl\lbrack #1 \bigr\rbrack} \newcommand{\Bigbracket}[1]{\Bigl\lbrack #1 \Bigr\rbrack} \newcommand{\biggbracket}[1]{\biggl\lbrack #1 \biggr\rbrack} \newcommand{\Biggbracket}[1]{\Biggl\lbrack #1 \Biggr\rbrack} \newcommand{\ucorner}[1]{\ulcorner #1 \urcorner} \newcommand{\Ucorner}[1]{\left\ulcorner #1 \right\urcorner} \newcommand{\bigucorner}[1]{\bigl\ulcorner #1 \bigr\urcorner} \newcommand{\Bigucorner}[1]{\Bigl\ulcorner #1 \Bigr\urcorner} \newcommand{\biggucorner}[1]{\biggl\ulcorner #1 \biggr\urcorner} \newcommand{\Biggucorner}[1]{\Biggl\ulcorner #1 \Biggr\urcorner} \newcommand{\ceil}[1]{\lceil #1 \rceil} \newcommand{\Ceil}[1]{\left\lceil #1 \right\rceil} \newcommand{\bigceil}[1]{\bigl\lceil #1 \bigr\rceil} \newcommand{\Bigceil}[1]{\Bigl\lceil #1 \Bigr\rceil} \newcommand{\biggceil}[1]{\biggl\lceil #1 \biggr\rceil} \newcommand{\Biggceil}[1]{\Biggl\lceil #1 \Biggr\rceil} \newcommand{\floor}[1]{\lfloor #1 \rfloor} \newcommand{\Floor}[1]{\left\lfloor #1 \right\rfloor} \newcommand{\bigfloor}[1]{\bigl\lfloor #1 \bigr\rfloor} \newcommand{\Bigfloor}[1]{\Bigl\lfloor #1 \Bigr\rfloor} \newcommand{\biggfloor}[1]{\biggl\lfloor #1 \biggr\rfloor} \newcommand{\Biggfloor}[1]{\Biggl\lfloor #1 \Biggr\rfloor} \newcommand{\lcorner}[1]{\llcorner #1 \lrcorner} \newcommand{\Lcorner}[1]{\left\llcorner #1 \right\lrcorner} \newcommand{\biglcorner}[1]{\bigl\llcorner #1 \bigr\lrcorner} \newcommand{\Biglcorner}[1]{\Bigl\llcorner #1 \Bigr\lrcorner} \newcommand{\bigglcorner}[1]{\biggl\llcorner #1 \biggr\lrcorner} \newcommand{\Bigglcorner}[1]{\Biggl\llcorner #1 \Biggr\lrcorner} \newcommand{\expr}[1]{\langle #1 \rangle} \newcommand{\Expr}[1]{\left\langle #1 \right\rangle} \newcommand{\bigexpr}[1]{\bigl\langle #1 \bigr\rangle} \newcommand{\Bigexpr}[1]{\Bigl\langle #1 \Bigr\rangle} \newcommand{\biggexpr}[1]{\biggl\langle #1 \biggr\rangle} \newcommand{\Biggexpr}[1]{\Biggl\langle #1 \Biggr\rangle} \newcommand{\e}{\varepsilon} \newcommand{\eps}{\varepsilon} \newcommand{\from}{\colon} \newcommand{\super}[2]{#1^{(#2)}} \newcommand{\varsuper}[2]{#1^{\scriptscriptstyle (#2)}} \newcommand{\tensor}{\otimes} \newcommand{\eset}{\emptyset} \newcommand{\sse}{\subseteq} \newcommand{\sst}{\substack} \newcommand{\ot}{\otimes} \newcommand{\Esst}[1]{\bbE_{\substack{#1}}} \newcommand{\vbig}{\vphantom{\bigoplus}} \newcommand{\seteq}{\mathrel{\mathop:}=} \newcommand{\defeq}{\stackrel{\mathrm{def}}=} \newcommand{\Mid}{\mathrel{}\middle|\mathrel{}} \newcommand{\Ind}{\mathbf 1} \newcommand{\bits}{\{0,1\}} \newcommand{\sbits}{\{\pm 1\}} \newcommand{\R}{\mathbb R} \newcommand{\Rnn}{\R_{\ge 0}} \newcommand{\N}{\mathbb N} \newcommand{\Z}{\mathbb Z} \newcommand{\Q}{\mathbb Q} \newcommand{\mper}{\,.} \newcommand{\mcom}{\,,} \DeclareMathOperator{\Id}{Id} \DeclareMathOperator{\cone}{cone} \DeclareMathOperator{\vol}{vol} \DeclareMathOperator{\val}{val} \DeclareMathOperator{\opt}{opt} \DeclareMathOperator{\Opt}{Opt} \DeclareMathOperator{\Val}{Val} \DeclareMathOperator{\LP}{LP} \DeclareMathOperator{\SDP}{SDP} \DeclareMathOperator{\Tr}{Tr} \DeclareMathOperator{\Inf}{Inf} \DeclareMathOperator{\poly}{poly} \DeclareMathOperator{\polylog}{polylog} \DeclareMathOperator{\argmax}{arg\,max} \DeclareMathOperator{\argmin}{arg\,min} \DeclareMathOperator{\qpoly}{qpoly} \DeclareMathOperator{\qqpoly}{qqpoly} \DeclareMathOperator{\conv}{conv} \DeclareMathOperator{\Conv}{Conv} \DeclareMathOperator{\supp}{supp} \DeclareMathOperator{\sign}{sign} \DeclareMathOperator{\mspan}{span} \DeclareMathOperator{\mrank}{rank} \DeclareMathOperator{\E}{\mathbb E} \DeclareMathOperator{\pE}{\tilde{\mathbb E}} \DeclareMathOperator{\Pr}{\mathbb P} \DeclareMathOperator{\Span}{Span} \DeclareMathOperator{\Cone}{Cone} \DeclareMathOperator{\junta}{junta} \DeclareMathOperator{\NSS}{NSS} \DeclareMathOperator{\SA}{SA} \DeclareMathOperator{\SOS}{SOS} \newcommand{\iprod}[1]{\langle #1 \rangle} \newcommand{\R}{\mathbb{R}} \newcommand{\cE}{\mathcal{E}} \newcommand{\E}{\mathbb{E}} \newcommand{\pE}{\tilde{\mathbb{E}}} \newcommand{\N}{\mathbb{N}} \renewcommand{\P}{\mathcal{P}} \notag \]
\[ \newcommand{\sleq}{\ensuremath{\preceq}} \newcommand{\sgeq}{\ensuremath{\succeq}} \newcommand{\diag}{\ensuremath{\mathrm{diag}}} \newcommand{\support}{\ensuremath{\mathrm{support}}} \newcommand{\zo}{\ensuremath{\{0,1\}}} \newcommand{\pmo}{\ensuremath{\{\pm 1\}}} \newcommand{\uppersos}{\ensuremath{\overline{\mathrm{sos}}}} \newcommand{\lambdamax}{\ensuremath{\lambda_{\mathrm{max}}}} \newcommand{\rank}{\ensuremath{\mathrm{rank}}} \newcommand{\Mslow}{\ensuremath{M_{\mathrm{slow}}}} \newcommand{\Mfast}{\ensuremath{M_{\mathrm{fast}}}} \newcommand{\Mdiag}{\ensuremath{M_{\mathrm{diag}}}} \newcommand{\Mcross}{\ensuremath{M_{\mathrm{cross}}}} \newcommand{\eqdef}{\ensuremath{ =^{def}}} \newcommand{\threshold}{\ensuremath{\mathrm{threshold}}} \newcommand{\vbls}{\ensuremath{\mathrm{vbls}}} \newcommand{\cons}{\ensuremath{\mathrm{cons}}} \newcommand{\edges}{\ensuremath{\mathrm{edges}}} \newcommand{\cl}{\ensuremath{\mathrm{cl}}} \newcommand{\xor}{\ensuremath{\oplus}} \newcommand{\1}{\ensuremath{\mathrm{1}}} \notag \]
\[ \newcommand{\transpose}[1]{\ensuremath{#1{}^{\mkern-2mu\intercal}}} \newcommand{\dyad}[1]{\ensuremath{#1#1{}^{\mkern-2mu\intercal}}} \newcommand{\nchoose}[1]{\ensuremath{{n \choose #1}}} \newcommand{\generated}[1]{\ensuremath{\langle #1 \rangle}} \newcommand{\bra}[1]{\ensuremath{\langle #1 |}} \newcommand{\ket}[1]{\ensuremath{| #1 \rangle}} \notag \]

Hash functions and random oracles

We have seen pseudorandom generators, functions and permutations, as well as Message Authentication codes, CPA and CCA secure encryptions. This week we will talk about cryptographic hash functions and some of their magical properties. We motivate this by the bitcoin cryptocurrency. As usual our discussion will be highly abstract and idealized, and any resemblance to real cryptocurrencies, living or dead, is purely coincidental.

The “bitcoin” problem

Using cryptography to create a centralized digital-currency is fairly straightforward, and indeed this is what is done by Visa, Mastercard etc.. The main challenge with bitcoin is that it is decentralized. There is no trusted server, there are no “user accounts”, no central authority to adjudicate claims. Rather we have a collection of anonymous and autonomous parties that somehow need to agree on what is a valid payment.

The currency problem

Before talking about cryptocurrencies, let’s talk about currencies in general.I am not an economist by any stretch of the imagination, so please take the dicussion below with a huge grain of salt. I would appreciate any comments on it. At an abstract level, a currency requires two components:

  • A scarce resource
  • A mechanism for determining and transferring ownership of certain quantities of this resource.

The original currencies were based on commodity money. The scarce resource was some commodity having intrinsic value, such as gold or silver, or even salt or tea, and ownership based on physical possession. However, as commerce increased, carrying around (and protecting) the large quantity of the commodities became impractical, and societies shifted to representative money, where the currency is not the commodity itself but rather a certificate that provides the right to the commodity. Representative money requires trust in some central authority that would respect the certificate. The next step in the evolution of currencies was fiat money, which is a currency (like today’s dollar, ever since the U.S. moved off the gold standard) that does not correspond to any commodity, but rather only relies on trust in a central authority. (Another example is the Roman coins, which though originally made of silver, have underdone a continous process of debasement until they contained less than two percent of it.) One advantage (sometimes disadvantage) of a fiat currency is that it allows for more flexible monetary policy on parts of the central authority.

Bitcoin architecture

Bitcoin is a fiat currency without a central authority. A priori this seems like a contradiction in terms. If there is no trusted central authority, how can we ensure a scarce resource? who settles claims of ownership? and who sets monetary policy? Bitcoin (and other cryptocurrencies) is about the solution for these problems via cryptographic means.

The basic unit in the bitcoin system is a coin. Each coin has a unique identifier, and a current owner .This is one of the places where we simplify and deviate from the actual Bitcoin system. In the actual Bitcoin system, the atomic unit is known as a satoshi and one bitcoin (abberviated BTC) is \(10^8\) satoshis. For reasons of efficiency, there is no individual identifier per satoshi and transactions can involve transfer and creation of multiple satoshis. However, conceptually we can think of atomic coins each of which has a unique identifier. Transactions in the system have either the form of “mint coin with identifier \(ID\) and owner \(P\)” or “transfer the coin \(ID\) from \(P\) to \(Q\)”. All of these transactions are recorded in a public ledger.

Since there are no user accounts in bitcoin, the “entities” \(P\) and \(Q\) are not identifiers of any person or account. Rather \(P\) and \(Q\) are “computational puzzles”. A computational puzzle can be thought of as a string \(\alpha\) that specifies some “problem” such that it’s easy to verify whether some other string \(\beta\) is a “solution” for \(\alpha\), but it is hard to find such a solution on your own. (Students with complexity background will recognize here the class NP.) So when we say “transfer the coin \(ID\) from \(P\) to \(Q\)” we mean that whomever holds a solution for the puzzle \(Q\) is now the owner of the coin \(ID\) (and to verify the authenticity of this transfer, you provide a solution to the puzzle \(P\).) More accurately, a transaction involving the coin \(ID\) is self-validating if it contains a solution to the puzzle that is associated with \(ID\) according to the latest transaction in the ledger.

Please re-read the previous paragraph, to make sure you follow the logic.

One example of a puzzle is that \(\alpha\) can encode some large integer \(N\), and a solution \(\beta\) will encode a pair of numbers \(A,B\) such that \(N=A\cdot B\). Another more generic example (that you can keep in mind as a potential implementation for the puzzles we use here) is that \(\alpha\) will be a string in \(\{0,1\}^{2n}\) while \(\beta\) will be a string in \(\{0,1\}^n\) such that \(\alpha = G(\beta)\) where \(G:\{0,1\}^n\rightarrow\{0,1\}^{2n}\) is some pseudorandom generator. The real Bitcoin system typically uses puzzles based on digital signatures, a concept we will learn about later in this course, but you can simply think of \(P\) as specifying some abstract puzzle and every person that can solve \(P\) can perform transactions on the coins owned by \(P\).There are reasons why Bitcoin uses digital signatures and not these puzzles. The main issue is that we want to bind the puzzle not just to the coin but also to the particular transaction, so that if you know the solution to the puzzle \(P\) corresponding to the coin \(ID\) and want to use that to transfer it to \(Q\), it won’t be possible for someone to take your solution and use that to transfer the coin to \(Q'\) before your transaction is added to the public ledger. We will come back to this issue after we learn about digital signatures. In particular if you lost the solution to the puzzle then you have no access to the coin, and if someone stole the solution from you, then you have no recourse or way to get your coin back. People have managed to lose millions of dollars in this way.

The bitcoin ledger

The main idea behind bitcoin is that there is a public ledger that contains an ordered list of all the transactions that were ever performed and are considered as valid in the system. Given such a ledger, it is easy to answer the question of who owns any particular coin. The main problem is how does a collection of anonymous parties without any central authority agree on this ledger? This is an instance of the consensus problem in distributed computing. This seems quite scary, as there are very strong negative results known for this problem; for example the famous Fischer, Lynch Patterson (FLP) result showed that if there is even one party that has a benign failure (i.e., it halts and stop responding) then it is impossible to guarantee consensus in an asynchronuous network. Things are better if we assume synchronicity (i.e., a global clock and some bounds on the latency of messages) as well as that a majority or supermajority of the parties behave correctly. The central clock assumption is typically approximately maintained on the Internet, but the honest majority assumption seems quite suspicious. What does it mean a “majority of parties” in an anonymous network where a single person can create multiple “entities” and cause them to behave arbitrarily (known as “byzantine” faults in distributed parlance)? Also, why would we assume that even one party would behave honestly- if there is no central authority and it is profitable to cheat then they everyone would cheat, wouldn’t they?

The bitcoin ledger consists of an ordered list of transactions. At any given point in time there might be several “forks” that continue the ledger, and different parties do not necessarily have to agree on them. However, the bitcoin architecture is designed to ensure that the parties corresponding to a majority of the computing power will reach consensus on a single ledger.

Perhaps the main idea behind bitcoin is that “majority” will correspond to a “majority of computing power”, or as the original bitcoin paper says, “one CPU one vote” (or perhaps more accurately, “one cycle one vote”). It might not be immediately clear how to implement this, but at least it means that creating fictitious new entities (sometimes known as a Sybill attack after the movie about multiple-personality disorder) cannot help. To implement it we turn to a cryptographic concept known as “proof of work” which was originally suggested by Dwork and Naor in 1991 as a way to combat mass marketing email.This was a rather visionary paper in that it foresaw this issue before the term “spam” was introduced and indeed when email itself, let alone spam email, was hardly widespread.

Consider a pseudorandom function \(\{ f_k \}\) mapping \(n\) bits to \(\ell\) bits. On average, it will take a party Alice \(2^\ell\) queries to obtain an input \(x\) such that \(f_k(x)=0^\ell\). So, if we’re not too careful, we might think of such an input \(x\) as a proof that Alice spent \(2^\ell\) time.

Stop here and try to think if indeed it is the case that one cannot find an input \(x\) such that \(f_k(x)=0^\ell\) using much fewer than \(2^\ell\) steps.

The main question in using PRF’s for proofs of work is who is holding the key \(k\) for the pseudorandom function. If there is a trusted server holding the key, then sure, finding such an input \(x\) would take on average \(2^\ell\) queries, but the whole point of bitcoin is to not have a trusted server. If we give \(k\) to a party Alice, then can we guarantee that she can’t find a “shortcut” to find such an input without running \(2^\ell\) queries? The answer, in general, is no.

Indeed, it is an excellent exercise to prove that (under the PRF conjecture) that there exists a PRF \(\{ f_k \}\) mapping \(n\) bits to \(n\) bits and an efficient algorithm \(A\) such that \(A(k)=x\) such that \(f_k(x)=0^\ell\).

However, suppose that \(\{ f_k \}\) was somehow a “super-strong PRF” that would behave like a random function even to a party that holds the key. In this case, we can imagine that making a query to \(f_k\) corresponds to tossing \(\ell\) independent random coins, and it would not be feasible to obtain \(x\) such that \(f_k(x)=0^\ell\) using much less than \(2^\ell\) cycles. Thus presenting such an input \(x\) can serve as a “proof of work” that you’ve spent \(2^\ell\) cycles or so. By adjusting \(\ell\) we can obtain a proof of spending \(T\) cycles for a value \(T\) of our choice. Now if things would go as usual in this course then I would state a result like the following:

Theorem: Under the PRG conjecture, there exist super strong PRF.

Unfortunately such a result is not known to be true, and for a very good reason. Most natural ways to define “super strong PRF” will result in properties that can be shown to be impossible to achieve. Nevertheless, the intuition behind it still seems useful and so we have the following heuristic:

The random oracle heuristic (aka “Random oracle model”, Bellare-Rogaway 1993): If a “natural” protocol is secure when all parties have access to a random function \(H:\{0,1\}^n\rightarrow\{0,1\}^\ell\), then it remains secure even when we give the parties the description of a cryptographic hash function with the same input and output lengths.

We don’t have a good characterization as to what makes a protocol “natural” and we do have fairly strong counterexamples to this heuristic (though they are arguably “unnatural”). That said, it still seems useful as a way to get intuition for security, and in particular to analyze bitcoin (and many other practical protocols) we do need to assume it, at least given current knowledge.

The random oracle heuristic is very different from all the conjectures we considered before. It is not a formal conjecture since we don’t have any good way to define “natural” and we do have examples of protocols that are secure when all parties have access to a random function but are insecure whenever we replace this random function by any efficiently computable function (see the homework exercises).

We can now specify the “proof of work” protocol for bitcoin. Given some identifier \(ID\in\{0,1\}^n\), an integer \(T \ll 2^n\), and a hash function \(H:\{0,1\}^{2n}\rightarrow\{0,1\}^n\), the proof of work corresponding to \(ID\) and \(T\) will be some \(x\in\{0,1\}^*\) such that the first \(\lceil \log T \rceil\) bits of \(H(ID\| x)\) are zero.The actual bitcoin protocol is slightly more general, where the proof is some \(x\) such that \(H(ID\|x)\), when interpreted as a number in \([2^n]\), is at most \(T\). There are also other issues about how exactly \(x\) is placed and \(ID\) is computed from past history that we ignore here.

From proof of work to consensus on ledger

How does proof of work help us in achieving consensus? The idea is that every transaction \(t_i\) comes up with a proof of work of some \(T_i\) time with respect to some identifier that is unique to \(t_i\). The length of a ledger \((t_1,\ldots,t_n)\) is the sum of the corresponding \(T_i\)’s which correspond to the total number of cycles invested in creating this ledger.

An honest party in the bitcoin network will accept the longest valid ledger it is aware of. (A ledger is valid if every transaction in it of the form “transfer the coin \(ID\) from \(P\) to \(Q\)” is self-certified by a solution of \(P\), and the last transaction in the ledger involving \(ID\) either transferred or minted the coin \(ID\) to \(P\)). If a ledger \(L\) corresponds to the majority of the cycles that were available in this network then every honest party would accept it, as any alternative ledger would be necessarily shorter. (See Reference:ledgerfig.)

The question is then how do we get to that happy state given that many parties might be non-malicious but still selfish and might not want to volunteer their computing power for the goal of creating a consensus ledger. Bitcoin achieves this by giving some incentive, in the form of the ability to mint new coins, to any party that adds to the ledger. This means that if we are already in the situation where there is a consensus ledger \(L\), then every party has an interest in continuing this ledger \(L\), and not any alternative, as they want their minting transaction to be part of the new consensus ledger. In contrast if they “fork” the consensus ledger then their work may well be for vain. Thus one can hope that the consensus ledger will continue to grow. (This is a rather hand-wavy and imprecise argument, see this paper for a more in depth analysis; this is also related to the phenomenon known as preferential attachment.)

Cost to mine, mining pools: Generally, if you know that completing a \(T\)-cycle proof will get you a single coin, then making a single query (which will succeed with probability \(1/T\)) is akin to buying a lottery ticket that costs you a single cycle and has probability \(1/T\) to win a single coin. One difference over the actual lottery is that there is also some probability that you’re working on the wrong fork of the ledger, but this incentivizes people to avoid this as much as possible. Another, perhaps even more major difference, is that things are setup so that this is a profitable enterprise and the cost of a cycle is smaller than the value of \(1/T\) coins. Just like in the lottery, people can and do gather in groups (known as “mining pools”) where they pool together all their computing resources, and then split the award if they win it. Joining a pool doesn’t change your expectation of winning but reduces the variance. In the extreme case, if everyone is in the same pool, then for every cycle you spend you get exactly \(1/T\) coins. The way these pools work in practice is that someone that spent \(C\) cycles looking for an output with all zeroes, only has probability \(C/T\) of getting it, but is very likely to get an output that begins with \(\log C\) zeroes. This output can serve as their own “proof of work” that they spent \(C\) cycles and they can send it to the pool management so they get an appropriate share of the reward.

The real bitcoin: There are several aspects in which the protocol described above differs from the real bitcoin protocol. Some of them were already discussed above: Bitcoin typically uses digital signatures for puzzles (though it has a more general scripting language to specify them), and transactions involve a number of satoshis (and the user interface typically displayes currency is in units of BTC which are \(10^8\) satoshis). The Bitcoin protocol also has a formula designed to factor in the decrease in dollar cost per cycle so that bitcoins become more expensive to mine with time. There is also a fee mechanism apart from the mining to incentivize parties to add to the ledger. (The issue of incentives in bitcoin is quite subtle and not fully resolved, and it is possible that parties’ behavior will change with time.) The ledger does not grow by a single transaction at a time but rather by a block of transactions, and there is also some timing synchronization mechanism (which is needed, as per the consensus impossiblity results). There are other differences as well; see the Bonneau et al paper as well as the Tschorsch and Scheuermann survey for more.

Collision resistance hash functions and creating short “unique” identifiers

Another issue we “brushed under the carpet” is how do we come up with these unique identifiers per transaction. We want each transaction \(t_i\) to be bound to the ledger state \((t_1,\ldots,t_{i-1})\), and so the ID of \(t_i\) should contain also the ID’s all the prior transactions. But yet we want this ID to be only \(n\) bits long. Ideally, we could solve this if we had a one to one mapping \(H\) from \(\{0,1\}^N\) to \(\{0,1\}^n\) for some very large \(N\gg n\). Then the ID corresponding to the task of appending \(t_i\) to \((t_1,\ldots,t_{i-1})\) would simply be \(H(t_1\|\cdots\|t_i)\). The only problem is that this is of course clearly impossible- \(2^N\) is much bigger than \(2^n\) and there is no one to one map from a large set to a smaller set. Luckily we are in the magical world of crypto where the impossible is routine and the unimaginable is occasional. So, we can actually find a function \(H\) that is “essentially” one to one.

A collision-resistant hash function is a map that from a large universe to a small one that is “practically one to one” in the sense that collisions for the function do exist but are hard to find.

The main idea is the following simple result, which can be thought of as one side of the so called “birthday paradox”:

If \(H\) is a random function from some domain \(S\) to \(\{0,1\}^n\), then the probability that after \(T\) queries an attacker finds \(x\neq x'\) such that \(H(x)=H(x')\) is at most \(T^2/2^n\).

Let us think of \(H\) in the “lazy evaluation” mode where for every query the adversary makes, we choose a random answer in \(\{0,1\}^n\) at the time it is made. (We can assume the adversary never makes the same query twice since a repeat query can be simulated by repeating the same answer.) For \(i< j\) in \([T]\) let \(E_{i,j}\) be the event that \(H(x_i)=H(x_j)\). Since \(H(x_j)\) is chosen at random and independently from the prior choice of \(H(x_i)\), the probability of \(E_{i,j}\) is \(2^{-n}\). Thus the probability of the union of \(E_{i,j}\) over all \(i,j\)’s is less than \(T^2/2^n\), but this probability is exactly what we needed to calculate.

This means that a random function \(H\) is collision resistant in the sense that it is hard for an efficient adversary to find two inputs that collide. Thus the random oracle heuristic would suggest that a cryptographic hash function can be used to obtain the following object:

A collection \(\{ h_k \}\) of functions where \(h_k:\{0,1\}^*\rightarrow\{0,1\}^n\) for \(k\in\{0,1\}^n\) is a collision resistant hash function (CRH) collection if the map \((k,x)\mapsto h_k(x)\) is efficiently computable and for every efficient adversary \(A\), the probability over \(k\) that \(A(k)=(x,x')\) such that \(x\neq x'\) and \(h_k(x)=h_k(x')\) is negligible.Note that the other side of the birthday bound shows that you can always find a collision in \(h_k\) using roughly \(2^{n/2}\) queries. For this reason we typically need to double the output length of hash functions compared to the key size of other cryptographic primitives (e.g., \(256\) bits as opposed to \(128\) bits).

Once more we do not know a theorem saying that under the PRG conjecture there exists a collision resistant hash function collection, even though this property is considered as one of the desiderata for cryptographic hash functions. However, we do know how to obtain collections satisfying this condition under various assumptions that we will see later in the course such as the learning with error problem and the factoring and discrete logarithm problems. Furthermore if we consider the weaker notion of security under a second preimage attack (also known as being a “universal one way hash function” or UOWHF) then it is known how to derive such a function from the PRG assumption.

A collection \(\{ h_k \}\) of collision resistant hash functions is an incomparable object to a collection \(\{ f_s \}\) of pseudorandom functions with the same input and output lengths. On one hand, the condition of being collision-resistant does not imply that \(h_k\) is indistinguishable from random. For example, it is possible to construct a valid collision resistant hash function where the first output bit always equals zero (and hence is easily distinguishable from a random function). On the other hand, unlike Reference:prfdef, the adversary of Reference:crhdef is not merely given a “black box” to compute the hash function, but rather the key to the hash function. This is a much stronger attack model, and so a PRF does not have to be collision resistant. (Constructing a PRF that is not collision resistant is a nice and recommended exercise.)

Practical constructions of cryptographic hash functions

While we discussed hash functions as keyed collections, in practice people often think of a hash function as being a fixed keyless function. However, this is because most practical constructions involve some hardwired standardized constants (often known as IV) that can be thought of as a choice of the key.

Practical constructions of cryptographic hash functions start with a basic block which is known as a compression function \(h:\{0,1\}^{2n}\rightarrow\{0,1\}^n\). The function \(H:\{0,1\}^*\rightarrow\{0,1\}^n\) is defined as \(H(m_1,\ldots,m_t)=h(h(h(m_1,IV),m_2),\cdots,m_t)\) when the message is composed of \(t\) blocks (and we can pad it otherwise). See Reference:merkledamgardfig. This construction is known as the Merkle-Damgard construction and we know that it does preserve collision resistance:

The Merkle-Damgard construction converts a compression function \(h:\{0,1\}^{2n}\rightarrow\{0,1\}^n\) into a hash function that maps strings of arbitrary length into \(\{0,1\}^n\). The transformation preserves collision resistance but does not yield a PRF even if \(h\) was pseudorandom. Hence for many applications it should not be used directly but rather composed with a transformation such as HMAC.

Let \(H\) be constructed from \(h\) as above. Then given two messages \(m \neq m' \in \{0,1\}^{tn}\) such that \(H(m)=H(m')\) we can efficiently find two messages \(x \neq x' \in \{0,1\}^{2n}\) such that \(h(x)=h(x')\).

The intuition behind the proof is that if \(h\) was invertible then we could invert \(H\) by simply going backwards. Thus in principle if a collision for \(H\) exists then so does a collision for \(h\). Now of course this is a vacuous statement since both \(h\) and \(H\) shrink their inputs and hence clearly have collisions. But we want to show a constructive proof for this statement that will allow us to transform a collision in \(H\) to a collision in \(h\). This is very simple. We look at the computation of \(H(m)\) and \(H(m')\) and at the first block in which the inputs differ but the output is the same (there must be such a block). This block will yield a collision for \(h\).

Practical random-ish functions

In practice we want much more than collision resistance from our hash functions. In particular we often would like them to be PRF’s as well. Unfortunately, the Merkle-Damgard construction is not a PRF even when \(IV\) is random and secret. This is because we can perform a length extension attack on it. Even if we don’t know \(IV\), given \(y=H_{IV}(m_1,\ldots,m_t)\) and a block \(m_{t+1}\) we can compute \(y' = h(y,m_{t+1})\) which equals \(H_{IV}(m_1,\ldots,m_{t+1})\).

One fix for this is to use a different \(IV'\) in the end of the encryption. That is, we define:

\(H_{IV,IV'}(m_1,\ldots,m_t) = h(IV',H_{IV}(m_1,\ldots,m_t))\)

A variant of this construction (where \(IV'\) is obtained as some simple function of \(IV\)) is known as HMAC and it can be shown to be a pseudorandom function under some pseudorandomness assumptions on the compression function \(h\). It is very widely implemented. In many cases where I say “use a cryptographic hash function” in this course I actually mean to use an HMAC like construction that can be conjectured to give at least a PRF if not stronger “random oracle”-like properties.

The simplest implementation for a compression function is to take a block cipher with an \(n\) bit key and an \(n\) bit message and then simply define \(h(x_1,\ldots,x_{2n})=E_{x_{n+1},\ldots,x_{2n}}(x_{1},\ldots,x_{n})\). A more common variant is known as Davies-Meyer where we also XOR the output with \(x_{n+1},\ldots x_{2n}\). In practice people often use tailor made block ciphers that are designed for some efficiency or security concerns.

Some history

Almost all practically used hash functions are based on the Merkle-Damgard paradigm. Hash functions are designed to be extremely efficientFor example, the Boneh-Shoup book quotes processing times of up to 255MB/sec on a 1.83 Ghz Intel Core 2 processor, which is more than enough to handle not just Harvard’s network but even Lamar College’s. which also means that they are often at the “edge of insecurity” and indeed have fallen over the edge.

In 1990 Ron Rivest proposed MD4, which was already shown weaknesses in 1991, and a full collision has been found in 1995. Even faster attacks have been since found and MD4 is considered completely insecure.

In response to these weaknesses, Rivest designed MD5 in 1991. A weakness was shown for it in 1996 and a full collision was shown in 2004. Hence it is now also considered insecure.

In 1993 the National Institute of Standards proposed a standard for a hash function known as the Secure Hash Algorithm (SHA), which has quite a few similarities with the MD4 and MD5 functions. This function is known as SHA-0, and the standard was replaced in 1995 with SHA-1 that includes an extra “mixing” (i.e., bit rotation) operation. At the time no explanation was given for this change but SHA-0 was later found to be insecure. In 2002 a variant with longer output, known as SHA-256, was added (as well as some others). In 2005, following the MD5 collision, significant weaknesses were shown in SHA-1. In 2017, a full SHA-1 collision was found. Today SHA-1 is considered insecure and SHA-256 is recommended.

Given the weaknesses in MD-5 and SHA-1 , NIST started in 2006 a competition for a new hashing standard, based on functions that seem sufficiently different from the MD5/SHA-0/SHA-1 family. (SHA-256 is unbroken but it seems too close for comfort to those other systems.) The hash function Keccak was selected as the new standard SHA-3 in August of 2015.

The NSA and hash functions.

The NSA is the world’s largest employer of mathematicians, and is very heavily invested in cryptographic research. It seems quite possible that they devote far more resources to analyzing symmetric primitives such as block ciphers and hash functions than the open research community. Indeed, the history above suggests that the NSA has consistently discovered attacks on hash functions before the cryptographic community (and the same holds for the differential cryptanalysis technique for block ciphers). That said, despite the “mythic” powers that are sometimes ascribed to the NSA, this history suggests that they are ahead of the open community but not so much ahead, discovering attacks on hash functions about 5 years or so ahead.

There are a few ways we can get “insider views” to the NSA’s thinking. Some such insights can be obtained from the Snowden documents. The Flame malware has been discovered in Iran in 2012 after operating since at least 2010. It used an MD5 collision to achieve its goals. Such a collision was known in the open literature since 2008, but Flame used a different variant that was unknown in the literature. For this reason it is suspected that it was designed by a western intelligence agency.

Another insight into NSA’s thoughts can be found in pages 12-19 of NSA’s internal Cryptolog magazine which has been recently declassified; one can find there a rather entertaining and opinionated (or obnoxious, depending on your point of view) review of the CRYPTO 1992 conference. In page 14 the author remarks that certain weaknesses of MD5 demonstrated in the conference are unlikely to be extended to the full version, which suggests that the NSA (or at least the author) was not aware of the MD5 collisions at the time.

Cryptographic vs non-cryptographic hash functions:

Hash functions are of course also widely used for non-cryptographic applications such as building hash tables and load balancing. For these applications people often use linear hash functions known as cyclic redundancy codes (CRC). Note however that even in those seemingly non-cryptographic applications, an adversary might cause significant slowdown to the system if he can generate many collisions. This can and has been used to obtain denial of service attacks. As a rule of thumb, if the inputs to your system might be generated by someone who does not have your best interests at heart, you’re better off using a cryptographic hash function.