

Book Review: A New Kind of Science
This review of Stephen Wolfram's new book addresses weaknesses in Wolfram's notions of computational complexity, general relativity, quantum mechanics, and the Bell inequality violation.
"Somebody says, 'You know, you people always say that space is continuous.
How do you know when you get to a small enough dimension that there really are
enough points in between, that it isn't just a lot of dots separated by little
distances?' Or they say, 'You know those quantum mechanical amplitudes you
told me about, they're so complicated and absurd, what makes you think those
are right? Maybe they aren't right.' Such remarks are obvious and are
perfectly clear to anybody who is working on this problem. It does not do any
good to point this out."
 Richard Feynman [7]
1. Introduction
A New Kind of Science [18], the 1280page
treatise by Mathematica creator Stephen Wolfram, has
only a few things to say about quantum computing. Yet the book's goal  to
understand nature in computational terms  is one widely shared by the quantum
computing community. Thus, many in the field will likely be curious: is this
2.5kilogram tome worth reading? Notwithstanding newspaper comparisons
[6] to Darwin's Origin of Species, what is the
book's actual content? This review will not attempt a chapterbychapter
evaluation, but will focus on two specific areas: computational complexity
(Section 2), especially the claimed relevance of Wolfram's Principle of
Computational Equivalence to
NPcompleteness
(Section 2.1); and fundamental physics (Section 3), especially quantum mechanics
(Section 3.1) and Wolfram's attempt to account for Bell inequality violation in a
classical model (Section 3.2).
As a popularization, A New Kind of Science is an
impressive accomplishment. The book's main theme is that simple programs can
exhibit complex behavior. For example, let
p_{i,j}=1
if cell (i,j)
is colored black, and
p_{i,j}=0
if white. Then the 'Rule
110'
cellular automaton is defined by the recurrence
p_{i+1,j} = p_{i,j} + p_{i,j+1}  (1+p_{i,j1})p_{i,j}p_{i,j+1}
for i>0,
given some initial condition at
i=0.
Wolfram emphasizes that such an automaton, even when run with a simple initial
condition such as a single black cell, can generate a complicatedlooking
image with no apparent repetitive or nested structure. This phenomenon,
although well known to programming enthusiasts as well as professionals, will
no doubt surprise many general readers.
Using cellular automata as a framework, Wolfram moves on to discuss a range of
topics  including the second law of thermodynamics, natural selection, plant
and animal morphology, artificial intelligence, fluid dynamics, special and
general relativity, quantum mechanics, efficient algorithms and
NPcompleteness, heuristic search methods,
cryptography and pseudorandomness, data compression, statistical hypothesis
testing, Gödel's Theorem, axiomatic set theory, and the ChurchTuring
thesis. What is noteworthy is that he explains all of these without using
formal notation. To do so, he relies on about
1000
highresolution graphics, which often (though not always) convey the ideas
with as much precision as a formula would. With suitable disclaimers,
A New Kind of Science could form an excellent basis
for an undergraduate generalscience course.
The trouble is that, as the title implies, Wolfram emphatically does not
believe that he is using cellular automata to popularize known ideas. In the
introduction, he describes his finding that onedimensional cellular automata
can produce complex behavior as one of the "more important single discoveries
in the whole history of theoretical science" (p. 2). He refers in the
preface to "a crack in the very foundations of existing science," "new
ideas and new methods that ultimately depend very little on what has gone
before," and "a vast array of applications  both conceptual and
practical  that can now be developed." Comments of this character pervade
the book.
Significantly, there is no bibliography. Instead there are
349
pages of endnotes, which summarize the history, from antiquity to the present,
of each subject that Wolfram addresses. The notes are fascinating; in many
respects they constitute a better book than the main text. However, in both
the main text and in the notes, Wolfram generally brings up prior work only to
dismiss it as misguided, or at best as irrelevant to his concerns. For
example, after relating his 'discovery' that there is apparent complexity in
the distribution of primes, Wolfram acknowledges that "the first few hundred
primes were no doubt known even in antiquity, and it must have been evident
that there was at least some complexity in their distribution" (p. 134).
However [he continues], without the whole intellectual structure that I have
developed in this book, the implications of this observation  and its
potential connection, for example, to phenomena in nature  were not
recognized. And even though there has been a vast amount of mathematical work
done on the sequence of primes over the course of many centuries, almost
without exception it has been concerned not with basic issues of complexity
but instead with trying to find specific kinds of regularities (p. 134).
2. Computational Complexity
In the opening chapter we are promised that Wolfram's new kind of science
"begins to shed new light on various longstanding questions in computational
complexity theory" (p. 14). On pages 758764 we finally learn what this
means. By enumerating simple Turing machines  say, all those with
4
states and
2
symbols  one can lowerbound the number of steps needed by any such machine
to perform a particular task. To illustrate that simple machines can already
display nontrivial behavior, Wolfram exhibits a machine (call it
M)
that decides a language
L_{M} in
time exponential in the length of its input. No machine with at most
4
states can decide
L_{M} more
efficiently, and Wolfram conjectures that
M
is 'irreducible,' meaning that no machine with any
number of states can decide
L_{M} substantially
more efficiently.
It is unlikely that this enumeration approach scales even to
5state,
2symbol
machines. For let
S(n),
the
n^{th} 'Busy
Beaver shift' number, be the maximum number of steps that an
nstate,
2symbol
machine can make before halting, when started on a blank tape. Then as noted
by Wolfram on p. 889, determining
S(5) is
still open; the best known lower bound, due to Marxen and Buntrock [12], is S(5)>=47,176,870.
The situation is even worse if the machine can be started on a nonblank tape.
Wolfram would reply that extremely short programs are already of interest,
since they can exhibit 'irreducible complexity.' And indeed, in Chapter 11 he
gives a remarkable construction, due to his employee Matthew Cook, showing
that even the Rule
110
cellular automaton is a universal computer. A corollary is that there exists
a
2state,
5symbol
universal Turing machine. However, such constructions inevitably rely on a
complicated input encoding. And given a problem of interest, such as matrix
multiplication, there seems to be nothing to suggest that an extremely short
program exists to solve it using a standard
encoding (note 1). So by restricting to
extremely short programs, in effect we trade an infeasible search among
programs for an infeasible search among input encoding schemes.
Might it be worthwhile, though, to examine tradeoffs
between running time and program size? If we do this, then again we must not
allow preprocessing of the input, since otherwise a constantsize program can
always achieve within a polynomial factor of the optimal running time. We
therefore obtain complexity measures that are not 'robust' (in the traditional
sense) under changes to the input representation. Such measures might
nevertheless have some interest for theoretical computer science.
Elsewhere in the book, there are a few errors and oversights regarding
complexity. Wolfram says that minimizing a DNF expression (p. 1096) and
computing a permanent (p. 1146) are NPcomplete; they
are respectively Σ_{2}Pcomplete
(as shown by Umans [16]) and
#Pcomplete
(as shown by Valiant [17]). Also, in Chapter 10, pseudorandom number
generators based on cellular automata are proposed. Wolfram suggests that,
since certain questions involving cellular automata are
NPcomplete, these generators might be a good basis
for cryptography (note 2):
To date [no] system has been devised whose cryptanalysis is known to be
NPcomplete. Indeed, essentially the only problem on which cryptography
systems have so far successfully been based is factoring of integers. [And]
while this problem has resisted a fair number of attempts at solution, it is
not known to be NPcomplete (and indeed its ability to be solved in polynomial
time on a formal quantum computer may suggest that it is not) (p. 10891090).
There are two problems here. First, all cryptanalysis problems are in NP intersect coNP,
and thus cannot be NPcomplete unless
NP=coNP.
Second, there is no shortage of candidate pseudorandom generators (or
equivalently, candidate oneway functions; Håstad et al. [9] showed
that either can be obtained from the other). Attention has focused on
factoring  and on other 'structured' problems, involving elliptic curves,
errorcorrecting codes, lattices, and so on  because of the need for
trapdoor oneway functions in publickey cryptography.
2.1. The Principle of Computational
Equivalence
The final chapter proposes a 'Principle of Computational Equivalence': that
almost all systems that are not 'obviously simple' are in some sense
equivalent to a universal Turing machine. Wolfram [19] emphasizes that
this principle goes beyond the ChurchTuring thesis in two ways: it asserts,
first, that universality is pervasive in nature; and second, that universality
arises in any sufficiently complex system, without needing to be 'engineered
in.' We confess that the principle still seems to us an expression of the
conventional wisdom in theoretical computer science. However, we will not
debate this question in general terms. Instead we will consider a specific
implication that Wolfram offers for computational complexity:
In studying the phenomenon of NP completeness what has mostly been done in the
past is to try to construct particular instances of rather general problems
that exhibit equivalence to other problems. But almost always what is
actually constructed is quite complicated  and certainly not something one
would expect to occur at all often. Yet on the basis of intuition from the
Principle of Computational Equivalence I strongly suspect that in most cases
there are already quite simple instances of general NPcomplete problems that
are just as difficult as any NPcomplete problem. And so, for example, I
suspect that it does not take a cellular automaton nearly as complicated as
[one with 19 colors given previously] for it to be an NPcomplete problem to
determine whether initial conditions exist that lead to particular behavior.
(p. 769)
In computer science, the complexity of 'typical' instances of
NPcomplete
problems has been investigated for decades. Highlights include Levin's theory
of averagecase completeness [10] and studies of phase transitions in
randomly generated combinatorial problems [5]. It remains open to show that
some
NPcomplete
problem is hard on average, under a simple distribution, so long as P!=NP.
However, 'worstcase/averagecase equivalence' has been shown for several
cryptographic problems, including one studied by Ajtai and Dwork [1].
As for the cellular automaton conjecture, its validity depends on how it is
formulated. Suppose we are given a onedimensional, twocolor cellular
automaton on a lattice of bounded size
n,
and an ending condition E in {0,1}^{n}. Then
we can decide in polynomial time whether there exists an initial condition
that evolves to
E
in one or more steps, by using dynamic programming. Extending this technique,
we can decide whether there exists an initial condition that evolves to
E
in exactly
t
steps, where
t=O(log n),
by computing a list of all possible initial configurations for each contiguous
block of
t cells.
Indeed, for any fixed polynomialtime predicate Φ,
let Init_{110}(Φ) be
the following problem. We are given an ending condition
E
with
n
cells, and asked to decide whether there exists an initial condition
I
such that (i) Φ(I) holds,
and (ii) the Rule
110
cellular automaton evolves
I
to
E
in exactly
t
steps. Here
t
is a fixed polynomial in
n.
Then:
Proposition 1. For all
Φ and
polynomials
p,
there is a polynomialtime algorithm that solves a
11/p(n) fraction
of
Init_{110}(Φ) instances.
Proof. Consider a directed graph with
2^{n}
vertices, one for each configuration, and edges corresponding to the action of
Rule
110.
Each vertex has outdegree
1,
so the number of paths of length
t
is 2^{n}.
Thus, if
E
is chosen uniformly at random, then the expected number of
lengtht
paths ending at
E
is
1,
so this number is at most
p(n) with
probability at least
11/p(n).
In this case we can trace
t
steps backward from
E
in time
O(ntp(n)),
maintaining a list of all possible predecessors, and then evaluate
Φ on
each.
Nevertheless, since Rule
110
is universal, the following intuition suggests that
Init_{110}(Φ) should
be
NPcomplete for some
Φ.
Given a Boolean formula
Ψ and
proposed solution
X,
we could create an initial condition
I(Ψ,X)
corresponding to a Turing machine that checks whether
X
satisfies
Ψ;
and if it does, erases
X,
preserves Ψ,
and goes into an 'accept' state
A(Ψ).
Then by having
Φ verify
that the initial condition is of legal form, we could reduce the problem of
whether
Ψ is
satisfiable to that of whether there exists a legal initial condition that
evolves
to E=A(Ψ).
This intuition fails for an interesting reason. Cook's proof that Rule
110
is universal relies on simulating 'cyclic tag systems,' a variant of the tag
systems studied in the 1960's by Cocke and Minsky [4] among others (see also
p. 670 of Wolfram). However, though Wolfram does not discuss this explicitly
in the book, the known simulations of Turing machines by tag systems require
exponential slowdown. To prove that
Init_{110}(Φ) is
NPcomplete,
what is needed is to show that Rule
110
allows efficient simulation of Turing machines.
In summary, there are many fascinating complexity questions about
onedimensional cellular automata, as well as about typical instances of
NPcomplete
problems. But it is unclear why the Principle of Computational Equivalence
should yield more insight into these questions than the standard techniques of
computational complexity.
3. Fundamental Physics
The most interesting chapter of A New Kind of Science
is the ninth, on 'Fundamental Physics.' Here Wolfram confronts general
relativity and quantum mechanics, arguably the two most serious challenges to
a view of nature based on deterministic cellular automata. He conjectures
that spacetime is discrete at the Planck scale, of about 10^{33} centimeters
or
10^{43} seconds.
This conjecture is not new, and has received considerable attention recently
in connection with the holographic principle [3] from black hole
thermodynamics, which Wolfram does not
discuss (note 3). But are new ideas
offered to substantiate the conjecture?
For Wolfram, spacetime is a causal network, in which
events are vertices and edges specify the dependence relations between events.
Pages 486496 and 508515 discuss in detail how to generate such a network
from a simple set of rules. In particular, we could start with a finite
undirected 'space graph'
G,
assumed to be regular with degree
3
(since higherdegree vertices can be replaced by cycles of
degree3
vertices). We then posit a set of update rules, each of which replaces a
subgraph by another subgraph with the same number of outgoing edges. The new
subgraph must preserve any symmetries of the old one. Then each event in the
causal network corresponds to an application of an update rule. If updating
event
B
becomes possible as a result of event
A,
then we draw an edge from
A
to
B.
As pointed out to us by Rowland [13], it is important to keep in mind the
distinction between the causal network and the space graph
G.
Properties of space are defined in terms of
G.
For example, if the number of vertices in
G
at distance at most
n
from any given vertex grows as
n^{D},
then space can be said to have dimension
D.
(As for formalizing this definition, Wolfram says only that there are "some
subtleties. For example, to find a definite volume growth rate one does still
need to take some kind of limit  and one needs to avoid sampling too many or
too few" vertices (p. 1030).) Similarly, Wolfram argues that the curvature
information needed for general relativity, in particular the Ricci
tensor (note 4), can be read from the
connectivity pattern of
G.
Interestingly, to make the model as simple as possible, Wolfram does not
associate a bit to each vertex of
G,
representing (say) the presence or absence of a particle. Instead particles
are localized structures, or 'tangles,' in
G.
An immediate problem is that one might obtain many nonequivalent causal
networks, depending on the order in which update rules are applied to
G.
Wolfram calls a set of rules that allows such nondeterministic evolution a
'multiway system.' He recognizes, but rejects, a possible connection to
quantum mechanics:
The notion of 'manyfigured time' has been discussed since the 1950s in the
context of the manyworlds interpretation of quantum mechanics. There are
some similarities to the multiway systems that I consider here. But an
important difference is that while in the manyworlds approach, branchings are
associated with possible observation or measurement events, what I suggest
here is that they could be an intrinsic feature of even the very lowestlevel
rules for the universe (p. 10356).
It is unclear exactly what distinction is being drawn: is there any physical
event that is not associated with a possible
observation or measurement? In any case, Wolfram opts instead for rule sets
that are 'causal invariant': that is, that yield the same causal network
regardless of the order in which rules are applied. As noted by Wolfram, a
sufficient (though not necessary) condition for causal invariance is that no
'replaceable' subgraph overlaps itself or any other replaceable subgraph.
Wolfram points out an immediate analogy to special relativity, wherein
observers do not in general agree on the order in which spacelike separated
events occur, yet agree on any final outcome of the events. He is vague,
though, about how (say) the Lorentz transformations might be derived in a
causal network model:
There are many subtleties here, and indeed to explain the details of what is
going on will no doubt require quite a few new and rather abstract concepts.
But the general picture that I believe will emerge is that when particles move
faster they will appear to have more nodes associated with them (p. 529).
Wolfram is "certainly aware that many physicists will want to know more
details," he says in the endnotes, about how a discrete model of the sort he
proposes can reproduce known features of physics. But, although he chose to
omit technical formalism from the presentation, "[g]iven my own personal
background in theoretical physics it will come as no surprise that I have
often used such formalism in the process of working out what I describe in
these sections" (p. 1043).
3.1. Quantum Mechanics
Physicists' hunger for details will likely grow further when they read the
section on 'Quantum Phenomena' (p. 537545). Here Wolfram's proposals are
even more unorthodox than he seems to acknowledge. He believes that quantum
mechanics is only an approximation to an underlying classical (and most likely
deterministic) theory. He is not advocating a
hiddenvariable approach such as Bohmian mechanics, in which the state vector
is supplemented by an 'actual' eigenstate of a particular observable. Instead
he thinks that, at the lowest level, the state vector is not needed at all; it
is merely a useful construct for describing some (though presumably not all)
higherlevel phenomena. Indeterminacy arises because of one's inability to
know the exact state of a system:
[I]f one knew all of the underlying details of the network that makes up our
universe, it should always be possible to work out the result of any
measurement. I strongly believe that the initial conditions for the universe
were quite simple. But like many of the processes we have seen in this book,
the evolution of the universe no doubt intrinsically generates apparent
randomness. And the result is that most aspects of the network that
represents the current state of our universe will seem essentially random (p.
543).
Similarly, Wolfram explains as follows why an electron has wave properties:
"...a network which represents our whole universe must also include us as
observers. And this means that there is no way that we can look at the
network from the outside and see the electron as a definite object" (p. 538).
An obvious question then is how Wolfram accounts for the possibility of
quantum computing, assuming BPP!=BQP.
He gives an answer in the final chapter:
Indeed within the usual formalism [of quantum mechanics] one can construct
quantum computers that may be able to solve at least a few specific problems
exponentially faster than ordinary Turing machines. But particularly after my
discoveries in Chapter 9 ['Fundamental Physics'], I strongly suspect that even
if this is formally the case, it will still not turn out to be a true
representation of ultimate physical reality, but will instead just be found to
reflect various idealizations made in the models used so far (p. 771).
In the endnotes, though, where he explains quantum computing in more detail,
Wolfram seems to hedge about which idealizations he has in mind:
It does appear that only modest precision is needed for the initial
amplitudes. And it seems that perturbations from the environment can be
overcome using versions of errorcorrecting codes. But it remains unclear
just what might be needed actually to perform for example the final
measurements required (p. 1148).
One might respond that, with or without quantum computing, Wolfram's proposals
can be ruled out on the simpler ground that they disallow Bell inequality
violations. Wolfram, however, puts forward an imaginative hypothesis to
account for bipartite entanglement. When two particles (or 'tangles' in the
graph
G)
collide, longrange 'threads' may form between them, which remain in place
even if the particles are later separated:
The picture that emerges is then of a background containing a very large
number of connections that maintain an approximation to threedimensional
space, together with a few threads that in effect go outside of that space to
make direct connections between particles (p. 544).
The threads can produce Bell correlations, but are somehow too small (i.e.
contain too few edges) to transmit information in a way that violates
causality. This is reminiscent of, for example, the 'multisimultaneity' model
studied experimentally by Stefanov et al. [15].
There are several objections one could raise against this thread
hypothesis. What we will show in Section 3.2 is that, if
one accepts two of Wolfram's own desiderata  determinism and causal
invariance  then the hypothesis fails. For now, though, we remark
that Wolfram says little about what, to us, is a more natural possibility
than the thread hypothesis. This is an explicitly quantum
cellular automaton or causal network, with a unitary transition
rule. The reason seems to be that he does not want continuity anywhere
in a model, not even in probabilities or amplitudes. In the notes,
he describes an experiment with a quantum cellular automaton as
follows:
One might hope to be able to get an ordinary cellular automaton with a limited
set of possible values by choosing a suitable [phase rotation]
θ
[θ=π/4
and
θ=π/3 are
given as examples in an illustration]. But in fact in nontrivial cases most
of the cells generated at each step end up having distinct values (p. 1060).
This observation is unsurprising, given results in quantum computing to the
effect that almost any nontrivial gate set is universal (that is, can
approximate any unitary matrix to any desired precision, or any orthogonal
matrix in case one is limited to reals). Indeed, Shi [14] has shown that a
Toffoli gate (note 5) plus any gate that
does not preserve the computational basis, or a
controlledNOT
gate plus any gate whose square does not preserve the
computational basis, are both universal gate sets. In any case, Wolfram does
not address the fact that continuity in amplitudes seems more 'benign' than
continuity in measurable quantities: the former, unlike the latter, does not
enable an infinite amount of computation to be performed in a finite time.
Also, as observed by Bernstein and Vazirani [2], the linearity of quantum
mechanics implies that small errors in amplitudes cannot be magnified during a
quantum computation.
3.2. Bell's Theorem and Causal Invariance
Let
R
be a set of graph updating rules, which might be probabilistic. Then consider
the following four assertions (which, though not mathematically precise, will
be clarified by subsequent discussion).

R
satisfies causal invariance. That is, given any initial graph (and choice of
randomness if
R is
probabilistic),
R yields
a unique causal network.

R
satisfies the relativity postulate. That is, assuming the causal network
approximates a flat Minkowski spacetime at a large enough scale, there are no
preferred inertial frames.

R
permits Bell inequality violation.

Any updating rule in
R
is always considered to act on a fixed graph, not on a distribution or
superposition over graphs. This is true even if parts of the initial graph
are chosen at random, and even if
R
is probabilistic.
Our goal is to show that, for any
R,
at least one of these assertions is false. Current physical theory would
suggest that (1)(3) are true and that (4) is false. Wolfram, if we
understand him correctly, starts with (4) as a premise, and then introduces
causal invariance to satisfy (1) and (2), and longrange threads to satisfy
(3). Of course, even to state the twoparty Bell inequalities requires some
notion of randomness. And on pages 299326, Wolfram discusses three
mechanisms for introducing randomness into a system: randomness in initial
conditions, randomness from the environment (i.e. probabilistic updating
rules), and intrinsic randomness (i.e. deterministic rules that produce
pseudorandom output). However, all of these mechanisms are compatible with
(4), and so our argument will show that they are inadequate assuming (1)(3).
Our conclusion is that, in a model of the sort Wolfram considers, randomness
must play a more fundamental role than he allows.
In a standard Bell experiment, Alice and Bob are given input bits
x_{A} and
x_{B} respectively,
chosen uniformly and independently at random. Their goal is, without
communicating, to output bits
y_{A} and
y_{B}
respectively such that
y_{A} XOR y_{B} = x_{A} AND x_{B}.
Under any 'local hidden variable' theory, Alice and Bob can succeed with
probability at most
3/4;
the optimal strategy is for them to ignore their inputs and output (say)
y_{A}=0 and
y_{B}=0.
However, suppose Alice has a qubit
ρ_{A} and
Bob a
ρ_{B},
that are jointly in the Bell state (00>+11>)/sqrt(2).
Then there is a protocol (note 6) by
which they can succeed with probability
(5+sqrt(2))/8, or about 0.802.
We model this situation by letting
A
and
B,
corresponding to Alice and Bob, be disjoint subgraphs of a graph
G.
We suppose that, at a large scale,
G
approximates a Euclidean space of some dimension; and that any causal network
obtained by applying updates to
Gapproximates
a Minkowski spacetime. We can think of
G
as containing longrange threads from
A
to
B,
though the nature of the threads will not affect our conclusions. We encode
Alice's input
x_{A} by
(say) placing an edge between two specific vertices in
A
if and only if
x_{A}=1.
We encode
x_{B}
similarly, and also supply Alice and Bob with arbitrarily many correlated
random bits. Finally we stipulate that, at the end of the protocol, there is
an edge between two specific vertices in
A
if and only if
y_{A}=1,
and similarly for
y_{B}.
A technicality is that we need to be able to identify which vertices
correspond to
x_{A},
y_{A},
and so on, even as
G
evolves over time. We could do this by stipulating that (say) "the
x_{A} vertices
are the ones that are roots of complete binary trees of depth
3,"
and then choosing the rule set to guarantee that, throughout the protocol,
exactly two vertices have this property.
Call a variable 'touched' after an update has been applied to a subgraph
containing any of the variable's vertices. Also, let
Z
be an assignment to all random variables: that is,
x_{A},
x_{B},
the correlated random bits, and the choice of randomness if
R is
probabilistic. Then for all
Z
we require the following, based on what observers in different inertial frames
could perceive:

There exists a sequence of updates under which
y_{A} is
output before any of Bob's variables are touched.

There exists another sequence under which
y_{B} is
output before any of Alice's variables are touched.
Then it is easy to see that, if Bell inequality violation occurs, then causal
invariance must be violated. Given
Z,
let y_{A}^{(1)}(Z),y_{B}^{(1)}(Z)
be the values of y_{A},y_{B} that
are output under rule sequence (i), and let y_{A}^{(2)}(Z),y_{B}^{(2)}(Z) be
the values output under sequence (ii). Then there must exist some
Z
for which either
y_{A}^{(1)}(Z) != y_{A}^{(2)}(Z)
or
y_{B}^{(1)}(Z) != y_{B}^{(2)}(Z)  for
if not, then the entire protocol could be simulated under a local hidden
variable model. It follows that the outcome of the protocol can depend on the
order in which updates are applied.
To obtain Bell inequality violation, something like the following seems to be
needed. We can encode 'hidden variables' into
G,
representing the outcomes of the possible measurements Bob could make on
ρ_{B}.
(We can imagine, if we like, that the update rules are such that observing any
one of these variables destroys all the others. Also, we make no assumption
of contextuality.) Then, after Alice measures
ρ_{A},
using the longrange threads she updates Bob's hidden variables conditioned on
her measurement outcome. Similarly, Bob updates Alice's hidden variables
conditioned on his outcome. Since at least one party must access its hidden
variables for there to be Bell inequality violation, causal invariance is
still violated. But a sort of probabilistic causal invariance holds, in the
sense that if we marginalize out
A
(the 'Alice' part of
G),
then the distribution of values for each of Bob's
hidden variables is the same before and after Alice's update. The lesson is
that, if we want both causal invariance and Bell inequality violation, then we
need to introduce probabilities at a fundamental level  not merely to
represent Alice and Bob's subjective uncertainty about the state of
G,
but even to define whether a set of rules is or is not causal invariant.
Note that we made no assumption about how the random bits were
generated  i.e. whether they were 'truly random' or were the pseudorandom
output of some updating rule. Our conclusion is also unaffected if we
consider a 'deterministic' variant of Bell's theorem due to Greenberger,
Horne, and Zeilinger [8], discussed by Wolfram on p. 1065. There three
parties, Alice, Bob, and Charlie, are given input bits
x_{A},
x_{B},
and
x_{C}
respectively, satisfying the promise that
x_{A} + x_{B} + x_{C} = 0 (mod 2).
The goal is to output bits
y_{A},
y_{B},
and
y_{C} such
that
y_{A} + y_{B} + y_{C} = 1  (1x_{A})(1x_{B})(1x_{C})
(mod 2).
Under a local hidden variable model, there is no protocol that succeeds on all
four possible inputs; but if the parties share the GHZ state (011>+101>+110>000>)/2,
then such a protocol exists. However, although the
output is correct with certainty, assuming causal
invariance one cannot implement the protocol without
introducing randomness into the underlying rules, precisely as for the
twoparty Bell inequality.
After a version of the above argument was circulated, Rowland [13]
claimed that the argument fails for the following reason. We assumed that
there exist two sequences of updating events, one in which Alice's measurement
precedes Bob's and one in which Bob's precedes Alice's. But we neglected the
possibility that a single update, call it
E,
is applied to a subgraph that straddles the longrange threads. The event
E
would encompass both Alice and Bob's measurements, so that neither would
precede the other in any sequence of updates. We could thereby obtain a rule
set
R satisfying
assertions (1), (3), and (4).
We argue that such an
R would
nevertheless fail to satisfy (2). For in effect we start with a flat
Minkowski spacetime, and then take two distinct events that are simultaneous
in a particular inertial frame, and identify them as being the
same event
E.
This can be visualized as 'pinching together' two horizontally separated
points on a spacetime diagram. (Actually a whole 'V' of points must be
pinched together, since otherwise entanglement could not have been created.)
However, what happens in a different inertial frame? It would seem that
E,
a single event, is perceived to occur at two separate times. That by itself
might be thought acceptable, but it implies that there exists a class of
preferred inertial frames: those in which
E
is perceived to occur only once. Of course, even in a flat spacetime, one
could designate as 'preferred' those frames in which Alice and Bob's
measurements are perceived to be simultaneous. A crucial distinction, though,
is that there one only obtains a class of preferred frames after deciding
which event at Alice's location, and which at Bob's
location, should count as the 'measurement.' Under Rowland's hypothesis, by
contrast, once one decides what counts as the measurement at Alice's location,
the decision at Bob's location is made automatically, because of the
identification of events that would otherwise be far apart.
4. Conclusion
Steven Levy [11] opines in Wired that "probably the
toughest criticism [of A New Kind of Science] will
come from those who reject Wolfram's ideas because the evidence for his
contentions is based on observing systems contained inside computers." In
our opinion, however, it is preferable to judge the book on its own terms  to
grant, that is, that many complex systems might indeed be fruitfully
understood in terms of simple computations. The question is, what does the
book tell us about such systems, beyond what was known from 'traditional
science'?
This review focused on two fields, computational complexity and fundamental
physics, which Wolfram claims are transformed by the discoveries in his book.
We made no attempt to assess the book's relevance to other fields such as
evolutionary biology and statistical physics.
In computational complexity, we argued that Wolfram often recapitulates
existing ideas (such as pseudorandomness and the intractability of simple
instances of NPcomplete problems), albeit without
precise definitions or proofs, and with greater claims of significance. On
the other hand, some of the book's contents  such as the explicit
constructions of Turing machines  may be of interest to theoretical computer
scientists.
In physics, the book proposes that spacetime be viewed in terms of causal
networks arising from graph rewriting systems. The causal network models are
intriguing, and in our opinion merit further mathematical study. However,
their relevance to physics is difficult to evaluate without the details that
Wolfram declines to supply. As for the proposal that a deterministic,
relativistically invariant, causal invariant model underlies quantum
mechanics, we argued that it fails  even if quantum mechanics breaks down for
more than two particles, and even if, as Wolfram suggests, one allows
longrange threads to connect entangled particles. Exactly what kinds of
classical models could underlie quantum mechanics is a question of great
importance, but Wolfram makes no serious effort to address the question.
A New Kind of Science was published by Wolfram's own
company, and was not subject to outside editing or peer review. If it were
(say) a creationist tract, then this unusual route to publication would be of
little consequence: for then no amount of editing would have improved it, and
few scientifically literate readers would be misled by it. What is
unfortunate in this case is that outside editing would probably have made a
substantial difference. In an endnote, Wolfram explains that "[p]erhaps I
might avoid some criticism by a greater display of modesty [in the text], but
the cost would be a drastic reduction in clarity" (p. 849). However, were
the book more cautious in its claims and more willing to acknowledge previous
work, it would likely be easier for readers to assess what it does offer: a
cellularautomatonbased perspective on existing ideas in science. Thus, we
believe the book would be not only less susceptible to criticism, but also
clearer.
Acknowledgments
I thank Stephen Wolfram for an enjoyable conversation about this review, David
Reiss for arranging the conversation, Todd Rowland for correspondence about
Section 3.2, and Dave Bacon for helpful comments. Supported by an NSF Graduate
Fellowship and by DARPA grant F306020120524.
Notes
 We are willing to be surprised by a counterexample.
 Wolfram [19] suggested to us that these generators have another advantage for practical cryptography, that they can be implemented in simple hardware without the need for an arithmetic unit.
 Wolfram [19] maintained to us that whether the number of degrees of freedom in a bounded region of spacetime is finite is unrelated to whether spacetime has a discrete structure of the sort he proposes.
 In personal conversation [19], Wolfram said what was especially significant is that the Ricci tensor, but not the full Riemann tensor, can be read from the connectivity pattern; and the Ricci tensor is all that is needed for GR. He said he had calculations to prove this, which were not included in the book for reasons of balance. He expressed hope that physicists would confirm his result by redoing the calculations.
 Indeed one can use almost any classical reversible 3bit gate in place of a Toffoli gate. On p. 1098, Wolfram reports that, out of 40,320 such gates, 38,976 are universal.
 If x_{A}=1 then Alice applies a π/8 phase rotation to ρ_{A}, and if x_{B}=1 then Bob applies a π/8 rotation to ρ_{B}. Both parties then measure in the standard basis and output whatever they observe.
References
[1]
M. Ajtai and C. Dwork. A publickey cryptosystem with worstcase/averagecase
equivalence, Electronic Colloquium on Computational
Complexity (ECCC) 3(65), 1996.
[2]
E. Bernstein and U. Vazirani. Quantum complexity
theory, SIAM Journal on Computing, 26(5):14111473,
1997.
[3]
R. Bousso. The holographic principle, to appear in Reviews
of Modern Physics, 74(3), 2002.
[4]
J. Cocke and M. Minsky. Universality of tag systems with P=2,
Journal of the ACM, 11(1):1520, 1964.
[5]
S. A. Cook and D. Mitchell. Finding hard instances of the satisfiability
problem: a survey, DIMACS Series in Discrete Math and
Theoretical Computer Science 35, pp. 117, 1997.
[6]
G. Farmela. The universe in black and white, The Daily
Telegraph, 13 January 1999.
[7]
R. P. Feynman. The Character of Physical Law, MIT
Press, 1998 (originally published 1965).
[8]
D. M. Greenberger, M. A. Horne, and A. Zeilinger. Bell's theorem without
inequalities, in SixtyTwo Years of Uncertainty: Historical,
Philosophical, and Physical Inquiries into the Foundations of Quantum
Mechanics, A. Miller (ed.), Plenum, 1990.
[9]
J. Håstad, R. Impagliazzo, L. A. Levin, and M. Luby. A pseudorandom
generator from any oneway function, SIAM Journal on
Computing, 28(4):13641396, 1999.
[10]
L. A. Levin. Average case complete problems, SIAM Journal on
Computing, 15(1):285286, 1986.
[11]
S. Levy. The man who cracked the code to everything,
Wired, June 2002.
[12] H. Marxen and J. Buntrock. Attacking the Busy Beaver 5, Bulletin of
the EATCS 40, pp. 247251, 1990. Also http://www.drb.insel.de/~heiner/BB.
[13]
T. Rowland. Email correspondence, June 12, 2002.
[14]
Y. Shi. Both Toffoli and controlledNOT need little help to do universal
quantum computation, submitted, 2002. quantph/0205115.
[15]
A. Stefanov, H. Zbinden, N. Gisin, and A. Suarez. Quantum correlations with
spacelike separated beam splitters in motion: experimental test of
multisimultaneity, Physical Review Letters 88(12),
2002.
[16]
C. Umans. The minimum equivalent DNF problem and shortest implicants,
Proceedings of 39th IEEE Symposium on Foundations of Computer
Science (FOCS), pp. 556563, 1998.
[17]
L. G. Valiant. The complexity of computing the permanent,
Theoretical Computer Science, 8(2):189201, 1979.
[18]
S. Wolfram. A New Kind of Science, Wolfram Media,
2002.
[19]
S. Wolfram. Telephone conversation, June 10, 2002.
  Join the discussion about this article on Mind·X!  