The DNA and Natural Algorithms Group:

Publications

(...with sometimes-fanciful commentary by Erik in lieu of an abstract; see the paper itself for a more sober discussion...)

Universal Spatial Distribution Approximation in Equilibrium Lattice Models with Local Pairwise Interactions. *
Cameron Chalk and Erik Winfree.

Applying the Boltzmann machine neural network perspective to a standard lattice model of multicomponent condensates gave us the Boltzmann liquids model, wherein a contrastive learning rule tunes the interaction energies between different species so as to best approximate a target probability distribution over spatial configurations. We tried it out on avocado distributions, Hopfield distributions, texture distributions, and MNIST distributions, and it seemed to work OK. But what are the limits to the complexity of distributions that liquids can exhibit? Here, we prove both unhelpful impossibility theorems and impractical universality theorems — which nonetheless give some clarity about how “hidden” states may enhance the representational capacity.

[ Conference paper #11 in DNA Computing and Molecular Programming 32 (27 pages, July 2026) PDF, 1.3 MB. ]

Contrastive Hebbian Learning for Multicomponent Liquids. *
Yancheng Du, Cameron Chalk, Salvador Buse, Lulu Qian, Erik Winfree.

Have you ever had a recurring dream? The same scene or situation comes up again and again, maybe months or even years apart, the images and emotions maybe recast differently each time, but instantly recognizable as having the same core? The field of neural computation has had such a recurring dream: the contrastive learning formula dw_ij / dt = ⟨x_i x_j⟩_driven − ⟨x_i x_j⟩_relaxed. It has been derived from the minimization of relative entropy between probability distributions in stochastic models; it has been derived from the minimization of an instability measure for local energy minima in continuous models; it has been derived from minimization of a cost function for a weakly nudged equilibrium continuous system. And again and again -- the formula is the same, but the operating context differs. This dream even recurs in non-neural systems -- mechanical spring networks, electrical resistive networks, chemical reaction networks. And now, also phase separation in multicomponent liquids.

[ Conference paper #13 in DNA Computing and Molecular Programming 32 (27 pages, July 2026) PDF, 11.5 MB. ]
[ Supplementary Material and Executable Code on GitHub. ]

Combinatorial decision-making driven by multicomponent surface condensates. *
Aidan Zentner, Ethan V. Halingstad, Cameron Chalk, Michael P. Brenner, Arvind Murugan, Erik Winfree, and Krishna Shrinivas.

A flock of green parrots descends on a jacaranda tree. Why that tree? Some might call it swarm dynamics: a few birds decide to alight, others pass by, a few more join, some peel away, until, amid a cacophony of squawks, the whole flock settles. This is how collective decision-making works, unifying the inclinations of a hundred bird brains, not as a linear average, but as a nonlinear all-or-none. And so we have a green-and-purple tree in Pasadena in June... A jumble of transcription factors colocalize on a genomic locus. Why that locus? Some might call it phase separation: a few molecules stick to a favorable site, most drift past, a few more bind to the first, others fall off, until the balance of Brownian jostling tips and a droplet forms. This is how collective decision-making works, unifying the preferences of a thousand molecules, not as a linear paverage, but as a nonlinear all-or-none. And so we have dewdrops on the genome...

[ Journal version in PNAS (12 pages, June 2026) PDF, 7.2 MB; SI PDF, 15.4 MB. ]
[ Pre-publication draft on arXiv:2509.08100v2 : article pdf, 18.8 MB (51 pages, June 2026). ]
[ Pre-publication draft on arXiv:2509.08100v1 : article pdf, 16.6 MB (47 pages, September 2025, earlier title). ]
[ Accompanying simulation code on GitHub. ]

Differentiable Programming of Indexed Chemical Reaction Networks and Reaction-Diffusion Systems. *
Inhoo Lee, Salvador Buse, Erik Winfree.

In the echoes of time, no human cry has fanned the winds of change as powerfully as "I want more!" Bigger farms, bigger cities, bigger armies. Biology too feels this drive: bigger populations, bigger bodies, bigger brains. And it works, because it's the easiest knob to turn. Size scales to N with just log(N) bits of information, instead of O(N) bits to specify the details of every new part and O(N²) bits for every new interaction. But when can you just make something bigger and it works better? Artificial neural networks -- inspired by brains -- are the prime computational paradigm for "more of the same" performing more and more sophisticated tasks. In this work, we formulate a model for chemical reaction networks that is similarly "scalable" by allowing "more" species of a general indexed form, interacting via "more" reactions of a general indexed form. So to get a more powerful chemical system, just increase N. Remarkably, this model maps cleanly onto existing GPU-powered software infrastructure that was developed for machine learning (JAX). This allows our software, icrn, to support training indexed chemical reaction networks by example, including pattern recognition in the well-mixed setting and pattern formation in the reaction-diffusion setting. We hope icrn will prove applicable for modeling and exploring the potential of the general-purpose reaction mechanisms of the central dogma of molecular biology where many-to-many "promiscuous" interactions are prevalent, and for designing new kinds of scalable systems in molecular programming.

[ Conference paper #4 in DNA Computing and Molecular Programming 31 (23 pages, August 2025) PDF, 4.3 MB. ]

Learning and Inference in a Lattice Model of Multicomponent Condensates. *
Cameron Chalk, Salvador Buse, Krishna Shrinivas, Arvind Murugan, Erik Winfree.

We used to talk about "computing with soup". You know, chicken stock. A light broth. Clear. Delicious. Refreshing. Dilute molecules bumping into each other from time to time, reacting according to rules, changing color subtly, making decisions. Chemical reaction networks were programmable, we realized. But it's a new day. It's a new cookbook. It's a new flavor. Today, we are computing with stew: thick, mushy, everything always pressed together. It's complete goop in there!

[ Conference paper #5 in DNA Computing and Molecular Programming 30 (24 pages, September 2024) PDF, 8.9 MB. ]

Revisiting Hybridization Kinetics with Improved Elementary Step Simulation. *
Jordan Lovrod, Boyan Beronov, Chenwei Zhang, Erik Winfree, Anne Condon.

People keep on telling you, oh yeah, dynamics is a lot more complicated than thermodynamics. Equilibrium is easy; kinetics can be really hard. But you don't really know what they're talking about until you get entangled in an example. Our example is non-pseudoknotted DNA secondary structure. We've got a decent thermodynamic model based on nearest-neighbor interactions. Melting temperatures, Boltzmann distributions, base pair probabilities -- just ask NUPACK or Vienna. But the kinetics of hybridization, hairpin folding, branch migration... well, good luck. You're about to encounter endless fascinating surprises...

[ Conference paper #5 in DNA Computing and Molecular Programming 29 (24 pages, September 2023) PDF, 2.1 MB. ]
[ find code on GitHub soon. ]

Visions of DNA Nanotechnology at 40 for the Next 40: A Tribute to Nadrian C. Seeman. *
Natasha Jonoska and Erik Winfree.

Inspired by Ned's article DNA Nanotechnology at 40, Natasha and I thought it would be good to ask the community to step back for a minute and think about the past and think about the future. The resulting volume includes 22 chapters divided into sections on perspectives, chemistry and physics, structures, biochemical circuits, and spatial systems. We hoped Ned would appreciate it, but he never got the chance to see the final product. While it only touches on a smattering of the achievements and possibilities in DNA nanotechnology, and in that sense is an unfinished reflection of an unfinished field, I like to think that he would pleased to see his legacy in the strange, sprawling, chaotic, intermingled growth of the science and technology. One can try, of course, but in truth no one can predict where it's going. Not even Ned has a crystal ball!

[ Open Access Springer Book. 431 pages. ]

The Tall Thin Molecular Programmer. *
Erik Winfree.

For everything there is a season. There was a time when a single person could design a whole computer from transistors up to programming languages. Carver Mead and Lynn Conway taught classes at Caltech and MIT in the late 1970's with the explicit goal of giving students the skills to do this, which spawned their textbook, "Introduction to VLSI Systems", that changed the industry. Carver espoused the idea of a "tall thin computer engineer" who could produce better designs by optimizing across all scales of computer organization so that they worked better together. Molecular programming with DNA nanotechnology is now reaching a similar point -- not with respect to commercialization and applications, but with respect to the complexity of the design process. Has the time come for the tall thin molecular programmer?

[ Article 27.2 for International Solid-State Circuits Conference (ISSCC 2023): article pdf, 8.5 MB (2 pages). ]
[ Presentation slides with lecture notes. ]

Two-dimensional tile displacement can simulate cellular automata. *
Erik Winfree and Lulu Qian.

The toehold-mediated strand displacement mechanism can be viewed as a way to selectively mine tunnels through a mountainous energy landscape, thus allowing fast kinetics for desired state changes while preserving barriers to undesired pathways. It has proved to be a central concept for dynamic DNA nanotechnology, foundational for molecular machines and computing systems both. As used in well-mixed DNA strand displacement circuits, the mechanism is essentially one-dimensional, as displacement acts from one end of a helix to the other. The recently discovered tile displacement mechanism shares many qualitative features with strand displacement, despite acting at a scale orders of magnitude larger in DNA origami tiles, but has one unique feature: it's behavior is fundamentally two-dimensional when acting within arrays of tiles. How important is this? Very! And how interesting? See for yourself how it makes beautiful connections to synchronous and asynchronous cellular automata, as well as how it highlights principles from the thermodynamics of computation.

[ Pre-publication draft on arXiv:2301.01929 : article pdf, 2.9 MB (17 pages). ]

Pattern recognition in the nucleation kinetics of non-equilibrium self-assembly. *
Constantine Evans, Jackson O'Brien, Erik Winfree, Arvind Murugan.

In an age when it sometimes seems that everything is a sham, we bring you a SHAM that is no sham. Our SHAM is a self-assembling brain, you might say, a set of molecules that collectively make decisions and spell out their answers. "Is this pattern of concentrations a Horse?" you ask, and the SHAM spells out its nanoscale answer in the words that it is forming: "H!" It's a new way of thinking about neural computation, reflecting old ways. Where some might use Hebbian learning to dig basins in an energy landscape and thereby guide the neural computation thermodynamically down, our SHAM builds barriers in an energy landscape that must be surpassed and thereby challenges our neural computation kinetically up. Where some might take neural mathematics as a prescription for engineering an equivalent molecular circuit, our SHAM takes ubiquitous and inevitable molecular processes, nucleation and crystallization, and finds the neural mathematics within them. Where some might be content to contemplate the concepts and create in silico constructions, our SHAM is real: the molecules synthesized, the fluorescence monitored, the self-assembled shapes imaged by atomic force microscopy. We hope it brings you peace and comfort to know that not all that shimmers is a sham.

[ journal version in Nature (21 pages, January 2024) PDF, 9.9 MB; SI PDF, 81 MB. ]
[ coverage: News & Views by Andrew Phillips (pdf, 200 MB); press releases from Caltech, the University of Chicago, and Maynooth University; New Scientist; Scientific American]
[ more Supplementary Information and code and data, including pre-publication drafts. ]

Detailed Balanced Chemical Reaction Networks as Generalized Boltzmann Machines. *
William Poole, Thomas E. Ouldridge, Manoj Gopalkrishnan, and Erik Winfree.

☯ The foundation for many machine learning models is a description of a probabilty distribution in terms of energies of configurations, coupled with generative processes designed to satisfy detailed balance such that the system equilibrates to the Boltzmann distribution over the energies. This enables concise analytic formulas, clear thinking, and systematic approaches to derivation of generative processes for manipulating probabilistic information -- such as establishing the equivalence of clamping variables and performing inference of conditional distributions. ☯ The foundation for chemical systems in equilibrium with a bath is again a description of the relevant energies, with respect to which a Boltzmann distribution of system states will emerge. The kinetics of such systems are physically required to satisfy detailed balance, down to the level of individual reaction in chemical networks. ☯ Our thesis is that these two things are two sides of the same coin. Any detailed balanced chemical reaction network can be viewed as, and used as, a machine learning model capable of representing a probability distribution and performing inference by clamping -- which now have physical and chemical interpretations to parallel their information processing interpretations. We show that the ability to encode complex distributions relies on exploiting reachability constraints within stochastic chemical reaction networks, and thus relies on the bath exchanging heat (canonical ensemble) but not material (grand canonical ensemble) for critical species. We further provide bounds on the thermodynamic cost of performing probabilistic inference, approaching zero in the limit as the speed also goes to zero, but related to the relative entropy for finite speed. ☯

[ Pre-publication draft on arXiv:2205.06313: article pdf, 815KB (15 pages). ]

Andrés' PhD Thesis: Combinatorics and Stochasticity for Chemical Reaction Networks. *
139 pages. California Institute of Technology. Submitted April 2022.
Andrés Ortiz-Muñoz. Thesis advisors: Erik Winfree and Walter Fontana (Harvard).
Let's talk about inner beauty. Do you have it? Too personal? OK, does your science have it? How about your math? What does it mean for math to be beautiful? A superficial prettiness? A deep harmony and resonance? Is such beauty created from scratch in the mind, as by a master architect? Or is it something ancient newly uncovered, as by an archaeological dig? Or is it a reflection of nature, an image in the water? And is there any way to experience it other than to dive in?
[ PhD thesis, 4.8 MB (class of 2022); Caltech ETD. ]
[ Defense video (125 MB) and slides (49 MB). ]

Predicting DNA kinetics with a truncated continuous-time Markov chain method. *
Sedigheh Zolaktaf, Frits Dannenberg, Mark Schmidt, Anne Condon, and Erik Winfree.

The secondary structure state space for multistranded nucleic acid systems suffers from combinatorial explosion, making exact methods for computing computational infeasible. So kinetic simulations are often used. But the secondary structure landscape for processes of interest often involves high-energy barrier states that make direct simulation of critical rare events computationally infeasible. So a variety of techniques from statistical physics perform biased sampling to efficiently approximate the rates of rare events. But if our goal is to infer model parameters based on data using some kind of gradient descent, very similar simulations would have to be run repeatedly, which is wasteful. The pathway elaboration method is our attempt to address these conflicting goals by "pre-computing" what is expected to be the most relevant part of the state space, then calculating parameter-dependent kinetics using fast matrix methods.

[ journal version in Computational Biology and Chemistry (18 pages, February/June 2023) PDF, 2.5 MB. ]
[ pre-publication draft with title "The pathway elaboration method for mean first passage time estimation in large continuous-time Markov chains with applications to nucleic acid kinetics" on arXiv:2101.03657v2: article pdf, 3.8MB (32 pages). ]
[ find code on GitHub. ]

William's PhD Thesis: Compilation and Inference with Chemical Reaction Networks. *
165 pages. California Institute of Technology. Submitted November 2021.
William Poole. Thesis advisors: Erik Winfree and Richard Murray.
The Yin and the Yang. The Mandala of Life. The Grand Unified Theory. Throughout history, people have striven to take disparate elements and weave them together into a whole, or to recognize that from a certain perspective, they are already the same thing. It is a delight when two things can be seen as the same, maybe even in two different ways! Even rarer is when three things appears as the same in three different ways!! William Poole's thesis blends synthetic biology, molecular programming, and machine learning. Dripping through a French press, we find an intriguing brew of E. coli cell extracts, probabilistic models, and statistical thermodynamics. It is all the natural flow of information!
[ PhD thesis, 6.8 MB (class of 2022); Caltech ETD.]

Verifying polymer reaction networks using bisimulation. *
Robert Johnson and Erik Winfree.

Formally, discrete chemical reaction networks (CRNs) have a finite number of species, but an infinite number of states -- the numbers of molecules in a test tube is unbounded. The point is, this can make it hard to verify that a proposed molecular implementation (a big complicated CRN) is really doing the job it was designed to do (a smaller simpler CRN). Nonetheless, a method based on Milner's notion of bisimulation in concurrency theory was developed in our work on "CRN bisimulation". When considering chemical systems involving polymers of potentially unbounded length, such as the DNA, RNA, and protein polymers in the central dogma of molecular biology, then the number of possible species becomes infinite as well. The point is, the extra infinities can make implementations of polymer reaction networks (PRNs) even harder to verify. And yet, by extending the bisimulation ideas, it can (at least sometimes) be done!

[ journal version in article in Theoretical Computer Science (31 pages, online September 2020): PDF, 2.5 MB. ]
[ conference version in Verification of Engineered Molecular Devices and Programs (VEMDP) (15 pages, June 2014): workshop article, 798 KB ]
[ associated Mathematica-based reaction schema simulator that can simulate CRNs, PRNs, and more. ]

A domain-level DNA strand displacement reaction enumerator allowing arbitrary non-pseudoknotted secondary structures. *
Stefan Badelt, Casey Grun, Karthik Sarma, Brian Wolfe, Seung Woo Shin, and Erik Winfree.

So-called domain-level models were introduced to describe how engineered DNA strand displacement systems should work, prior to assigning actual sequences to the domains. Here we describe our efforts to provide a model rich enough to describe arbitrary secondary structures (so long as no pseudoknots are involved) and to support reaction mechanisms such as association, dissociation, 3-way branch migration, and 4-way branch migration. We use a nifty time-scale separation to make the process tractable and to find a "condensed" reaction network representation. A sequence-independent kinetics model for domain-level reaction steps is developed, founded on rule-of-thumb biophysics, that (imperfectly, but reasonably) predicts reaction kinetics for a wide range of systems taken from the literature.

[ Article in Royal Society Interface 17: 20190866 (28 pages, June 2020): article, 2.3 MB. ]
[ Find Python code on GitHub and try it yourself! ]
[ Verification of Engineered Molecular Devices and Programs (VEMDP) (29 pages, June 2014): workshop article, 1.1 MB, also on arXiv ]
[ Tar archive of workshop version of Peppercorn, for historical reference only. ]

Programming and Simulating Chemical Reaction Networks on a Surface. *
Samuel Clamons, Lulu Qian, and Erik Winfree.

When chemical reactions are constrained to a surface, new possibilities arise and new constraints are imposed, relative to when they take place in a well-mixed solution. As proposed in our 2014 paper, DNA nanotechnology provides potential mechanisms for implementing arbitrary CRNs on a surface, and thus the "surface CRN" model can be considered as a programming language for future technologies. Here, we continue our exploration of how to think about programming surface CRNs. You might be interested in how we emulate synchronous cellular automata in two dimensions, despite the asynchronous and pairwise CRN reactions. Or how surface CRNs can implement continously active recurrent digital logic circuits -- giving another way to emulate cellular automata. Or how surface CRNs can be programmed to "grow" complex patterns. Or how we can think of programmable molecules as "robots" and explore ideas from swarm robotics, but at the nanometer scale. This is just a sampling... the most exciting ideas are yet to be discovered... by you?

[ Article in Royal Society Interface 17: 20190790 (May, 2020): article, 1.9 MB. ]
[ An online simulator is available for your pleasure. ]
[ Find Python code on GitHub and run it locally with extra features! ]

Robert's PhD Thesis: Formal design and analysis for DNA implementations of Chemical Reaction Networks. *
207 pages. California Institute of Technology. Submitted January 2020.
Robert F. Johnson. Thesis advisors: Erik Winfree and Lulu Qian.
Robert Johnson's thesis addresses how to translate from abstract models of computation to models of molecular interactions, and to prove the validity of those translations. It can be done in specific cases, yes, but in general it is a rough landscape full of NP-complete and PSPACE-hard questions. On honor of Robert's work, Erik and Lulu wrote haikus on the subject of proofs and impossibility and molecules. Translations to Japanese have been generously provided by Mizuho Hoshi (星瑞穂). Note, however, that quality translation of haikus is similarly fraught with peril. Success requires a leap of some kind.

Impossible proofs make an invisible bridge to what can be done	証明の橋は掛かれり見えずとも	Sho-u-me-i-no Ha-shi-wa-ka-ka-re-ri Mi-e-zu-to-mo
An unbridged gap awaits molecules to jump therein lies the proof	証明す橋なき川を跳ぶ分子	Sho-u-me-i-su Ha-shi-na-ki-ka-wa-wo To-bu-bu-n-shi

[ PhD thesis, 2.9 MB; Caltech ETD.]

Stochastic Chemical Reaction Networks for Robustly Approximating Arbitrary Probability Distributions. *
Daniele Cappelletti, Andrés Ortiz-Muñoz, David F. Anderson, and Erik Winfree

You've heard them say, "it's a noisy business down there", and you looked at the inner workings of the cell, and yup, that's what you saw in the thick thermal bath. But Asimov told us that any sufficiently advanced technology is indistinguishable from magic, and SETI researchers remind us that any sufficiently compressed information is indistinguishable from randomness. So perhaps there's more to the noise, you wondered? In order to see the nearly-invisible, you have to know what you are looking for. Here we begin an exploration of how stochastic chemical reaction networks can shape noise with almost arbitrary finesse.

[ article in Theoretical Computer Science (32 pages, online August 2019): PDF, 2.1 MB. ]
[ arXiv preprint v1 (33 pages, October 2018): PDF, 1.1 MB. ]
[ Stanford Neuroscience Talk, April 22, 2019. ]

Chemical Reaction Networks and Stochastic Local Search. *
Erik Winfree.

The world is hierarchical. So it seems. In geography, fractals are key to understanding that. In biology, perhaps P≠NP is the key. Whaaaat? Let me explain. That some functions are hard to compute but easy to verify means that a computationally-limited supervisor can verify the performance of a large number of workers doing hard things. And a bigger boss can verify the performance of a bunch of supervisors. From molecules to cells to organs to organisms, even from students to professors to deans to presidents, P≠NP makes hierarchical organization feasible, effective, and naturally self-repairing. In this paper, I argue that "programming by recognition" -- a computational architecture founded on the P≠NP distinction -- is natural for stochastic chemical systems and puts all that randomness to good use. The case is not entirely convincing, but hopefully will stimulate further thought.

[ DNA Computing and Molecular Programming (DNA25) proceedings, Lecture Notes in Computer Science (LNCS), Volume 11648, August 2019, pp 1-20 (20 pages) pdf, 3.8 MB. ]
[ Mathematica notebooks. ]
[ A seminar I gave at the University of Washington. ]

Efficient Parameter Estimation for DNA Kinetics Modeled as Continuous-Time Markov Chains. *
Sedigheh (Nasim) Zolaktaf, Frits Dannenberg, Erik Winfree, Alexandre Bouchard-Côté, Mark Schmidt, Anne Condon.

Nucleic acid secondary structure rearrangements form the basis for a wide range of natural phenomena as well as an incredible diversity of engineered molecular machines. Improving the accuracy of secondary structure kinetics models will be of enormous benefit both for analysis and for design. One approach is to fit model parameters to experimental data, but when the model is a stochastic Monte Carlo simulator like Multistrand, this can be prohibitively time-consuming. Here we explore some approaches for speeding up parameter estimation. Can we develop even better techniques?

[ DNA Computing and Molecular Programming (DNA25) proceedings, Lecture Notes in Computer Science (LNCS), Volume 11648, August 2019, pp 80-99 (20 pages) pdf, 1.5 MB. ]
[ GitHub page for the associated software. ]

Reversible Computation Using Swap Reactions on a Surface. *
Tatiana Brailovskaya, Gokul Gowri, Sean Yu, Erik Winfree.

How many times have you been exhorted to do more with less? It's not a bad goal. For well-mixed chemical reaction networks, no reaction does "less" than A+B → B+A. In fact, it does absolutely nothing. So how can you do more with it? What these three undergraduates realized (much to my surprise) is that reactions like A+B → B+A are not so ineffective if they are operating confined to a surface. Now, the reaction means that A swaps positions with B. How much more can you do with that? For starters, you can build arbitrary feedforward Boolean logic circuits. It almost seems like getting something from nothing.

[ DNA Computing and Molecular Programming (DNA25) proceedings, Lecture Notes in Computer Science (LNCS), Volume 11648, August 2019, pp 174-196 (23 pages) pdf, 1.5 MB. ]

Diverse and robust molecular algorithms using reprogrammable DNA self-assembly. *
Damien Woods, David Doty, Cameron Myhrvold, Joy Hui, Felix Zhou, Peng Yin, Erik Winfree.

Hao Wang was right, although he was most famous for being wrong. Back in the early 1960s he was investigating mathematical logic from the perspective of automatic theorem proving, which entailed going head to head with Gödel's incompleteness and Turing's uncomputability. Looking for simpler cases to establish the line between solvable and unsolvable logics, he saw a connection between a subset of mathematical logic and a problem in mathematical geometry: determining whether a finite set of tiles could be used, with repetition, to tile the entire plane. He famously conjectured that any set of (what are now called) Wang tiles -- square tiles with matching rules on each side -- either tile the plane periodically, or else there is no infinite tiling with complete coverage. If that were true, his subset of mathematical logic would be decidable. But it wasn't true! Shocking him and many others, his student Robert Berger proved that there are tile sets that can tile the plane, but only aperiodically. Shortly thereafter, Roger Penrose came up with his beautiful aperiodic tilings using kites, darts, and other shapes. While periodic tilings were well-established as a mathematical basis for understanding real crystals, not many believed that the Penrose tilings had anything to do with real materials -- until 1984, when Dan Shechtman discovered quasicrystals in an aluminum-manganese alloy! Well, then, what about Hao Wang's other tilings? Perhaps most intriguingly, Wang had designed tiles that, if started with an initial "seed", directly simulate the behavior of a Turing machine. Does this "algorithmic order" within tilings also reflect a way that real molecules can self-organize? Yes, it seems to be so.

[ Nature, volume 567, pages 366-372 (March 21, 2019): article PDF (2.7 MB, 7 pages, corrected), Supp. info. (main) (47 MB, 140 pages, corrected), Supp. info. (sequences) (412 KB, 13 pages), Supp. info. (AFM images) (36 MB, 5 pages), related code and data ]
[ Media: Caltech, UC Davis, INRIA (English), INRIA (French), Irish Times, The California Aggie, Chemistry World, Physics World, Wired, IEEE Spectrum ]
[ Interviews: Maynooth University, Northern Sound Radio, Futureproof with Jonathan McCrea ]
[ Artwork: A carpet of algorithms ]
[ It has come to our attention... that our IBC model has remarkable similarities to the FPGA architecture proposed in US Patent 6133788B1. ]

Automated sequence-level analysis of kinetics and thermodynamics for domain-level DNA strand-displacement systems. *
Joseph Berleant, Christopher Berlind, Stefan Badelt, Frits Dannenberg, Joseph Schaeffer, and Erik Winfree.

Coarse-graining is the act of lumping together states in a detailed model to obtain a more abstract model that -- if the coarse-graining is good -- can be used to accurately predict important behaviors of the original model. Design is the act of creating a detailed implementation based off of an abstract specification -- and if the implementation is good, coarse-graining will bring you right back to the abstract specification you started with (or somewhere close). Here we consider a framework for quantitatively evaluating such relationships in the context of DNA strand displacement systems.

[ Journal version in Royal Society Interface 15: 20180107 (December, 2018): article, 1.2 MB. ]
[ Code on GitHub, including link to pre-installed Amazon Machine Image that you can run out-of-the-box. ]
[ ERRATA: There is an extraneous reaction arrow in Figure 10b. Oops. ]

Effective design principles for leakless strand displacement systems. *
Boya Wang, Chris Thachuk, Andrew D. Ellington, Erik Winfree, and David Soloveichik.

Some theory is too good to be true. Lots of theory seems too good to be true. The most fun part of experimental work is to figure out which is which. In this paper, we make the remarkable discovery that our theory for how to design leakless DNA strand displacement cascades only seems too good to be true. But at it's core, it's the stuff of reality.

[ PNAS, Published online before print December 13, 2018 doi: 10.1073/pnas.1806859115: article PDF (3.9 MB, 10 pages), Supp. info. (27 MB, 26 pages). ]

Optimizing Tile Set Size While Preserving Proofreading with a DNA Self-assembly Compiler. *
Constantine G. Evans and Erik Winfree.

Can you accomplish the same task with less? That's the game here. Given an algorithmic tile set that performs well -- perhaps because it uses proofreading principles -- can you find a smaller tile set that does just as well? Fewer tiles, fewer glues. You might make the infeasible feasible again. Our DNA tile set compiler, Alhambra, can do this now.

[ DNA Computing and Molecular Programming (DNA24) proceedings, Lecture Notes in Computer Science (LNCS), Volume 11145, October 2018, pp 37-54 (18 pages) pdf, 1.2 MB. ]

Verifying Chemical Reaction Network Implementations: A Bisimulation Approach. *
Robert F. Johnson, Qing Dong, and Erik Winfree

The question of whether a molecular implementation is "for all intents and purposes" equivalent to the intended abstract specification is a fundamental question for molecular programming. Building on earlier work by Seung Woo Shin and Qing Dong, we show that bisimulation provides a surprisingly natural framework assessing the correctness of proposed implementations of formal chemical reaction networks (CRNs). While CRN bisimulation can be easily applied in many circumstances of interest, and an algorithm for doing so is frequently effective, we show that in general variants of the problem of finding and checking bisimulation are NP-complete or PSPACE-complete in this context. Differences with the alternative theory of pathway decomposition are also discussed.

[ article in Theoretical Computer Science (44 pages, 2019 (online January 2018)): PDF, 1.9 MB. ]
[ DNA Computing and Molecular Programming (DNA22) proceedings, Lecture Notes in Computer Science (LNCS), Volume 9818, September 2016, pp 114-134 (21 pages) pdf, 393 KB ]

Verifying Chemical Reaction Network Implementations: A Pathway Decomposition Approach. *
Seung Woo Shin, Chris Thachuk, Erik Winfree.

What does it mean for two chemical reaction networks to be logically equivalent? The answer is not as obvious as it may seem -- in fact, we still can't give a fully satisfactory answer. There seem to be many possible notions one could entertain, each with its own strengths and weaknesses. Here we present a refinement of the ideas from Seung Woo's Masters Thesis that focus on the smallest coherent sequences of reactions out of which all other sequences can be built. A strength of the theory is that it can handle the "delayed choice" phenomenon, wherein a single species in a molecular implementation has already committed to participation in a reaction, but not yet fully committed to exactly which one. Combined with bisimulation ideas from Qing Dong's Masters Thesis, we hope that this theory can address most, if not all, methods to implement chemical reaction networks using DNA strand displacement systems -- and perhaps more.

[ article in Theoretical Computer Science (30 pages, first online 2018; in print vol 765, 2019, pp 67-96): PDF, 1.1 MB. ]
[ arXiv preprint v2 (40 pages, May 2017): PDF, 718 KB. ]
[ arXiv preprint v1 (21 pages, November 2014): PDF, 365 KB. ]
[ Verification of Engineered Molecular Devices and Programs (VEMDP) (23 pages, June 2014): workshop article, 298 KB ]
[ Note: The VEMDP version introduces a notion of "futile loops", not present in the MS thesis theory. The arXiv and TCS versions revert to the theory without futile loop elimination, due to a difficulty proving that our basis-finding algorithm terminates. The proof of Theorem B.1 in the VEMDP version is false as stated. ]

Enzyme-free nucleic acid dynamical systems. *
Niranjan Srinivas, James Parkin, Georg Seelig, Erik Winfree, and David Soloveichik

In days gone by, did you ever build electrical circuits on an old-fashioned breadboard? Maybe you plugged in a few capacitors, a few resistors, an inductor, and transistor -- and voila! an AM radio! The wonderful thing was that if you could draw the circuit and analyze the equations, you could build it. And with a little tuning, it would work. Will biochemistry ever be that straightforward? We take a small step, by showing that the standard chemical reaction network formalism can serve as a programming language for DNA strand displacement systems. We provide a CRN-to-DNA compiler and build a three-reaction oscillator experimentally. What will you build?

[ Science online article (10 pages, December 15, 2017): article, 1.6 MB and supplementary, 12 MB. ]
[ bioRxiv preprint (21 pages, May 2017): article, 2.5 MB and supplementary, 10 MB. ]
[ Github page for the Piperine compiler ]
[ Media: UT Austin press release, Scientific American, Kurzweil AI Network ]
[ YouTube: 5-minute overview ]

A cargo-sorting DNA robot. *
Anupama J. Thubagere, Wei Li, Robert F. Johnson, Zibo Chen, Shayan Doroudi, Yae Lim Lee, Gregory Izatt, Sarah Wittman, Niranjan Srinivas, Damien Woods, Erik Winfree, Lulu Qian.

Science is often blown forward by the wind of dreams. Can you imagine a mail room smaller than a single virus? Can you imagine a box of mail packages, each being just a single molecule? Can you imagine a single molecule robot wandering around this room, picking up molecular packages and placing each one in the right molecular bin, according to the zip code on its address? Can you imagine a cargo-sorting DNA robot? We dreamed this dream seven years ago. It wasn't particularly realistic, then. But now it is real.

[ Science: 357: eaan6558, 15 September 2017 (1+9 pages) perspective, 490 KB, article, 2.2 MB and supplementary, 12 MB. ]
[ Media: Caltech press release, Science podcast, Los Angeles Times, ABC News Australia, The Scientist, Communications of the ACM, and more. ]
[ The early days of the project in the hands of a talented BIOMOD team. ]

Inferring Parameters for an Elementary Step Model of DNA Structure Kinetics with Locally Context-Dependent Arrhenius Rates. *
Sedigheh (Nasim) Zolaktaf, Frits Dannenberg, Xander Rudelis, Anne Condon, Joseph M. Schaeffer, Mark Schmidt, Chris Thachuk, Erik Winfree

Models that capture a substantial fraction of the known thermodynamics of multistranded DNA molecules at the analytically-tractable secondary structure level, such as NUPACK, have proven invaluable for the analysis of both biological and artificial nucleic acid systems. Extending such models to predict the kinetics of secondary structure rearrangements and interactions between molecules, as done for example in the Multistrand simulation, has the potential to address a wider range of phenomena. However, thermodynamics does not dictate kinetics -- there is an infinite family of kinetic models that are perfectly consistent with any given thermodynamic model -- and therefore the accuracy of naive kinetics models is limited. Here, we propose a parameterization of multistranded secondary structure kinetics that is based on the Arrhenius model for elementary base-pairing changes, and we prototype a Bayesian Markov Chain Monte Carlo (MCMC) inference method to obtain an ensemble of improved parameter sets -- resulting in markedly increased accuracy when evaluated on a database of experimentally-measured kinetics rates culled from the literature.

[ DNA Computing and Molecular Programming (DNA23) proceedings, Lecture Notes in Computer Science (LNCS), Volume 10467, August 2017, pp 172-187 (16 pages) pdf, 695 KB. ]

Chemical Boltzmann Machines. *
William Poole, Andrés Ortiz-Muñoz, Abhishek Behera, Nick S. Jones, Thomas E. Ouldridge, Erik Winfree, Manoj Gopalkrishnan.

What is the difference between noise and information? What distinguishes stochasticity and thinking? These concepts may seem like opposites, but from a certain perspective, they are closely related. Here we present several theoretical constructions for chemical reaction networks whose stochasticity embodies meaningful information, and whose response to input constitutes perfect Bayesian inference.

[ DNA Computing and Molecular Programming (DNA23) proceedings, Lecture Notes in Computer Science (LNCS), Volume 10467, August 2017, pp 210-231 (22 pages) pdf, 1.1 MB. ]
[ arXiv preprint (21 pages, July 2017): PDF, 2.6 MB. ]
[ Stanford Neuroscience Talk, April 22, 2019. ]

A General-Purpose CRN-to-DSD Compiler with Formal Verification, Optimization, and Simulation Capabilities. *
Stefan Badelt, Seung Woo Shin, Robert F. Johnson, Qing Dong, Chris Thachuk, Erik Winfree.

What does it mean to compile to molecules? To some, this just means that a computer was used design the molecules, somehow. To others, a more rigorous process is implied: that the intended design is specified by a "high level" formal language, and that a systematic process is used to translate the design into the "low level" molecular construction. Here, we go further: the high-level specification language and the low-level implementation language must each have a semantics -- that is to say, the intended/expected behavior of a given system must be well-defined based on its description -- and more importantly, the behavior of the implementation produced by the compiler must come with a guarantee that it is effectively the same as the specification. These features are prototyped by the Nuskell compiler, which translates formal chemical reaction networks into domain-level DNA strand displacement systems.

[ DNA Computing and Molecular Programming (DNA23) proceedings, Lecture Notes in Computer Science (LNCS), Volume 10467, August 2017, pp 232-248 (17 pages) pdf, 1.0 MB. ]

Physical principles for DNA tile self-assembly. *
Constantine G. Evans and Erik Winfree

We do our best to provide a gentle but solid introduction to some key physical principles for DNA tile self-assembly, including algorithmic growth, error rates, proofreading, and nucleation. We discuss how an understanding of these issues allows one to navigate from abstract models of tile assembly, such as the aTAM, to more realistic models of tile assembly, such as the single-crystal kTAM and the mass-action kTAM. Finally, we outline a "unified" model of tile self-asssembly that helps clarify the relationships between different levels of abstraction.

[ Chemical Society Reviews, DOI:10.1039/C6CS00745G (22 pages, May 2017): article with corrections, 4.9 MB (without corrections). ]
[ Erratum: There is a typo in equations in the paragraph below Figure 8 of the uncorrected version. ]

Time Complexity of Computation and Construction in the Chemical Reaction Network-Controlled Tile Assembly Model. *
Nicholas Schiefer and Erik Winfree

In earlier work, we proposed a theoretical model that combines tile self-assembly and chemical reaction networks, showing that fabrication tasks and computation tasks can be performed more efficiently than in either previous model, when measuring in terms of space used and program size. Here, we show that the CRN-TAM is also as fast or faster. In particular, we show that a class of search problems can be solved in polynomial time, using exponential space -- i.e. the CRN-TAM can efficiently use the inherent parallelism of chemistry for this class of problems. (Full proofs will appear in an expanded journal version of the paper, "soon".)

[ DNA Computing and Molecular Programming (DNA22) proceedings, Lecture Notes in Computer Science (LNCS), Volume 9818, September 2016, pp 165-182 (18 pages) pdf, 238 KB ]

Determining hydrodynamic forces in bursting bubbles using DNA nanotube mechanics. *
Rizal F. Hariadi, Erik Winfree, and Bernard Yurke.

"To see a World in a Grain of Sand / And a Heaven in a Wild Flower" -- William Blake.

And so he looked at a tiny bubble
bursting on the surface of an infinite ocean.
Within it, molecules, their world torn asunder.
And in that vigor,
and in that endless churning,
the origin of life.
We followed him deep into this vision.

[PNAS, Published online before print October 26, 2015 doi: 10.1073/pnas.1424673112: article PDF (1.0 MB, 10 pages), article PDF+SI (1.8 MB, 35 pages). ]

Leakless DNA Strand Displacement Systems. *
Chris Thachuk, Erik Winfree, and David Soloveichik

DNA strand displacement systems are powerful in theory, capable of simulating arbitrary formal chemical reaction network dynamics. In practice, they are plagued by problems, such as leak reactions where the outputs of an intended reaction are released -- to some degree -- even in the absence of the intended inputs. Here we present a systematic approach, involving what we call "double long domains", that aims to eliminate leak reactions.

[ DNA Computing and Molecular Programming (DNA21) proceedings, Lecture Notes in Computer Science (LNCS), Volume 9211, July 2015, pp 133-153 (21 pages) pdf, 790 KB ]
[ Erratum: Figure 2a of the LNCS paper has a labeling error in the lower lefthand waste molecule. The top domains should be X2 Y1, not X1 X2. ]

Universal Computation and Optimal Construction in the Chemical Reaction Network-Controlled Tile Assembly Model. *
Nicholas Schiefer and Erik Winfree

The abstract chemical reaction network (CRN) model allows for the specification of complex dynamical behaviors in a well-mixed solution. CRN programs have a systematic implementation as DNA strand displacement cascades. The abstract tile assembly model (aTAM) allows for the specification of complex self-assembly processes within a single growing crystal. aTAM programs have a systematic implementation as DNA tile sets. The CRN-TAM provides a "minimal" integration of these two models, allowing CRN reactions to produce tiles, and allowing tile assembly steps to send signals back to the CRN. Although a compelling implementation is not yet available, we show that the CRN-TAM can do things neither previous model can do alone -- in particular, we show that concise CRN-TAM programs can ("optimally") construct arbitrary algorithmically-defined objects, without the sometimes-dramatic scale-up required in the aTAM.

[ DNA Computing and Molecular Programming (DNA21) proceedings, Lecture Notes in Computer Science (LNCS), Volume 9211, July 2015, pp 34-54 (21 pages) pdf, 432 KB ]

Stochastic Simulation of the Kinetics of Multiple Interacting Nucleic Acid Strands. *
Joseph M. Schaeffer, Chris Thachuk, and Erik Winfree

Multistrand is a simulation package that performs random walks on the secondary structure energy landscape for test tubes of multiple (but not too many!) DNA strands. It has a powerful Python interface that allows setting up complex sets of simulations, as well as powerful analysis methods for making sense of them. (Multistrand was developed as the major component of Joseph's PhD thesis.)

[ DNA Computing and Molecular Programming (DNA21) proceedings, Lecture Notes in Computer Science (LNCS), Volume 9211, July 2015, pp 194-211 (18 pages) pdf, 454 KB ]
[ Erratum: On page 198, there is a sign error; the equation should be ΔG = ΔH - T * ΔS. ]

Niranjan's PhD Thesis: Programming chemical kinetics: engineering dynamic reaction networks with DNA strand displacement. *
245 pages. California Institute of Technology. Submitted June 2015.
Niranjan Srinivas. Thesis advisor: Erik Winfree.
A flute transforms a steady flow of air into an oscillation, a note, a tune. But can a test tube of molecules sing? Yes -- many chemical oscillators are known, transforming a steady decrease of chemical potential into wildly swinging concentrations. One can make a fair argument -- in theory -- that chemical reaction networks can do far more than sing: they can compute, control, and even think. Niranjan has taken a major step in showing that what is possible in theory may indeed be possible also in practice, applying a systematic approach to compile abstract chemical reaction networks into enzyme-free systems of interacting DNA molecules. The specific system he demonstrated is, in fact, a simple oscillator, but the method is general. Niranjan's thesis won Caltech's 2015 Demetriadis-Tsafka-Kokkalis Prize for Nanotechnology.
[ PhD thesis, 15 MB; Caltech ETD.]

Increasing Redundancy Exponentially Reduces Error Rates during Algorithmic Self-Assembly. *
Rebecca Schulman, Christina Wright, and Erik Winfree

Everyone makes mistakes. That applies to molecules too: during self-assembly, for example, sometimes the wrong molecule arrives at the wrong place, and sticks -- resulting in the growth of an ill-formed structure. But there's a solution: double check, triple check, quadruple check. In DNA tile self-assembly theory, there is a natural way to do this, using proofreading tile sets. Here we demonstrate, experimentally, that assembly error rates decrease exponentially with molecular designs that allow increased levels of proofreading.

[ ACS Nano, 2015 (12 pages, online May 12) : article, 2.4 MB; supp. info., 1.5 MB. ]
[ Archive of simulation software, tile sets, and scripts: xgrow-ACS-Nano-2015.tgz. ]

Thermodynamics and kinetics of DNA nanotube polymerization from single-filament measurements. *
Rizal F. Hariadi, Bernard Yurke, and Erik Winfree

Can you make a career of watching the grass grow? Probably. But we prefer watching nano-grass -- or, to be more precise, nanotubes self-assembled from DNA tiles. By watching how fast they grow, or how fast they shrink, as a function of temperature and concentration, we were able to extract the association rate constant as well as the enthalpy and entropy of binding. As usually happens when you watch closely and think carefully, we also observed several other interesting phenomena... enjoy the movies!

[ Chemical Science, 2015 (16 pages, online February 20) : article, 1.5 MB; supp. info., 11 MB. ]
[ movie 1: side-to-side bundling, 645 KB; movie 2: annealing and coalescence, 899 KB; movie 3: depolymerization, 877 KB; movie 4: polymerization, 2 MB. ]

Parallel and scalable computation and spatial dynamics with DNA-based chemical reaction networks on a surface. *
钱璐璐 (Lulu Qian) and Erik Winfree

Abstract chemical reaction networks (CRNs) provide an all-encompassing formal language for modelling well-mixed chemical systems involving a finite number of species. Showing that arbitrary CRNs can in principle be implemented with DNA strand displacement cascades [*] was a major step toward proving the generality and universality of pure-DNA systems. However, well-mixed systems are far less powerful than molecular systems that exploit spatial geometry to encode information combinatorially. While I know of no all-encompassing formal language for molecular machines with geometry, a very broad class of systems can be described as discrete chemical reaction networks wherein each molecule is localized on a surface at grid points, and asynchronous reactions mediate movement and state change. Here we suggest a pure-DNA implementation using a combination of 3-way and 4-way branch migration.

[ DNA Computing and Molecular Programming (DNA20) (18 pages, June 2014) conference preprint, 228 KB ]
[ Lecture Notes in Computer Science (LNCS), Volume 8727, 2014, pp 114-131 (18 pages) pdf, 856 KB ]
[ Sam Clamons wrote an awesome simulator for surface CRNs. ]

Constantine's PhD Thesis: Crystals that count! Physical principles and experimental investigations of DNA tile self-assembly. *
91 pages. California Institute of Technology. Submitted June 2014.
Constantine G. Evans. Thesis advisor: Erik Winfree.
Counting is among the most fundamental computing primitives. One might say that clocks made of gears and springs -- and astronomical calculators such as the Antikythera mechanism of 100 BC -- were the logical precursors of more powerful computing machines like Babbage's difference engine and even the mechanical cash registers that were used for counting customers' purchases through the middle of the twentieth century. Biology, too, uses clocks and counters in a variety of forms: to step through the cell cycle, to grow each human a full set of 24 ribs, to make sure that no duckling gets left behind. How small, and how simple, can a counter be? Constantine has coaxed a set of 22 DNA molecular tiles to count to 31, in binary, as they self-assemble into a roughly 2 x 80 x 900 nanometer crystal. This must be the precursor to something!
[ PhD thesis, 26 MB; Caltech ETD.]

Diversity in the dynamical behaviour of a compartmentalized programmable biochemical oscillator. *
Maximilian Weitz, Jongmin Kim, Korbinian Kapsner, Erik Winfree, Elisa Franco and Friedrich C. Simmel.

Small is different. It's not different mechanistically, of course, but understanding it can be different because many of the simplifying assumptions we use to understand large systems -- systems with bijillions of copies of every type of molecule where macroscopic effects arise from the average case -- no longer work so well. Here we make tiny water-in-oil droplets, about the size of a cell, that contain a programmable biochemical oscillator. Because of stochastic effects, every droplet behaves differently. Can you guess why? We thought we knew, but our first guess was totally wrong. Read on...

[ Nature Chemistry doi:10.1038/nchem.18694 (8 pages, February 16, 2014): cover article, 7.4 MB and supp. info., 24.4 MB. ]
[ A movie of oscillating droplets (more on the journal supp. info. page) ]
[ TUM press release, UCR press release, Caltech press release, reported also in Phys.Org, AZoNano.com, Science 2.0, Science Daily, and elsewhere. ]
[ A painting inspired by this work. ]

Programmable chemical controllers made from DNA. *
Yuan-Jyue Chen, Neil Dalchau, Niranjan Srinivas, Andrew Phillips, Luca Cardelli, David Soloveichik, and Georg Seelig.

When David and Georg realized that DNA strand displacement cascades can be designed to approximate arbitrary formal chemical reaction network (CRN) dynamics, that was theoretically exciting. Now that Yuan and collaborators have successfully demonstrated an example of Luca's version of the CRN-to-DNA implementation scheme in the laboratory, things are getting really exciting!

[ Nature Nanotechnology 8: 755-762, 2013 (8 pages, September 29): article, 568 KB and supp. info., 4.2 MB ]
[ UW Press Release and Nature Nanotechnology News & Views ]

On the biophysics and kinetics of toehold-mediated DNA strand displacement. *
Niranjan Srinivas, Thomas E. Ouldridge, Petr Šulc, Joseph M. Schaeffer, Bernard Yurke, Ard A. Louis, Jonathan P. K. Doye, and Erik Winfree.

Toehold-mediated DNA strand displacement is at the heart of many dynamic DNA nanotechnology devices. And it works much much better than we thought it ought to. So we asked why? It all comes down to the relative rates of bimolecular collisions, zipping and fraying of the double-helix, branch migration steps -- and a previously unknown thermodynamic cost for branch migration intermediates. So now we can say that toehold-mediated DNA strand displacement works just about as well as it should. And we have insights about how to make it work better yet.

[ Nucleic Acids Research doi: 10.1093/nar/gkt801 (18 pages, September 9, 2013): article, 5.8 MB and supp. info., 1.8 MB ]

DNA Sticky End Design and Assignment for Robust Algorithmic Self-assembly. *
Constantine G. Evans and Erik Winfree.

Theories of the logic and kinetics of algorithmic self-assembly make many idealizations that eliminate complexities and clarify essential insights. But those complexities are still there when one tries to create self-assembling systems in the laboratory. Which ones are most important, what effects do they have, and how can one design molecules and systems to minimize assembly errors? Examining these questions from both biophysical and combinatorial angles lead us to a DNA sequence design algorithm that may perform orders of magnitude better than previous methods.

[ DNA Computing and Molecular Programming LNCS 8141: 61-75 (2013) (15 pages, September, 2013): article, 821 KB ]
[ Code can be found here. ]

Rizal's PhD Thesis: Non-equilibrium Dynamics of DNA Nanotubes. *
218 pages. California Institute of Technology. Submitted March 2011, final revision August 2013.
Rizal F. Hariadi. Thesis advisor: Erik Winfree; co-advised by Bernard Yurke.
What can Physics tell us about Life? It's not easy to say, as physics thrives on simplicity, while biology seems to thrive on complexity. But behind any complex phenomenon lies a simple set of driving principles, we like to think. In his thesis, Rizal explores some of the most fundamental principles underlying cellular function -- the self-organization and dynamical behavior of filamentous polymers such as those in the cellular cytoskeleton -- by designing, synthesizing, and quantitatively characterizing DNA nanotubes that self-assemble, grow, shrink, fragment, and (almost) exhibit treadmilling. In these lifeless pieces we see fragments of the soul of life.
[ PhD thesis, 27 MB (better @ 106 MB); Caltech ETD.]

Integrating DNA strand-displacement circuitry with DNA tile self-assembly. *
David Yu Zhang, Rizal F. Hariadi, Harry M. T. Choi, and Erik Winfree.

DNA tile self-assembly provides a molecular architecture for algorithmically programming the growth of complex but static geometrical structures with molecular precision. DNA strand displacement circuitry provides a molecular architecture for algorithmically programming the temporal dynamics of well-mixed solutions by design of chemical reaction pathways. What might be achieved by integrating them to enable programmable control over spatial and temporal self-organization simultaneously? Here, we use an upstream strand-displacement catalytic circuit to control the timing of a downstream tile-assembly system to isothermally grow DNA nanotubes.

[ Nature Communications 4: 1965, 2013 (10 pages, June 12): article, 1.4 MB and supp. info., 3.5 MB, with supp. movies #1, 9.4 MB, #2, 9.4 MB, #3, 13.1 MB, #4, 8.9 MB]

Sungwook's PhD Thesis: Beyond Watson and Crick: Programming the Self-Assembly and Reconfiguration of DNA Nanostructures Based on Stacking Interactions. *
144 pages. California Institute of Technology. Submitted May 2013.
Sungwook Woo. Thesis advisor: Paul W.K. Rothemund.
Central to DNA nanotechnology is the Watson-Crick base pair. The exquisite specificity, predictable strength, and combinatorial diversity of hybridization reactions based on complementary DNA sequences is what enables us to create sophisticated molecular programs. But what other bases for complementary binding interactions might there be? Might we be able to construct a new type of bonding with the specificity and combinatorial diversity of DNA hybridization, but which has new and different properties, perhaps allowing for easier reconfiguration of self-assembled nanomachine parts? In his PhD thesis, Sungwook starts from the simple observation that DNA origami stick together at their edges via blunt-end stacking, and constructs and studies a new system for combining origami based on binary- and shape-coded stacking interactions. Sungwook applies stacking interactions towards two different goals: the creation of large two-dimensional origami crystals on surfaces, and the origanization of expanding protein filaments to create large-scale self-assembled geometries. Just as strand displacement enabled a whole host of molecular programs which were unavailable to simple equilibrium DNA hybridization, perhaps programmable stacking bonds will enable a class of new molecular programs which undergo large scale geometric rearrangement.
[ PhD thesis, 46 MB (better @ 82 MB); Caltech ETD.]

Active Self-Assembly of Algorithmic Shapes and Patterns in Polylogarithmic Time. *
Damien Woods, Ho-Lin Chen, Scott Goodfriend, Nadine Dabby, Erik Winfree, and Peng Yin.

DNA nanotechnology provides the tools for engineering ``smart'' molecular motors that not only move from place to place, but can change their actions based on sensing molecules nearby. Molecular robots, if you will. But don't imagine a single molecular robot. Imagine a swarm of them. Walking on top of each other like bees or ants. What power is there in numbers? What power in the ability to move? On top of one another? Here we develop a theoretical model -- with roots in algorithmic tile self-assembly and L-systems and graph automata -- that can be used to explore what systems of interacting molecular robots can accomplish. As a hint of what's to come, here we examine the fabrication task: build an algorithmically-specified object. We show that our theoretical molecular robots (called ``nubots'') can fabricate objects exponentially faster and more compactly than can be done by passive self-assembly systems such as tile assembly.

[ Pre-publication draft on arXiv: arXiv:1301.2626v1 article, 5.1MB (39 pages). ]
[ Presented at ITCS 2013: Innovations in Theoretical Computer Science. Berkeley, California, pages 353-354, January 10-12, 2013.]
[ Keynote slides from a presentation at SODA 2014.]

Qing's MS Thesis: A bisimulation approach to verification of molecular implementations of formal chemical reaction networks. *
49 pages. SUNY Stony Brook. Submitted December 2012.
Qing Dong. Thesis co-advisors: Steven Skiena and Erik Winfree.
Looking at the world through rose-colored glasses doesn't make everything go right when it is destined to go wrong. That's exactly why it can be used as a technique to verify whether a designed system of molecules -- as described by a detailed and complicated model -- is a correct implementation of the simple and abstract reaction network that you wanted to create. If you can find a mathematical interpretation (the "glasses") that "colors" every molecular species in the implementation, such that every reaction in the detailed model "looks like" a reaction in the abstract specification, then you're good to go. Roughly. Read the thesis for many subtleties.
[ thesis, 394KB; SUNY Stony Brook thesis depository (not yet up). ]
Note that this is a different approach to the same problem tackled in Seung Woo Shin's thesis. Both approaches have their merits!
[Tar archive of Nuskell, the CRN compiler and verifier: Qing's thesis version combined with Seung Woo's code base.]

Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting. *
Niranjan Srinivas, Andreas Krause, Sham Kakade, and Matthias Seeger.

Before he joined the DNA group, Niranjan was a skilled fighter in the world of bandits. They were an unruly and unpredictable bunch, some with eye patches, some with just one arm -- some with many arms! But Niranjan learned their ways, and figured out how to get the best of them. With these types, you have to poke and prod them a bit -- but be wise as to what you know and don't know about how they'll respond. Always aim to hit the place that might be the best spot. And when you see what happens, revise your thinking about what they could do. It's surprising effective that simple strategy can be. Posed mathematically by a proper Bayesian.

[ IEEE Transactions on Information Theory: vol 58, pp 3250-3265, 2012 (pdf, 16 pages, 1.0 MB). ]
[ Pre-publication draft on arXiv : 0912.3995, 2010 (pdf, 17 pages, 735 KB). ]
[ Conference version in ICML: pp 1015–1022, 2010 (pdf, 8 pages, 400 KB). Winner of the ICML 2020 Test of Time Award. ]

Theory of Algorithmic Self-Assembly. *
David Doty.

Dave wrote a delightful introduction to and review of the theory of tile-based algorithmic self-assembly -- complete with a motivational video!

[ Communications of the ACM: vol 55, pp 78-88, 2012 (pdf, 11 pages, 4.7MB) (video on vimeo) (interview at MIRI) ]

Ensemble Bayesian Analysis of Bistability in a Synthetic Transcriptional Switch. *
Pakpoom Subsoontorn, Jongmin Kim, Erik Winfree.

To attain complexity, one must first master simplicity. That's been the mantra for much of our research engineering cell-free biochemical circuits. Previously we developed in vitro transcriptional systems as a bare-bones model of genetic regulatory networks that required only two essential enzymes. Knowing only how to implement negatively regulated "genelets", we constructed a bistable feedback circuit using two such genelets -- comprising six synthetic DNA oligonucleotides. Here, we show how positive regulation can be achieved, and build a single-switch positive feedback circuit that is similarly bistable. With only four synthetic DNA oligonucleotides. This simplicity encouraged us to characterize and model the system within a Bayesian inference framework.

[ ACS Synthetic Biology: ASAP version, June 26, 2012 (18 pages): article, 4.1MB and SI, 934KB. ]
[ Pre-submission rough draft on arXiv: q-bio/1101.0723 (21 pages): article, 1.4MB under the name Bistability of an In Vitro Synthetic Autoregulatory Switch. ]

Direct Atomic Force Microscopy Observation of DNA Tile Crystal Growth at the Single-Molecule Level. *
Constantine Evans, Rizal Hariadi, and Erik Winfree.

It's nice to see what you're knowing. For over 15 years, our group has been modeling DNA tile self-assembly as a crystal growth process where individual tiles attach at a constant rate (independent of where and how they attach) and detach at a rate that depends exponentially on how many base pairs they are attached by. This model formed the basis for our theoretical investigations, for our experimental designs, and for our confidence that algorithmic self-assembly could work with low error rates. We took it for granted. But we never tested it directly. Now we have.

[ JACS 134: 10485-10492, 2012 (8 pages, June 13): article, 5.2MB and supp. info. ]
[ Media: ChemistryViews. ]

Deterministic Function Computation with Chemical Reaction Networks. *
Ho-Lin Chen, David Doty, and David Soloveichik.

Sometimes a little flexibility makes a huge difference. For example, in Shannon's theory of information, the number of bits required to specify an element of a set with no possibility of error (the deterministic information) can be dramatically larger than the number of bits required if you can tolerate some chance of error, no matter how small (the probabilistic information). This work establishes a perhaps even more dramatic distinction for computation within well-mixed stochastic chemical reaction networks. Whereas it was previously known that, accepting an arbitrarily small probability that the output will be in error, well-mixed stochastic chemical reaction networks can simulate Turing machines, and that in contrast questions of reachability (rather than probability) are decidable, in this work the exact class of functions that can be deterministically computed by well-mixed stochastic chemical reaction networks (with no possibility of error) is elegantly characterized in terms of semilinear functions. This is a huge gap!

[ Journal version: Natural Computing 13: 517--534, 2014 (18 pages): article, 920KB ]
[ Conference version: DNA Computing and Molecular Programming 18: pages 25--42, 2012 (18 pages): article, 401KB ]
[ Pre-publication draft on arXiv: arXiv:1204.4176v1 (15 pages). ]

Robust self-replication of combinatorial information via crystal growth and scission. *

Rebecca Schulman, Bernard Yurke, and Erik Winfree.
What is the simplest chemical self-replicator? Well, that depends on just what you mean by "self-replication". If your interest is ultimately in Darwinian evolution -- and perhaps the origin of life here on Earth or elsewhere -- then an interesting self-replicator must carry combinatorial information that guides its function and can be mutated to explore a vast range of functions. In 1965, Graham Cairns-Smith proposed that crystals -- specifically, polytypic minerals -- may have been the first such chemical self-replicators capable of Darwinian evolution. Here, we experimentally explore the general principles and mechanisms needed for Cairns-Smith's hypothesis, but using DNA tile crystals rather than mineral crystals, and find that they are sound in practice.

[ PNAS, Published online before print April 9, 2012, doi: 10.1073/pnas.1117813109 (6 pages): article, 1.0 MB, supplementary information, 13 MB. ]
[ Caltech press release. ]
[ BBC article providing context. ]

Joseph's PhD Thesis: Stochastic Simulation of the Kinetics of Multiple Interacting Nucleic Acid Strands. *
104 pages. California Institute of Technology. Submitted February 2013.
Joseph's MS Thesis: The Multistrand Simulator: Stochastic Simulation of the Kinetics of Multiple Interacting DNA Strands. *
123 pages. California Institute of Technology. Submitted February 2012.
Joseph Schaeffer. Thesis advisor: Erik Winfree.
Perhaps the central enabling concept for computer engineering is the abstraction hierarchy: a tower of machine models that range from low level details to high level concepts -- and that come with methods for moving from one level to another with provable notions of equivalence. DNA (and RNA) secondary structure kinetics will be near the bottom of the stack for molecular programming, as it is currently envisioned. In his PhD thesis (refined and improved from his Master's thesis), Joseph takes important steps in defining the model for multiple interacting strands, developing an efficient simulator, and providing tools for analysis. This work should provide a strong foundation for future efforts to build the abstraction hierarchy for molecular programming.
[ PhD thesis, 4 MB; Caltech ETD.] [ MS thesis, 4 MB; Caltech ETD.]
[ Multistrand is now publicly released; see the LNCS paper. ]

Parallelism and Time in Hierarchical Self-Assembly. *
Ho-Lin Chen and David Doty.

The abstract Tile Assembly Model examines a single tile-based crystal as it forms from a seed within an inexhaustible bath of free tiles. But in solution, many crystals may grow in parallel, and if they interact with each other by self-assembly, it is natural to think that this parallelism could be exploited to grow well-defined structures much more quickly. Somewhat remarkably, and counter-intuitively, the authors show that if basic elements of chemical kinetics are taken into account (low concentration species encounter each other less frequently than high concentration species) then the parallelism of hierarchical self-assembly provides no advantage at all (caveat, caveat). If you want to build large complex things quickly, look elsewhere.

[ Journal version: SIAM Journal of Computing 46: 661-709, 2017 (49 pages) article, 843KB ]
[ Pre-publication draft on arXiv: arXiv:1104.5226v1 (46 pages). ]
[ SODA 2012: Proceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1163-1182. article, 987 KB (20 pages). ]

The tile assembly model is intrinsically universal. *
David Doty, Jack H. Lutz, Matthew J. Patitz, Robert T. Schweller, Scott M. Summers, and Damien Woods.

We are all familiar with intrinsic universality. We know there exists a single Turing machine that can simulate any other Turing machine, step for step but with some linear spatial and temporal scaling. We know there exists a single cellular automata rule that can simulate any other cellular automaton, step for step but with some linear spatial and temporal scaling. This gives the models a sense of wholesomeness. They can simulate themselves efficiently; nothing's missing. But the abstract Tile Assembly Model, so dear to my heart, appeared able only to simulate certain subsets of itself -- an awkward incompleteness. Until now!

[ Pre-publication draft on arXiv: cd.DS/1111.3097 (54 pages): article, 1.3 MB ]
[ FOCS 2012: Proceedings of the 53rd Annual IEEE Symposium on Foundations of Computer Science, pp 302--310, 2012, unofficial article, 1.26 MB (54 pages). ]

Program Size and Temperature in Self-Assembly. *
Ho-Lin Chen, David Doty, and Shinnosuke Seki.

Asymptotic upper and lower bounds for the tile assembly complexity of NxN squares are known. But the precise number? Well, it's related to the Kolmogorov complexity of the number N, and that's uncomputable. However, the relation is not tight, and a polynomial time algorithm to find the smallest tile program is known for "temperature" T=2. Here, the authors generalize that result for all temperatures.

[ Pre-publication draft on arXiv: arXiv:1011.3493v2 (16 pages). ]
[ ISAAC 2011: Proceedings of the 22nd International Symposium on Algorithms and Computation, LNCS 7074, pp. 445-453. unofficial article, 514 KB ]

Seung Woo's MS Thesis: Compiling and verifying DNA-based chemical reaction network implementations. *
120 pages. California Institute of Technology. Submitted September 2011.
Seung Woo Shin. Thesis advisor: Erik Winfree.
Compilers are useless if the low-level implementation that they produce as output is not semantically equivalent to the high-level program that was provided as input. Proving a compiler's output to be correct is an essential element of computer science -- and a non-trivial one. Here, Seung Woo tackles this problem in molecular programming, first building a general-purpose compiler for implementing abstract chemical reaction networks using DNA strand displacement cascades, then defining a notion of equivalence for abstract chemical reaction networks, and finally providing a (usually) efficient algorithm for testing whether a compiled network is correct with respect to that notion. The results of testing on a number of systems from the literature rather surprised me.
[ thesis, 1 MB; Caltech ETD. ] [Tar archive of Nuskell, the CRN compiler and verifier: Seung Woo's thesis version.]

Timing molecular motion and production with a synthetic transcriptional clock. *
Elisa Franco, Eike Friedrichs, Jongmin Kim, Ralf Jungmann, Richard Murray, Erik Winfree, and Friedrich C. Simmel.

The brilliance of genetic regulatory networks is that they are programmable manufacturing plants. They can build all sorts of things. Protein motors, self-assembling scaffolds, enzymes, ion channels... they even build biochemical circuits, and transcription factors that regulate what is built, and when. Can chemical engineers learn to build non-biological molecular systems that operate according to similarly powerful principles, as programmable molecular manufacturing plants? As an exercise toward that goal, here we use in vitro transcriptional systems -- a stripped-down model for genetic regulatory networks in which RNA rather than protein plays the important signalling and control roles, with only two essential enzymes needed for RNA production and degradation -- to construct a synthetic biochemical clock that controls downstream nucleic acid machinery.

[ PNAS, 108: E784-E793, 2011 (10 pages): article, 9.7 MB, summary, 1.7 MB, supplementary information, 10.4 MB. ]
[ Highlighted in Biopolymers, reported by HFSP. ]

Neural network computation with DNA strand displacement cascades. *
钱璐璐 (Lulu Qian), Erik Winfree, and Jehoshua Bruck.

At last, we have a small test tube of over a hundred DNA strands, playing games with you, trying to read your mind, acting like a lovely tiny brain.

[ Nature: 475: 368-372, 2011 (5 pages) article, 459KB and SI, 19.8MB ]
[ Check out our YouTube video, Part I (design) and Part II (experiments). ]
[ Also see current and archival media coverage, including the Caltech press release (in Chinese), a Nature News & Views by Anne Condon, a Nature Podcast and an article in IEEE Engineering & Technology. ]
[ Erratum: In both inequalities on page 17 of the SI, the indices are botched. It should be $\sum_j (w^+_{j,i}x^1_j + w^-_{j,i}x^0_j)$ on the LHS.]
[ Erratum: In Fig. S11 of the SI and throughout the section, the "inference score" was calculated to provide a score of 1, in case (2), even in situations where the network didn't reconstruct all of the common bits. Also, in case (3), the network computation was deemed correct if at least one (but not necessarily all) output bits became invalid. In the case of the network experimentally demonstrated in Fig 3, both the looser and stricter criteria agree. ]

Programmable molecular recognition based on the geometry of DNA nanostructures. *
Sungwook Woo and Paul W. K. Rothemund.

Benoit Mandelbrot famously noted that the geometry of natural objects is often fractal -- that self-similar structures appear at many scales, resulting in a fractional dimensional exponent. Self-similarity can appear not only in static objects, but in dynamic behavior patterns, or even in concepts. Here, Sungwook and Paul show that the principles of sequence-specific binding, which we enjoy at the nanometer scale in DNA and RNA, reappear at a ten times larger scale when engineering nanostructures (and eventually nanomachines) using DNA origami.

[ Nature Chemistry: online 10 July 2011 (8 pages): article, 3.5MB and SI, 7.2MB See also Nature Chemistry News and View ; reporting in Nature Methods ]

Scaling up digital circuit computation with DNA strand displacement cascades. *
钱璐璐 (Lulu Qian) and Erik Winfree.

People sometimes say that DNA computing is useless. They're probably just thinking about the wrong application. A postdoc in our lab often wears a T-shirt with a classic XKCD comic. Inspired by the failure of math to explain romance, ( √ ♥ = ? ) we decided to try our latest and greatest DNA technology on the square root problem. It was challenging: designing 130 DNA strands to all work together as a circuit in a single test tube! Not only did it get all the answers right for inputs less than three ( √ < 3 ), but it even computed up to the square root of 15 (rounded down). Is that useless?

[ Science: 332: 1196-1201, 2011 (6 pages) article, 860KB and SI, 5.7MB ]
[ Check out our YouTube video, The Seesaw Magic Book. (In Chinese) ]
[ Try it yourself: we wrote an online compiler that will give you DNA sequences for any digital circuit. ]
[ Also see current and archival media coverage, including the Caltech press release (in Chinese), a Science Perspective by John Reif, and Nature News. ]
[ Erratum: The horizontal axis for Supplementary Figure S2E should be in minutes, not hours. ]
[ Erratum: The experimental data in Supplementary Figures S6B and S6D should be swapped. ]

A simple DNA gate motif for synthesizing large-scale circuits. *
钱璐璐 (Lulu Qian) and Erik Winfree

What is the core of life? Is it, as the poets sometimes say, the ebb and flow of the tides, the in and out of breathing, the ups and downs of fortune? Or is it, as the electrical engineers might say, all in the circuitry, the processing of information, the dynamics of behavior? In this work, we present a simple DNA device, called the "seesaw gate", that should make both poets and engineers happy. It operates by a simple back-and-forth process -- a strand comes in on one side, pushing that side "down" and kicking off the one on the other side as it goes "up". And then visa versa, like a seesaw. Released strands are then free to interact with other seesaw gates, allowing networks to be designed systematically. Remarkably, the ebb and flow of activity in these networks can perform computations of arbitrary complexity in principle -- in fact, we describe a compiler that translates digital logic circuits into functionally equivalent seesaw gate networks, and we argue that the simplicity of the motif should make networks containing thousands of gates possible. If this theoretical proposal pans out experimentally, could it become a core technology for embedding circuitry in synthetic biochemical systems?

[ DNA 14 conference preprint: (13 pages): .pdf, 417 KB. ]
[ Conference version in LNCS 5347 pages 70-89 (2009): .pdf, 414 KB. ]
[ Journal version in Royal Society Interface 8: 1281-1297 (2011): article, 2.1 MB. ]

Synthetic in vitro transcriptional oscillators. *
Jongmin Kim and Erik Winfree.

The first* purely chemical oscillator was discovered a little over 60 years ago. Belousov's detailed paper was rejected twice, appearing later only as an abbreviated conference abstract and as a full paper only posthumously. Thus ingloriously began the scientific study of how chemical circuitry can lead to non-trivial dynamical behavior. Eventually, of course, the study of simple chemical oscillators and complex biological clocks flowered into a rich interdisciplinary field. In this paper, we probe the middle ground by showing how cell-free transcription and degradation can be systematically designed to create a variety of oscillator architectures -- merging the programmability of biology with the simplicity of in vitro reactions.

[ Molecular Systems Biology: 7: 465, 2011 (15 pages): article, 2.6MB and SI, 9.1MB and models.zip, 2.8MB ]
*Leon Glass pointed out to me that Belousov's was not the first discovered. That credit appears to go to W. C. Bray and A. L Caulkins circa 1921. Leon says he tried it, and it works. Thanks for the pointer, Leon!

Remote Toehold: A Mechanism for Flexible Control of DNA Hybridization Kinetics. *
Anthony J. Genot, David Yu Zhang, Jonathan Bath, and Andrew J. Turberfield.

The concept of a "toehold" for initiating branch migration and strand displacement has become central to our thinking about kinetic control in DNA nanomachines and DNA circuits. Toeholds were originally formulated as single-stranded extensions contiguous with the double-stranded domain where branch migration occurs, so that a complementary strand can bind to the toehold and initiate branch migration up to six orders of magnitude faster than in the absense of a toehold. Here, Genot and colleagues show that toeholds need not be contiguous with the double-stranded branch migration domain; localizing the invading strand at a remote site within the target molecule can also serve to significantly accelerate the targeted strand displacement reaction. The resulting design flexibility is essential for some complex DNA nanomachines; furthermore, the intervening region between the toehold and the branch migration domain can be used as a knob to modulate the reaction rate. An idea worth knowing about.

[ JACS: ASAP (6 pages): article, 1.9MB and SI, 600KB ]

Dynamic DNA Nanotechnology using Strand Displacement Reactions. *
David Yu Zhang and Georg Seelig.

This is what happens when you write a really good introduction to your thesis: It metamorphoses into a top notch review article.

[ Nature Chemistry: 3: 103-113, 2011 (11 pages): article, 645KB ]

Cooperative Hybridization of Oligonucleotides. *
David Yu Zhang.

When two Watson-Crick oligonucleotides hybridize to each other, they cancel each other out. This is useful. For example, it allows those oligonucleotides to serve as opposing signals in their single-stranded form. What if you want to build a DNA strand displacement circuit in which two dynamic signals cancel each other out, but the two signals must have unrelated sequences? Well, Dave's got an elegant trick for you. And it gets better. When the signals cancel out, they also release a new strand with a new sequence juxtaposition that can be fed into downstream circuit components. He demonstrates these features by constructing elegant Boolean logic gates and analog concentration comparators. Even more useful!

[ JACS: 133: 1077-1086, 2011 (10 pages): article, 2MB and SI, 360KB ]

Elongational-flow-induced scission of DNA nanotubes in laminar flow. *
Rizal F. Hariadi and Bernard Yurke.

Everyone knows what happens when an astronaut hops out of his spaceship a little too close to a black hole. As he drifts toward the black hole, the gravitational pull becomes stronger and stronger, until it rips him apart, legs from his body, toes from his feet -- even molecules and atoms are eventually rent asunder. Interestingly, it's not the magnitude of the gravitational force that is so disruptive, but the change in gravity with distance, increasing as 1/R². (An equally strong, but constant, force would result in no worse than a relatively pleasant free-fall sensation.) As it is, the astronaut's toes are eventually pulled so much harder than his knees, that the differential force is greater than the forces holding the tissue together -- and pop! Loosely speaking, Rizal and Bernie have built a microfluidic black hole, and rent asunder billions of DNA nanotube astronauts.

[ Physical Review E: 82: 046307, 2010 (11 pages): .pdf, 1.6MB ]
[ Also enjoy this movie of DNA nanotubes being compressed and stretched as they enter and leave a side chamber that was accidentally created in a defectively manufactured microfluidic channel. This elongational flow is not sufficient to break the nanotubes. ]

Dave's PhD Thesis: Dynamic DNA Strand Displacement Circuits. *
290 pages. California Institute of Technology, Defended May 2010.
David Yu Zhang. Thesis advisor: Erik Winfree.
It's like martial arts choreography. Strands flying. Collisions in mid-air. An improbable cascade of events. A quick chop, and what was once one becomes two. Then the silent ones lurking in the shadows suddenly appear, a brief moment of havoc... and, what, you've completed the catalytic cycle? That's right.
[ .pdf, 5.9 MB; Caltech ETD. ]

Optimizing Tile Concentrations to Minimize Errors and Time for DNA Tile Self-Assembly Systems. *
陳和麟 (Ho-Lin Chen) and 高銘揚 (Ming-Yang Kao)

Well, it's Christmas season, and the Grinch isn't giving you any extra tiles. Success is the best revenge, so how do you do the most with what you've got? Remarkably, when you optimize the stoichiometry of the different tile types -- and no, it's not quite as simple as using them in direct proportion to how often they occur in the final assembled shape, but it's almost that simple -- you simultaneously get the best error rate and the best assembly time. So there, Grinch!

[ final: LNCS 6518 pages 13-24, 2011: .pdf, 422 KB. ]

DNA-based Fixed Gain Ampliﬁers and Linear Classiﬁer Circuits. *
David Yu Zhang and Georg Seelig

It's great that an entropy-driven DNA catalyst can amplify a signal to detect the presence of a target nucleic acid molecule. The amplified DNA signal can then trigger some downstream chemical activity, just under the right circumstances. But what if the circumstance you're trying to detect isn't the presence or absence of a specific single species, but rather a specific pattern of analog concentration levels for a spectrum of nucleic acid targets? One option is to implement a threshold for each species, then process those YES/NO answers with digital logic -- AND and OR gates. Another option, explored here, is to first implement analog logic that appropriately combines the signals from each species, then takes a threshold to yield the final YES/NO answer. To do so, Dave and Georg propose a scheme for modifying our entropy-driven catalyst system to serve as a signal multiplier, coupled with a mechanism for taking a difference and effecting a threshold. The resulting neural network-like architecture could be quite handy for chemical pattern recognition.

[ preliminary: DNA 16 conference proceedings, 300 KB, 8 pages. ]
[ final: LNCS 6518 pages 176-186, 2011: .pdf, 275 KB. ]

Efficient Turing-universal computation with DNA polymers. *
钱璐璐 (Lulu Qian), David Soloveichik, and Erik Winfree

You've gotta love those little cartoons of Turing machines, you know, with the robotic handcar rolling along on the track writing zeros and ones all over the place? It's a seminal model of abstract computing machines, sure, but real computers aren't little motorized machines like toy trains. Right? Computers don't even need moving parts -- just a smooth and soundless flow of electrons through a staggering maze of wires carved into a cold piece of silicon. The Turing machine model is great for theorists, but of little use for anything else. Right? Wrong! The Turing machine is a wonderful, practical, implementable model of an autonomous molecular machine that modifies a linear heteropolymer. Charles Bennett realized this many years ago during his studies of the thermodynamics of computing. We don't quite figure out how to design a DNA Turing machine here, but we get pretty close -- we give a theoretical construction for its close cousin, the stack machine.

[ preliminary: DNA 16 conference proceedings, 600 KB, 14 pages. ]
[ final: LNCS 6518 pages 123-140, 2011: .pdf, 407 KB. ]

Simple Evolution of Complex Crystal Species. *
Rebecca Schulman and Erik Winfree

How pervasive is the principle of evolution? How complex must be the seed of life? Here we present our take on the simple origins of complexity, extending Graham Cairns-Smith's ideas: that the logic itself of crystallization can in principle give rise to unbounded complexity in pattern formation -- with very simple crystal repeat blocks, in very simple environments, and without any intrinsic chemical or mechanical function. Perhaps life is logically inevitable.

[ preliminary: DNA 16 conference proceedings, 332 KB, 11 pages.]
[ final: LNCS 6518 pages 147-161, 2011: .pdf, 768 KB. ]

Towards Domain-based Sequence Design for DNA Strand Displacement Reactions. *
David Yu Zhang

Dave describes his motivations and methods for designing the DNA sequences of molecular machines. These rule-of-thumb principles have worked out pretty well in his hands ( see his other publications ) so there must be some merit to them. It's an insightful paper, and practical. Plus, he wrote some quick-and-easy software that he's been using for a few years. Feel free to try it out.

[ preliminary: DNA 16 conference proceedings, 249 KB, 11 pages. ]
[ final: LNCS 6518 pages 162-175, 2011: .pdf, 292 KB. ]

Molecular robots guided by prescriptive landscapes. *
Kyle Lund, Anthony T. Manzo, Nadine Dabby, Nicole Michelotti, Alexander Johnson-Buck, Jeanette Nangreave, Steven Taylor, 裴仁军 (Renjun Pei), Milan N. Stojanovic, Nils G. Walter, Erik Winfree, and 颜颢 (Hao Yan).

How small can a robot be? When I was in school, I built a small robot by taking a remote-control car, adding some light sensors and bump sensors, and installing a tiny on-board computer. I programmed it up to scurry around following walls. Lots of fun. But while I played, researchers in academia and industry were building pint-sized robots that carried out collective tasks in distributed "swarms", or that connected to each other in reconfigurable ways to form "meta-robots". Researchers wanted to make the robots smaller and smaller, so that they could have lots and lots of them, so that they could do more and more interesting things. Some even began using silicon chip fabrication techniques to build MEMS (microelectromechanical systems) robots only millimeters on a side. Once you get that small, there's not really much room for a CPU and memory, much less sensors and motors! How can you program a robot -- or a swarm of robots -- when each one has such limited capabilities? It's a deep and important question. Fast forward to today. Can you make a robot from a single molecule? Can a single molecule sense its environment, make decisions, and take actions? The answer is a resounding yes! But how? Our first baby steps -- born of Milan Stojanovic's seminal work on "smart and agile" deoxyribozyme "spiders" -- are reported here in a collaboration lead by Milan that involved chemical synthesis and characterization at Col umbia, atomic force microscopy at Arizona State University, and real-time single-molecule fluorescence microscopy at Michigan State University.

[Nature 465:206-210, 2010 (5 pages, pdf, 920 KB) and supporting information (pdf, 9 MB). (Also check out a lovely perspective piece in Nature, and a Nature blog, the NSF press release with an interview of Milan, the Caltech press release, a commentary at The Scientist that includes a single-molecule fluorescence movie of spiders from the Walter lab, Chemical & Engineering News, Chemistry World, Popular Science, MIT Technology Review, Physics World, an excellent Discover Magazine blog , a blog, comments from a Biophysical Society Meeting, a discussion of boffinry , and something I don't understand in German. )]

DNA as a Universal Substrate for Chemical Kinetics. *
David Soloveichik, Georg Seelig, and Erik Winfree

If chemistry is programmable, what's the programming language? Well, there are many possibilities... but let's not go with a fad. For well over one hundred years, chemists have been using the elegant and refined mathematical language of mass action chemical kinetics. Traditionally, it has been used descriptively, as a means of describing chemical systems encountered in the real world. As chemical engineers interested in programming chemistry, we wish to use the language of chemical kinetics prescriptively, as a means of describing the behaviors we aim to achieve. It's a rich language, capable of expressing behaviors as varied and sophisticated as stable attractors, oscillators, chaotic dynamics, signal processing and computation -- but much of the evidence for this has come from theoretical studies of chemical reaction networks that are possible in principle, rather than from the study of real systems. That is, for the past hundred years, many intriguing chemical reaction networks existed only in a kind of theoretical dream world, with no physical implementation known. Finally, here we show how any chemical reaction network you dream up can be implemented using systems of DNA logic gates, transforming chemical kinetics from a modeling language into a programming language. Go ahead, make your dreams come true!

[ Full paper in PNAS, 107: 5393-5398, online March 4, 2010 (6 pages): .pdf, 850 KB and supplementary information, 650 KB]
[ Try our Mathematica notebooks for compiling CRNs into DNA gate schemes. To go all the way to DNA sequences you'll need some sequence design tools. ]
[ DNA 14 conference preprint, revised: (10 pages): .pdf, 508 KB. Extended abstract in LNCS 5347 pages 57-69 (2009): .pdf, 582 KB. ]

Robustness and modularity properties of a non-covalent DNA catalytic reaction. *
David Yu Zhang and Erik Winfree.

What makes a molecule tick? There's one way to find out: Poke it. Prod it. Dave thoroughly harrassed his entropy-driven DNA catalyst system and found out a lot that way. A few surprises: * The system is exquisitely sensitive to mutations in the catalyst strand sequence, but relatively insensitive to mutations in the fuel strand sequence. * The system is almost unaffected by a high background of random poly-{A,C,T} strands, but performance rapidly degrades with high concentrations of random poly-{A,C,G,T} strands. These results can be explained in terms of design features of the entropy-driven catalyst system. You'll also find information about modeling, leak reactions, poisoning reactions, strand purity, 5'/3' orientation, and bulky add-ons to the catalyst. The bottom line: this is stuff you'll want to know if you plan on designing robust and modular strand displacement circuits.

[ Nucleic Acids Research: 38: 4182-4197, 2010 (16 pages): .pdf, 1.8MB ]

Time-Complexity of Multilayered DNA Strand Displacement Circuits. *
Georg Seelig and David Soloveichik

Of central importance in the theory of computation -- and in life in general -- is how long it takes to get things done. For standard models such as digital circuits and Turing machines, time complexity is very well understood. Does our understanding of classical computation directly yield results for computation in biochemical circuits? Of course, it depends on how you do things. For example, as shown here, circuits built from stochiometric gates (one output molecule per input molecule) are asymptotically slower than those using catalytic gates (each input molecule can produce many output molecules). This suggests that the time complexity of formal chemical reaction networks is likely to develop into an interesting, rich -- and probably suprising -- theory.

[ article appears in DNA 15, LNCS 5877 pages 144-153, 2009: .pdf, 451 KB. ]

Programmable Control of Nucleation for Algorithmic Self-Assembly. *
Rebecca Schulman and Erik Winfree.

When writing a computer program, it is easy -- so trivial you don't even think about it -- to make sure that your algorithm starts with the correct input and proceeds from there. Models of algorithmic self-assembly usually assume that assembly begins with a "seed tile" which provides the input information. But molecular self-assembly is an inherently parallel process; assembly begins anywhere it is energetically favored. In this paper we show that tile sets can be constructed that introduce arbitrarily large barriers to spurious nucleation (starting self-assembly from something other that the seed tile). By combining this construction with previous methods for proofreading tile sets, we can now construct tile sets that are robust to all major types of errors, without introducing significant slow-down. Control of nucleation also has some very interesting applications, such as enzyme-free detection and amplification, and enzyme-free in-vitro evolution.

[ SIAM Journal on Computing vol. 39, pp. 1581-1616, December 4, 2009, (36 pages) .pdf, 1.5 MB. ]
[ preprint cond-mat/0607317 on arXiv.org (22 pages): .pdf, 728 KB. ]
[ extended abstract in DNA Computers 10, LNCS volume 3384: 319-328, 2005, (10 pages): .pdf, 176 KB. ]

Self-assembly of carbon nanotubes into two-dimensional geometries using DNA origami templates. *
Hareem T. Maune, Si-Ping Han, Robert D. Barish, Marc Bockrath, William A. Goddard III, Paul W. K. Rothemund, Erik Winfree.

Carbon nanotubes are amazing molecules -- rolled up sheets of hexagonal carbon mesh with astounding thermal and electrical properties, they're the nanocircuit engineer's dream. But they're oh so hard to handle! Too small and too slippery to pick up and put them where you want, most researchers either study individual nanotubes, make do with regular arrays of nanotubes, look for chance circuits in randomly scattered piles of nanotubes, or rely on bulk properties of tangled tubes. Here we suggest a way to self-assemble complex nanotube circuits in parallel -- by sticking them in precise locations on DNA origami nanobreadboards. In the future, we envision this technique being expanded from the two-nanotube devices demonstrated in this paper, to multiple nanotube circuits on individual origami, to large-scale nanotube circuits on origami placed on lithographically-patterned surfaces. Dream on!

[ Nature Nanotechnology, 5: 61-66, 2010 (online 8 November, 2009; 5 pages): paper, 1.3 MB, supplementary information, 2.1 MB.]
[ Media: Caltech press release, Eric Drexler's blog. ]

Synthetic Networks. *
Jongmin Kim.

Jongmin's very readable review about principles of network architecture in cells, synthetic networks in vivo, synthetic networks in vitro, design challenges related to enzyme saturation and waste products, and the prospects for constructing a minimal cell.

[ In Automation in Proteomics and Genomics: An Engineering Case-Based Approach, eds. Gil Alterovitz, Roseann Benson, and Marco Ramoni, pp. 251-271, 2009 (21 pages): paper, 2.0 MB. ]

Control of DNA Strand Displacement Kinetics Using Toehold Exchange. *
David Yu Zhang and Erik Winfree.

Computer programs work by transferring control from one line of the program to another. Many DNA programs work by transferring control from one part of a molecule to another. That's what toehold exchange does. It is an essential mechanism in many of our DNA strand displacement circuits. Here we study the device physics of toehold exchange and show that kinetics can be predicted remarkably accurately from the thermodynamics of toehold binding. With this understanding, we can make our DNA programs transition smoothly from step to step.

[ JACS, 131: 17303-17314, 2009 (12 pages): paper, 2.2 MB, supplementary information, 200 KB, addendum, 46 KB.]
[ Erratum: In equation 9 on page 17311, there should be a factor of /RT in the exponent of the numerator. ]
[ Erratum: The flowchart approximation of Figure 8, derived on SI page 13, uses an "average" base pair energy of -1.4 kcal/mol at 25C, which is an underestimate. The true value at 25 C should be -1.7 kcal/mol, and the formulas in the flowchart should be adjusted. The value of -1.4 kcal/mol is a better value for "average" base pair energies at 37C, or for "somewhat weak" base pairs (say 70% A/T) at 25C, in which case the original flowchart holds. ]

Placement and orientation of individual DNA shapes on lithographically patterned surfaces. *

RJ Kershner, LD Bozano, CM Micheel, AH Hung, AR Fornof, JN Cha, CT Rettner, M Bersani, J Frommer, PWK Rothemund, GM Wallraff.
Electrical engineers excel at making things using top-down patterning, such as using lithograph to etch circuits on large wafers of silicon, but it is quite difficult to achieve feature sizes below 20nm. Molecular engineers excel at making things using bottom-up self-assembly, such as using DNA hybridization to fold a virus genome into DNA origami, but it is quite difficult to put all the precisely-constructed molecules in the right places on a large scale. Patterning from 10cm down to 20nm. Patterning from 100nm down to 3Å. A match made in heaven. (Joint Caltech/IBM work.)

[Nature Nanotechnology, 3: 557-561, 16 August, 2009 (5 pages): paper, 784 KB, supplementary information, 9.9 MB, and supplementary movie, 3.3 MB.]
(Comments in the press... but please take them with a grain of salt. Working nanoscale circuits are much much further from reality than some of these would suggest. Let's be clear: the work here doesn't attempt to make functional circuits, it just provides a step toward the solution for how to position DNA origami on a lithographically-patterned substrate. We have no idea when or if this approach will pay off for a commercial application. Anyway: Caltech Press Release, IBM Press Release, guardian.co.uk, BBC, CNET, Wired, EE Times, Discover Magazine )

An Information-Bearing Seed for Nucleating Algorithmic Self-Assembly. *
Robert D. Barish, Rebecca Schulman, Paul W. K. Rothemund, and Erik Winfree.

What is a seed? The tiny seed of a giant sequoia tree, sprouting after the fire. The invisible seed of an idea, from which a thousand possibilities grow. A crystal seed, determining the order of all that grows from it. The seed of man and woman, carrying with it the future of humanity. Clearly, it's important stuff. Why? The seed carries the information, the creative part, the inspiration -- and what follows is mere mechanism, the consequences, the algorithm. In this work, we use DNA origami as a highly effective seed for growing DNA tile crystals. Arbitrary information can be put on the seed; it directs the growth of DNA crystals and determines their morphology... much as a genome determines phenotype of an organism... or even, as an idea creates the future.

[PNAS, 106: 6054-6059, 2009 (6 pages): .pdf, 1.7 MB, supplementary information, 2.6 MB, and appendix, 124 KB. ]
(Comments in the press: Caltech Press Release, New Scientist, Foresight )

Statistical Learning of Arbitrary Computable Classifiers. *
David Soloveichik.

It's possible to learn. But there's a conundrum. If the thing you're trying to learn is really complex, you'll need a really complex model. That means you'll need a lot of data to distinguish between the good models and the bad ones. Can you know when you've got enough data? Remarkably, yes... most of the time. But not until you've seen the data -- a distinction that explains why other researchers (with different learning assumptions) have claimed that this is impossible.

[arXiv preprint: cs.LG/0806.3537v2 (5 pages): .pdf, 84 KB. ]

Dynamic Allosteric Control of Noncovalent DNA Catalysis Reactions. *
David Yu Zhang and Erik Winfree.

"When you're hot, you're hot; when you're not, you're not." Some find in that phrase solace for fickle fortune. Others see an engineering principle. It's the ideal for a switchable catalyst: when it's ON, it works full speed ahead, but when it's OFF, nothing doing. In this paper, we design and demonstrate a modification of the catalyst for a previously studied DNA hybrization reaction (Zhang et al 2007) that can be turned ON and OFF by an exogenous DNA strand. This is accomplished by designing the catalyst so that it has two conformations -- switchable by the exogenous strand -- one of which hides the toehold critical for initiating the catalytic reaction. I can almost hear those poor DNA molecules' lament.

[JACS, 130:13921–13926, 2008 (6 pages): .pdf, 850 KB. ]

Programmability of Chemical Reaction Networks. *
Matthew Cook, David Soloveichik, Erik Winfree, and Jehoshua Bruck.

So you'd like to program chemistry, would you? Well, it's tough. Suppose you already had a handle on the molecular design problem: "No problem," you say, "given a specification for a set of chemical reaction equations, I can construct molecules that react according to plan." Even that's not good enough, because there are limits to what a system can do when the parts are bouncing around like a bag of marbles. So, it's tough. But we can give you some hints. In this paper, we compare stochastic chemical reaction networks to a variety of known models of computation (including one of our eccentric favorites, John Conway's FRACTRAN) and examine the limits of computability. You'll meet molecular counts, probabilities, vector addition systems, primitive recursive functions... and discover just when you can and can't perform uniform computation with chemistry. And all this is for well-mixed solutions with a fixed number of chemical species -- if you want to include polymers and geometrical structures, that's a whole other game. Maybe easier, actually.

[In Algorithmic Bioprocesses, Springer Berlin Heidelberg, Eds. A. Condon, D. Harel, J. N. Kok, A. Salomaa, E. Winfree, pp. 543-584, 2009 (pdf, 937 KB).]
[ Draft preprint (45 pages): .pdf, 824 KB. ]

Programming DNA Tube Circumferences. *
Peng Yin, Rizal F. Hariadi, Sudheer Sahu, Harry M. T. Choi, Sung Ha Park, Thomas H. LaBean, John H. Reif.

To form a good crystal, must the monomer unit be a compact, well-folded structure? Or can it be a floppy mess? Peng shows that what's really important is the structure in the context of the resulting crystal -- so yes, a monomer can be a floppy mess, and it's still alright! This philosophy leads to the "single-stranded tile" (SST) motif. Because each tile consists of just one strand with four domains, each specifying connectivity in one of the four lattice directions. This allows a straightforward programming of crystals ribbons and tubes with specified width. Lovely.

[ Science, vol 321: 824-826, 2008 (3 pages): online .pdf, 240 KB. (26 pages): online supplementary, 3.6 MB. ]
[ See the blog at Foresight, blurb at nanotechweb.org, research highlight in Nature Nanotechnology, and Caltech's Press Release. ]

David's PhD Thesis: Molecules computing : self-assembled nanostructures, molecular automata, and chemical reaction networks. *
133 pages. California Institute of Technology, Defended May 2008.
David Soloveichik. Thesis advisor: Erik Winfree.
We call it the song of the little nightingale. According to the text of the Clauser Prize announcement, this thesis provides seminal contributions to the theory of molecular computing, addressing fundamental questions like: Can we program the self-assembly of complex molecular structures? How can the behavior of molecular systems be made more robust to imperfections and noise? How complex can chemical circuitry become? And, what are the fundamental limits? That's a lovely song to sing.
[.pdf, 2.5 MB; Caltech ETD.]

Error suppression mechanisms for DNA tile self-assembly and their simulation. *
Kenichi Fujibayashi, David Yu Zhang, Erik Winfree, Satoshi Murata.

With the pieces floating all over the place, how can self-assembly be restricted just to the active attachment sites where you want it? How can you prevent the pieces from sticking to each other to soon? Well, one possibility is to put a lock on their binding sites, and design the active attachment sites (and no others!) to serve as a key. Here, we present two possible implementations of this concept for the algorithmic self-assembly of DNA tiles. They are analyzed by theory and simulation; surprisingly, for our designs, locking both the input and output sides of tiles provides substantially greater error suppression than just locking one side.

[ Natural Computing, vol 8, pp. 589–612 (2009) (24 pages): online .pdf, 3.4 MB. ]

Robust Stochastic Chemical Reaction Networks and Bounded Tau-Leaping. *
David Soloveichik

Biology is robust. Step on it, squash it, zap it, shake it -- odds are, it will just keep going. What makes biological systems so robust, and what are the implications of this robustness? To study this question, forget the arms and legs and blood and guts -- biological organisms are just chemistry, intricate networks of chemical reactions. What's special about robust chemical reaction networks? That should be a simpler question. In this paper, David makes a surprising connection: robustness allows stochastic chemical reaction networks to be simulated efficiently. In fact, this insight leads to an elegant and fast algorithm for simulating stochastic chemical reaction networks and even results in formal statements limiting how fast a simulator could possibly be. The result provides a new and fundamental framework for analyzing the efficiency of stochastic simulation algorithms. The bottom line is, robustness doesn't only make life easier for the organism, it makes life easier for those of us studying the organism.

[ Journal of Computational Biology 16(3): 501-522 (2009), (22 pages): .pdf, 364 KB.
arXiv version, with corrections & additions: cs.CC/0803.1030 (29 pages): .pdf, 427 KB. ]

Toward Reliable Algorithmic Self-Assembly of DNA Tiles: A Fixed-Width Cellular Automaton Pattern. *
Kenichi Fujibayashi, Rizal Hariadi, Sung Ha Park, Erik Winfree, Satoshi Murata.

In biological morphogenesis, genetic information is expressed as biochemical processes that create an organism. In algorithmic self-assembly, information in DNA is expressed as complex folding and crystallization processes that construct intricately patterned supramolecular objects. Here, we combine several techniques developed in this lab -- Erik's DNA tiles, Paul's DNA origami, Rebecca's ribbon crystals, and Rob's (as yet unpublished) origami seeds -- to self-assemble a fixed-width cellular automaton pattern related to Sierpinski triangles. (One commentator calls them "snakeskin nanobelts", an odd but evocative phrase.) Along the way, we gained some insights about assembly errors and how to prevent them -- about how algorithmic crystals aggregate and grow together and about lattice defect and computation error rates.

[ Nano Letters, 8(17): 1791-1797, 2008 (5 pages): .pdf, 884 KB and supplementary information, 1.9 MB. Highlighted in Nature. Made the cover of Nano Letters!]

Self-Assembling Tile Systems that Heal from Small Fragments. *
Ho-Lin Chen, Ashish Goel, Chris Luhrs, Erik Winfree.

Tile self-assembly that is governed by cooperative attachment is capable of forming complex algorithmic patterns from seeds. If holes are punched in the structure but the seed remains intact, properly designed tile sets can successfully self-heal the damage. However, with these constructions, if the structure is fragmented into many pieces, only the fragment containing the seed will correctly regrow into the "adult" structure. Here we set out to address the Fantasia Broomstick Problem: can tile sets be designed such that every fragment regrows into the adult structure? That's too ambitious. But what we can design is a a tile set with O(log n) tile types that grows into an n x n square such that after fragmentation, every fragment that is at least size O(log n) will regrow into the full square again. More general constructions, for arbitrary shapes, are also possible -- where the minimal size for fragment regrowth relates to the program size complexity of the adult shape.

[ Accepted at DNA Computing 13 (2007); manuscript from preliminary proceedings (16 pages): .pdf, 288 KB. ]

Engineering Entropy-Driven Reactions and Networks Catalyzed by DNA. *
David Yu Zhang, Andrew J. Turberfield, Bernard Yurke, and Erik Winfree.

The entropy of the universe is always increasing. That sounds like a force -- something that keeps increasing can push something else, can't it? The problem, of course, is that using entropy to do work sounds like trying to plow a field by herding stray cats -- it just ends in chaos. But does it have to? Nope: chemists, in fact, are quite familiar with entropy-driven reactions. In this paper, we show how to design systems of DNA molecules with catalytic reactions that are driven by entropy. And it doesn't end in chaos, far from it: we argue that our reactions can be wired together into arbitrary analog or digital circuits, which means they can process information and thus create order. So entropy can drive the production of order? Yup. No wonder the universe is such a beautiful place!

[ Science, 318:1121-1125, 2007 (5 pages) .pdf, 604k and supplementary info, 640k. (A labeling correction for Fig. 4.) ]
[ Chosen as an editor's choice; also see Roy Bar-Ziv's excellent Perspectives essay, an interesting commentary in The Scientist and a nice article in The New Scientist. ]

Computation with Finite Stochastic Chemical Reaction Networks. *
David Soloveichik, Matthew Cook, Erik Winfree, and Jehoshua Bruck.

Some people think of chemistry as a bag of colored marbles. Shake really hard. When the marbles hit each other, they change colors, according to rules. So there's a bit of structure, but it's a chaotic mess -- at any given moment, it's anyone's guess what will happen next. Can chemistry do computation, then? At least since Jacob & Monod, and perhaps before, it has been generally recognized that biochemical systems, such as genetic regulatory networks, can operate much like electrical circuits -- the concentration of some chemical species can carry an ON/OFF signal. Arbitrary digital circuit logic can be performed. If enough marbles turn green, we'll say the output was "1". Bennett even realized that if we string the marbles together like a necklace of beads, then Turing-universal computation can be performed. That's strictly more powerful, theoretically, than digital circuits. What we show here is that finite stochastic chemical reaction networks -- bags of marbles without strings -- can also perform Turing-universal computation. Reasonably quickly, too! This result holds if we accept some probability, no matter how small, that the chemistry will produce the wrong output... but remarkably, the result fails if we insist that the chemistry always and without exception produces the correct output. It pays to be tolerant, if even ever so slightly.

[ Natural Computing, vol. 7, pp. 615-633, 2008, (19 pages): online (pdf, 690 KB)]
[ Technical Report CaltechPARADISE:2007.ETR085 (19 pages): (pdf, 530 KB). ]

Rebecca's PhD Thesis: The Self-Replication and Evolution of DNA Crystals. *
133 pages. California Institute of Technology, Defended May 2007.
Rebecca Schulman. Thesis advisor: Erik Winfree.
Few questions are so profound as that of life's origins. It may be that the answer is not ours to know. But we can know how it could have happened -- that is, we can ask "What chemical systems have the capacity to self-replicate and evolve?" This too is a profound question, and a rich one; it asks about the design space for evolvable chemical systems, and there must be many many ways. A good place to start, then, would be the simplest. In that spirit, Rebecca's thesis examines how DNA tiles could be programmed to faithfully copy information, spawn daughter crystals, mutate, grow under selective pressures, and thus evolve in potentially complex evolutionary landscapes. What is truly remarkable is how such a simple chemical mechanism -- the selective sticking that underlies tile-based self-assembly -- can exhibit so many profoundly life-like characteristics. Perhaps the essence of life is not so hard to come by, after all?
[.pdf, 3.1 MB; Caltech ETD.]

Synthesis of crystals with a programmable kinetic barrier to nucleation. *
Rebecca Schulman and Erik Winfree.

Rock candy is my favorite form of sugar. It sparkles with beautiful flat facets. It's tasty. And it practically makes itself, growing right on the stick -- almost as you watch! All you have to do is slowly cool a pot-full of supersaturated hot sugar water. Ahh, the beauty of crystal growth. But if the solid form is thermodynamically favorable (it is growing, after all) why does it grow just on the stick, and not precipitate out of solution everywhere? The answer, of course, is that there is a kinetic barrier to nucleation: small crystals are more likely to melt than to grow. But your stick had on it some pre-formed large crystals (sugar grains) that were more likely to grow than to melt. Seed crystals, we call them. They suck up all the sugar, and voila, big beautiful rock candy! All goes to show: control over nucleation is wonderful to have. But rock candy is just a metaphor. Control over nucleation is critical for a whole slew of biological processes -- such as the growth of the cytoskeleton -- where a structure needs to be built in the right place and at the right time, but nowhere else. It's natural to think that control over nucleation will be key to self-assembling molecular technologies as well. That's what our paper is about. We design DNA tiles that crystallize into ribbons -- but there's a big kinetic barrier to nucleation unless we add a seed. Hopefully, this mechanism for controlling nucleation can be used as a general subroutine for programming complex crystal growth processes.

[PNAS, (vol. 104, pp. 15236-15241, 2007) (6 pages): .pdf, 429 KB and supplementary information, 1.4 MB. ]

Reducing Facet Nucleation during Algorithmic Self-Assembly. *
陳和麟 (Ho-Lin Chen), Rebecca Schulman, Ashish Goel, Erik Winfree.

A single bit flip in the source code and the program won't compile. A single mistake in calculating the digits of pi, and every subsequent digit is garbage. This is what happens in brittle algorithms: corrupted information gets used to compute more information, which is thus corrupted, and the infection grows. Our initial efforts to embed algorithmic logic into DNA self-assembly were brittle. A single DNA tile that attached at the wrong site would utterly disrupt the algorithmic pattern being formed. So we invented a way to design "proofreading" tile sets that allowed a few errors to occur without disrupting algorithmic growth. Unfortunately, our construction turned out to still have problems when crystals grew with large facets -- it did not suppress errors occurring from spontaneous nucleation of new layers of tiles on the facets. A few years ago, Ho-Lin and Ashish proposed an improved construction for DNA tile sets, which they call "snaked proofreading" tile sets, that has excellent theoretical properties in this regard. Here, Ho-Lin and Rebecca demonstrate experimentally that the fundamental principle of the new construction is sound: we can design tile sets that logically suppress nucleation on facets. In effect, those nasty bit flips will now most of the time just flip right back.

[Nano Letters, (on-line August 24, 2007) (7 pages): .pdf, 653 KB and supplementary information, 93 KB. ]

An autonomous polymerization motor powered by DNA hybridization. *
Suvir Venkataraman, Robert M. Dirks, Paul W. K. Rothemund, Erik Winfree, Niles A. Pierce.

Can a DNA molecule walk? Can we design a molecular motor from scratch? If so, how could it work? Consider macroscopic motors for a minute: they come in all varieties, using all sorts of principles -- internal combustion engines, steam engines, Wankel rotary engines, electric motors, pneumatic motors, solenoids, rockets, jets... each best suited to different tasks. The molecular world has similar diversity. In biology, we see rotary motors like ATPase and the flagellar motor, walking motors like kinesin, linear motors like RNA polymerase and the ribosome, and waving motors like cilia. Not to mention muscle. Among the most mind-bending are the polymerization motors of pathogenic bacteria, such as Rickettsia and Listeria, that live inside eukaryotic host cells. By displaying proteins that catalyze the polymerization of the host cell's actin, these bacteria create a "comet tail" behind them that pushes them forcefully through the cell and even into neighboring cells. This was the motor principle targeted in our work, which was lead by Niles' group. Perhaps most fascinating is that the DNA polymers grow by insertion between the polymer tail and the DNA catalyst strands anchored on the "surrogate bacterial cell". This insertion takes place by a series of conformational rearrangements without ever the two sides losing their strong attachment to each other. Quite a dance!

[Nature Nanotechnology, (vol. 2, pp. 490-494, 2007) (5 pages): .pdf, 738 KB and supplementary information, 581 KB. ]

How Crystals that Sense and Respond to Their Environment Could Evolve. *
Rebecca Schulman and Erik Winfree.

In 1965, Graham Cairns-Smith proposed that the first (and simplest) Darwinian entities were clay crystals, and that as they evolved to reproduce faster and more robustly, they catalyzed organic synthesis... and eventually their invented technology took over the reins... and this "genetic takeover" resulted in us, their organic brainchildren. Sounds crazy, doesn't it? Especially since crystal evolution hasn't been demonstrated experimentally. Well, read his books. In any case, a few years ago we realized that we might be able to evolve our DNA crystals in the lab -- it's not clay, but the principles might be similar. So we asked ourselves, what might be an interesting selective pressure that our crystals could be subjected to, so that evolution leads to increasing functional complexity? The answer was quite surprising: without any novel chemistry or physics, purely informational aspects of crystal growth are in principle sufficient for a crystal to "intelligently" respond to the presence or absence in the environment of material required for growth -- and crystals containing information that enables them to respond more adeptly will grow and reproduce faster. This is intriguing.

[Natural Computing, (Volume 7, pages 219-237, 2008) (19 pages): online .pdf, 485 KB or (17 pages): preprint .pdf, 256 KB. ]

Combining Self-Healing and Proofreading in Self-Assembly. *
David Soloveichik, Matthew Cook, Erik Winfree.

If you cut a crystal, does it bleed? Normally, you'd think not... but algorithmic crystal growth has some surprisingly life-like properties... so it makes you wonder. For example, there are crystal monomer solutions that, depending on the information contained in a seed crystal, can grow into any algorithmically describable shape -- much like a biological developmental program. But what if the growth process is error-prone? What if the crystal suffers damage from a hostile environment (i.e. you cut it)? Previous work showed that even under such conditions (considered separately), algorithmic crystals can grow and maintain their integrity. But different constructions and different arguments were used in the two cases. Here, we introduce a new construction and a new proof technique that shows that both challenges can be met simultaneously! No blood.

[Natural Computing, (Volume 7, pages 203-218, 2008) (16 pages): online .pdf, 472 KB or (12 pages): preprint .pdf, 310 KB. ]

Thermodynamic Analysis of Interacting Nucleic Acid Strands. *
Robert M. Dirks, Justin S. Bois, Joseph M. Schaeffer, Erik Winfree, Niles A. Pierce.

If you're thinking about the molecular machines that may have inhabited the RNA World in times gone by, if you're planning on engineering a modern-day DNA World, or even if you're just curious, you may want to know what happens when you toss a collection of strands together in a test tube. Will they fold up? Into what shapes? Will they interact? Form multi-stranded assemblies? With what concentrations? For single strands, existing dynamic programming algorithms can find the minimum-free-energy fold and even calculate the partition function exactly, so that the probabilities of alternative folds can be compared. (At least, with respect to the standard nearest-neighbor model of secondary structure.) The bimolecular case has also been tackled. Here, Niles & his boys lead the charge to solve the full problem for any number of strands. An exact solution, with fast algorithms, is provided. The math is elegant, involving symmetry groups, convex optimization, and what seem like remarkable "coincidences". Nice.

[SIAM Review, vol. 49, pp 65-88, 2007 (24 pages): preprint .pdf, 653 KB. ]

Complexity of Self-Assembled Shapes. *
David Soloveichik and Erik Winfree.

If you can effectively describe a shape -- e.g. write a computer program that draws it -- then can the shape be self-assembled by spontaneous local processes from just a few kinds of pieces? The answer may be "no" if you are concerned about the exact size of the shape, but if it is only the form of the shape that matters, then the answer is always "yes". In fact, the Kolmogorov complexity of a shape provides upper and lower bounds for the number of tile types necessary to self-assemble it (at some scale). Proof of this result makes use of a construction for programmable "blocks" that execute a Turing machine simulation to guide the growth process, and introduces a new proof technique for establishing when growth is independent of the order in which tiles are added. This work suggests that scale plays the same role in self-assembly as does time in computability: when you ignore it and just ask, "can it be done at all?", then the theory becomes clean and elegant.

[SIAM Journal on Computing 36 (6) 1544-1569, 2007, (26 pages) paper, 810 KB.]
[preprint cs.CC/0412096 on arXiv.org (25 pages): .pdf, 488 KB.]
[extended abstract in DNA Computers 10, LNCS volume 3384:344-354, 2005, (10 pages): .pdf, 272 KB, color, .pdf, 271 KB, B/W]

Jongmin's PhD Thesis: In vitro Synthetic Transcriptional Networks. *
118 pages. California Institute of Technology, Defended December 2006.
Jongmin Kim. Thesis advisor: Erik Winfree.
This thesis describes presents a simplified architecture for designing and synthesizing arbitrary dynamics in in vitro biochemical networks. Inspired by neural networks and genetic regulatory networks, the in vitro systems make use of just RNA polymerase and RNases in addition to DNA constructs that define the network structure. Theory demonstrates that arbitrary dynamics can be synthesized in general, and experiments demonstrate individual switches as well as circuits that implement bistability and oscillation.
[.pdf, 18 MB; Caltech ETD.]

Construction of an in vitro bistable circuit from synthetic transcriptional switches. *
Jongmin Kim, Kristin S. White, Erik Winfree.

How simple can life be? To some, the answer to this question is a small bungalow on Fiji, a hammock, and the sound of waves sloshing and slurping against the sand. Others are more inclined to think about the origin of life, about minimal organisms, about what it would take to create life from scratch. Being broad-minded, we appreciate both perspectives. While our research on the first aspect of the problem is still in progress, this paper reports our efforts in the second direction -- efforts to recreate, in simplified form, the information-processing heart of cellular processes: genetic regulatory networks. We do it in vitro using just DNA, RNA, and a small handful of enzymes... though not without encountering a few challenges and surprises!

[Molecular Systems Biology, 2:68, 2006 (12 pages): paper and supplementary info; also see Michael Simpson's excellent News & Views essay. Raw data is available from our Extra Supplementary Material page. ]

Enzyme-Free Nucleic Acid Logic Circuits. *
Georg Seelig, David Soloveichik, David Yu Zhang, Erik Winfree.

It's a game of molecular dominoes. One falls, bumps into the next, triggering its fall. Sometimes two separate cascades converge on a single domino, and only when both hit at once does the domino fall. Or maybe either input is enough. In fact, you can build any kind of logic circuit out of dominoes. You can make a domino cascade that multiplies binary numbers, if you want. It's fun. But maybe not so deep scientifically, not so useful. But suppose you now do the same thing with molecules, say DNA strands. Now you have a way to embed logical control in chemistry, to sense and react to molecular events. Now you're at the same frontier of molecular information processing that biology has been exploiting and refining for almost 4 billion years. We're a bit behind in the game.

[Science, 314:1585-1588, 2006 (4 pages) paper and supplementary information; also see Walter Fontana's excellent Perspectives essay, Udi Shapiro's News and Views essay in Nature Nanotechnology, a note in Scientific American, Caltech's Press Release, or a general-interest article we wrote for a Caltech newsletter (we all helped write it, even though they mistakenly only put my name on it). ]

Catalyzed Relaxation of a Metastable DNA Fuel. *
Georg Seelig, Bernard Yurke, Erik Winfree.

DNA machines need fuel. Whether the machine is a motor, a manipulator, or a computer, if it is to be capable of driving a system to perform a periodic behavior -- and what respectable machine can't do that? -- then it needs fuel. ATP is an excellent fuel: small, high energy content, simple... but, how can we put it?... holding it requires small hands. Not even the rain has such small hands. So certainly not human-designed DNA machines, yet. So for DNA machines, we will make do with a bigger fuel -- but one that can be specifically targeted. That, more or less, was the motivation for this work. (The original work pursuing this line of reasoning was Turberfield et al 2003 [ref 1] and further progress was made in "DNA Hybridization Catalysts and Catalyst Circuits" [ref 37, see below]. Here, we finally have a system that works quite nicely.) {apologies to ee cummings}

[JACS, 128(37): 12211-12220, 2006 (5 pages): .pdf, 1.4 MB. ]

Sturdier DNA Nanotubes via Ligation. *
Patrick O'Neill, Paul W. K. Rothemund, Ashish Kumar and Deborah K. Fygenson

Deborah's group decides to ligate DNA nanotubes. They are stable under a fluorescence microscope up to 70C and, unlike noncovalent tubes, resist opening under AFM. That's very, very useful!

[Nano Letters 6(7): 1379-1383, 2006 (5 pages): .pdf, 286 KB. See also supplementary pdf, 27KB. ]

A DNA Superstructure-based Replicator without Product Inhibition. *
David Yu Zhang and Bernard Yurke.

DNA replication, when one becomes two, marks the fundamental moment in life. How simple can the machinery for replication be? Can it be designed from scratch? Dave and Bernie present here a proposal for a clock-work machine -- an information-bearing superstructure made entirely of multi-base DNA building-block motifs -- that replicates its overall structure when exposed to the proper sequence of raw building blocks and fuel. So it's theoretically possible. This throws down the gauntlet: Can an autonomous scheme be devised? Can a simpler implementation be found, one practical enough for experimental demonstration? Can a chemical self-replicator be rationally designed?

[Natural Computing 5(2): 183-202, 2006 (20 pages): .pdf, 1.3 MB. ]

Folding DNA to create nanoscale shapes and patterns. *
Paul W. K. Rothemund

Paul sends a swarm of staple strands to tie viral DNA in knots...thereby self-assembling 100 x 100 nm objects with roughly 6 nm resolution from the 7 kilobase single-stranded genomic DNA of M13mp18. Rectangles. Squares. Triangles. Stars. Even a smiley-face. About 50 billion copies of each, in a typical reaction, and with very high yields. It works like magic. We did some calculations... Paul's smiley faces constitute the most concentrated happiness ever experienced on earth. Each spot in such a structure contains a unique address and can be addressed as such by DNA hybridization, allowing one to "write" on the DNA origami objects. Words. Pictures. Snowflakes. A map of North and South America. We did some more calculations... Paul probably made more maps than have ever been produced in the history of mankind -- we're definitely talking quantity over quality here. The applications of this technology are likely to be less whimsical. For example, it can be used as a "nanobreadboard" for attaching almost arbitrary nanometer-scale components, and there are few other ways to obtain such precise control over the arrangement of components at this scale. You'll never look at M13 phage DNA the same way again...

[Nature 440, 297-302 (16 March 2006). article, .pdf, 575 KB. News and View, .pdf, 300 KB. Supplementary material: .pdf, part 1, 6.3 MB; .pdf, part 2, 193 KB. Caltech's Press Release. ]

Design of DNA origami. *
Paul W. K. Rothemund

Paul talks a little about the design software and future possibilities.

[Proceedings of the International Conference on Computer-Aided Design (ICCAD) 2005: .pdf, 646 KB.]

Scaffolded DNA Origami: from Generalized Multicrossovers to Polygonal Networks. *
Paul W. K. Rothemund

Paul makes a DNA origami especially for Ned Seeman, and sketches how polygonal networks and polygonal three-dimensional structures can be created.

[in Nanotechnology: Science and Computation, pages 3-21, 2006. Preprint (22 pages): .pdf, 1.4 MB.]

Self-Healing Tile Sets. *
Erik Winfree.

Wandering in the dreamy fields of abstraction and metaphor, I like to think of algorithmic self-assembly as a toy model of organismal development. A tile set can direct the growth of an algorithmic pattern from a seed assembly "in the same way" as genetic DNA directs the growth of a salamander from an egg. Similar principles seem to be at work in both systems: information flow is essential, sophisticated programming is possible, the physics and chemistry is messy but fault-tolerance can be achieved. The adult salamander, however, can do something that has not previously been shown in algorithmic self-assembly -- after losing its tail, the salamander can regrow a new one! Perhaps more remarkable, actually, is the ability of almost all multicellular organisms to heal severe wounds within their bodies. In this paper I investigate a (somewhat) analogous issue for algorithmic self-assembly, and show that self-healing tile sets are possible.

[in Nanotechnology: Science and Computation, pages 55-78, 2006. Preprint (24 pages): .pdf, 431 KB.]
[Note, a one-page abstract of this work appeared in the FNANO 2005 conference proceedings. Unfortunately, the 3x3 self-healing transformation in the figure shown there had an error in the boundary and seed tile blocks. EW 8/05]

Two Computational Primitives for Algorithmic Self-Assembly: Copying and Counting. *
Robert D. Barish, Paul W. K. Rothemund, and Erik Winfree

Here we demonstrate crystals that count and tubes that copy. This is interesting as the second example of algorithmic self-assembly. Counting is a useful primitive for bottom-up fabrication tasks such as constructing a memory chip with address demultiplexers. Copying is a useful primitive for Darwinian evolution. The error rates in this work, however, strongly motivate research into "proofreading" and other methods for fault-tolerant self-assembly.

[Nano Letters 5(12): 2586-2592, 2005 (7 pages): .pdf, 515 KB. See also the Supplementary Materials (.pdf), 622 KB and our Extra Supplementary Materials page.]

Complexity of Compact Proofreading for Self-Assembled Patterns. *
David Soloveichik and Erik Winfree.

Presumably, there is some cost for obtaining low error rates during self-assembly. Maybe assembly must be slower. Or the same pattern can be assembled more reliably, but using more tile types. Or the pattern must be assembled at a larger scale. Or all of the above. In fact, Chen and Goel (2005) proved that if you can accept "just a little slower", "just a few more tiles", and "just a slightly larger scale", then any self-assembled pattern can be made as reliable as you want. But suppose you insist that the pattern remain at exactly the original scale, as do Reif, Sahu, and Yin (2005) ? We give evidence that for some patterns -- which we term "robust" -- it is still possible to do arbitrarily strong proofreading with modest costs in speed and tile set size, but for other patterns -- which we term "fragile" -- it is not... at least, not with any proofreading scheme that uses a certain kind of redundancy.

[Published in DNA Computing 11, LNCS 3892: 305-324 (2006): .pdf, 595 KB.]
[Preprint, typos corrected 3/29/2006 (20 pages): .pdf, 237 KB.]

Self-Replication and Evolution of DNA Crystals. *
Rebecca Schulman and Erik Winfree.

What is the simplest physical system capable of extensive Darwinian evolution? Such a system might be a candidate for the origin of life. One would expect that it must have certain features: it must be capable of of reproducing as an organism to populate an environmental niche; of storing an unbounded amount of information in its "genome"; of copying that information with low error rates; and of expressing that information as a phenotype with selective advantages or disadvantages. Graham Cairns-Smith has argued, since the 1960's, that mineral crystals (such as clays) provide the most compelling example of a simple and abundant physical system capable of extensive Darwinian evolution: Clay crystals can store information as a pattern of inhomogeneities that are propagated from layer to layer, with few errors; they can reproduce by random fragmentation; and they can express a variety of morphological phenotypes. To be blunt, he proposes that your great-great-great-great-great-....-grandparent was a lump of clay. I think he may be right. But the evidence is thin: to date, no one has managed to catch clay in the act of replicating information or evolving. In this paper, we propose using experimentally-tractable DNA tile self-assembly to investigate how evolution could conceivably work on information-bearing crystals.
Won the conference best paper award! (of 94 papers)

[ ECAL 2005, in LNCS 3630:734-743, 2005. Preprint (10 pages): .pdf, 2.3 MB, .ps, 9.3 MB. ]

The Computational Power of Benenson Automata. *
David Soloveichik and Erik Winfree.

In a series of experimental papers, Benenson et al demonstrated how information encoded in DNA can be processed autonomously by a molecular automaton consisting of a type IIS restriction enzyme (FokI) and a collection of DNA "rule molecules". They constructed several two-symbol, two-state finite state machines as well as several 4-input AND/NOT gates. Ultimately, applications envisioned include diagnosis of mRNA and delivery of a theraputic molecule. Here, we formalize the class of logical computations that can be performed by such systems and show that, in our abstraction, the finite-input computations that can be performed by Benenson automata can be exactly characterized by a connection to branching programs -- which indicates that (if restriction enzymes slightly more powerful than FokI can be found) Benenson automata are considerably more powerful than we had originally imagined!

[Preprint: cs.CC/0412097 on arXiv.org (18 pages): .pdf, 223 KB.]
[Journal version: Theoretical Computer Science 244(2-3):279-297, 2005, (19 pages): .pdf, 192 KB. Erratum: .pdf, 279 KB. ]
[Note: Thanks to remarkably thorough comments from the reviewers, the clarity and precision of the definitions and proofs in the journal version have been improved substantially. Read this version, not the arxiv version!]

Algorithmic Self-Assembly of DNA Sierpinski Triangles. *
Paul W. K. Rothemund, Nick Papadakis, Erik Winfree.

Our first demonstration of algorithmic crystals, wherein molecularly-encoded information directs the growth process to create a complex pattern. The DNA crystals are, at the molecular level, a two-dimensional woven fabric of short DNA strands. Both because this programmable growth could be considered a super-simplified toy model of organismal development, and because DNA is the central information molecule in biology, I like to call it "weaving the tapestry of life". This is a substantial personal victory for me: I proposed that this should be possible in 1995 as a graduate student -- nearly 10 years later, Paul's efforts made it actually happen.

[PLoS Biology 2 (12) e424, 2004, (13 pages): .pdf, 4.6 MB.] See also our Extra Supplementary Materials page. (PLoS Biology has a synopsis and a primer by Chengde Mao for this paper. Also it was highlighted in Nature by Philip Ball. And there's a Caltech Press Release. )

Design and Characterization of Programmable DNA Nanotubes. *
Paul W. K. Rothemund, Axel Ekani-Nkodo, Nick Papadakis, Ashish Kumar, Deborah Kuchnir Fygenson, Erik Winfree.

DNA tiles designed to make sheets sometimes roll up into tubes that are abstractly analogous to protein microtubules that self-assemble from tubulin. Way cool! Fortuitously discovered during Paul's work on the DNA Sierpinski triangles, DNA nanotubes have opened up a whole host of interesting possibilities that we never dreamed of before...

[JACS 126(50):16344-16353, 2004, (9 pages): article, 891 KB, supp, 5.5 MB.] See also our Extra Supplementary Materials page.
[Note added Feb. 2013: We presented a correct equation for the persistence length of a DNA tube but gave an incorrect derivation. Here is the erratum stating changes to the paper and the correction to the supplementary information which gives a valid proof. Remember kids, area moment of inertia is for bending, mass moment of inertia is for spinning ice skaters!]

Neural Network Computation by in vitro Transcriptional Circuits. *
Jongmin Kim, John J. Hopfield, Erik Winfree.

Is it possible to design biochemical circuits in the same way we design electrical circuits -- where the information processing and dynamics is what we're interested in, rather than the chemistry per se? What kinds of dynamical and decision-making behavior can be achieved this way? In this paper, we give a concrete proposal for how to use synthetic DNA molecules and a few enzymes (RNA polymerase and some ribonucleases) to make general-purpose threshold gates that can be wired together to make arbitrary circuits. The mathematical description turns out to be closely related to continuous-time analog neural networks. We analyze a few example circuits.

[Advances in Neural Information Processing Systems (NIPS) 17, 2004, pg 681-688. preprint, (8 pages): .pdf, 426 KB.]

DNA Hybridization Catalysts and Catalyst Circuits. *
Georg Seelig and Bernard Yurke and Erik Winfree.

Energy stored in metastable configurations of DNA molecules can in principle be used to power DNA nanomachines and other molecular processes. Essential to using this energy productively is the ability to regulate how and when the energy is released. Starting with a previously-demonstrated DNA hybridization catalyst for doing exactly that, we introduce schemes for an improved catalyst and for networks of catalysts that operate as computing circuits. We show preliminary experimental results investigating these proposals -- which uncover a few interesting surprises!

[ in DNA Computers 10, LNCS volume 3384:329-343, 2005, (15 pages): .pdf, 361 KB ]

Paradigms for Computational Nucleic Acid Design. *
Robert M. Dirks, Milo Lin, Erik Winfree and Niles A. Pierce.

What constitutes a good design criterion for choosing DNA sequences that (should) fold up into target nanostructures? We argue that both positive (the target structure should have low energy) and negative (all other structures should have high energy) design criteria should be considered. Secondary structure formalisms for describing and analyzing DNA molecules allow these criteria to be rigorously treated computationally.

[preprint with clear figures: .pdf, 6.4 MB]
[published in Nucleic Acids Research: 32(4):1392-1403, 2004, (17 pages, in color): .pdf, 503KB; supplemental.pdf, 2.7MB; sequences.gz, 2.8MB]

Proofreading Tile Sets: Error Correction for Algorithmic Self-Assembly. *
Erik Winfree and Renat Bekbolatov.

Experimental studies of algorithmic self-assembly have observed error rates between 1 and 10 %, measured as the fraction of DNA tiles that fail to perfectly match their neighbors. It is difficult to perform extended computations when every elemental logical operation is so likely to be wrong. Here, we provide a general-purpose technique for designing tile sets that are robust to assembly errors. Kinetic and thermodynamic models of self-assembly are key to our investigations. It is suggested that proofreading tile sets can square the error rate for comparable physical conditions -- i.e. bring 10% to 1% and bring 1% to .01%.

[in DNA Computers 9, LNCS volume 2943:126-144, 2004, (19 pages, in color): .pdf, 2.0 MB, .ps, 18 MB]
See also our Extra Supplementary Material page. [Note: Typo in Fig 7b: "FC" on the left should be "FM".]

Self-Assembled Circuit Patterns. *
Matthew Cook, Paul W. K. Rothemund, and Erik Winfree.

Can DNA self-assembly be used for patterning, as a scaffold for functional devices such as molecular electronic circuits? We show that several circuit patterns, including demultiplexers, random-access memory, and Hadamard matrix transforms, can be self-assembled (in principle) from a small number of tile types.

[in DNA Computers 9, LNCS volume 2943:91-107, 2004. (17 pages, in color): .pdf, 608 KB, .ps, 3.2 MB]

One Dimensional Boundaries for DNA Tile Self-Assembly. *
Rebecca Schulman, Shaun Lee, Nick Papadakis, and Erik Winfree.

Discusses our progress toward using DNA tiles to self-assembly the boundary for a self-assembled Sierpinski triangle. Experimental results show that individual tiles behave well, with the expected tile-tile interactions, but size and shape distributions of self-assembled structures were sometimes surprising and unsatisfactory. We understand this by analyzing models of 1D polymerization, for example distinguishing accretion from a seed vs. parallel aggregation. This work identifies control over nucleation as an important target for future study in algorithmic self-assembly.

[in DNA Computers 9, LNCS volume 2943:108-125, 2004, (19 pages, in color): .pdf, 2.1 MB, .ps.gz, 8.7 MB, .ps, 49 MB]
See also our Extra Supplementary Material page.

DNA Computing by Self-Assembly. *
Erik Winfree.

A brief review of algorithmic self-assembly of DNA.

[in NAE's The Bridge, 33(4):31-38, 2003 (8 pages): preprint in color .pdf, 350KB, or as published in B/W .pdf, 275KB]

Protein Design is NP-Hard. *
Niles Pierce and Erik Winfree.

Shows that a commonly-used formulation for protein design -- amino acid sequence selection to stabilize a given backbone fold using a pairwise energy model -- is NP-Hard if the energy function is considered part of the specification of the problem (as it is in common practice, since this is the implicit formulation of the target backbone fold). Therefore, it may be advantageous to exploit specific properties of protein folding energetics, rather than relying on general-purpose (worst-case) exponential-time algorithms.

[in Protein Engineering, v15, pp. 779-782, 2002 (4 pages): PRODES.pdf, 1.1 MB]

Combinatorial Optimization Problems in Self Assembly. *
Leonard Adleman, Qi Cheng, Ashish Goel, Ming-deh-Huang, David Kempe, Pablo Moisset, Paul Rothemund

Can one find the minimal number of tiles required to uniquely produce a given shape? Here we prove that the problem is NP-complete. Also, for a large class of tile sets we give a method to find concentrations of tiles that yield the fastest time of self-assembly.

[in STOC 2002, (10 pages): .pdf, 731 KB, .ps, 537 KB]

Paul's PhD Thesis: Theory and Experiments in Algorithmic Self-Assembly. *
283 pages, in black and white. University of Southern California, December 2001.
Paul W. K. Rothemund. Thesis advisor: Leonard Adleman.
This thesis describes theory on the uniqueness of self-assembled structures with an expanded account of material in the paper "The Program-Size Complexity of Self-Assembled Squares" (see below), experiments in algorithmic capillary force-based self-assembly as well as capillary force-based assembly of Penrose tilings, and DNA computation for breaking the Data Encryption Standard (DES). It has appendices detailing the frequency of certain tile configurations (vertex stars) in the Penrose tiling and an initial experiment in making capillary force-based gears.
[pwkr_thesis_nov15.ps, 49.8 MB, or pwkr_thesis_nov15.ps.gz, 8.8 MB, or pwkr_thesis_nov15.pdf, 9.1 MB]

Algorithmic Self-Assembly of DNA: Theoretical Motivations and 2D Assembly Experiments. *
Erik Winfree.

Overview discussion of Wang tiles, computation by self-assembly, and previous experimental results, as well as some additional experiments using the system described in Nature 394, 539-544, Aug. 6, 1998 -- see below.

[in Journal of Biomolecular Structure and Dynamics, 11 (S2): 263-270, 2000, (8 pages, in color): algSA_JBSD.pdf, 746 KB, or algSA_JBSD.ps.gz, 3.3 MB]
[Erratum: In figure 6 on page 266, the lower right sticky end sequence of the B tile should be AGTGA instead of ATGTA, to be complementary with the upper left one of the A tile.]

Two Dimensional and Two States in DNA Nanotechnology. *
Nadrian C. Seeman, Furong Liu, Chengde Mao, Xiaoping Yang, Lisa A. Wenzler, Ruojie Sha, Weiqiong Sun, Zhiyong Shen, Xiaojun Li, Jing Qi, Yuwen Zhang, Tsu-Ju Fu, Junghuei Chen and Erik Winfree.

This brief review, written by Ned Seeman, describes the state of DNA nanotechnology in the year 2000, including the recent results described in Nature 394, 539-544, Aug. 6, 1998 -- see below.

[in Journal of Biomolecular Structure and Dynamics, 11 (S2): 253-262, 2000, (10 pages, B/W): TwoDimTwoStates.pdf, 554 KB]

String Tile Models for DNA Computing by Self-Assembly. *
Erik Winfree, Tony Eng, Grzegorz Rozenberg.

How much computation can be done with linear self-assembly? With some tricky strand routing through complex DNA tiles, you can do a lot!

[in DNA Based Computers 6, LNCS 2054, pp. 63-88, 2001, (26 pages, in color): stringtiles.pdf, 679 KB, or stringtiles.ps, 1.6 MB]
[early draft, 6/2000: stringtiles_DNA6.ps.gz, 225 KB]

The Program-Size Complexity of Self-Assembled Squares. *
Paul W. K. Rothemund and Erik Winfree.

Are there small self-assembly programs for building squares of a particular size? Yes! Also contains a nice presentation of our model of self-assembly.

[in STOC 2000, (10 pages): squares_STOC.ps.gz, 114 KB, squares_STOC.ps, 761 KB, or squares_STOC.pdf, 243 KB]
[ Erratum: In Figure 5, actually N=50. ]

Construction, analysis, ligation, and self-assembly of DNA triple crossover complexes. *
Thomas H. LaBean, Hao Yan, Jens Kopatsch, Furong Liu, Erik Winfree, John H. Reif, Nadrian C. Seeman.

2D crystals can be made from fancier DNA complexes that allow for baroque strand routing.

[in Journal of the American Chemical Society, 122(9): 1848-1860, 2000, (13 pages): triplex.ps.gz, 906 KB, or triplex.pdf, 712 KB]

Using lateral capillary forces to compute by self-assembly. *
Paul W. K. Rothemund.

Here Paul used macroscopic plastic tiles to test ideas about algorithmic self-assembly. The plastic tiles self-assemble at the interface of oil and water, and hydrophilic and hydrophobic patches on the edges of the tiles mediate the specific binding interactions between them. Tiles in this paper encode binding interactions for creating Sierpinski triangles as well as Penrose tilings. Paul had mild success creating unnucleated patterns with Sierpinski tiles and perhaps created the most complex set of specific capillary bonds ever made.

[in Proceedings of the National Academy of Sciences, 97(3): 984-989, 2000, (6 pages): Rothemund-PNAS-capillary.pdf, 911KB]

Experimental Progress in Computation by Self-Assembly of DNA Tilings. *
Thomas H. LaBean, Erik Winfree, John H. Reif.

Describes experimental characterization of three-helix DNA complexes, shows that multiple TX can assemble around a template strand, and illustrates how they could be used to self-assemble an addition table.

[in DNA Based Computers 5, pp. 123-140, 1999, (18 pages): LaBeanTX.pdf, 497 KB, or LaBeanTX.ps.gz, 600 KB, or LaBeanTX.ps, 2.3 MB]

Error Correction in DNA Computing: Misclassification and Strand Loss. *
Kevin Chen and Erik Winfree.

Update on Roweis et al: shows that misclassification can be ameliorated by repeated separation, and strand loss can be recovered by PCR, resulting in reliable computation at the cost of moderately more time and space. *IF* there is no sequence-specific bias to the errors.

[in DNA Based Computers 5, pp. 49-63, 1999, (16 pages): kevin.pdf, 304 KB, or kevin.ps.gz, 100 KB, or kevin.ps, 416 KB]

Design and self-assembly of two-dimensional DNA crystals. *
Erik Winfree, Furong Liu, Lisa Wenzler, Nadrian C. Seeman.

Periodic lattices of DNA double-crossover molecules, of the type proposed in "On the Computational Power of DNA Annealing and Ligation", are demonstrated experimentally.

[in Nature 394, 539-544, Aug. 6, 1998, (6 pages): lattice.pdf, 3.8 MB, or lattice.ps.gz, 710 KB, or lattice.ps, 1.1 MB]
[supplementary material: Supp. Material 1, 5KB, or Supp. Material 2, 9KB]
[Typo: the PAGE gel was 19:1, not 90:1]

Erik's PhD Thesis: Algorithmic Self-Assembly of DNA. *
110 pages, Caltech, June 1998.
Erik Winfree . Thesis advisor: John Hopfield.
The contents of several of the above papers are included here; in addition there are many more experimental results. Note that pages 61, 72, and 87 are in color.
[thesis.ps.gz, 5 MB, or thesis.pdf, 2.6 MB, or thesis.ps, 21 MB, or the Caltech official library copy]

Simulations of Computing by Self-Assembly. *
Erik Winfree.

A kinetic and thermodynamic analysis of error rates in a simple model of the two-dimensional self-assembly process proposed in my first paper on this topic, "Annealing and Ligation".

[presented at DNA Based Computers IV; published as Caltech CS Tech Report 1998.22, (25 pages): simulation.ps.gz, 392 KB, or simulation.pdf, 1.0 MB, or simulation.ps, 3.1 MB]

Whiplash PCR for O(1) Computing. *
Erik Winfree.

Some comments and extensions to the "one-pot" PCR-like technique of [Hagiya,Arita,Kiga,Sakamoto,Yokoyama in DNA Based Computers III], including a a way for a single DNA strand to simulate the execution of circuit.

[presented at DNA Based Computers IV; published as Caltech CS Tech Report 1998.23, (14 pages): whiplash.pdf, 331 KB, or whiplash.ps.gz, 136 KB, or whiplash.ps, 666 KB]

Universal Computation via Self-assembly of DNA: Some Theory and Experiments. *
Erik Winfree, Xiaoping Yang, Nadrian C. Seeman.

This follow-up to "Annealing and Ligation" (below) examines the relationship of DNA self-assembly to formal language theory -- finding ties to regular, context-free, and recursively enumerable languages. Additionally, preliminary laboratory investigations into the self-assembly required for universal computation is reported.

[in DNA Based Computers II, pgs 191-213, 1998 (22 pages): self-assem.ps.gz, 546 KB, or self-assem.pdf, 627 KB, or self-assem.ps, 2.6 MB]
[Note, there are a few typographical errors in important definitions. EW, 6/98]

A Sticker Based Architecture for DNA Computation. *
Sam Roweis, Erik Winfree, Richard Burgoyne, Nickolas V. Chelyapov, Myron F. Goodman, Paul W. K. Rothemund, Leonard M. Adleman.

A new representation for encoding bits in DNA is presented and examined, and generally applicable methods for decreasing separation error rates are discussed. [in DNA Based Computers II, pgs 1-29, 1998 (26 pages): stickers.pdf, 611 KB, or stickers.ps.gz, 231 KB, or stickers.ps, 1.2 MB]
This appeared in journal form as two papers:

A Sticker-Based Model for DNA Computatation.
Sam Roweis, Erik Winfree, Richard Burgoyne, Nickolas V. Chelyapov, Myron F. Goodman, Paul W. K. Rothemund, Leonard M. Adleman.
[ Journal of Computational Biology, 5(4): 615-29, 1998 .pdf, 1.3 MB]
and
On the Reduction of Errors in DNA Computation.
Sam Roweis and Erik Winfree.
[ Journal of Computational Biology, 6(1): 65--5, 1999 .pdf, 884 KB]

On Applying Molecular Computation to the Data Encryption Standard. *
Leonard M. Adleman, Paul W. K. Rothemund, Sam Roweis, Erik Winfree.

An algorithm for breaking DES is designed for the Stickers model. Size, space, and error rates of the resulting machine are considered.

[in DNA Based Computers II, pgs 31-44, 1998 (21 pages): des.pdf, 185 KB, or des.ps.gz, 77 KB, or des.ps, 211 KB]
The journal version: [ Journal of Computational Biology, 6(1): 53-63, 1999 .pdf, 771 KB]

On the Computational Power of DNA Annealing and Ligation. *
Erik Winfree.

Here I show how one might create a "one-pot" mixture of DNA which can perform universal computation by self-assembly of DNA double crossover units. A.k.a. "weaving the tapestry of life".

[in DNA Based Computers, pgs 199-221, 1996, (13 pages): ligation.ps.gz, 157 KB, or ligation.ps, 951KB, or ligation.pdf, 314KB]
[Note, there are strand polarity errors in several figures. EW, 5/96]

Complexity of Restricted and Unrestricted Models of Molecular Computation. *
Erik Winfree.

Here I show some limits on what can be computed using some proposed operations on DNA. In particular, how affinity separation w/ or w/o amplification relates to branching programs and non-deterministic branching programs. These limits have since been overcome by the inclusion of additional operations, such as ligation.

[in DNA Based Computers, pgs 187-198, 1996, (8 pages): models.ps.gz, 72KB, or models.ps, 195KB, or models.pdf, 209KB]

A DNA and restriction enzyme implementation of Turing Machines. *
Paul W. K. Rothemund.

Here Paul gives a construction for simulating Minsky's 4 symbol, 7 state universal Turing machine using DNA, the type IIS restriction enzyme Fok I, and ligase. The encoding of machine state and current symbol used in this paper was later used by Kobi Benenson of Udi Shapiro's group to create DNA finite state machines and hence such machines have been called Rothemund-Shapiro machines.

[in DNA Based Computers, pgs 75-120, 1996, (29 pages below): dimacs.ps, 578KB, or dimacs.pdf, 330KB]

Organizing Centers in a Cellular Excitable Medium. *
Arthur T. Winfree, Erik M. Winfree, and Herbert Seifert.

Excitable media support directed wave propagation that, unlike linear waves, are directed and self-sustaining but annihilate each other when they collide. Examples include, in one dimension, action potential propagation along a neuron; in two dimensions, a slow-moving fire burning across a quick-growing plain of grass leaving a thin zone of burnt grass in its wake; in three dimensions, the electrical-contractile waves that make heart muscle beat. The Belousov-Zhabotinski reaction provides a simple chemical analog in any dimension, and the Greenberg-Hastings cellular automaton provides a simple discrete model that exhibits many of the same qualitative behaviors. The most striking behavior of excitable media in two dimensions is that after part of a wavefront is rubbed out, a spiral pattern develops as the wavefront swirls around its end. In three dimensions, one observes a wavesheet rather than a wavefront -- and if one rubs out a hole in that wavesheet, what develops is a remarkable three dimensional spiral. In fact there are an infinite variety of topologically distinct three dimensional spirals, each corresponding to the Seifert surface of a distinct knot. We simulated two such spirals in a three dimensional Greenberg-Hastings cellular automaton.

[in Physica 17D (1985), 109-115 (7 pages): .pdf, 487KB.]
P.S. This is my first published work! My contribution, as a high school student, was to write the program for simulating the cellular automaton and visualizing initial conditions and the resulting three dimensional pattern formation.

Reference details for DNA Computing conference proceedings:

DNA Based Computers: DIMACS Workshop, held April 4, 1995 (eds Richard J. Lipton and Eric B. Baum) American Mathematical Society, 1996.

DNA Based Computers II: DIMACS Workshop, held June 10-12, 1996 (eds Laura F. Landweber and Eric B. Baum) American Mathematical Society, 1998.

DNA Based Computers III: DIMACS Woskhop, held June 23-25, 1997 (eds Harvey Rubin and David H. Wood) American Mathematical Society, 1999.

Proceedings of the Fourth DIMACS Meeting on DNA Based Computers, held at the University of Pennsylvania, June 16-19, 1998. (Never published as a book; selected articles appear in https://www.sciencedirect.com/journal/biosystems/vol/52/issue/1 but alas not mine.)

DNA Based Computers V: DIMACS Workshop, held June 14-15, 1999. (eds. Erik Winfree and David K. Gifford) American Mathematical Society, 2000.

DNA Based Computers VI: held June 13-17, 2000. (eds. Anne Condon and Grzegorz Rozenberg) Lecture Notes in Computer Science 2054, Springer, 2001.

Send comments to

winfree@caltech.edu