Thursday, September 03, 2009

SIMPLICITY IS THE HIGHEST FORM OF SOPHISTICATION

Do you ever wonder how God or No-God keeps expanding and perfecting the performance of life? Here's a great essay about it.


by Jim Manzi

Jerry Coyne has a fairly scathing review of Robert Wright’s book Evolution of God in The New Republic. Here’s how the review begins:

Over its history, science has delivered two crippling blows to humanity’s self-image. The first was Galileo’s announcement, in 1632, that our Earth was just another planet and not, as Scripture implied, the center of the universe. The second—and more severe—landed in 1859, when Charles Darwin published On the Origin of Species, demolishing, in 545 pages of closely reasoned prose, the comforting notion that we are unique among all species—the supreme object of God’s creation, and the only creature whose earthly travails could be cashed in for a comfortable afterlife.

But there are some for whom the true evolutionary tale of human life is not sufficiently inspiring or flattering. After all, the tale seems to hold no moral other than this: like all species, we are the result of a purely natural and material process. While many religious people have been persuaded by Darwin’s overwhelming evidence, there still remains a need to find greater meaning behind it all—to see our world as part of an unfolding and divinely scripted plan.

And so the faithful—the ones who care about science at all—have tweaked the theory of evolution to bring it into line with their needs, to make it more congenial. Although life may indeed have evolved, they say, the process was really masterminded by God, whose ultimate goal was to evolve a species, our species, that is able to apprehend and therefore to admire its creator.

Coyne is an eminent evolutionary biologist, but here makes an enormous claim about the philosophical implications of science: that evolution through natural selection demonstrates that there is no divine plan for the universe. I think this claim is, in fact, a gigantic leap of faith unsupported by any scientific findings. Let me try to explain why.

I’ll need to begin by considering evolution at a reasonably concrete level. It’s very helpful to look at a representation of the core algorithmic processes that comprise evolution through natural selection, but abstract away, for the moment, many of its biochemical complexities. Genetic Algorithms (GAs) are computer-software implementations of the same kind of information algorithm that takes place in the biological process of evolution through natural selection. Today, GAs are a widely deployed software engineering tool used to solve such prosaic problems as optimally scheduling trucks on a delivery route or identifying the best combination of process-control settings to get maximum output from a factory.

Consider the example of a chemical plant with a control panel that has 100 on/off switches used to regulate the manufacturing process. You are given the task of finding the combination of switch settings that will generate the highest total output for the plant. How would you solve the problem? One obvious approach would be to run the plant briefly with each possible combination of switch settings and select the best one. Unfortunately, even in this very simplified example there are 2^100 possible combinations. This is a surprisingly gigantic number — much larger, for instance, than the number of grains of sand on Earth. We could spend a million lifetimes trying various combinations of switches and never get to most of the possible combinations.

But there’s a trick that can help us. Once we start to try combinations, we might begin to notice patterns like “when switches 17 and 84 are set to ‘on,’ production tends to increase when I put switch 53 to the ‘off’ position.” Such insights could help us to narrow down our search, and get to the answer faster. This might not seem to be much help in the face of such an enormous number of possibilities, but the power of these rules is also surprising.

To illustrate this, think of a simple game: I pick a random whole number between one and a billion, and you try to guess it. If the only thing I tell you when you make each guess is whether you are right or wrong, you would have very little chance of guessing my number even if I let you guess non-stop for a year. If, however, I tell you whether each guess is high or low, there is a procedure that will get the exact answer within about 30 guesses. You should always guess 500 million first. For all subsequent guesses, you should always pick the mid-point of the remaining possibilities. If, for example, the response to your opening guess of 500 million is that you are too high, your next guess should be the mid-point of the remaining possibilities, or 250 million. If the response to this second guess is “too low,” then your next guess should be the mid-point of 250 million and 500 million, or 375 million, and so on. You can find my number within about a minute.

A Genetic Algorithm works on roughly the same principle. To return to our problem of the 2^100 possible combinations of switch settings, we can use a GA as an automated procedure to sort through the vast “search space” of possibilities — and thus home in quickly on the best one. This procedure has the same three elements as our procedure for guessing the number: a starting guess, a feedback measurement that gives some indication of how good any guess is, and an iterative method that exploits this feedback to improve subsequent guesses.

In order to establish the initial guess for the GA, imagine writing a vertical column of 100 zeroes and ones on a piece of paper. If we agree to let one=“turn the switch on” and zero=“turn the switch off,” this could be used as a set of instructions for operating the chemical plant. The first of the hundred would tell us whether switch 1 should be on or off, the second would tell us what to do with switch 2, and so on all the way down to the 100th switch

This is a pretty obvious analogy to what happens with biological organisms and their genetic codes — and therefore, in a GA, we refer to this list as a “genome.” The mapping of genome to physical manifestation is termed the genotype-phenotype map.

Our goal, then, is to find the genome that will lead the plant to run at maximum output. The algorithm creates an initial bunch of guesses — genomes — by randomly generating, say, 1,000 strings of 100 zeros and ones. We then do 1,000 sequential production runs at the factory, by setting the switches in the plant to the combination of settings indicated by each genome and measuring the output of the plant for each; this measured output is termed the “fitness value.” (Typically, in fact, we construct a software-based simulation of the factory that allows us to run such tests more rapidly.) Next, the program selects the 500 of the 1,000 organisms that have the lowest fitness values and eliminates them. This is the feedback measurement in our algorithm — and it is directly analogous to the competition for survival of biological entities.

Next comes the algorithmic process for generating new guesses, which has two major components: crossover and mutation. These components are directly modeled on the biological process of reproduction. First, the 500 surviving organisms are randomly paired off into 250 pairs of mates. The GA then proceeds through these pairs of organisms one at a time. For each pair it flips a coin. If the coin comes up heads, then organism A “reproduces” with organism B by simply creating one additional copy of each; this is called direct replication. If it comes up tails, then organism A reproduces with organism B via “crossover”: The program selects a random “crossover point,” say at the 34th of the 100 positions, and then creates one offspring that has the string of zeroes and ones from organism A up to the crossover point and those from organism B after the crossover point, and an additional offspring that has the string of zeroes and ones from organism B up to the crossover point and those from organism A after the crossover point. The 500 resulting offspring are added to the population of 500 surviving parents to create a new population of 1,000 organisms. Finally, a soupçon of mutation is added by randomly flipping roughly every 10,000th digit from zero to one or vice versa.

The new generation is now complete. Fitness is evaluated for each; the bottom 500 are eliminated, and the surviving 500 reproduce through the same process of direct replication, crossover, and mutation to create the subsequent generation. This cycle is repeated over and over again through many generations. The average fitness value of the population moves upward through these iterations, and the algorithm, in fits and starts, closes in on the best solution.

This seems like a laborious process — but it works: it helps us get the factory to very high output much faster than we could otherwise. Computer scientists were inspired to do it this way because they observed the same three fundamental algorithmic operators — selection, crossover, and mutation — accomplish a similar task in the natural world. Notice that the method searches a space of possible solutions far more rapidly than random search, but it neither requires nor generates beliefs about the causal relationship between patterns within the genome and fitness beyond the raw observation of the survival or death of individual organisms. This is what makes the approach applicable to such a vast range of phenomena. That such a comparatively simple concept can explain so much about the way nature works is what makes genetic evolution a scientific paradigm of stupendous beauty and power. As Leonardo put it, simplicity is the highest form of sophistication.

We can make two simple observations about the properties of this GA. First, our factory evolution process did not begin ex nihilo. It required pre-existing building blocks in the form of the initial population and the rules of the algorithm. Second, one of the 2^100 possible combinations of switch settings will produce the highest output. With enough time, the algorithm will always converge on this one answer. The algorithm is therefore the opposite of goalless: it is, rather, a device designed to tend toward a specific needle in a haystack — the single best potential result.

These two observations are highly relevant to our consideration of the philosophical problems of creation (first cause) and purpose (final cause).

It is obvious from the factory analogy that evolution does not eliminate the problem of ultimate origins. Physical genomes are composed of parts, which in turn are assembled from other subsidiary components according to physical laws. We could, in theory, push this construction process back through components and sub-components all the way to the smallest sub-atomic particles currently known, but we would still have to address the problem of original creation. Even if we argue that, as per the GA which spontaneously generates the initial population, that prior physical processes created matter, we are still left with the more profound question of the origin of the rules of the physical process themselves.

This, of course, is a very old question that far pre-dates modern science. A scientific theory is a falsifiable rule that relates cause to effect. If you push the chain of causality back far enough, you either find yourself more or less right back where Aristotle was more than 2,000 years ago in stating his view that any conception of any chain of cause-and-effect must ultimately begin with an Uncaused Cause, or just accept the problem of infinite regress. No matter how far science advances, an explanation of ultimate origins seems always to remain a non-scientific question.

Now consider the relationship of the second observation to the problem of final cause. The factory GA, as we saw, had a goal. Evolution in nature is more complicated — but the complications don’t mean that the process is goalless, just that determining this goal would be so incomprehensibly hard that in practice it falls into the realm of philosophy rather than science. Science can not tell us whether or not evolution through natural selection has some final cause or not; if we believe, for some non-scientific reason, that evolution has a goal, then science can not, as of now, tell what that goal might be.

One important complication is that evolution in nature proceeds against a more complex fitness function than “see how much output this factory creates.” The natural fitness landscape is defined by survival and reproduction, and it is constantly changing as the environment changes — for example, as new species arise or the climate becomes colder. It is prohibitively difficult to calculate the result of this process, but it is, in principle, calculable; the fitness landscape, after all, is only the product of the interaction of other physical processes.

A second major complication is that genetic strings in nature have complex structures and can evolve to some arbitrary length, unlike our factory example, where the genome had a single string with a fixed length of 100 positions. But, even in nature, the genome must always have finite dimension, as regulated by physical laws; and therefore the total number of potential combinations of genetic components remains finite. It is often said, correctly, that the number of possible genetic combinations is “all but infinite”; but of course this is just a very loaded way of saying “finite.”

The combination of a constantly changing fitness landscape and an extraordinarily large number of possible genomes means that scientists appropriately proceed as if evolution were goalless, but from a philosophical perspective a goal may remain present in principle.

But what about the random elements of evolution – how can randomness possibly comport with a goal? First, note that in the factory example, this did not impact the goal, merely the path taken to the goal. Further, it’s especially important that we be clear about our terms on this subject, since a lot of philosophical baggage can get swept into the term “random”. It is often used loosely in discussions of evolution to imply senselessness, a basic lack of understandability, in occurrences. But in fact, even the “random” elements of evolution that influence the path it takes toward its goal — for example, mutation and crossover — are really pseudo-random. For example, if a specific mutation is caused by radiation hitting a nucleotide, both the radiation and its effect on the nucleotide are governed by normal physical laws. Human uncertainty in describing evolution, which as a practical matter we refer to as randomness, is reducible entirely to the impracticality of building a model that comprehensively considers things such as the idiosyncratic path of every photon in the universe compounded by the quantum-mechanistic uncertainty present in fundamental physical laws that govern the motion of such particles. So, said more precisely, the evolutionary process does not add any incremental randomness to outcomes beyond what is already present in other physical laws, simply such great complexity that scientists are well-advised to treat it as if it were goalless. We currently lack the capability to compute either the goal or the path of evolution, but that is a comment about our limitations as observers, not about the process itself.

The theory of evolution, then, has not eliminated the problems of ultimate origins and ultimate purpose with respect to the development of organisms; it has ignored them. These problems are defined as non-scientific questions, not because we don’t care about the answers, but because attempting to solve them would impede practical progress. Accepting evolution, therefore, requires neither the denial of a Creator nor the loss of the idea of ultimate purpose. It resolves neither issue for us one way or the other. The field of philosophical speculation that does not contradict any valid scientific findings is much wider open to Wright than Coyne is willing to accept.

Orginally at The Daily Dish


No comments:

Post a Comment