2.22. Lecture 16: Evolution

Before this class you should:

  • Read Think Complexity, Chapter 11

Before next class you should:

  • Read Think Complexity, Chapter 12

Note taker: Ahsan Minhas

2.22.1. Overview

This lecture introduces a simple computational model of evolution. The goal is to show how a few basic mechanisms can lead to adaptation, diversity, and a basic form of speciation.

2.22.2. Why evolution is misunderstood

The lecture begins by noting that evolution is often misunderstood. A survey example is used in which some respondents believe humans have always existed in their current form.

Possible reasons include:

  • conflict with personal beliefs

  • misinformation

  • lack of understanding of scientific concepts

The lecture focuses on explaining evolution using simple models rather than biological detail.

2.22.3. Key ingredients of evolution

A small number of ingredients are sufficient for evolution:

  • replicators (agents that reproduce)

  • variation between individuals

  • differential survival or reproduction

These are the minimum requirements needed for evolution in a simulation.

2.22.4. Genotypes and fitness

Each agent represents an organism and has a genotype. In this model, the genotype is represented as a binary string, where each bit represents a trait.

A simple representation of an agent is:

class Agent:
    def __init__(self, genotype, landscape):
        self.genotype = genotype
        self.landscape = landscape

    def get_fitness(self):
        return self.landscape.get_fitness(self.genotype)

Each agent also has a fitness value between 0 and 1. Fitness depends on the genotype and is determined using a fitness landscape.

2.22.5. Fitness landscape

A fitness landscape maps genotypes to fitness values. Some genotypes have higher fitness than others.

A simple implementation is:

class FitnessLandscape:
    def __init__(self, values):
        self.values = values

    def get_fitness(self, genotype):
        index = int("".join(str(bit) for bit in genotype), 2)
        return self.values[index]

In this model, the genotype acts like an address. The bits are combined to form a binary number, which is converted into an index. This index is used to look up a fitness value in the landscape.

This means different genotypes correspond to different positions in the landscape, and the value at that position determines the fitness of the agent.

The landscape can be visualized as a graph where the horizontal axis represents genotype positions and the vertical axis represents fitness. High points represent highly fit genotypes, while low points represent poorly fit genotypes.

2.22.6. Agent and simulation model

The model uses an Agent class and a Simulation class.

Each agent has:

  • a location in the fitness landscape

  • a genotype

  • a fitness value

The simulation updates the population over time by repeatedly computing fitness values, determining which agents are removed, selecting agents for reproduction, and replacing removed agents with copies of the selected parent agents.

A simple structure for the simulation is:

class Simulation:
    def __init__(self, agents):
        self.agents = agents

    def step(self):
        fitnesses = [agent.get_fitness() for agent in self.agents]

        # determine which agents are removed
        # select parent agents for reproduction
        # replace removed agents with copies of the parents

At each time step, fitness is first computed for every agent. The model then determines which agents are removed. In the baseline version, this removal is random. In the selection version, lower-fitness agents are more likely to be removed. After that, parent agents are selected for reproduction. In the baseline version this is random, while in the selection version higher-fitness agents are more likely to be chosen. Finally, the removed agents are replaced with copies of the selected parent agents.

In the baseline version of the model, there is no selection. This means that survival and reproduction are random rather than being influenced by fitness.

2.22.7. Baseline model (no selection)

In the initial model, survival is random and does not depend on fitness. Reproduction is also random, so higher-fitness agents have no advantage over lower-fitness agents.

This can be represented as:

def step(self):
    fitnesses = [agent.get_fitness() for agent in self.agents]

    removed_index = random.choice(range(len(self.agents)))
    parent_index = random.choice(range(len(self.agents)))

    self.agents[removed_index] = copy.deepcopy(self.agents[parent_index])

This results in:

  • no advantage for higher fitness agents

  • mean fitness changing randomly over time

  • behaviour similar to a random walk

Although the population changes, this does not explain adaptation, because the changes are not consistently pushing the population toward better fit genotypes. Since survival and reproduction are random, any increase in fitness is temporary rather than the result of selection.

2.22.8. Differential survival

Differential survival is introduced so that fitness affects survival.

Agents with higher fitness are more likely to survive, while lower fitness agents are more likely to be removed. This can be implemented by making the removal probability inversely related to fitness, and making the reproduction probability directly related to fitness.

For example:

def step(self):
    fitnesses = [agent.get_fitness() for agent in self.agents]

    removed_index = weighted_choice([1 - f for f in fitnesses])
    parent_index = weighted_choice(fitnesses)

    self.agents[removed_index] = copy.deepcopy(self.agents[parent_index])

In this version, agents with low fitness have a higher chance of being removed, and agents with high fitness have a higher chance of being chosen as parents. This causes the population to move toward genotype positions associated with higher values in the fitness landscape.

This leads to:

  • an increase in mean fitness over time

  • movement toward higher fitness regions

However, the number of distinct genotype positions occupied by the population decreases over time, so diversity is limited. Since genotype positions correspond to locations in the fitness landscape, this means more agents become concentrated around a smaller number of fit regions.

2.22.9. Mutation

Mutation is added to introduce variation.

When an agent is copied, there is a small probability of mutation (approximately 0.05). Since the genotype is a binary string, mutation is implemented by choosing one bit position at random and flipping it from 0 to 1 or from 1 to 0.

A simple implementation is:

def mutate(genotype, p=0.05):
    new_genotype = genotype[:]
    if random.random() < p:
        i = random.randrange(len(new_genotype))
        new_genotype[i] = 1 - new_genotype[i]
    return new_genotype

Mutation changes the genotype of copied agents, allowing the population to explore new positions in the fitness landscape.

Effects of mutation include:

  • new genotypes appearing

  • increased diversity

  • a balance between mutation and selection over time

2.22.10. Speciation and genotype distance

To compare agents, a distance measure between genotypes is defined.

Distance is calculated as the number of differing bits using XOR. If two agents have the same bit in a given position, the XOR result for that position is 0. If they differ, the XOR result is 1. Summing these values across all bit positions gives the total number of differences between the two genotypes. This provides a simple measure of genetic distance.

Population dispersion is measured as the average distance between agents. When agents are close together in genotype space, they form a cluster. In this model, different clusters can be interpreted as different species because agents within the same cluster are genetically similar, while agents in different clusters are much more genetically separated.

This is a simplified computational definition of species. It does not capture all aspects of biological speciation, but it demonstrates how distinct groups can emerge from differences in genotype over time.

2.22.11. Clusters and species

At equilibrium, agents form clusters in the fitness landscape.

If the landscape changes:

  • the population shifts to new regions because genotypes that were previously associated with high fitness may no longer be optimal

  • new clusters can form as selection begins favouring different regions of the updated landscape

  • distances between clusters can increase as groups adapt to different local fitness peaks

If clusters are far apart compared to distances within clusters, they can be interpreted as different species.

A simple way to visualize these changes is to plot summary statistics before and after the landscape changes. For example, one graph can show the average distance within clusters, and another can show the average distance between clusters.

import matplotlib.pyplot as plt

states = ["Before change", "After change"]
within_cluster = [1.2, 1.4]
between_cluster = [2.1, 4.8]

plt.figure()
plt.plot(states, within_cluster, marker="o")
plt.title("Average Distance Within Clusters")
plt.ylabel("Distance")
plt.show()

plt.figure()
plt.plot(states, between_cluster, marker="o")
plt.title("Average Distance Between Clusters")
plt.ylabel("Distance")
plt.show()

In this example, the average distance within clusters stays relatively small, while the average distance between clusters increases after the landscape changes. This supports the idea that the population is splitting into more distinct species-like groups.

2.22.12. Conclusion

This lecture shows that three simple mechanisms can produce important evolutionary behaviour:

  • reproduction

  • differential survival

  • mutation

These lead to:

  • increasing fitness

  • increasing diversity

  • formation of distinct clusters (speciation)