2.22. Lecture 16: Evolution¶
Before this class you should:
Read Think Complexity, Chapter 11
Before next class you should:
Read Think Complexity, Chapter 12
Note taker: Ahsan Minhas
2.22.1. Overview¶
This lecture introduces a simple computational model of evolution. The goal is to show how a few basic mechanisms can lead to adaptation, diversity, and a basic form of speciation.
2.22.2. Why evolution is misunderstood¶
The lecture begins by noting that evolution is often misunderstood. A survey example is used in which some respondents believe humans have always existed in their current form.
Possible reasons include:
conflict with personal beliefs
misinformation
lack of understanding of scientific concepts
The lecture focuses on explaining evolution using simple models rather than biological detail.
2.22.3. Key ingredients of evolution¶
A small number of ingredients are sufficient for evolution:
replicators (agents that reproduce)
variation between individuals
differential survival or reproduction
These are the minimum requirements needed for evolution in a simulation.
2.22.4. Genotypes and fitness¶
Each agent represents an organism and has a genotype. In this model, the genotype is represented as a binary string, where each bit represents a trait.
A simple representation of an agent is:
class Agent:
def __init__(self, genotype, landscape):
self.genotype = genotype
self.landscape = landscape
def get_fitness(self):
return self.landscape.get_fitness(self.genotype)
Each agent also has a fitness value between 0 and 1. Fitness depends on the genotype and is determined using a fitness landscape.
2.22.5. Fitness landscape¶
A fitness landscape maps genotypes to fitness values. Some genotypes have higher fitness than others.
A simple implementation is:
class FitnessLandscape:
def __init__(self, values):
self.values = values
def get_fitness(self, genotype):
index = int("".join(str(bit) for bit in genotype), 2)
return self.values[index]
In this model, the genotype acts like an address. The bits are combined to form a binary number, which is converted into an index. This index is used to look up a fitness value in the landscape.
This means different genotypes correspond to different positions in the landscape, and the value at that position determines the fitness of the agent.
The landscape can be visualized as a graph where the horizontal axis represents genotype positions and the vertical axis represents fitness. High points represent highly fit genotypes, while low points represent poorly fit genotypes.
2.22.6. Agent and simulation model¶
The model uses an Agent class and a Simulation class.
Each agent has:
a location in the fitness landscape
a genotype
a fitness value
The simulation updates the population over time by repeatedly computing fitness values, determining which agents are removed, selecting agents for reproduction, and replacing removed agents with copies of the selected parent agents.
A simple structure for the simulation is:
class Simulation:
def __init__(self, agents):
self.agents = agents
def step(self):
fitnesses = [agent.get_fitness() for agent in self.agents]
# determine which agents are removed
# select parent agents for reproduction
# replace removed agents with copies of the parents
At each time step, fitness is first computed for every agent. The model then determines which agents are removed. In the baseline version, this removal is random. In the selection version, lower-fitness agents are more likely to be removed. After that, parent agents are selected for reproduction. In the baseline version this is random, while in the selection version higher-fitness agents are more likely to be chosen. Finally, the removed agents are replaced with copies of the selected parent agents.
In the baseline version of the model, there is no selection. This means that survival and reproduction are random rather than being influenced by fitness.
2.22.7. Baseline model (no selection)¶
In the initial model, survival is random and does not depend on fitness. Reproduction is also random, so higher-fitness agents have no advantage over lower-fitness agents.
This can be represented as:
def step(self):
fitnesses = [agent.get_fitness() for agent in self.agents]
removed_index = random.choice(range(len(self.agents)))
parent_index = random.choice(range(len(self.agents)))
self.agents[removed_index] = copy.deepcopy(self.agents[parent_index])
This results in:
no advantage for higher fitness agents
mean fitness changing randomly over time
behaviour similar to a random walk
Although the population changes, this does not explain adaptation, because the changes are not consistently pushing the population toward better fit genotypes. Since survival and reproduction are random, any increase in fitness is temporary rather than the result of selection.
2.22.8. Differential survival¶
Differential survival is introduced so that fitness affects survival.
Agents with higher fitness are more likely to survive, while lower fitness agents are more likely to be removed. This can be implemented by making the removal probability inversely related to fitness, and making the reproduction probability directly related to fitness.
For example:
def step(self):
fitnesses = [agent.get_fitness() for agent in self.agents]
removed_index = weighted_choice([1 - f for f in fitnesses])
parent_index = weighted_choice(fitnesses)
self.agents[removed_index] = copy.deepcopy(self.agents[parent_index])
In this version, agents with low fitness have a higher chance of being removed, and agents with high fitness have a higher chance of being chosen as parents. This causes the population to move toward genotype positions associated with higher values in the fitness landscape.
This leads to:
an increase in mean fitness over time
movement toward higher fitness regions
However, the number of distinct genotype positions occupied by the population decreases over time, so diversity is limited. Since genotype positions correspond to locations in the fitness landscape, this means more agents become concentrated around a smaller number of fit regions.
2.22.9. Mutation¶
Mutation is added to introduce variation.
When an agent is copied, there is a small probability of mutation (approximately 0.05). Since the genotype is a binary string, mutation is implemented by choosing one bit position at random and flipping it from 0 to 1 or from 1 to 0.
A simple implementation is:
def mutate(genotype, p=0.05):
new_genotype = genotype[:]
if random.random() < p:
i = random.randrange(len(new_genotype))
new_genotype[i] = 1 - new_genotype[i]
return new_genotype
Mutation changes the genotype of copied agents, allowing the population to explore new positions in the fitness landscape.
Effects of mutation include:
new genotypes appearing
increased diversity
a balance between mutation and selection over time
2.22.10. Speciation and genotype distance¶
To compare agents, a distance measure between genotypes is defined.
Distance is calculated as the number of differing bits using XOR. If two agents have the same bit in a given position, the XOR result for that position is 0. If they differ, the XOR result is 1. Summing these values across all bit positions gives the total number of differences between the two genotypes. This provides a simple measure of genetic distance.
Population dispersion is measured as the average distance between agents. When agents are close together in genotype space, they form a cluster. In this model, different clusters can be interpreted as different species because agents within the same cluster are genetically similar, while agents in different clusters are much more genetically separated.
This is a simplified computational definition of species. It does not capture all aspects of biological speciation, but it demonstrates how distinct groups can emerge from differences in genotype over time.
2.22.11. Clusters and species¶
At equilibrium, agents form clusters in the fitness landscape.
If the landscape changes:
the population shifts to new regions because genotypes that were previously associated with high fitness may no longer be optimal
new clusters can form as selection begins favouring different regions of the updated landscape
distances between clusters can increase as groups adapt to different local fitness peaks
If clusters are far apart compared to distances within clusters, they can be interpreted as different species.
A simple way to visualize these changes is to plot summary statistics before and after the landscape changes. For example, one graph can show the average distance within clusters, and another can show the average distance between clusters.
import matplotlib.pyplot as plt
states = ["Before change", "After change"]
within_cluster = [1.2, 1.4]
between_cluster = [2.1, 4.8]
plt.figure()
plt.plot(states, within_cluster, marker="o")
plt.title("Average Distance Within Clusters")
plt.ylabel("Distance")
plt.show()
plt.figure()
plt.plot(states, between_cluster, marker="o")
plt.title("Average Distance Between Clusters")
plt.ylabel("Distance")
plt.show()
In this example, the average distance within clusters stays relatively small, while the average distance between clusters increases after the landscape changes. This supports the idea that the population is splitting into more distinct species-like groups.
2.22.12. Conclusion¶
This lecture shows that three simple mechanisms can produce important evolutionary behaviour:
reproduction
differential survival
mutation
These lead to:
increasing fitness
increasing diversity
formation of distinct clusters (speciation)