2.17. Lecture 16: Evolution¶
Before this class you should:
Read Think Complexity, Chapter 11
Before next class you should:
Read Think Complexity, Chapter 12
Note Taker: Liam Rocheleau
2.17.1. Theory of Evolution¶
The theory of evolution is based on claims that new species emerge from existing species due to natural selection where certain traits make a species more successful, leading to that trait’s propagation witin a species’ descendants. About 34% of Americans do not believe in the theory of evolution (Pew research centre survey). Downey supposed the following reasons as to why this is the case:
Conflict between evolution and religious beliefs
Misinformation, often by members of the first group
Lack of knowledge on the theory of evolution
Evolution can be easier to understand when simulated rather than explained via theory.
2.17.2. Simulating Evolution¶
The following features are sufficient to produce evolution:
Replicators: agents that can reproduce in some way. We will start with perfect copying and move on to imperfect copying
Variation: differences between individuals
Differential survival or reproduction: variation between individuals affect their ability to survive or reproduce
2.17.3. Genotypes and Fitness¶
We define a population of agents that represents individual organisms. Each one has genetic information called its genotype, which is copied on replication. To generate variation, we create a population with a variety of genotypes. To generate differential survival and reproduction, we define a function to map genotype to fitness, where fitness is a quantity related to survival/reproduction (similar to Sugarscapes model). This function is referred to as the fitness landscape.
2.17.4. Notion of Fitness Landscape¶
The fitness lanscape is a function that maps genotype to a scalar value for fitness, i.e. determining the “chance of success” of a genotype. Individual genotypes exist in some 2D landscape, and they can be mapped to a “height” for a static or dynamic fitness landscape. Dynamic landscapes have changing high points to reflect the change in optimal characteristics for genotypes, while static landscapes do not change with time. We use a static landscape in our simulations for simplicity.
2.17.5. Fintess Landscape Class¶
To simulate evolution, we first need to define a class for the fitness landscape. Agents (individual “organisms” in the simulation) will refer to the FitnessLandscape object in order to measure their fitness.
class FitnessLandscape:
def __init__(self, N):
self.N = one_values = np.random.random(N)
self.zero_values = np.random.random(N)
def fitness(self, loc):
fs = np.where(loc, self.one_values, self.zero_values)
return fs.mean()
# see the lab 12 notebook for the full FitnessLandscape class
We will use a random fitness landscape in our simulations for simplicity
2.17.6. Agent Class¶
As mentioned previously, the agents in our model represent individuals in our simulation. Each agent has a genotype represented by its location on the fitness landscape, and each genotype is an length n binary array. The fitness landscape maps from each location in N-D space to a fitness value.
class Agent:
def __init__(self, loc, fit_land):
self.loc = loc
self.fit_land = fit_land
self.fitness = fit_land.fitness(self.loc)
def copy(self):
return Agent(self.loc, self.fit_land)
The attributes of each agent are:
loc: the location of the agent in the fitness landscape
fit_land: a reference to a FitnessLandscape object
fitness: the fitness of the agent in the FitnessLandscape represented as a number between 0 and 1
2.17.7. Simulation Class¶
Now that we have the fitness landscape and agents, we define the Simulation class, which contains methods that will simulate a simple model of evolution.
class Simulation:
def __init__(self, fit_land, agents):
self.fit_land
def step(self):
n = len(self.agents)
fits = self.get_fitnesses()
# see who dies
index_dead = self.choose_dead(fits)
num_dead = len(index_dead)
# replace the dead with copies of the living
replacements = self.choose_replacements(num_dead, fits)
self.agents[index_dead] = replacements
def choose_dead(self, fits):
n = len(self.agents)
is_dead = np.random.random(n) < 0.1
index_dead = np.nonzero(is_dead)[0]
return index_dead
def choose_replacements(self, n, fits):
agents = np.random.choice(self.agents, size=n, replace)
replacements = [agent.copy() for agent in agents]
return replacements
# see the lab 12 notebook for the full Simulation class
Attributes of a simulation are:
fit_land: a reference to a FitnessLandscape object
agents: a list of agent objects
Each time step in the simulation, the agents have their fitnesses read with the least fit dying and being removed from the simulation. The dead agents are then replaced by replications of randomly chosen agents that remain.
2.17.8. Evidence of Evolution¶
choose_dead and choose_replacements don’t depend on fitness, therefore the simulation does not have differential survival or reproduction. As a result, there is no evolution. But, how can we tell? the most inclusive definition of evolution is a change in the distribution of genotypes. Genotypes are high-dimensional in this simulation, it is hard to visualize changes in the distribution. Instead, we will use changes in the distribution of fitness as evolution. Jaggedness in CDF plot (see the lab 12 notebook) indicates fewer unique values, meaning there is a loss of diversity in genotypes.
2.17.9. Instrument/Mean Fitness¶
As a side note, we define Instrument classes to help measure changes over the course of the simulation. For example, the MeanFitness object computes the mean fitness at each time step. To see how this model functions, see the lab 12 notebook.
class Instrument:
def __init__(self):
self.metrics = []
class MeanFitness(Instrument):
def update(self, sim):
mean = np.nanmean(sim.get_fitnesses())
self.metrics.append(mean)
2.18. Differential Survival, Mutation, and Speciation¶
2.18.1. Differential Survival and Reproduction¶
Now for the final pieces, differential survival and reproduction. Differential survival explains how different genotypes either succeed or fail at making a species fit enough to survive on the landscape. Here is a modified simulation class that overrides choose_dead to implement this:
class SimWithDiffSurvival(Simulation):
def choose_dead(self, fits):
n = len(self.agents)
is_survives = np.random.random(n) > fits # modified line from lab notebook to fix typo
index_dead = np.nonzero(is_survives)[0]
return index_dead
If an individual’s fitness is less than a random value between 0 and 1, then the agent dies. Therefore, agents with genotypes in high fitness locations on the fitness landscape have a higher chance of surviving. Differential reproduction has a similar effect, but it describes how higher fitness agents are more likely to propagate their genotype upon replication than lower fitness individuals. Here is a modified simulation class that overrides choose_replacements to implement this:
class SimWithDiffReproduction(Simulation):
def choose_replacements(self, n, weights):
p = weights / np.sum(weights)
agents = np.random.choice(self.agents, size=n, replace=True, p=p)
replacements = [agent.copy() for agent in agents]
return replacements
Both of these aspects of the simulation increase the mean fitness over time until it levels off at some threshold. Running a simulation with both of these effects causes the mean fitness to increase even faster.
2.18.2. Simulation Results, Differential Survival and Reproduction¶
This simulation starts to explain adaptation: increasing fitness means that the species is adapting to the landscape and improving itself. Over time, the number of occupied locations decreases, so this model does not yet explain diversity due to mutation. Diversity can be achieved by imperfect replication.
2.18.3. Mutation¶
Here is a class that reimplements the agent class overriding the copy method:
class Mutant(Agent):
def copy(self, prob_mutate = 0.05):
if np.random.random() > prob_mutate:
loc = self.loc.copy()
else:
direction = np.random.randint(self.fit_land.N)
loc = self.mutate(direction)
return Mutant(loc, self.fit_land)
In our model of mutation, there is a 5% chance of mutation upon agent replication. Mutation means choosing a random direction from the current location, i.e. choosing a random bit in the genotype, and flipping it.
2.18.4. Simulation Results, Mutation¶
The population always evolves toward the location with maximum fitness like before. To measure diversity in the population, we can plot the number of occupied locations after each time step. as mutations occur, the number of occupied locations increases rapidly. Eventually, the system reaches an equilibrium where mutation occupies new locations at the same rate differential survival causes lower-fitness locations to be left empty.
2.18.5. Speciation¶
Before we can model new species, we need the ability to identify clusters of agents in the landscape, which means we need to define distance. the distance between genotypes can be expressed as the number of bit flips to get from one location on the landscape to another (recall that genotypes, and by extension their locations, are expressed in terms of length n binary arrays).
# in FitnessLandscape class
def distance(self, loc1, loc2):
return np.sum(np.logical_xor(loc1, loc2))
2.18.6. Population Dispersion¶
To quantify population dispersion, we can compute the mean of the distances between pairs of agents and plot the mean distance over time.
2.18.7. Looking For New Species¶
Here, we run the simulation to steady state, where many agents are at the optimal location. After 500 steps, we run FitnessLandscape.set_values to change the fitness landscape, then resume the simulation. After the change, the mean fitness increases again as the population migrates across the new landscape, eventually finding the new optimal location. If we compute the distance between the agents locations before and after the change, we see they differ by more than 6 on average. The distances between clusters are much larger than within, so we can interpret these clusters as distinct species.
2.18.8. Takeaways¶
In this chapter, we developed an agent based model to simulate the theory of evolution. Using the concepts of genotypes and fitnesses along with newly learned modelling techniques such as instrument usage, we observed the following takeaways:
Mutation, differential survival, and reproduction are sufficient to model evolution with increasing fitness, increasing diversity, and a simple form of speciation.
This model is not meant to be realistic; evolution in real life is far more complicated.
By seeing the process along with the results, we see evolution as a surprisingly simple, and inevitable idea.