2.17. Lecture 16: Evolution

Before this class you should:

  • Read Think Complexity, Chapter 11

Before next class you should:

  • Read Think Complexity, Chapter 12

Note Taker: Liam Rocheleau

2.17.1. Theory of Evolution

The theory of evolution is based on claims that new species emerge from existing species due to natural selection where certain traits make a species more successful, leading to that trait’s propagation witin a species’ descendants. About 34% of Americans do not believe in the theory of evolution (Pew research centre survey). Downey supposed the following reasons as to why this is the case:

  • Conflict between evolution and religious beliefs

  • Misinformation, often by members of the first group

  • Lack of knowledge on the theory of evolution

Evolution can be easier to understand when simulated rather than explained via theory.

2.17.2. Simulating Evolution

The following features are sufficient to produce evolution:

  • Replicators: agents that can reproduce in some way. We will start with perfect copying and move on to imperfect copying

  • Variation: differences between individuals

  • Differential survival or reproduction: variation between individuals affect their ability to survive or reproduce

2.17.3. Genotypes and Fitness

We define a population of agents that represents individual organisms. Each one has genetic information called its genotype, which is copied on replication. To generate variation, we create a population with a variety of genotypes. To generate differential survival and reproduction, we define a function to map genotype to fitness, where fitness is a quantity related to survival/reproduction (similar to Sugarscapes model). This function is referred to as the fitness landscape.

2.17.4. Notion of Fitness Landscape

The fitness lanscape is a function that maps genotype to a scalar value for fitness, i.e. determining the “chance of success” of a genotype. Individual genotypes exist in some 2D landscape, and they can be mapped to a “height” for a static or dynamic fitness landscape. Dynamic landscapes have changing high points to reflect the change in optimal characteristics for genotypes, while static landscapes do not change with time. We use a static landscape in our simulations for simplicity.

2.17.5. Fintess Landscape Class

To simulate evolution, we first need to define a class for the fitness landscape. Agents (individual “organisms” in the simulation) will refer to the FitnessLandscape object in order to measure their fitness.

class FitnessLandscape:
    def __init__(self, N):
        self.N = one_values = np.random.random(N)
        self.zero_values = np.random.random(N)

    def fitness(self, loc):
        fs = np.where(loc, self.one_values, self.zero_values)
        return fs.mean()

    # see the lab 12 notebook for the full FitnessLandscape class

We will use a random fitness landscape in our simulations for simplicity

2.17.6. Agent Class

As mentioned previously, the agents in our model represent individuals in our simulation. Each agent has a genotype represented by its location on the fitness landscape, and each genotype is an length n binary array. The fitness landscape maps from each location in N-D space to a fitness value.

class Agent:
   def __init__(self, loc, fit_land):
      self.loc = loc
      self.fit_land = fit_land
      self.fitness = fit_land.fitness(self.loc)

   def copy(self):
      return Agent(self.loc, self.fit_land)

The attributes of each agent are:

  • loc: the location of the agent in the fitness landscape

  • fit_land: a reference to a FitnessLandscape object

  • fitness: the fitness of the agent in the FitnessLandscape represented as a number between 0 and 1

2.17.7. Simulation Class

Now that we have the fitness landscape and agents, we define the Simulation class, which contains methods that will simulate a simple model of evolution.

class Simulation:
    def __init__(self, fit_land, agents):
        self.fit_land

    def step(self):
        n = len(self.agents)
        fits = self.get_fitnesses()

        # see who dies
        index_dead = self.choose_dead(fits)
        num_dead = len(index_dead)

        # replace the dead with copies of the living
        replacements = self.choose_replacements(num_dead, fits)
        self.agents[index_dead] = replacements

    def choose_dead(self, fits):
        n = len(self.agents)
        is_dead = np.random.random(n) < 0.1
        index_dead = np.nonzero(is_dead)[0]
        return index_dead

    def choose_replacements(self, n, fits):
        agents = np.random.choice(self.agents, size=n, replace)
        replacements = [agent.copy() for agent in agents]
        return replacements

    # see the lab 12 notebook for the full Simulation class

Attributes of a simulation are:

  • fit_land: a reference to a FitnessLandscape object

  • agents: a list of agent objects

Each time step in the simulation, the agents have their fitnesses read with the least fit dying and being removed from the simulation. The dead agents are then replaced by replications of randomly chosen agents that remain.

2.17.8. Evidence of Evolution

choose_dead and choose_replacements don’t depend on fitness, therefore the simulation does not have differential survival or reproduction. As a result, there is no evolution. But, how can we tell? the most inclusive definition of evolution is a change in the distribution of genotypes. Genotypes are high-dimensional in this simulation, it is hard to visualize changes in the distribution. Instead, we will use changes in the distribution of fitness as evolution. Jaggedness in CDF plot (see the lab 12 notebook) indicates fewer unique values, meaning there is a loss of diversity in genotypes.

2.17.9. Instrument/Mean Fitness

As a side note, we define Instrument classes to help measure changes over the course of the simulation. For example, the MeanFitness object computes the mean fitness at each time step. To see how this model functions, see the lab 12 notebook.

class Instrument:
    def __init__(self):
        self.metrics = []

class MeanFitness(Instrument):
    def update(self, sim):
        mean = np.nanmean(sim.get_fitnesses())
        self.metrics.append(mean)

2.18. Differential Survival, Mutation, and Speciation

2.18.1. Differential Survival and Reproduction

Now for the final pieces, differential survival and reproduction. Differential survival explains how different genotypes either succeed or fail at making a species fit enough to survive on the landscape. Here is a modified simulation class that overrides choose_dead to implement this:

class SimWithDiffSurvival(Simulation):

    def choose_dead(self, fits):
        n = len(self.agents)
        is_survives = np.random.random(n) > fits # modified line from lab notebook to fix typo
        index_dead = np.nonzero(is_survives)[0]
        return index_dead

If an individual’s fitness is less than a random value between 0 and 1, then the agent dies. Therefore, agents with genotypes in high fitness locations on the fitness landscape have a higher chance of surviving. Differential reproduction has a similar effect, but it describes how higher fitness agents are more likely to propagate their genotype upon replication than lower fitness individuals. Here is a modified simulation class that overrides choose_replacements to implement this:

class SimWithDiffReproduction(Simulation):

    def choose_replacements(self, n, weights):
        p = weights / np.sum(weights)
        agents = np.random.choice(self.agents, size=n, replace=True, p=p)
        replacements = [agent.copy() for agent in agents]
        return replacements

Both of these aspects of the simulation increase the mean fitness over time until it levels off at some threshold. Running a simulation with both of these effects causes the mean fitness to increase even faster.

2.18.2. Simulation Results, Differential Survival and Reproduction

This simulation starts to explain adaptation: increasing fitness means that the species is adapting to the landscape and improving itself. Over time, the number of occupied locations decreases, so this model does not yet explain diversity due to mutation. Diversity can be achieved by imperfect replication.

2.18.3. Mutation

Here is a class that reimplements the agent class overriding the copy method:

class Mutant(Agent):

    def copy(self, prob_mutate = 0.05):
        if np.random.random() > prob_mutate:
            loc = self.loc.copy()
        else:
            direction = np.random.randint(self.fit_land.N)
            loc = self.mutate(direction)
        return Mutant(loc, self.fit_land)

In our model of mutation, there is a 5% chance of mutation upon agent replication. Mutation means choosing a random direction from the current location, i.e. choosing a random bit in the genotype, and flipping it.

2.18.4. Simulation Results, Mutation

The population always evolves toward the location with maximum fitness like before. To measure diversity in the population, we can plot the number of occupied locations after each time step. as mutations occur, the number of occupied locations increases rapidly. Eventually, the system reaches an equilibrium where mutation occupies new locations at the same rate differential survival causes lower-fitness locations to be left empty.

2.18.5. Speciation

Before we can model new species, we need the ability to identify clusters of agents in the landscape, which means we need to define distance. the distance between genotypes can be expressed as the number of bit flips to get from one location on the landscape to another (recall that genotypes, and by extension their locations, are expressed in terms of length n binary arrays).

# in FitnessLandscape class

def distance(self, loc1, loc2):
    return np.sum(np.logical_xor(loc1, loc2))

2.18.6. Population Dispersion

To quantify population dispersion, we can compute the mean of the distances between pairs of agents and plot the mean distance over time.

2.18.7. Looking For New Species

Here, we run the simulation to steady state, where many agents are at the optimal location. After 500 steps, we run FitnessLandscape.set_values to change the fitness landscape, then resume the simulation. After the change, the mean fitness increases again as the population migrates across the new landscape, eventually finding the new optimal location. If we compute the distance between the agents locations before and after the change, we see they differ by more than 6 on average. The distances between clusters are much larger than within, so we can interpret these clusters as distinct species.

2.18.8. Takeaways

In this chapter, we developed an agent based model to simulate the theory of evolution. Using the concepts of genotypes and fitnesses along with newly learned modelling techniques such as instrument usage, we observed the following takeaways:

  • Mutation, differential survival, and reproduction are sufficient to model evolution with increasing fitness, increasing diversity, and a simple form of speciation.

  • This model is not meant to be realistic; evolution in real life is far more complicated.

  • By seeing the process along with the results, we see evolution as a surprisingly simple, and inevitable idea.