2.8. Lecture 7: Graphs

Before this class you should:

  • Read Think Complexity, Chapter 2

  • Read the Wikipedia page about graphs at https://en.wikipedia.org/wiki/Graph_(discrete_mathematics) and answer the following questions:

    1. What is a simple graph? For our discussion today, we will assume that all graphs are simple graphs. This is a common assumption for many graph algorithms – so common it is often unstated.

    2. What is a regular graph? What is a complete graph? Prove that a complete graph is regular.

    3. What is a path? What is a cycle?

    4. What is a forest? What is a tree? Note: a graph is connected if there is a path from every node to every other node.

Before next class you should:

Note taker: James Watt

2.8.1. Overview

This lecture introduces graphs as mathematical structures used to represent relationships between objects. Topics covered include:

  • Basic terminology (nodes, edges, neighbors, degree)

  • Common graph classes (simple, regular, complete)

  • Paths, cycles, and closed walks

  • Trees and forests

  • NetworkX (Python) graph construction and visualization

  • Complete graphs and random graphs in NetworkX

  • Breakout room activity: prerequisite and task-dependency graphs

2.8.2. Definitions

A graph consists of a set of nodes (vertices) and edges connecting pairs of nodes.

Node (vertex): An element of the graph representing an entity (e.g., a person or a city).

Edge: A connection between two nodes representing a relationship or interaction.

Neighbors: Two nodes are neighbors if they are connected by an edge.

Degree: The number of edges incident to a node (equivalently, the number of neighbors in an undirected graph).

Simple graph: An undirected graph with no loops and no multiple edges.

Regular graph: A graph in which every node has the same degree.

Complete graph: A graph in which each pair of distinct nodes is joined by an edge. Each node is connected by an edge to every other node. For a complete graph with \(N\) nodes, each node has degree \(N-1\). Therefore, every node has the same degree, which implies the graph is regular (the reverse is not always true: a regular graph is not necessarily complete).

2.8.4. Trees and forests

Tree: A connected graph with no cycles.

Forest: A graph with no cycles (a disjoint union of one or more trees).

A graph is connected if there exists a path between every pair of nodes.

2.8.5. Case study: COVID-19 contact tracing

Contact tracing can be modeled using a graph to visualize and analyze how interactions between individuals may contribute to disease spread.

  • Nodes represent people

  • Edges represent interactions (who interacts with whom)

  • Edges can be weighted to represent interaction level (weighted edges)

2.8.6. NetworkX (Python)

NetworkX is a Python package used to create, manipulate, and study graphs.

2.8.6.1. Imports and setup

import matplotlib.pyplot as plt
import networkx as nx
import numpy as np
import seaborn as sns

from utils import decorate, savefig

np.random.seed(17)

2.8.6.2. Set the color palette

colors = sns.color_palette("pastel", 5)
sns.set_palette(colors)

2.8.6.3. Directed graphs

This example represents a directed social network with three nodes.

How to add nodes:

G = nx.DiGraph()

G.add_node("Alice")
G.add_node("Bob")
G.add_node("Chuck")

list(G.nodes())

How to add edges:

G.add_edge("Alice", "Bob")
G.add_edge("Alice", "Chuck")
G.add_edge("Bob", "Alice")
G.add_edge("Bob", "Chuck")

list(G.edges())

How to draw the graph:

  • draw_circular draws the nodes in a circle

  • This also shows how to create a PDF output

nx.draw_circular(G, node_color="C0", node_size=2000)
plt.axis("equal")
savefig("chap02-1.pdf")
Directed graph drawn with NetworkX

Directed graph example (corresponds to chap02-1 output).

2.8.6.4. Undirected graph example

  • positions maps each city to its coordinates

  • Keys in positions can be used to add nodes to the graph

  • drive_times maps pairs of cities to driving times between them

positions = dict(Albany=(-74, 43), Boston=(-71, 42))

positions["Albany"]

G = nx.Graph()
G.add_nodes_from(positions)
G.nodes()
drive_times = {
    ("Albany", "Boston"): 3,
    ("Albany", "NYC"): 4,
    ("Boston", "NYC"): 4,
    ("NYC", "Philly"): 2,
}

G.add_edges_from(drive_times)
G.edges()
nx.draw(G, positions, node_color="C1", node_size=2000)
nx.draw_networkx_edge_labels(G, positions, edge_labels=drive_times)
plt.axis("equal")
savefig("chap02-2.pdf")
Undirected weighted graph example

Undirected graph example with edge labels (chap02-2 output).

2.8.6.5. Complete graphs

A complete graph can be generated by creating edges between all pairs of nodes. The function uses yield, which makes it a generator, producing edges one at a time.

def all_pairs(nodes):
    for i, u in enumerate(nodes):
        for j, v in enumerate(nodes):
            if i < j:
                yield u, v
def make_complete_graph(n):
    G = nx.Graph()
    nodes = range(n)
    G.add_nodes_from(nodes)
    G.add_edges_from(all_pairs(nodes))
    return G

Example:

complete = make_complete_graph(10)
complete.number_of_nodes()

The neighbors method returns the neighbors for a given node:

list(complete.neighbors(0))
Complete graph example output

Complete graph example (chap02-3 related output).

2.8.6.6. Random graphs

Random graphs can be generated by including each possible edge with a fixed probability \(p\).

The helper function flip returns True with probability \(p\) and False with probability \(1-p\).

def flip(p):
    return np.random.random() < p
def random_pairs(nodes, p):
    for edge in all_pairs(nodes):
        if flip(p):
            yield edge
def make_random_graph(n, p):
    G = nx.Graph()
    nodes = range(n)
    G.add_nodes_from(nodes)
    G.add_edges_from(random_pairs(nodes, p))
    return G

Example:

np.random.seed(10)
random_graph = make_random_graph(10, 0.3)
len(random_graph.edges())
nx.draw_circular(random_graph, node_color="C3", node_size=2000)
plt.axis("equal")
savefig("chap02-4.pdf")
Random graph example output

Random graph example (chap02-4 output).

2.8.7. Breakout rooms activity

The server stopped working, so the class switched to breakout rooms.

Breakout room roles:

  • One member drawing

  • One member programming

  • One member presenting

2.8.7.1. Group A + C: Course prerequisite graph

Tasks:

  1. Draw/program the directed graph

  2. Does the graph contain a cycle?

  3. What is the longest prerequisite chain?

  4. Assuming 1 course per semester, what is the minimum number of semesters needed to complete all courses?

Group A results:

  • No cycle, because there is no path that leads back to a previously visited node.

  • Longest prerequisite chain: Math1210 -> Engg3130 (5 semesters)

  • Minimum time to complete all courses: 6 semesters

2.8.7.2. Group B + D: Task dependency graph

Tasks:

  1. Draw/program the directed graph

  2. Identify bottleneck tasks

  3. Find the longest chain

  4. Which task failure would delay the most downstream tasks?

Group B results:

  • Bottleneck task: where multiple processes are waiting for a task to complete; for this graph it would be “run the integration tests”

  • The starting point does not count as a bottleneck because prior tasks are not shown

  • Task failure that causes the most downstream issues: “install”

2.8.8. Announcements

A post is going up today about the lab test:

  • 10 peers for 20 minutes

  • Jupyter notebook, 4 questions

  • Up until next Thursday

  • Small-world graphs (lower probability)

  • Coding question, then a written part

  • Link to Lab 1 test tutor posting after class

2.8.9. Note

At this point in the course, the textbook reading matters, and the lecture assumes students arrive prepared.