There is no implementation of *graph* in **Python Standard Library**. The truth is that *graph* structure is rarely put
into standard libraries - I can come up with only one example of programming language which has this structure by
default: **Erlang** and its `digraph`

. Anyway - today I want to focus on its implementation in **Python**, because
it’s one of things in which I feel lack of pointers with comparision to **C/C++** languages.

Let’s say we want to implement some graph algorithm (like *Dijkstra*) in **Python**, but we want to write as less code
as possible for *graph* structure implementation. So our requirements are:

- small amount of code (we want to be able to code it fast and forget even faster )
- ability to change our structure dynamically (we want to add some edges or vertices after
*graph*initialization) -
*graph*is**undirected**(for each two vertices there can be at most one edge and edges don’t have directions)

Generally there are many approaches in implementing *graphs*, but I want to focus on those two:

- adjacency list
- adjacency matrix

In this post I want to go through the first approach - the second I will describe next week.

##
*Graph* as adjacency list in **Python**

*Graph* represented as an adjacency list is a structure in which for each vertex we have a list of adjacent vertices.
So for *graph* from this picture:

the adjacent lists for each vertex will look like this:

- for :
- for :
- for :
- for :

In **C++** we can achieve this kind of structure by creating a node structure for storing an array of pointers
to other nodes (better to use `std::vector`

than regular array):

On a snippet above there is only a structure, but you can add edges just by pushing back initialized nodes to
`neighbours`

, whole *graph* structure can be just a *list* (*vector*, *array*, etc) of nodes.

In **Python** it is not much harder. For algorithmic contests it should be enough to implement whole graph structure
with adding edges and vertices like this:

I’ve used here two basic data structures from **Python Standard Library**: `dict`

and `set`

. My `Graph`

is
a dictionary in which I store vertices labels. Usage of `dict`

allows quick access to elements.
In Python `dict`

is implemented as a hash table, so the average time complexity of accessing an element is -
the same as when using `std::vector`

in **C++**.
For storing list of adjacent vertices I’ve used `set`

data structure, because set gets rid of duplicates. Actually in
**Python** `set`

is also implemented as hash table - it is a `dict`

which stores dummy values. Using this structure
allows us to check if vertex is adjacent to other vertex in average time complexity (the worst is
where is number of edges in *graph*).

### There is shorter implementation…

When you are a hardcore contestant in algorithmic competitions you can make implementation of a *graph* structure shorter
by using `defaultdict`

(it’s a bit modified `dict`

, average time complexity of operations is the same):

It seems that it’s possible to have a structure implementation of graph in only one line…
This `deafultdict`

uses a default value `set()`

for not initialized keys - it’s quite dangerous. *What will happen
when you add an edge for not existing vertex?* The previous implementation will throw `KeyError`

exception in this case,
but this one will just create a new vertex along with this edge. Actually it can be a desired behaviour, anyway, you
should be aware of this - it’s not a fault tolerant implementation.
Also if you forget to add an edge to both edge vertices funny things can happen.
Probably it’s better to use it in this way for directed graphs and for undirected write a separate function which adds
an edge.

### Deleting vertex and edge

Deleting an edge is quite easy - we just have to remove corresponding vertices from two sets. For deleting a vertex we
need to iterate over its adjacency set, remove an edge between this vertex and each neighbour, then finally remove it
from our dictionary. The average time complexity is (degree of vertex - number of its neighbours),
because checking if vertex is in set of edges is in , deleting elements from set is and deleting
from dictionary is also implemented in , so only iterating through vertex neighbours matters. For
*sparse graphs* each vertex degree is small, so it will be almost .
Unfortunately the worst case complexity scenario can become , where is number of vertices.
It’s because worst scenario complexity of deleting from set is ) and our vertex can have at most
neighbours, when *graph* is dense.

### Pros and cons of using list adjacency approach in **Python**

List adjacency implementation of a *graph* is easy to understand, also it’s quite readable (if you don’t use any magic
approaches like `defaultdict(set)`

).
It’s good for *sparse graphs*, because it doesn’t use any additional memory - only what’s necessary to store vertices
and edges, so .
One of the biggest cons is that for *dense graphs* the dictionary operations can have a worst case scenario
about . The deletion of vertex can have quite bad time complexity (even …) - it can be done much
faster using other approaches. I guess in **C++** it can be done better even with this approach, because we can just set
a value of a node to `null`

and after deletion of vertex, pointers of this vertex in other vertices adjacency lists will
point to this `null`

. It’s good to get rid of those `null pointers`

in adjacency lists, but it can be postponed (for example
only done for each adjacency list when searching adjacent vertex). I think this amortized approach can’t be easily
done in **Python**, because it lacks pointers.

You can find code from this post here: gist.
Hopefully you enjoyed reading about this approach for implementing *graph* structure. Let me know your thoughts or
remarks in comments!

## Leave a Comment