Graph Theory Basics

Hello people…! In this post, I will talk about Graph Theory Basics, which are its terminologies, types and implementations in C. Graphs are difficult to code, but they have the most interesting real-life applications. When you want to talk about the real-life applications of graphs, you just cannot resist talking about the Facebook’s Graph Search! Now, I don’t really know that algorithm, but it uses graphs to find out your closest friends, or any other associations you have with the other users.

Other applications include finding the shortest paths, and by shortest path, I mean in the “universal sense”, it could be the shortest path of a Traveling Salesman, or the shortest path to win Snake and Ladder or even the shortest way to solve the Rubik’s Cube…! Amazing, right…?! So, let’s get started…!

Graphs are much clear when defined in mathematical terms. A Graph G (V, E), consists of two sets,

  • Set of Vertices, V
  • Set of Edges, E

As the name suggest, V, is the set of vertices or, the set of all nodes in a given graph and E, is the set of all the edges between these nodes, or the associations between them. So, |V| denotes the number of nodes in a graph and |E| denotes the number of edges. Let us take up a small example to understand these terms –

Undirected and Directed Graph

Undirected and Directed Graph

As you can see, we have two types of graphs, Directed and Undirected Graphs. In Undirected Graphs,the direction for an edge is not defined. This is generally used to indicate that the edge is actually bi-directional in nature, i.e, one can go from V1 → V2, as well as from V2 → V1. This is used to represent the graph where the states (nodes) are re-doable, such as, in a Rubik’s Cube, you can go from one configuration of the cube to the other as well as the vice-versa.

The other type, the Directed Graph restricts the… “traversal”, if you say… to only one direction. For example, in the Snakes and Ladders game, you can play dice and go from Position 5 → Position 10, but you can’t roll the dice such that it gets you from Position 10 ← Position 5… That’s obvious…! So, it really depends on the requirement of the situation, which graph you choose. Generally, we only have to deal with graphs which are purely Directed or purely Undirected. We do not talk about the hybrid type. Some of the other type of graphs are shown below –

Types of Graphs

Types of Graphs

All the graphs which we have discussed till now are Simple Graphs, they do not contain any loops. The ones which do contain loops are Non-Simple. A given graph G can be drawn in any way as long as the sets V and E remain the same. Such a graph is shown in the figure. The same graph is just drawn differently, they both have the same set of vertices and edges. Such graphs are called as Isomorphic Graphs, as the name suggests “iso” means “same”, “morphic” means “shape”, the graphs which have the same shape.

All these Graphs are Connected Graphs, i.e., for any given pair of vertices V1 and V2 ∈ V, there always exists a path between these two vertices. The last graph is the Weighted Graph. Here, the edges are given “weights”. To understand a Weighted Graph, you can think of the vertices as cities and the edges as the distance between them (so they will have some value). You could be asked the shortest path between two cities. So in the context of a Weighted graph, the shortest path may not be the one with least edges. One must keep that in mind.

Now, we will look at the way the graphs are implemented. There are mainly two ways to implement graphs, they are –

  • Adjacency Matrix
  • Adjacency List

Adjacency Matrix

It is a very simple representation where we take a two-dimensional matrix of the size |V| × |V|, i.e., the declaration in C would look like,

int AdjacencyMatrix[V + 1][V + 1];

As the arrays are zero-indexed, we generally prefer to keep the sizes V + 1.Now, initially all the values would be zeroes. We put ‘1’ in an element of the two-dimensional array, AdjacencyMatrix[i][j], if there exists an edge between Vi → Vj. Remember, arrays are zero-indexed. Let us take an example to understand this,

Adjacency Matrix

Adjacency Matrix

I hope it is clear from the example, how we can represent the graph using an Adjacency Matrix. It is very easy to code. All you have to do is create a two-dimensional matrix and assign the values, so, I won’t post the code, but if you have any doubts regarding the code, feel free to comment them. The Adjacency Matrix has its pros and cons which we will discuss shortly.

Adjacency List

It is an array of pointers of size |V| + 1, where each pointer points to a linked list. The linked list holds the nodes which are adjacent to the ith vertex. By adjacent, we mean those vertices that can be accessed from ith node by making a single move. It is a little complex implementation if you are uncomfortable with linked lists. But it is this implementation that we will use for Graph Search Algorithms. This is because we can easily and efficiently know which of the vertices of V are neighbours of a given vertex. And that is what you want, if your are to do a Graph Search. Now, let us take an example to understand the concept of Adjacency Lists.

Adjacency List

Adjacency List

I hope the sketch has made it clear as to how we use the Adjacency List to implement a Graph. As I mentioned before, it is one of the most versatile implementations of the Graph Data Structure. Now, try to code the implementation in C, or any language you like. Please try this at least once by yourself so that you can get brain deep into the Graph Data Structure. It is only a linked list. And one more thing, don’t get confused between pointer-to-an-array and array-of-pointers…! So, if we put the value of |V| + 1 in a variable ‘vertices‘, now the declaration of our Adjacency List would be,

struct node * adjacency_list[vertices];

Remember this is an “array of pointers”, the declaration for “pointer to an array” would look like,

struct node (* adjcency_list)[vertices];

This is due to operator precedence. Some people make mistakes here, so, watch out! If you want to code in C++, we can have a vector where each element of the vector can hold another vector or a list, so, the declaration would be, like,

vector< list< int > > adjacencyList(vertices + 1);

Try to code for an hour or two… If you don’t get the code, it’s ok…! I’ve put my C code below. Those who got the code right… You are fabulous…! 😉 … But please take a minute to go through my code, I might have written a slightly better code than yours.

CC++ STLJavaC#
    
    
    
    

Other implementations of Adjacency List

Comparison between Adjacency Matrix and Adjacency List

Adjacency Matrix Adjacency List
For a Directed Graph, it consumes O(|V|2) space which is often under-utilized in the implementation. For an Undirected Graph also, it consumes O(|V|2) space which is also under-utilized as the generated matrix is symmetric about diagonal and values just repeat. For a Directed Graph it consumes O(|V| + |E|) space which is less, and is utilized optimally. For an Undirected Graph it consumes O(|V| + 2 ✗ |E|) space, which is O(|V| + |E|), which is less.
As the memory allocation is contiguous, data access is slightly faster. It is basically a Linked List, so memory allocation is not contiguous, hence traversal is slow as as compared to the traversal in an array. However, this can be eliminated if we use a Vector of C++ STL Library in the place of the Linked List. But C++ STL Vector is not recommended for graphs that keep growing.
Inserting an edge takes O(1) time, i.e., constant time. Therefore, inserting |E| edges take O(|E|) time, as direct access is available. Inserting an edge can take O(|E|) in the worst case, which is, there is an edge between every pair of vertices, one must traverse the whole Linked List over-and-over causing insertion of |E| nodes to take O(|E|2) time. However, if one follows Head Insertion, inserting a node will take O(1) time, and inserting |E| nodes will take O(|E|) time.
Deleting an edge is very simple, we just set the corresponding element value to 0, which takes O(1) time. Deleting an edge is time-taking as, one would have to traverse the Linked List, which is slow (non-contiguous). In the worst case, it takes O(|E|) time.

 

With the knowledge of Adjacency Matrix and Adjacency List alone you won’t be able to do the competitive programming questions, because you do not know how to explore your graph to get the desired result. For this, you need to know the Breadth First Search (BFS) Algorithm and the Depth First Search (DFS) Algorithm. This post is only to give an introduction. Feel free to comment your doubts..! Keep practising…. Happy Coding…! 🙂

36 thoughts on “Graph Theory Basics

  1. Where you’ve written:

    list< vector > adjacencyList;

    Shouldn’t it be:

    vector< list > adjacencyList(vertices);

    Because this is an array (vector) of |V|+1 (verticies) linked lists (list)?

  2. Hi Vamsi,

    thank you for this great tutorial on Graph theory, since I’m novice to programming I’m still trying to grasp my mind over this, but I believe I’m on the track 🙂
    I’ve just wanted to notice you that the link to:
    ” Adjacency List in C# ” (http://theoryofprogramming.com/adjacency-list-in-c-sharp/) is broken, but that it can be reached through search at the following link http://theoryofprogramming.com/adjacency-list-implementation-in-c-sharp/

    That’s it for now and I’m sure going to spend a lots of time on these pages. And thanks once more to both of you guys!

  3. Hello, thank you for this introduction, helped me a lot. Could you add a compiler and execute on your page if possible? I tried in other sites but couldnt get it to work

  4. Great Tutorial for graph beginners.Kudos to you and your team for such an elaborative and thorough tutorial. 🙂

  5. Sir,i guess in undirected graph set of edges will be reflexive,i.e. both{1,2} and {2,1} will be different edges as it is undirected,it can go either way.Correct me if I am wrong.Thanks in advance 🙂

    • Consider a complete graph (graph where there is is an edge between a pair of distinct vertices)… So in the linked list corresponding to a vertex, you would have |E| entries… As I stated in my post, if you follow head insertion, inserting |E| items into a linked list by head insertion takes O(|E|) time (because head insertion takes O(1) time and we are inserting O(|E|) elements)… Now, if you want that list to be sorted, you would have to traverse the linked list for the appropriate position to insert. In the worst case, you would traverse the whole list every time and insert it at the tail… Think of it as |E| tail insertions which will take O(|E|2) time… Logically, the second method is a little like insertion sort over |E| items. Think about it for a while, I’m sure you’ll get it 🙂

  6. Hi Vamsi,

    Thanks for the tutorial.

    Is there any specific reason for using vecor< list <pair > > > for adjacency list?

    we could use array of vectors like vector< pair> adjList[nodes +1];

    or vector<vector>>

    • Yes, you can do that… I used a vector of lists because I wanted it to resemble an adjacency list “visually” as much as possible. In the days of C, we learn adjacency list as an array pointers, pointing to linked lists, so here, the vector resembles that array, and the list resembles the linked list… But then, you can implement it as you seem fit… 🙂

  7. can you please tell me how to create adjacency list of character
    Number of nodes = 5
    A,G,T,a,b (nodes)
    Number of edges = 4;
    there is a edge between A and G;
    there is a edge between A and a;
    there is a edge between T and G;
    there is a edge between a and b;

    • Hi, Pravin..! 🙂 … Here’s the modified code to create a graph of character nodes using adjacency list. As you have mentioned in your input that you want to create nodes for upper and lower case letters, I created adjacency list just big enough to accommodate them (26 + 26)… And then I map the upper case letters to the first 26 indices and the lower case letters to the last 26 indices… Try to run the code which I have given you… I’m sure you’ll understand… Feel free to comment if you have any more doubts 🙂

      • thank you
        But I have two more doubt
        1) what if I add two more edges of integer say
        1->2 and 2->3 in the same modified program by you.?

        2) And can’t I eliminate printing of which I don’t have nodes in my adjacency list as I am not having node Z but in code provided by you printing the node of which I don’t have use.

        • You can have nodes for integers too… You could start using adjacency list index 52 onwards for your integers… Then your integer nodes get mapped to (integerValue + 52)th index… The point is you can have anything in your adjacency list… You only need to know how to map them properly. Regarding your second point… You can just cancel the inner loop by putting a condition before it… Something like –

          if (adjacencyList[i] == null) {
              continue;
          }
          

          With this, you can avoid printing the nodes which don’t have edges… But, in that case, ‘G’ won’t get printed for the test case in your previous comment… If you want that to be printed too, then you’ll have to keep a track of what was entered. Then your code gets clumsy and uses more memory… It’s your wish… Feel free to comment if you have any more doubts 🙂

  8. There is typo that I found 😛 . As you said you have used Camel Case naming convention . So the Function that you have declared “AddEdge()” , I think it should be addEdge ().

  9. Pingback: Breadth First Search Algorithm

  10. Pingback: graph theory | programmingalgos

  11. Can you tell more about what are you doing in the ‘add’ function? And how the returned value is treated by ‘adjacency_list[v]’?

    • Sure..! Now, if you look at the Adjacency List diagram, you will see that adjacency_list[v] stores the address of the first node. That is, it acts as a head pointer to the linked list. Now, when we call add() function, we pass this value, the address of the first node to the head variable. So, now, head points to the first node.
      null
      So inside the add() function, the situation looks like –
      null
      Now, we create a new node and assign put some values into it –
      null
      The blue part of the node indicates the next variable whereas the white part indicates the val variable. Now, we assign head to p->next. Then, p is returned. So, the situation becomes –
      null
      As you can see, p->next is pointing to head and the value of p is assigned to adjacency_list[v], which is the address of the newly allocated node. So, essentially, the new node is at the starting of the previous linked list. This is head insertion. This is what we do in add() function. Take your time and think properly for a while and I’m sure you’ll understand what’s happening…! Feel free to ask if you have any more doubts… 🙂

  12. In the below statement there is a small mistake.it should be “In UnDirected Graphs the direction……..”
    “In Directed Graphs the direction for and edge is not defined. This is generally used to indicate that the edge is actually bi-directional in nature, i.e, one can go from Node 1 to Node 2 as well as Node 2 to Node 1.”

Leave a Reply