Quick Graph Construction

Introduction


The study of graphs makes a big part of computer science. It plays an important role in competitive programming problems, too. Before delving into graphs algorithms, it is important to know how to build effective graphs in the first place. Forget the full-fledged Graph class with add_edge(), delete_edge(), get_weight() and other functions defined in it, because coding it up is too time-consuming. In contests, you need more efficient ways to construct graphs. I will introduce a few classic techniques in this article, which you can use as templates in graph problems. But before you start reading, make sure you are already familiar with coding conventions in competitive programming and basic properties of graphs. 

0. Input Data Format


In contest problems, graphs have different forms. They can be maps, social networks, traffic flows, etc. However, no matter what real-life entities they depend on, the formats of input data are often very similar. It’s usually the size of the graph tailed by edge data. Here are some input data patterns of different types of graphs. 

Note that real problems might not strictly stick to these formats. But as long as you are familiar with the general pattern, it’s not hard to apply one of the following three methods to construct the corresponding graph. 

1. Adjacency Matrix with Primitive Arrays


This is probably the easiest way to build a graph. Abstractly, a graph is just a set of pairs. If two nodes are connected by an edge, they form a pair. Therefore, we can describe a graph with N nodes by merely storing all the pairs it contains. If you remember the concept of the cartesian product, the grids of an N by N array can store all possible pairings among N items, just like the x-y plane gives you all possible (x,y) tuples. This way, graph construction means simply ticking existing pairs in a 2D array.

What about weighted graph? Just change the ones to edge weights!

For undirected graphs, adjacency matrices are always symmetrical. However, it’s not necessarily so in directed graphs. 

Pretty simple, right? However, adjacency matrix trades efficiency for convenience. If your graph has more than 1,000 nodes, this method might give you an MLE (Memory Limit Exceed) or TLE (Time Limit Exceed). Since an entire row has to be traversed to find a node’s outgoing edges, adjacency matrix can compromise the optimal runtime of many graph algorithms. For example, BFS and DFS in adjacency matrices have a runtime of O(|V|^2) rather than O(|V|+|E|). Hungarian algorithm with adjacency matrices have a runtime of O(|V|^3) rather than O(|V||E|). Only use adjacency matrix if you have a small graph. Finally, let’s see the code to construct a graph with an adjacency matrix. 

Consider this input.

You can construct the graph like this.



#include <cstdio>
#define maxn 505 //Since n <= 500, 505 should be enough for every test ease.
using namespace std;
int n,m,g[maxn][maxn]; //Define adjacency matrix

int main(){
  scanf("%d%d", &n, &m); //Read n and m 
  //read the edges
  for(int i = 1; i <= m; i++){
    int x,y,z; scanf("%d%d%d", &x, &y, &z);
    g[x][y] = z;
  }
  return 0;
}


That concludes our discussion of adjacency matrices. Again, if any style-wise stuff of the above code confuses you, you might find this article about contest coding style useful.

2. Adjacency List with Struct and Vector Array


Since adjacency matrices only works for small graphs, we need a technique that can handle graphs with a large number of nodes. This is when adjacency list comes under the spotlight. Rather than listing all possible pairs and leaving lots of empty memory, adjacency list only stores the pairs that exist.  

How to store this data structure in C++? Fixed-size arrays would make us fall back to adjacency matrices: the list has to be stored by a resizable container. STL vector would be a good choice. Consider an array of vectors called g.

And there’s not much of a difference for directed graphs, for which the list only stores outgoing edges.

Up to this point, our adjacency list can only store unweighted graphs. Each vector contains a list of adjacent nodes’ indices. What if each edge has a weight? Then we need to encapsulate edge data with the help of struct.



struct Edge{
  /*
  to: The node to which an edge is pointing to
  w: Edge weight
  */
  int to, w; 
};


Now, we can use an array of vector<Edge> to store a weighted graph. 

With that in mind, let’s write some C++ code to build a graph with adjacency list. Consider this example. 



#include <cstdio> 
#define maxn 10005

using namespace std;
int n,m;
struct Edge{int to, w;};

vector<Edge> adj[maxn];

int main(){
  scanf("%d%d" &n, &m); //Read graph size
  //Read edges
  for(int i = 1; i <= m; i++){
    int x,y,z; scanf("%d%d%d", &x, &y, &z);
    adj[x].push_back((Edge){y,z});
    /*
      If the graph is undirected, we should add this edge to both nodes it connects.
      adj[y].push_back((Edge){x,z});
    */
  }
  return 0;
}


Again, feel free to check out this article if any of the above code confuses you.

3. Adjacency List with Primitive Arrays


Welcome to the most sophisticated graph construction method! If you are new to competitive programming, the previous two techniques should be enough for now. This last technique specializes in solving more advanced graph problems such as network flow, where there is a need to access reverse edges or reconstruct paths quickly. Since it only uses primitive arrays, this method is often preferred by some old school contestants.

The basic idea of this technique is to store all the edges in a big array, hook up edges from the same node with linked lists, then use another array called head to keep track of the starting edge of each linked list. I understand this might sound confusing. Let’s walk through the idea step by step. 

Firstly, as we read the input data, store the edges inside an array called arr. This array is nothing but a big warehouse of edges. Its purpose is to allow quick access of edge data through simple array indices instead of pointers. 

Our second step is to connect adjacent nodes with a linked list. This name might remind you of complicated error-prone pointer operations. However, don’t forget that we are talking about contest coding. EVERYTHING HAS TO BE SIMPLE! Here, since we used the “edge warehouse.” We can build a linked list by simply adding a variable called next to our struct declaration, which is the warehouse index of the next edge. 



struct Edge{
  int to, w;
  //This variable stores the index of the next edge in arr (the edge warehouse)
  int next; 
}


To indicate the end of the list, we can set the ‘next’ value of the last edge in a list to -1. That way, by starting from a head edge and keep jumping to the edge indicated by the previous edge’s next value until a -1 is found, we can traverse all the outgoing edges of a particular node.

Finally, don’t forget to use a head array to store all the starting edge of the linked lists. 

Time for some code. For this input data.



#include <cstdio>
#define maxn 10005
#define maxm 100005

using namespace std;
int n,m,arr[maxm], head[maxn];
struct Edge{int to, w, next;};

int main(){
  //Read graph size
  scanf("%d%d" &n, &m); 
  
  //Initialize head array with our "End of List" indicator: -1
  memset(head,-1,sizeof(head));
  
  //Read edges
  for(int i = 1; i <= m; i++){
    int x,y,z; scanf("%d%d%d", &x, &y, &z);
    
    //Add the edge to the edge warehouse
    arr[i] = (Edge){y,z,head[x]};
    
    //Update head
    head[x] = i; 
  }
  return 0;
}


Notice that the above code is just a basic version of method 3. Contestants sometimes add "plugins" to it to make it suit different graph problems. If you find this last method to be unnecessarily obscure, don’t worry. Just leave it for now and come back when you actually need it for some problems. 

Summary


The three quick graph construction techniques we talked about apply to different problems. Adjacency matrices are super convenient for small graphs,  adjacency list 1 is less convenient but can handle most graph problems, and adjacency list 2 is the most troublesome but can be critical in some advanced graph algorithms. 


Copyright © 2019 by Kangmin Tan. All rights reserved.

For permission to reprint articles from this site, please contact kangmint@usc.edu.

Categories contest
%d bloggers like this: