I am currently implementing the infamous Kosaraju algorithm for computing the top 5 strongly connected components in a directed graph.

**Given an input file:**Get input file here

Quote

The file contains the edges of a directed graph. Vertices are labeled as positive integers from 1 to 875714. Every row indicates an edge, the vertex label in first column is the tail and the vertex label in second column is the head (recall the graph is directed, and the edges are directed from the first column vertex to the second column vertex). So for example, the 11th row looks like : "2 47646". This just means that the vertex with label 2 has an outgoing edge to the vertex with label 47646

**Output Format:**

Output the sizes of the 5 largest SCCs in the given graph, in decreasing order of sizes, separated by commas.

I get an implementation error for StackOverflow if I use the full input file size. However, if I reduce it to only a few lines things work out! So clearly - the array is not able to hold all the numbers in the input file!

Quote

Exception in thread "main" java.lang.StackOverflowError

at Kosaraju.dfs(Kosaraju.java:61)

at Kosaraju.dfs(Kosaraju.java:67)

at Kosaraju.dfs(Kosaraju.java:61)

at Kosaraju.dfs(Kosaraju.java:67)

The main variables are:

static int[] fiveLargestSCCs = {0, 0, 0, 0, 0}; // this array will be modified static int n = 875714; // total # of vertices in graph static int timer = 0; static boolean[] isExplored; // array holding marked vertices static int[] markGraph = new int[n+1]; static int countSCC; static Map<Integer, List<Integer>> adj = new HashMap<Integer, List<Integer>>(); // adjacency list of G static Map<Integer, List<Integer>> revAdj = new HashMap<Integer, List<Integer>>(); // adjacency list of reversed G

Read input file into adjacency lists and perform Kosaraju on graph G:

public static void main(String[] args) throws IOException { // Scanner in = new Scanner(new File("SCC.txt")); FileReader fr = new FileReader( "SCC2.txt" ); BufferedReader br = new BufferedReader( fr ); String line; while ( ( line = br.readLine() ) != null ) { String[] s = line.trim().split(" "); System.out.println("S[0] : " + s[0].trim()); int startPoint = Integer.parseInt(s[0]); int endPoint = Integer.parseInt(s[1]); if (!adj.containsKey(startPoint)) adj.put(startPoint, new ArrayList<Integer>()); adj.get(startPoint).add(endPoint); if (!revAdj.containsKey(endPoint)) revAdj.put(endPoint, new ArrayList<Integer>()); revAdj.get(endPoint).add(startPoint); } br.close(); fr.close(); isExplored = new boolean[n+1]; for (int i = n; i >= 1; i--) if (!isExplored[i]) dfs(revAdj, i, 0); isExplored = new boolean[n+1]; for (int i = n; i >= 1; i--) if (!isExplored[markGraph[i]]) {countSCC = 0; dfs(adj, markGraph[i], 1); pushSCC(countSCC); } for (int i = 0; i <= 4; i++){ if (i != 0) System.out.print(","); System.out.print(fiveLargestSCCs[i]); } }

The DFS function:

static void dfs(Map<Integer,List<Integer>> graph, int currentPoint, int typeOfProcess){ // typeOfProcess = 0: dfs on the reversed graph; typeOfProcess = 1: dfs on the original graph. isExplored[currentPoint] = true; if (typeOfProcess == 1) countSCC++; if (graph.containsKey(currentPoint)) { Iterator<Integer> endPointIt = graph.get(currentPoint).iterator(); while (endPointIt.hasNext()) { int endPoint = endPointIt.next(); if (!isExplored[endPoint]) dfs(graph, endPoint, typeOfProcess); } } if (typeOfProcess == 0){ timer++; markGraph[timer] = currentPoint; } }

Update the array with SCC sizes:

static void pushSCC(int scc){ if (scc<=fiveLargestSCCs[4]) return; fiveLargestSCCs[4] = scc; for (int i = 3; i >= 0; i--) if (fiveLargestSCCs[i]<fiveLargestSCCs[i+1]) { int t = fiveLargestSCCs[i+1]; fiveLargestSCCs[i+1] = fiveLargestSCCs[i]; fiveLargestSCCs[i] = t; } }

To repeat, the error occurred at the following line:

if (!isExplored[endPoint]) dfs(graph, endPoint, typeOfProcess);

Any suggestions?!