^{2}) examples. In the comments O(log n) was briefly discussed.

In this tutorial we're going to dive deeper down the rabbit hole of time complexity and look at O(log n), O(n log n), O(n!), and the math behind Big O.

--

**O(log n)**

Consider a traditional linear algorithm:

int n = 10; for(int i = 1; i <= n; i++){ //n cout << i << endl; }

A graph of its growth rate looks like this:

For the algebraically inclined, this f(x) = x. There is a direct correlation between the input and the output: one to one.

Now consider a log(n) algorithm:

int n = 10; for(int i = 1; i <= n; i*= 2){ //log n cout << i << endl; }

A graph of a log(n) growth rate:

What makes this log(n)? First a definition of what exactly a logarithm is:

Quote

In mathematics, the logarithm of a number to a given base is the exponent to which the base must be raised in order to produce that number.

In computing, the base is usually two [binary logarithm], but in Big O, the base is ultimately irrelevant.

For our function above, 'i' doubling each iteration. In the linear example, 'i' is incrementing by one each iteration.

Taking a look at each output:

Quote

//linear

1

2

3

4

5

6

7

8

9

10

//log(n)

1

2

4

8

1

2

3

4

5

6

7

8

9

10

//log(n)

1

2

4

8

We arrived at the "destination" much faster with our second algorithm. Specifically in less then half the "time" [10 versus 4]. If we were to graph these two functions together to see the correlation, we would see something akin to this:

The math behind it:

log

_{2}(10) = ~3.3 [2

^{3.3}~= 9.849]

We had 4 iterations, thus, the above function is roughly log(n). Note: the larger the data set, the more accurate these calculations would be.

**O(n log n)**

This growth rate is indicative of nested loops. Remember that a loop "inside" of another loop multiplies its growth rate with its "partner". Thus, an example of O(n log n) would be a combination of our previous two examples:

int n = 10; for(int i = 1; i <= n; i++){ //n for(int i = 1; i <= n; i*= 2){ //log n cout << i << " "; } cout << endl; }

Output:

Quote

1 2 4 8

1 2 4 8

1 2 4 8

1 2 4 8

1 2 4 8

1 2 4 8

1 2 4 8

1 2 4 8

1 2 4 8

1 2 4 8

1 2 4 8

1 2 4 8

1 2 4 8

1 2 4 8

1 2 4 8

1 2 4 8

1 2 4 8

1 2 4 8

1 2 4 8

Notice we have 'n' sets of 'log n' data.

A graph of this growth rate:

In and of itself that may not be very helpful, so let's overlay 'n' and log n for comparison:

**O(n!)**

The factorial growth rate is often confused with the actual calculation of a number's factorial:

//a naive implementation long factorial(long num){ if(num == 1) return 1; //assumes a number >= 1 --0! also equals 1 return (num*factorial(num-1)); }

The above factorial finder is actually O(n), for example:

initial parameter: 5

5*(5-1)

5*(4)*(4-1)

5*(4)*(3)*(3-1)

5*(4)*(3)*(2)*(2-1)

In terms of the number of iterations, the above function will never execute more then its initial parameter, thus, it has a linear growth rate. Now, the factorial it's calculating on the other hand, grows massively. I'm not a fan of using the word in the definition, but the variable does have a n! growth rate.

This begs the question, what type of problem does have a O(n!) solution?

Answer: A brute force of the traveling salesman problem.

There are 'n' paths for each city. From the next choice of those 'n' paths, there are (n-1) paths. From the next node there are (n-2) paths. Notice a pattern yet? Assuming we don't try to optimize (by the way, solving this NP-Hard problem is way outside the scope of this tutorial) the growth rate would be n!.

Considering that for a route of 15 cities, there are 1,307,674,368,000 path choices, brute forcing is most definitely not the way to go.

A visualization of n!'s growth rate:

Note: I used the gamma function to graph over the real number set, we are only concerned with the growth rate starting at x = 1. Watch it rocket up, as we zoom out to a much larger data set:

To sum up the use of the factorial growth rate...don't!

The Math of Big O

The Math of Big O

At the most basic level, we are looking to see how our function's runtime/number of iterations grows as the data-set passed to it gets larger. You may recall from algebra that we often compared tables of x and y values to ascertain the relationship between them: slope, x and y intercepts, etc... This same principle/approach can work for Big O as well. We did it in the first couple of example by examining the output in regards to the input.

In any case, you are comparing actual performance with your chosen metric of evaluation. Ensure that the baseline (or control group) is accurate before you do any comparisons, otherwise the entire exercise is moot. For example, using the "real" time of execution may not be the best choice due to varying computer architecture, processing power/speed, so on and so forth. It would be more advantageous to determine a platform agnostic time complexity [is my algorithm linear, exponential, etc...] and move on from there.

I haven't found it necessary [yet, I suppose] to calculate actual values (like in the above log n example). Once you recognize the pattern, an exact interpretation generally isn't needed and violates the spirit of Big O. Big O is meant to provide an upper bound for a function, the idea being that you know what threshold your algorithm won't cross, either in terms of time or space complexity.

--

Happy Coding!