First, a definition:
Quote
The wiki article from which this is taken is an excellent reference if you need to quickly know the Big O of common/popular algorithms. What it absolutely fails to do is explain how one determines the actual Big O [I will use this term interchangeably with upper bound or limit]. For those not mathematically inclined, their eyes probably glaze over when reading the symbol infested portions. At running the risk of repeating myself, this article is simply here to show another way, one I think is easier (i.e. you could have a solid grasp on the concept without ever taking Calculus).
Some Basic Rules:
1. Nested loops are multiplied together.
2. Sequential loops are added.
3. Only the largest term is kept, all others are dropped.
4. Constants are dropped.
5. Conditional checks are constant (i.e. 1).
That's it really. I used the word loop, but the concept applies to conditional checks, full algorithms, etc.. since a whole is the sum of its parts. I can see the worried look on your face, this would all be frivolous without some examples[see code comments]:
//linear for(int i = 0; i < n; i++) { cout << i << endl; }
Here we iterate 'n' times. Since nothing else is going on inside the loop (other then constant time printing), this algorithm is said to be O(n). The common bubble-sort:
//quadratic for(int i = 0; i < n; i++) { for(int j = 0; j < n; j++){ //do swap stuff, constant time } }
Each loop is 'n'. Since the inner loop is nested, it is n*n, thus it is O(n^2). Hardly efficient. We can make it a bit better by doing the following:
//quadratic for(int i = 0; i < n; i++) { for(int j = 0; j < i; j++){ //do swap stuff, constant time } }
Outer loop is still 'n'. The inner loop now executes 'i' times, the end being (n-1). We now have (n(n-1)). This is still in the bound of O(n^2), but only in the worst case.
An example of constant dropping:
//linear for(int i = 0; i < 2*n; i++) { cout << i << endl; }
At first you might say that the upper bound is O(2n); however, we drop constants so it becomes O(n). Mathematically, they are the same since (either way) it will require 'n' elements iterated (even though we'd iterate 2n times).
An example of sequential loops:
//linear for(int i = 0; i < n; i++) { cout << i << endl; } //quadratic for(int i = 0; i < n; i++) { for(int j = 0; j < i; j++){ //do constant time stuff } }
You wouldn't do this exact example in implementation, but doing something similar certainly is in the realm of possibilities. In this case we add each loop's Big O, in this case n+n^2. O(n^2+n) is not an acceptable answer since we must drop the lowest term. The upper bound is O(n^2). Why? Because it has the largest growth rate (upper bound or limit for the Calculus inclined).
Finite loops are common as well, an example:
for(int i = 0; i < n; i++) { for(int j = 0; j < 2; j++){ //do stuff } }
Outer loop is 'n', inner loop is 2, this we have 2n, dropped constant gives up O(n).
In short Big O is simply a way to measure the efficiency of an algorithm. The goal is constant or linear time, thus the various data structures and their implementations. Keep in mind that a "faster" structure or algorithm is not necessary better. For example, see the classic hash table versus binary tree debate. While not 100% factual, it often said that a a hash-table is O(1) and is therefore better then a tree. From a discussion on the subject in a recent class I took:
Quote
That said, hash-tables aren't purely O(1). Poor choices in hash algorithm or table size, and issues like primary clustering, make operations on hash-tables in worse-than-constant time in reality.
The point is, saying "hash-tables are superior to trees" without some qualifications is ridiculous. But then, it doesn't take a genius to know that sweeping generalizations are often problematic.
The above is always something good to keep in mind when dealing with theoretical computer science concepts. Hopefully you found this both interesting and helpful. Happy coding!