11 Replies - 3619 Views - Last Post: 10 August 2012 - 01:26 PM

#1 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3549
  • View blog
  • Posts: 10,993
  • Joined: 05-May 12

Big O(): singly linked list vs. tree insertion

Posted 09 August 2012 - 01:44 PM

After the recent discussion about the C# List<T> insertion big O characteristics, and the latest question in C# about linked lists, it made me recall what I was taught in school: Insertion into a linked list is O(1), while insertion into a tree is normally O(log n). No problem there, but I also further recall that with a tree, if all the nodes are in one big long chain, then it exhibits O(n).

So here's the question: If it's a big long chain, then it's essentially a linked list. So why is it O(n) if the structure is called a tree, but O(1) if the structure is called a linked list?

I was doing some more digging, I noticed that Wikipedia now has this interesting detail for insertion in the middle table on the linked list page:

Quote

search time + O(1)


For a singly linked list, isn't the search time going to be O(n)?

I guess the other way I can phrase my question is, why does the big O for a tree insertion include the operations needed to search for the insertion point and the pointer swaps while the big O for a linked only counts the pointer swaps, but not the search for the insertion point?

Any insight would help as to why this subject matter is taught this way would be helpful.

BTW, one of the links from Wikipedia article was a great read: Number crunching: Why you should never, ever, EVER use linked-list in your code again

This post has been edited by Skydiver: 09 August 2012 - 01:47 PM


Is This A Good Question/Topic? 0
  • +

Replies To: Big O(): singly linked list vs. tree insertion

#2 jon.kiparsky  Icon User is online

  • Pancakes!
  • member icon


Reputation: 7644
  • View blog
  • Posts: 12,890
  • Joined: 19-March 11

Re: Big O(): singly linked list vs. tree insertion

Posted 09 August 2012 - 01:56 PM

It's a communication issue, not an algorithms issue. Insert doesn't mean the same thing in the two cases. In a simple linked list, you can insert by inserting: just do the insert at the head of the list. In an unsorted binary tree (a what? I know, but it could happen) you would still need to navigate from head to a leaf, which requires log(n) traversal*. So to make the cases comparable, you'd need to talk about a sorted linked list, in which case insert is O(N) for the list and O(log N) for the tree.

Basically, the language of big-O is fine, but it's often used in a pretty sloppy fashion, as you see here.

*assuming the tree somehow magically ends up balanced!
Was This Post Helpful? 1
  • +
  • -

#3 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3549
  • View blog
  • Posts: 10,993
  • Joined: 05-May 12

Re: Big O(): singly linked list vs. tree insertion

Posted 09 August 2012 - 03:21 PM

A binary tree only requires the nodes having 0, 1, 2 child nodes, and no cycles. There is no requirement that there be any "sorting" applied.

In an an unsorted binary tree one could just as easily be inserting at the root instead of at a leaf like the simple linked list is inserting at the front of the list. Wouldn't make for a balanced tree, but it could represent the turns taken by an explorer in a maze.
Spoiler


But this is an atypical usage. Perhaps it's that a binary tree is typically sorted and so one typically has to traverse it to insert and so we get O(log n), while a singly linked list is typically used for insertions at the head and so we get O(1).

But that still doesn't account for the O(1) claim for inserting into the middle of linked list (be it singly or doubly linked). Or is that claim of O(1) for insertion in the middle incorrect?
Was This Post Helpful? 0
  • +
  • -

#4 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3549
  • View blog
  • Posts: 10,993
  • Joined: 05-May 12

Re: Big O(): singly linked list vs. tree insertion

Posted 09 August 2012 - 03:33 PM

Or is the O(1) claim for a linked list made when comparing to insertion into an array which is O(n)? The insertion operation itself is touch only 2 (or 4) pointers for the linked list versus moving at most n items for the array.

If that's the case, then shouldn't it be an apples to apples comparison where inserting into a binary tree should also be compared to insertion into an array? The insertion operation for the tree is also touch only 1 pointer for the tree, versus moving at most n items for the array.
Was This Post Helpful? 0
  • +
  • -

#5 blackcompe  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 1152
  • View blog
  • Posts: 2,530
  • Joined: 05-May 05

Re: Big O(): singly linked list vs. tree insertion

Posted 09 August 2012 - 03:55 PM

Quote

So here's the question: If it's a big long chain, then it's essentially a linked list. So why is it O(n) if the structure is called a tree, but O(1) if the structure is called a linked list?


Insertion into an unbalanced tree is O(n) is the worst case (degenerated to a linked link). It's O(log(n)) for a balanced tree (AVL, Red-Black, B-Tree) in the worst case. As stated, inserting into a linked list is a constant operation, since your appending onto the tail pointer, which is known at all times. Note that I am referring to binary search trees.

Quote

I was doing some more digging, I noticed that Wikipedia now has this interesting detail for insertion in the middle table on the linked list page:


That's appropriate since it takes searchTime (O(n)) to find the node at that index and constant time to switch around the pointers to include a new node at that index to or delete a node at that index.

Quote

I guess the other way I can phrase my question is, why does the big O for a tree insertion include the operations needed to search for the insertion point and the pointer swaps while the big O for a linked only counts the pointer swaps, but not the search for the insertion point?


You're overcomplicating it. Don't pay attention to swapping pointers. All you need to know is that it's done in constant time. Focus on the number of repeating steps.

Its better to use a popular Algorithms book than to depend on Wikipedia (if your a novice). User generated content can be a bit confusing at times -- that's not to say it's wrong. For instance, I wouldn't have wrote searchTime + O(1) for inserting/delete into the middle. I would have simply put O(n), and I'm sure many people would put the same. However, given you have a reference to the node that's in the previous position relative to where you'd like to insert/remove, searchTime is O(1), and thus the entire operations is O(1). So, technically, I'd be wrong. Hopefully that correctly addresses the source of your confusion.

Quote

BTW, one of the links from Wikipedia article was a great read: Number crunching: Why you should never, ever, EVER use linked-list in your code again


And this is a perfect example of what I was talking about. The following was taken from that page:

Quote

http://en.wikipedia...._(C%2B%2B)#List Vectors are inefficient at removing or inserting elements other than at the end. Such operations have O(n) (see Big-O notation) complexity compared with O(1) for linked-lists. This is offset by the speed of access — access to a random element in a vector is of complexity O(1) compared with O(n) for general linked-lists and O(log n) for link-trees.


It's wrong to say insertion/removal from the middle of a linked list is O(1) as I just said. It should be searchTime + O(1). As an aside, with an array-based data structure you'll absolutely never be able to remove in constant time, since you have to shift the elements.

I found this interesting though (also taken from his page):

Quote

A vector on the other hand is all about having data in adjacent memory. An insertion or removal of an item might mean that data must be shuffled, but this is cheap for the vector. Dirt cheap (yes I like that term). The vector, with its adjacent memory layout maximizes cache utilization and minimizes cache line misses. This makes the whole difference, and that difference is huge as I will show you.


Basically, he's arguing that due to "locality of reference," using an (array-based) vector over a linked list is better since nodes can be all over memory, whereas an array's memory is allocated contiguously, and thus be in the cache as one. Remember that cache access time is many magnitudes faster than accessing main memory. One could imagine that even a constant time linked list insertion could take longer than inserting into a vector because of the memory overhead.

This post has been edited by blackcompe: 09 August 2012 - 04:00 PM

Was This Post Helpful? 3
  • +
  • -

#6 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3549
  • View blog
  • Posts: 10,993
  • Joined: 05-May 12

Re: Big O(): singly linked list vs. tree insertion

Posted 09 August 2012 - 04:47 PM

View Postblackcompe, on 09 August 2012 - 03:55 PM, said:

Its better to use a popular Algorithms book than to depend on Wikipedia (if your a novice). User generated content can be a bit confusing at times -- that's not to say it's wrong. For instance, I wouldn't have wrote searchTime + O(1) for inserting/delete into the middle. I would have simply put O(n), and I'm sure many people would put the same.


My two favorite books about data structures and algorithms are buried in boxes right now. Part of the reason why I'm posting here and started of my original post with what I recalled learning: O(1) inserts for linked lists and O(log n) for binary trees. Also having an online reference that I can point to makes it a little easier to make sure that we are reading the same thing and highlight things if I'm just misreading or misinterpreting something.
Was This Post Helpful? 0
  • +
  • -

#7 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3549
  • View blog
  • Posts: 10,993
  • Joined: 05-May 12

Re: Big O(): singly linked list vs. tree insertion

Posted 09 August 2012 - 04:54 PM

View Postblackcompe, on 09 August 2012 - 03:55 PM, said:

As stated, inserting into a linked list is a constant operation, since your appending onto the tail pointer, which is known at all times.


Tail is known at all times for a doubly linked list. For a traditional singly linked list most students are first taught to just have a head pointer. Then after being shown the O(n) inefficiency of appending to the end of the list, they are taught or given an exercise how to manage both a head and tail pointer. And then usually a doubly linked list is introduced.

If all you have is the traditional singly linked list with just a head pointer, then O(1) insertion at the middle doesn't seem possible without the use of extra pointers.
Was This Post Helpful? 0
  • +
  • -

#8 blackcompe  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 1152
  • View blog
  • Posts: 2,530
  • Joined: 05-May 05

Re: Big O(): singly linked list vs. tree insertion

Posted 09 August 2012 - 05:22 PM

Quote

If all you have is the traditional singly linked list with just a head pointer, then O(1) insertion at the middle doesn't seem possible without the use of extra pointers.


Obviously, but you can probably assume the most (time) efficient implementation is used. Pretty much all industrial-grade container libraries use doubly-linked lists.
Was This Post Helpful? 0
  • +
  • -

#9 sepp2k  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 2101
  • View blog
  • Posts: 3,204
  • Joined: 21-June 11

Re: Big O(): singly linked list vs. tree insertion

Posted 09 August 2012 - 05:26 PM

View Postblackcompe, on 10 August 2012 - 02:22 AM, said:

Pretty much all industrial-grade container libraries use doubly-linked lists.


Except list implementations used by most functional languages. Though those have different performance considerations of course.
Was This Post Helpful? 1
  • +
  • -

#10 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3549
  • View blog
  • Posts: 10,993
  • Joined: 05-May 12

Re: Big O(): singly linked list vs. tree insertion

Posted 10 August 2012 - 12:33 AM

I managed to find my third favorite data structures and algorithms book:

Quote

The running time for LIST-INSERT on a list of n elements is O(1).

Cormen, Lieserson, and Rivest. Introduction to Algorithms. MIT. 1990. p. 205

Of course the code he shows is always an insert to the head of the list. Interestingly when they talk about LIST-DELETE, they also say that it an O(1) operation. Then they quickly append that if you need to delete specific item, then it becomes O(n).

(Now, I'm left wondering if I lent my Sedgwick and Tenenbaum books and they aren't buried because they should have been right along side Cormen.)
Was This Post Helpful? 0
  • +
  • -

#11 mojo666  Icon User is offline

  • D.I.C Addict
  • member icon

Reputation: 352
  • View blog
  • Posts: 770
  • Joined: 27-June 09

Re: Big O(): singly linked list vs. tree insertion

Posted 10 August 2012 - 09:55 AM

Expanding on one of Blackcompe's points:

Quote

Compared to other base standard sequence containers (vectors and deques), lists perform generally better in inserting, extracting and moving elements in any position within the container for which we already have an iterator, and therefore also in algorithms that make intensive use of these, like sorting algorithms.

-Quote taken from cplusplus


The important thing to note is that in a list, you are not always in a situation where you need to search. You may be executing an algorithm where you always have the reference to the area where the insert will occur. In such cases, search time is 0.

Contrast this to contiguous memory structures. Here, even though you can also avoid the search, you still have to move all the elements after an insert so such structures still suffer linear runtime.

Technically, the same logic for lists can be said of trees. However, I am unaware of any situation where one tracks his traversal through a tree and inserts at that point. I believe insertions always occur at leaves, thus it is not possible to seperate insertion from a traversal when discussing trees. That is why tree inserts are O(log(n)) average to O(n) worst case, while list inserts depend more on the context of the algorithm that is using it.
Was This Post Helpful? 2
  • +
  • -

#12 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3549
  • View blog
  • Posts: 10,993
  • Joined: 05-May 12

Re: Big O(): singly linked list vs. tree insertion

Posted 10 August 2012 - 01:26 PM

Most tree inserts happen at the leaves, but if your tree construction code knows that it is building a tree, it can often be holding on to the node that is going to become the parent. Example: building quad and oct trees. You don't go and call "addArea()" or "addVolume()" and start at the root of the tree. You have already created one area or volume node and and still above your depth threshold so you subdivide the area or volume and call "addArea/Volume()" passing the current node as the parent. If I recall correctly, there is also a version of dynamic Huffman tree construction that holds on to the last node that was inserted and its parent. This is to avoid traversing the tree for a common case where a node simply needs to get incremented, and if need be promoted.

Thanks for the explanation guys, I think I have a slightly clearer grasp on why the the list inserts into the middle don't always include traversal time, but tree insertions always do.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1