[Challenge] Finding ith Order Statistics Algorithm

  • (2 Pages)
  • +
  • 1
  • 2

17 Replies - 8693 Views - Last Post: 06 March 2014 - 11:37 PM

#1 Dogstopper   User is offline

  • The Ninjaducky
  • member icon

Reputation: 2972
  • View blog
  • Posts: 11,223
  • Joined: 15-July 08

[Challenge] Finding ith Order Statistics Algorithm

Post icon  Posted 07 February 2014 - 07:59 AM

Hey all!

This week's algorithm design and analysis will be on a topic called Order Statistics. The ith order statistic of a set of n elements is the ith smallest element. We can find the first and nth order statistic (min and max) easily in O(n) time by doing an algorithm like this:

let min <- A[1]
for i = 2 to A.length
    if min > A[i]
         let min <- A[i]
return min



And a similar idea for maximum. However, what I want to know is given an unsorted array A[1...n], can we find any ith order statistic also in O(n)? If so, give an algorithm and prove your statement.

For example, if I gave you A = [4,1,2,7,5] and asked for the 3rd order statistic, I would return 4

Hint: You can't just sort it and return the ith element

Please put your psuedocode in [spoiler ] tags and I'll post the right answer in a week.

Is This A Good Question/Topic? 2
  • +

Replies To: [Challenge] Finding ith Order Statistics Algorithm

#2 cfoley   User is offline

  • Cabbage
  • member icon

Reputation: 2397
  • View blog
  • Posts: 5,030
  • Joined: 11-December 07

Re: [Challenge] Finding ith Order Statistics Algorithm

Posted 07 February 2014 - 08:33 AM

Spoiler


Attempt2:

Spoiler

This post has been edited by cfoley: 07 February 2014 - 11:43 AM

Was This Post Helpful? 1
  • +
  • -

#3 AdamSpeight2008   User is offline

  • MrCupOfT
  • member icon

Reputation: 2298
  • View blog
  • Posts: 9,535
  • Joined: 29-May 08

Re: [Challenge] Finding ith Order Statistics Algorithm

Posted 07 February 2014 - 02:52 PM

Spoiler

This post has been edited by AdamSpeight2008: 07 February 2014 - 02:52 PM

Was This Post Helpful? 1
  • +
  • -

#4 BetaWar   User is offline

  • #include "soul.h"
  • member icon

Reputation: 1649
  • View blog
  • Posts: 8,520
  • Joined: 07-September 06

Re: [Challenge] Finding ith Order Statistics Algorithm

Posted 07 February 2014 - 07:53 PM

Spoiler

Was This Post Helpful? 2
  • +
  • -

#5 TwoOfDiamonds   User is offline

  • D.I.C Regular

Reputation: 54
  • View blog
  • Posts: 272
  • Joined: 27-July 12

Re: [Challenge] Finding ith Order Statistics Algorithm

Posted 08 February 2014 - 12:03 AM

Spoiler

Was This Post Helpful? 2
  • +
  • -

#6 Dogstopper   User is offline

  • The Ninjaducky
  • member icon

Reputation: 2972
  • View blog
  • Posts: 11,223
  • Joined: 15-July 08

Re: [Challenge] Finding ith Order Statistics Algorithm

Posted 08 February 2014 - 09:32 AM

Nice job TwoOfDiamonds! That's right along the lines of what I was going for. There are other ways of doing it while preserving the array, but you nailed the concept.

Note that there is a way to guarantee O(n) even in the worst case, so see if you can figure out how to accomplish that.
Was This Post Helpful? 1
  • +
  • -

#7 cfoley   User is offline

  • Cabbage
  • member icon

Reputation: 2397
  • View blog
  • Posts: 5,030
  • Joined: 11-December 07

Re: [Challenge] Finding ith Order Statistics Algorithm

Posted 08 February 2014 - 10:21 AM

I had the same method as my first attempt. I don't understand why it's O(N), not O(N log N)

To preserve the order:

Spoiler


To guarantee O(N):

Spoiler

Was This Post Helpful? 1
  • +
  • -

#8 Dogstopper   User is offline

  • The Ninjaducky
  • member icon

Reputation: 2972
  • View blog
  • Posts: 11,223
  • Joined: 15-July 08

Re: [Challenge] Finding ith Order Statistics Algorithm

Posted 08 February 2014 - 11:09 AM

Cfoley, solve the recursion. Easy using Master Theorem to show that it's in fact O(n). In order to guarantee O(n) in the worst case, though, I have not seen the correct algorithm yet.
Was This Post Helpful? 1
  • +
  • -

#9 AdamSpeight2008   User is offline

  • MrCupOfT
  • member icon

Reputation: 2298
  • View blog
  • Posts: 9,535
  • Joined: 29-May 08

Re: [Challenge] Finding ith Order Statistics Algorithm

Posted 09 February 2014 - 10:36 AM

Dogstopper If the snippet system was working, I'd could show you QuickSelect I think I named it QuickFind and it'll show that the algorithm is what TwoOfDiamonds described. Not that I'm a bitter reputation hunting whore hound.
Was This Post Helpful? 1
  • +
  • -

#10 Skydiver   User is offline

  • Code herder
  • member icon

Reputation: 7292
  • View blog
  • Posts: 24,679
  • Joined: 05-May 12

Re: [Challenge] Finding ith Order Statistics Algorithm

Posted 09 February 2014 - 09:31 PM

I don't see anything wrong with BetaWar's approach. Radix sort is O(n), so once the data is sorted indexing into the sorted result would be O(1), since O(n) > O(1), the solution would be O(n). Why go through the complexity of implementing a custom QuickSort?
Was This Post Helpful? 0
  • +
  • -

#11 macosxnerd101   User is offline

  • Games, Graphs, and Auctions
  • member icon




Reputation: 12691
  • View blog
  • Posts: 45,880
  • Joined: 27-December 08

Re: [Challenge] Finding ith Order Statistics Algorithm

Posted 09 February 2014 - 09:40 PM

Actually, Radix Sort is O(n log n). Radix Sort is usually written as O(kn); however, k = O(log(n)). Let arr be an array of n elements from {0, ..., n-1}, randomly ordered. Then it takes k iterations, where k = log(n). Think of it this way. If we have elements 0-99, we take 2 iterations (log(100)). If we have elements 0-999, we take 3 iterations (log(1000)). Thus, for n elements, k = O(log(n)).

We can sort an array of bits in linear time, but that is it.
Was This Post Helpful? 0
  • +
  • -

#12 mojo666   User is offline

  • D.I.C Addict
  • member icon

Reputation: 409
  • View blog
  • Posts: 885
  • Joined: 27-June 09

Re: [Challenge] Finding ith Order Statistics Algorithm

Posted 09 February 2014 - 09:46 PM

Quote

I don't understand why it's O(N), not O(N log N)


Spoiler


EDIT

Quote

Actually, Radix Sort is O(n log n). Radix Sort is usually written as O(kn); however, k = O(log(n))


Assuming we have all the elements in the range, this is correct, but n refers to the number of elements in the array, k refers to the largest individual element. It could be that k is way bigger than n making O(kn) much bigger than O(n^2) such as [0, 5, 1090823467892164]

This post has been edited by mojo666: 09 February 2014 - 09:51 PM

Was This Post Helpful? 2
  • +
  • -

#13 Skydiver   User is offline

  • Code herder
  • member icon

Reputation: 7292
  • View blog
  • Posts: 24,679
  • Joined: 05-May 12

Re: [Challenge] Finding ith Order Statistics Algorithm

Posted 09 February 2014 - 10:00 PM

Oh, I think I see where my misunderstanding is... I knew that radix sort was of O(kn), but to me k was always a constant value that depended the size of the key. Since the key size doesn't change it's just the same as O(3n) or O(12n) which is the same as O(n). So for example given a database whose key is is 32-bit integer, the key size will be constant. It seems that my understanding is flawed and I need to study some more.
Was This Post Helpful? 0
  • +
  • -

#14 cfoley   User is offline

  • Cabbage
  • member icon

Reputation: 2397
  • View blog
  • Posts: 5,030
  • Joined: 11-December 07

Re: [Challenge] Finding ith Order Statistics Algorithm

Posted 10 February 2014 - 08:17 AM

Spoiler

Was This Post Helpful? 0
  • +
  • -

#15 Dogstopper   User is offline

  • The Ninjaducky
  • member icon

Reputation: 2972
  • View blog
  • Posts: 11,223
  • Joined: 15-July 08

Re: [Challenge] Finding ith Order Statistics Algorithm

Posted 10 February 2014 - 11:49 AM

BetaWar's approach works very well...with numbers. However, most of the real world applications that we see have complex data structures that can be compared but may not have an value off of which to base a counting/radix sort.

And everyone that has used a pivot is right on the money regarding the low probability for a bad pivot, but using the pivot method, there is still a chance to degrade to O(n^2).

AdamSpeight2008, QuickSelect is a neat way of doing it that uses randomized pivots.

Thanks for participating so far y'all! It's been neat to see the different approaches taken.
Was This Post Helpful? 0
  • +
  • -

  • (2 Pages)
  • +
  • 1
  • 2