## 28 Replies - 5648 Views - Last Post: 06 August 2015 - 12:37 PM

### #1

# Challenge: Keep track of the median of a stream of integers

Posted 26 July 2015 - 07:32 PM

The *real* challenge is how to do this efficiently. So here are some benchmarks that you can try to hit (where 'n' is the number of integers queried from the stream so far). It is also interesting to just come up with all the algorithms that meet these various space and time requirements.

1. O(n lg n) time and constant space

2. O(n) time and space

3. O(lg n) time and O(n) space

4. O(1) amortized time and O(n) space

5. O(1) time and O(1) space

Remember to post solutions in spoiler tags. Numbers 1 and 2 should be easy to get. The solution I saw today will naively meet 3. It can be easily sped up to 4. I'm pretty sure there is an easy way to do this in constant space in exact time but I haven't worked though all the details yet.

##
**Replies To:** Challenge: Keep track of the median of a stream of integers

### #2

## Re: Challenge: Keep track of the median of a stream of integers

Posted 26 July 2015 - 07:41 PM

### #3

## Re: Challenge: Keep track of the median of a stream of integers

Posted 26 July 2015 - 08:43 PM

If you are familair with Haskell then infinite lists are streams. If you are familiar with Python then generators fit the bill. If you are familiar with C# then IEnumerators constructed using the yield keyword form streams. in C++ and many other languages as mentioned the file IO classes (in C++ literally called stream) form streams of characters. In the case of C++ if you just keep reading integers from the stream then they are streams of integers (so C++ streams are sorta like heterogeneous streams really).

Additionally one can view functions from integers to objects as being streams. If you plug in 0 you get the first element, plug in 1 you get the second, etc... These sorts of streams can have some really strange run time properties.

Here is how I would set things up in the following languages. You should be able to find something that fits the bill for your language of choice here. All the following code is untested and incomplete but should give you the right idea.

Python:

def pi_digits(): while True: digit = #compute digit of pi or some such other thing yield digit #now build a function that efficiently yields the median digit at any given time def medians(digits): #store some needed state here for digit in digits: #update to find the new median thus far yield median for med in medians(pi_digits): print(med)

Lua, Ruby, .NET (see below), and many other dynamic languages all have a feature like this. So this should generalize to many langauges.

C#:

public class Median { public static IEnumerable PiDigits() { //setup some state here while(true) { //compute some new stuff and update the state yield digit; } } public static IEnumerable Medians(IEnumerable digits) { //setup some state here foreach (int number in digits) { //compute the new median yield median; } } public static void Main() { foreach(int med in Medians(PiDigits())) { Console.out.println(med); } } }

This should work for any .NET language as stated above.

Haskell (the easiest language to do this in):

digits :: [Int] digits = --some infinite list --here the "..." is for some accumulator elements need to maintain the state --you could also use the state monad if you were feeling extra monadic today --you could also get creative and do this with folds/scans/maps medians :: ... -> [Int] -> [Int] medians ... digits = --compute the stream of medians the given inputs --here the "..." is the the initial accumulators main = do mapM_ print (medians ... digits)

This directly generalizes for any lazy language. If you are using a strict functional language and it doesn't have lazy lists ready to go you can make a type called 'Stream' which has a single element called 'head' and another called 'tail' which is a function from unit to the stream type.

C++ (this should work for most imperative languages like Java):

int main() { while(true) { int digit; cin >> digit; //do some computation cout << median; } }

So basically just reading from the console and outputting to the console for a possibly indefinite amount of time. You should be able to do this in any language worth it's salt.

If you don't know the best way to implement streams in your language of choice just ask and I'll see if I can give you a decent solution to work with. Streams are really neat things from a theoretical point of view and a god sent to real programmers. Learn to use them!!

This post has been edited by **ishkabible**: 26 July 2015 - 08:53 PM

### #4

## Re: Challenge: Keep track of the median of a stream of integers

Posted 26 July 2015 - 08:45 PM

Quote

And was hoping for some textual representation (ie., an example) of the input.

### #5

## Re: Challenge: Keep track of the median of a stream of integers

Posted 26 July 2015 - 09:02 PM

so say my input is the digits of pi 31415926535897932384626433832795028841971693993751...

First I would read in 3 and the output would be 3

Next I would read in 1 and 4 and the output would be 3

Next I would read in 1 and 5 and the output would be 3 (again)

Next I would read in 2 and 9 and the output would still be 3

etc...

Does that clarify things? I can't actually write the whole infinite input out but the idea is clear, right?

This post has been edited by **ishkabible**: 26 July 2015 - 09:06 PM

### #6

## Re: Challenge: Keep track of the median of a stream of integers

Posted 26 July 2015 - 09:04 PM

### #7

## Re: Challenge: Keep track of the median of a stream of integers

Posted 26 July 2015 - 11:36 PM

### #8

## Re: Challenge: Keep track of the median of a stream of integers

Posted 27 July 2015 - 12:45 AM

Here is your solution in Haskell />

This post has been edited by **ishkabible**: 27 July 2015 - 01:13 AM

### #9

## Re: Challenge: Keep track of the median of a stream of integers

Posted 27 July 2015 - 07:39 AM

### #10

## Re: Challenge: Keep track of the median of a stream of integers

Posted 27 July 2015 - 09:04 AM

macosxnerd101, on 27 July 2015 - 12:36 AM, said:

Unless I am missing something, this doesn't work.

Let us consider the sequence of 0, 1, 2, 3, 4, 5, 6, ...

Med [0, 1, 2]

Receive (3, 4) - > There is no element i (3 or 4) that satisfies Med[0] < i < Med[2]. (yet the Median is now 2).

### #11

## Re: Challenge: Keep track of the median of a stream of integers

Posted 27 July 2015 - 09:06 AM

### #12

## Re: Challenge: Keep track of the median of a stream of integers

Posted 27 July 2015 - 11:21 AM

A good overview of the other ways to solve this can be found here: http://www.ardendert...integer-stream/

### #13

## Re: Challenge: Keep track of the median of a stream of integers

Posted 27 July 2015 - 11:56 AM

Quote

.

.

.

I'm pretty sure there is an easy way to do this in constant space in exact time but I haven't worked though all the details yet.

This is only possible if by "Integer" you mean "An integer within a given range or other finite set of values". If we are reading integers of any possible value, then every integer we have already read can become the median in the future, so every integer needs to be stored.

For cases where we do have a range, we can implement a counting algorithm as outlined below.

This post has been edited by **mojo666**: 27 July 2015 - 12:29 PM

### #14

## Re: Challenge: Keep track of the median of a stream of integers

Posted 27 July 2015 - 12:23 PM

macosxnerd101, on 27 July 2015 - 05:06 PM, said:

So apparently when implementing your idea I accidentally fixed (at least) this mistake because I misinterpreted you. The implementation I provided however still doesn't wind up working, on the same input not less, but it has to read in 8 digits before it fails rather than 3. It really seemed like there should be an O(1) space and time solution to this! I just can't seem to figure it out. The more and more I think about it however, the more and more I disagree that this should have an O(1) time and space solution.

Quote

Yes this is for arbitrary streams; the digits of pi was just an example. However I encourage people to think about how it makes this problem very easily solvable if the digits are bounded by a small number!

solution:

edit: this is the same solution as above

Also one last thing: I said that you can push this all the way down to linear space an constant amortized time. I made a mistake when I asserted that. The implementation I was thinking of still requires an amortized O(lg n) operation every once and a while. In worst case this operation actually happens every time.

here is the solution:

This post has been edited by **ishkabible**: 27 July 2015 - 12:26 PM

### #15

## Re: Challenge: Keep track of the median of a stream of integers

Posted 30 July 2015 - 12:17 PM

This post has been edited by **dday9**: 30 July 2015 - 12:17 PM