5 Replies - 424 Views - Last Post: 11 October 2018 - 05:26 AM Rate Topic: -----

#1 chloeCodes   User is offline

  • D.I.C Head

Reputation: 4
  • View blog
  • Posts: 223
  • Joined: 05-January 17

MapReduce query: Changing reduce() function

Posted 10 October 2018 - 07:20 AM

I have been working on a simple wordcount program that given a text input prints out the number of occurrences of each word.
The reduce function looks like:
def reducer(self, word, count):
    yield(word, sum(counts)



Now, I want to adjust the reduce() function so that only words that have an occurrence of 10 or more are printed out in the output file. I thought, it may look like this:
def reducer(self, word, count):
         if sum(counts)>10:
           emit(word,sum(counts)



However this doesn't work. I know that I have to change the reduce method and leave map the same.

Is This A Good Question/Topic? 0
  • +

Replies To: MapReduce query: Changing reduce() function

#2 Programmer2004   User is offline

  • D.I.C Head

Reputation: 18
  • View blog
  • Posts: 96
  • Joined: 25-October 17

Re: MapReduce query: Changing reduce() function

Posted 10 October 2018 - 07:22 AM

I'm not oriented at Python, but maybe I can somehow help. What exactly happens when you run this code?
Was This Post Helpful? 0
  • +
  • -

#3 chloeCodes   User is offline

  • D.I.C Head

Reputation: 4
  • View blog
  • Posts: 223
  • Joined: 05-January 17

Re: MapReduce query: Changing reduce() function

Posted 10 October 2018 - 07:27 AM

View PostProgrammer2004, on 10 October 2018 - 07:22 AM, said:

I'm not oriented at Python, but maybe I can somehow help. What exactly happens when you run this code?


Thank you! A 0 is printed by every word in the input string : /
Was This Post Helpful? 0
  • +
  • -

#4 Programmer2004   User is offline

  • D.I.C Head

Reputation: 18
  • View blog
  • Posts: 96
  • Joined: 25-October 17

Re: MapReduce query: Changing reduce() function

Posted 10 October 2018 - 07:34 AM

Hmm, the first thing that comes to my mind is that the declaration is incorrect. You wrote that you want to adjust the reduce function, but you declared the function as reducer.
Was This Post Helpful? 0
  • +
  • -

#5 chloeCodes   User is offline

  • D.I.C Head

Reputation: 4
  • View blog
  • Posts: 223
  • Joined: 05-January 17

Re: MapReduce query: Changing reduce() function

Posted 10 October 2018 - 07:43 AM

View PostProgrammer2004, on 10 October 2018 - 07:34 AM, said:

Hmm, the first thing that comes to my mind is that the declaration is incorrect. You wrote that you want to adjust the reduce function, but you declared the function as reducer.


Mapper function is declared: def mapper() and def reducer() respectively. Both my functions work correctly to derive the word count of each word. Now, I want to adjust reducer() so that occur in the input text file 10 times or more are in the outputed text file.
So there's nothing wrong with the method name I declared.
Was This Post Helpful? 0
  • +
  • -

#6 chloeCodes   User is offline

  • D.I.C Head

Reputation: 4
  • View blog
  • Posts: 223
  • Joined: 05-January 17

Re: MapReduce query: Changing reduce() function

Posted 11 October 2018 - 05:26 AM

Solved:

count is an iterable, and you are iterating it twice, the second time it is empty and the sum will be zero.

You need to store the result, then check and output. Otherwise, the logic is correct

def reducer(self, word, count):
   _count = sum(count)
   if _count > 10:
       emit(word, _count)

Was This Post Helpful? 0
  • +
  • -

Page 1 of 1