3 Replies - 302 Views - Last Post: 16 October 2017 - 09:42 AM Rate Topic: -----

#1 alexz003  Icon User is offline

  • D.I.C Head

Reputation: 4
  • View blog
  • Posts: 78
  • Joined: 08-May 10

TypeError: object of type 'int' has no len() in ThreadPool

Posted 15 October 2017 - 06:38 PM

I am having a weird issue that I can't seem to figure out and am hoping someone can give me some insight.

I am using a ThreadPool from multiprocessing.pool and for some reason, after generating a a list of threads, I can't iterate over that list.

Source:
    file_count = 1
    item_count = 0
    while os.path.isfile('imdb.pkl{}'.format(file_count)):
        
        url_list = pickle.load(open('imdb.pkl{}'.format(file_count), 'rb'))
    
        threads = [pool.apply_async(get_single_title_info,
                            args=(url_list[i][1][url_list[i][1].index('/tt') + 1 : url_list[i][1].rindex('/')],), 
                            callback=title_info_callback) for i in range(len(url_list))]

        thread_length = len(url_list)
        print(thread_length)
	
        print('Threads working...')

        thread_result = [threads[i].get() for i in range(0, thread_length)]


	file_count += 1



Output:
223
Threads working...
Traceback (most recent call last):
  File "run.py", line 375, in <module>
    get_info_from_imdb()
  File "run.py", line 365, in get_info_from_imdb
    thread_result = [threads[i].get() for i in range(0, thread_length)]
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 567, in get
    raise self._value
TypeError: object of type 'int' has no len()



Here you can see that I'm printing out the total number of threads to process (223), but when I run the for loop I get the error above. I've tried rewriting the for loop using range(len(url_list)), range(thread_length), and for thread in threads: _ = thread.get() but keep coming up with the same error.

Thanks for any help provided!

Edit:
Formatting

This post has been edited by alexz003: 15 October 2017 - 06:44 PM


Is This A Good Question/Topic? 0
  • +

Replies To: TypeError: object of type 'int' has no len() in ThreadPool

#2 jon.kiparsky  Icon User is offline

  • Chinga la migra
  • member icon


Reputation: 10690
  • View blog
  • Posts: 18,308
  • Joined: 19-March 11

Re: TypeError: object of type 'int' has no len() in ThreadPool

Posted 15 October 2017 - 07:03 PM

It's not exactly clear to me what the problem is, but I would start by suggesting that you try to make your list comprehensions a little more idiomatic. Specifically, you want to be iterating over the items in a list, not over their indices. Something more like this would in general be preferable

thread_result = [thread.get() for thread in threads]


but it's hard to say if there's any reason why it wouldn't work based on the little sample you've given us.
Was This Post Helpful? 0
  • +
  • -

#3 alexz003  Icon User is offline

  • D.I.C Head

Reputation: 4
  • View blog
  • Posts: 78
  • Joined: 08-May 10

Re: TypeError: object of type 'int' has no len() in ThreadPool

Posted 15 October 2017 - 07:50 PM

View Postjon.kiparsky, on 15 October 2017 - 07:03 PM, said:

It's not exactly clear to me what the problem is, but I would start by suggesting that you try to make your list comprehensions a little more idiomatic. Specifically, you want to be iterating over the items in a list, not over their indices. Something more like this would in general be preferable

thread_result = [thread.get() for thread in threads]


but it's hard to say if there's any reason why it wouldn't work based on the little sample you've given us.


I took your advice and rewrote it a bit.

Source:
    thread_count = 15
    pool = ThreadPool(processes=thread_count)
    
    file_count = 1
    item_count = 0
    while os.path.isfile('imdb.pkl{}'.format(file_count)):
        
        url_list = pickle.load(open('imdb.pkl{}'.format(file_count), 'rb'))


        threads = []

        for i in range(len(url_list)):
            # Turn URL into ID
            title_id = url_list[i][1][url_list[i][1].index('/tt') + 1 : url_list[i][1].rindex('/')]
            
            # Add thread to threadpool
            threads.append(pool.apply_async(get_single_title_info,
                            args=(title_id,)))

        thread_length = len(threads)
        print(thread_length)
        
        print('Threads working...')
        

        for i in range(len(url_list)):
            threads[i].get()

        file_count += 1



I should say that I have tried using thread_result = [thread.get() for thread in threads], but I do get the same error. What type of information could I supply to help figure this out?

This post has been edited by alexz003: 15 October 2017 - 07:53 PM

Was This Post Helpful? 0
  • +
  • -

#4 woooee  Icon User is offline

  • D.I.C Head

Reputation: 45
  • View blog
  • Posts: 168
  • Joined: 21-November 12

Re: TypeError: object of type 'int' has no len() in ThreadPool

Posted 16 October 2017 - 09:42 AM

One error is here
for i in range(len(url_list))]  ## should print a syntax error
No colon at the end and an extra "]". You don't need this for() as it is doing the same list comprehension over and over.

The problem appears to be at threads[i].get() The pool.apply_async() function in the list comprehension apparently returns as ID number of the thread,

This post has been edited by woooee: 16 October 2017 - 09:50 AM

Was This Post Helpful? 0
  • +
  • -

Page 1 of 1