Page 1 of 1

Threading

#1 erik.price  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 485
  • View blog
  • Posts: 2,690
  • Joined: 18-December 08

Posted 04 April 2010 - 07:01 PM

*
POPULAR

1. What are Threads?
Traditional, nonmultithreaded programs use a single thread of execution, in which each statement is executed sequentially until the program terminates. A multithreaded program on the other hand, has multiple threads of execution, and therefore, can execute instructions in parallel (on a multicore CPU). If you have a single core computer however, this parallelism will be simulated by switching execution between threads.

Many programs can benefit from multithreading, even in single core environments, but others will really only benefit in multicore environments.
Programs that are compute-bound, that is: perform computations, and only depend on the speed of the CPU won’t be the best choice for threads if you are on a single core computer.
However, if your program is IO-bound, meaning that it spends a significant amount of time waiting for a file to be read, bytes to be transfered over a network, or an image to be rendered, multithreading can be beneficial.

Consider your favorite browser as an example for multithreading. It is a largely IO-bound program, as it needs to download text, images, and anything else that makes your daily lolcat browsing possible. One thread could wait for the HTML page to finish downloading, while another renders an image, and so on.


2. How Do I Use Threads in Ruby?

In Ruby, it is extremely easy to create and use threads. Simply create a new Thread object, and pass a block to it, with what you want that thread to do.
Here’s an example of this:

t1 = Thread.new do
	puts "I'm A Thread!"
end


Now, this thread will only print “I’m A Thread!” to the screen, and then exit, but hopefully it’ll show you how to create a Thread.
There is no need to explicitly start a Thread either; they will begin to execute the block they are given the moment resources are available.

Note: In Ruby, all implementations are not created equal. In Ruby 1.8, all threads within the program run inside a single Thread written in C at the interpreter level. This means that a Thread in Ruby 1.8 can never run in parallel, even in multicore environments.

In Ruby 1.9 however, each Ruby Thread will allocate a native thread. Some C libraries included in the 1.9 implementation are not thread-safe, and as a precaution, Ruby 1.9 is very restrictive with it’s thread scheduling.

In JRuby, the Java implementation of Ruby, the interpreter will take advantage of Java Threads, which will usually map themselves to a native thread. There is some significant overhead here, and this may cause a loss of speed as well.

If you want to use threads in a Ruby program, use Ruby 1.9, as the 1.8 implementation is drastically slower (in my test, 1.8 ran 3-5 times slower than 1.9)


In every Ruby program, there is at least one thread running at all times; the main thread. This thread is special, as when it finishes execution, the interpreter stops running. It will close even if there are other threads inside of it that are still running, which is a bit of an annoyance, but luckily, there’s an easy work around. Simply call the Thread#join method on a Thread that you want to allow to finish executing.
Example:
		t1 = Thread.new do 
			#do nothing, but take awhile doing it
			0.upto(10) { sleep 1 } 
		end
		t1.join #allow thread to finish


In the main thread, if an exception occurs, and it isn’t handled, it will cause the Ruby interpreter to print an error message and exit. In other threads however, this behavior doesn’t occur, at least by default. If you want unhandled exceptions in Threads to cause the interpreter to issue an error, then use the class method Thread.abort_on_exception=. If you want to set this property for a specific thread you can use the instance method of the same name: Thread#abort_on_exception=
Example:
		t1 = Thread.new do
			#raise exception, nothing will happen
			raise “BOOM!”			
		end
		Thread.abort_on_exception = true
		t2 = Thread.new do
			raise “BOOM!”
			#Now we get an error
		end


Using this method will force you to handle all the exceptions that occur in your threads, which in my opinion, is good practice.

Because Threads are defined by using blocks, they have access to all variables which exist at the scope they’re in, meaning they can access globals, local variables, instance variables, class variables, etc. Due to the nature of Ruby’s scope rules, any variable declared within the thread cannot exist outside of it. You can have multiple threads access the same variables, but you need to make sure that they don’t interfere with each other. We’ll touch upon this in a little bit with the Mutex class.

A few of Ruby’s global variables are created to be thread local, meaning that their value is dependent on the thread that they are being accessed from. One such variable is $~ which represents the last regular expression match. This is useful if you are matching regular expressions in more than one thread, as you don’t want to find that you’ve been using results from a set of data that had nothing to do with what that specific thread was doing!

Thread local variables:

	$SAFE 	#safe level for program execution
	$!		#last exception object raised
	$@		#stack trace of last exception
	$_		#last string read by Kernel.gets or
 			#Kernel.readline
	$~		#MatchData object produced by last pattern 
			#matching operation
	$&		#most recently matched text
	$`		#string preceeding last match in $&
	$’		#string following last match in $&
	$+		#string corresponding to the last matched 
			#group in $&


When there are more threads than CPU cores available (as is most often the case), a process called thread scheduling kicks in. Each thread has a priority, and higher priority threads will be run before threads with lower priority. To set the priority of a Thread, use Thread#priority= and to check the priority, Thread#priority. Every thread begins with the priority of the thread that created it. The main thread has a priority of 0, and every thread in it will also have a priority of 0 unless they are changed individually. There is no way to change the priority of a thread before it runs, but you can make it change the priority as the first action it takes.

NOTE: Since Ruby 1.9 uses native threads, attempting to change the priority of
a thread under a GNU/Linux operating system will be ignored


A thread scheduler is what determines which thread gets to run when, and there are two main types: preempting, and cooperative. Preempting schedulers will only allow a thread to run for a certain amount of time before relinquishing control over to another thread. This way, each thread gets to run, and non get “starved” (not executed at all). Cooperative schedulers on the other hand rely on each individual thread to give up control at some point. If a thread is compute-bound, and is running on a cooperative scheduler, it will starve other threads until it sleeps, waits for IO, or another, higher priority thread is created. If you are writing a compute-bound thread, it is good practice to occasionally call Thread#pass, to let other threads have a chance to run once in awhile.

A Ruby thread has 5 different states, runnable, sleeping, aborting, terminated (normal), terminated (abnormal). A thread is said to be alive if it is in the “runnable" or “sleeping" state. A runnable thread is one that is currently running, or will run next time that resources are available. A sleeping thread is one that is sleeping through the Kernel#sleep method, or waiting for IO. A thread that was terminated normally finished it’s block of code and exited without a problem, and an abnormal termination means that the thread ended with an exception. An aborting thread is in the process of terminating.
You can query the state of a thread through the Thread#status method. This will return one of five values:
	"run"		#thread is runnable
	"sleep"		#thread is sleeping
	"aborting"	#thread is aborting
	false		#thread terminated normally
	nil		#thread terminated abnormally


3. Example

This example is an IO-bound example, which will benefit from the use of threads, even in a single core environment: a concurrent file reader. In this case, the files will be websites which will be downloaded. The information downloaded is just discarded at the end.
require 'open-uri' #for open
def conRead(urls) 	#urls is an array of string 
				#containing website urls
  thread_list = [] #keep track of our threads
  urls.each do |f| 
    thread_list  << Thread.new { 	#add a new thread to
							#download each site
      open(f) do |x|
         x.read 
      end
      puts "read: " + f  #show what we’ve done
    }
  end
  thread_list.each {|x| x.join} #wait for each thread 
						  #to complete
end
urls = [] #set up an array of urls
urls << "http://ruby-lang.org" #add some random sites
urls << "http://google.com"
urls << "http://slashdot.com"
urls << "http://dreamincode.net"
urls << "http://xkcd.com"
urls << "http://engadget.com"
urls << "http://lifehacker.com"

conRead(urls) #read them concurrently


You should see that the sites are not read in the order that they are processed. They are read more in order of which site is is smallest, and has a lower ping. To get a feel for why having threads here helps at all, let’s create a sequential version, where each website is loaded only after the last has finished:
def seqRead(urls) 
  urls.each do |f|
    open(f) do |x|
      x.read
    end
    puts "read: " + f
  end
end


Now in order to compare the speed of the two methods, we’ll use the Benchmark class to, well, perform a benchmark:
Benchmark.bm do |x|
#urls is the same array of site names used before
  #run concurrent read method
  x.report("Thr:") { conRead(urls) }
  #run sequential read method
  x.report("Seq:") { seqRead(urls) }
end


Obviously, the results will vary between runs, but the data should still show a consistent advantage for the threading method. Here are my results:

      user     system      total        real
Thr:  0.140000   0.080000   0.220000 (  2.380668)
Seq:  0.110000   0.060000   0.170000 (  5.335737)


As you can see, threading shaved a good 3 seconds off the total run time.


4. Thread Exclusion

When two or more threads share access to the same variables, precautions must be made so that the variables must be seen as atomic operations. What does this mean? Well, an atomic operation refers to a set of operations that are combined so that they appear to be a single operation. Consider incrementing a variable as an example:

a = a + 1


To have this work, Ruby first loads the value of a from memory. Then it adds 1 to the value, but doesn’t write the change back into memory. The last step of the process is to rewrite the original value in memory with the new one created by adding the two values. This is fine on a single threaded program, but imagine it in a multithreaded one, in which two threads are incrementing the same value. The one thread could complete the first two steps, and before the value is written back in to memory, the thread passes execution to another, which reads the unchanged value, and increments that. When execution is passed back to the first thread, it will write the value it incremented, meaning that the value will only be incremented once, rather than twice.

This is a pretty trivial, and over simplified, but it should explain a problem.

To solve problems like this, we use locks. Basically, what it does is lock up the variables, so only one thread can use them at a time, and will bundle a whole bunch of statements into one atomic one. In Ruby, we use the Mutex class to accomplish this. In Ruby 1.9, Mutex is a core class, and in Ruby 1.8, you must require ‘thread’ to use it.

Here’s an example of how it can be useful:
require 'thread' 	#for backwards comparability with 
				#1.8

x = 0
y = 0
total = 0

t1 = Thread.new {
	loop do 
		x += 1
      	sleep 0.1 
		y += 1
	end
}

spy = Thread.new {
  loop do
    total += (x-y).abs #find the difference 
  end
}
 
sleep 1 #allow the threads to run for 1 second
puts "x: #{x}\ny: #{y}\nDif: #{total}"


Run this, and you should see that the difference is pretty substantial, for me, it was almost 2,000,000.

Now, let’s run the same program, this time, utilizing locks:
require 'thread' 	#for backwards comparability with 
				#1.8
@lock = Mutex.new 	#our lock

x = 0
y = 0
total = 0

t1 = Thread.new {
	loop do 
		@lock.synchronize { #everything in this block 						
                                    #will be run at one time.
				    #The Thread cannot 
				    #relinquish control 
				    #anywhere inside here
			x += 1
      		        sleep 0.1
			y += 1
		}
	end
}

spy = Thread.new {
  loop do
    @lock.synchronize { 
    total += (x-y).abs
    }
  end
}
 
sleep 1
puts "x: #{x}\ny: #{y}\n Dif: #{total}"


When you run the program this time, you should see that the difference is 0, because our Mutex guarantees that our ‘spy’ thread will never be able to see the values of x and y until they are both incremented.

Whenever using a Mutex, one must be careful not to create a deadlock situation. Deadlock occurs when when two or more threads are waiting to acquire a resource which is currently locked by another thread. Because all the threads are waiting for a specific object, none of them can release their locks on the object, and the program grinds to a halt.

If this is confusing, imagine you and a friend are going through the fridge and both of you want to have a bagel. There is only one left, which you grab, but there is also only one stick of cream cheese, which your friend grabs. You want the cream cheese, and your friend wants the bagel. You have both “locked” one resource, but are waiting for another to finish making a delicious bagel. Nothing happens, since you both really want a delicious bagel, this is deadlock. Now, instead of you and delicious bagels, think of threads and variables. Both threads want to lock a variable, but neither is willing to give it up.

A simple way to avoid deadlock is to lock Mutexes in the same order in every thread. If in Thread A, you lock Mutex m before Mutex n, in Thread B, you should do the same.

This is about it for the tutorial, hope it was informative, and maybe you’ll walk away with some new knowledge!

Is This A Good Question/Topic? 5
  • +

Replies To: Threading

#2 Guest_derek*


Reputation:

Posted 26 June 2010 - 06:40 PM

Plagiarist.

http://ruby-doc.org/...ut_threads.html
Was This Post Helpful? -4

Page 1 of 1