Multiple concurrent updaters is a common problem to solve when using multiple threads. The solution is to let the system know that one thread is updating a common structure and to provide 'exclusive' access to the structure while the update is being performed. Without exclusive access, two or more threads may try to access the same structure which, of course would corrupt it.
The C# lock statement provides a simple way to mark a statement block as a critical section by obtaining an mutual exclusion lock for a given object. In our case, the object would be the object which multiple threads attempt to update.
The file system provides a classical example for using multiple threads. If we start at the root folder of the 'C' volume and spawn a thread to enumerate the files at the root folder, then for each folder in the root folder, spawn a thread to enumerate the files at that level and so on until all folders on the volume have been scanned, we have enumerated all the files on the volume.
Writing a recursive function that queues a thread to get the files in a folder, then enumerate all the subfolders in the folder is a simple matter. The recursive function is called "EnumerateVolume".
Here is the code.
I need access the the ThreadPool and Console objects so you see the 'using' statements
CODE
using System;
using System.Collections.Generic;
using System.Text;
using System.Threading;
using System.IO;
namespace MultipleThreadTutorialProject
{
}
The 'class' Program needs to define a couple of objects that can be accessed by all the threads that are created. The List<string> _files object is a list that will contain the complete paths of all the files on the 'C:\' volume. It is the structure that will be updated by each thread. We'll use _files as the object to provide exclusive access to the itself.
The object _lock is the object used to provide the mutual exclusion lock for access to the Console. Without this lock, lines to the console may get scrambled. (I haven't seen this with the Console object so I suspect .Net may already provide a lock around access to the Console.) You can see scrambled lines using other logging mechanisms.
CODE
class Program
{
//
// this is our 'shared' structure, simply a list that will contain the paths
// of all the files on the c drive
//
private static List<string> _files = new List<string>();
//
// this is our locking mechanism, just declare a class static object. Then
// using the lock keyword in C# marks the statement block as a critical section by obtaining
// the mutual-exclusion lock for a given object, executing the block, and then releasing the lock.
private static object _lock = new object();
static void Main(string[] args)
{
//
// since we will use the ThreadPool, we need to set the maximum based on the number of
// processors.
//
int processorCount = Environment.ProcessorCount;
int maxThreads = processorCount * 100;
if (ThreadPool.SetMaxThreads(maxThreads, processorCount/* cannot be less than processor count*/))
{
//
// just checking to see if max threads was set correctly
//
int maxCompletionPorts;
ThreadPool.GetMaxThreads(out maxThreads, out maxCompletionPorts);
lock(_lock)
{
Console.WriteLine("Max threads '{0}', Max completion ports '{1}'", maxThreads, maxCompletionPorts);
Console.WriteLine();
}
//
// one of the easiest ways to create a lot of threads is to enumerate the folders and
// files of a volume recursively
//
string folder = "C:\\";
DateTime start = DateTime.Now;
EnumerateVolume(folder);
DateTime end = DateTime.Now;
TimeSpan ts = end - start;
lock(_lock)
{
Console.WriteLine("Time for EnumerateVolume() to queue up all threads '{0}' msecs", ts.TotalMilliseconds);
}
//
// now we need to wait until all the threads complete
//
int availableWorkerThreads = 1;
int availableCompletionPortThreads = 0;
while (availableWorkerThreads < maxThreads)
{
ThreadPool.GetAvailableThreads(out availableWorkerThreads, out availableCompletionPortThreads);
lock(_lock)
{
Console.WriteLine(" total files so far '{0}'", _files.Count);
}
Thread.Sleep(100);
}
end = DateTime.Now;
ts = end - start;
lock(_lock)
{
Console.WriteLine("Time for all threads to complete '{0}' msecs", ts.TotalMilliseconds);
Console.WriteLine("\nPress any key to exit");
Console.Read();
}
}
else
{
Console.WriteLine("ThreadPool.SetMaxThreads() returned false, unable to set max threads.");
}
}
Here is the recursive function.
CODE
private static void EnumerateVolume(string folder)
{
//
// Now for the fun part. EnumerateVolume starts at the root of the c drive and recursively
// moves through the folders queueing up work for threads to add files to the
// static list object _files
//
//
// queue the work item
//
ThreadPool.QueueUserWorkItem(new WaitCallback(GetFiles), folder);
//
// get the sub directories for this folder if there are any
// and for each sub folder call EnumerateVolume()
// the try/catch block is to pass over access denied exceptions
//
try
{
string[] subFolders = Directory.GetDirectories(folder);
foreach (string subFolder in subFolders)
{
EnumerateVolume(subFolder);
}
}
catch (Exception e)
{
lock(_lock)
{
Console.WriteLine(" ***Exception '{0}'", e.Message);
}
}
}
Now all we need is the function "GetFiles()" which does its work in each worker thread.
The first Thread.Sleep(0) is to immediately give up the cpu so other threads can execute.
Also note that the 'lock' statement block only contains the statement:
_files.Add(file);
You want the lock sequence to be as short as possible so we only lock around access to the _files object while we are trying to change it.
CODE
static void GetFiles(object stateInfo)
{
//
// immediately give up our timeslice to give another thread the attention of the cpu.
//
Thread.Sleep(0);
//
// check on the number of available worker threads
//
int availableWorkerThreads, availableCompletionPortThreads;
ThreadPool.GetAvailableThreads(out availableWorkerThreads, out availableCompletionPortThreads);
//
// we need to lock around our output to the console so our lines won't get scrambled
// try removing this lock and see what happens
//
lock (_lock)
{
Console.WriteLine(" Available threads '{0}'", availableWorkerThreads);
}
string folder = stateInfo as string;
//
// get the files in this folder if there are any
// and add them to the _files object
//
try
{
string[] files = Directory.GetFiles(folder);
if (files.Length > 0)
{
foreach (string file in files)
{
//
// lock around the code to add the file to the _files object
// we to minimize the time we are lock so we just lock around the add
//
lock (_files)
{
_files.Add(file);
}
}
}
}
catch (Exception e)
{
lock (_lock)
{
Console.WriteLine(" ***Exception '{0}'", e.Message);
}
}
//
// take up some time in the thread to slow things down a bit.
//
Thread.Sleep(20);
}
Try removing the lock(_files) statement. In my tests, the _files list wasn't corrupted but the number of total files is typically less than the number of files found if the lock statement was in.