5 Replies - 4985 Views - Last Post: 27 December 2010 - 09:08 PM Rate Topic: -----

#1 d_rop4nme   User is offline

  • D.I.C Head
  • member icon

Reputation: 17
  • View blog
  • Posts: 149
  • Joined: 18-April 10

Multithreading for loop maintain order

Posted 27 December 2010 - 03:48 PM

I started messing around with multithreading for a CPU intensive batch process I'm running, essentially I'm trying to condense multiple single page tiffs into single PDF documents. This works fine with a foreach loop or standard iteration but can be very slow for several 100 page documents. I tried the following based on a some examples I found to use multithreading and it has significant performance improvements however it obliterates the page order instead of 1,2,3,4 it will be 1,3,4,2,6,5 on what thread completes first. My question is how would I utilize this technique while maintaining the page order and if I can will it negate the performance benifit of the multithreading. Thankyou in advance.

            PdfDocument doc = new PdfDocument();
            string mail = textBox1.Text;
            string[] split = mail.Split(new string[] { Environment.NewLine }, StringSplitOptions.None);

            int counter = split.Count();

            // Source must be array or IList.
            var source = Enumerable.Range(0, 100000).ToArray();
            // Partition the entire source array.
            var rangePartitioner = Partitioner.Create(0, counter);
            double[] results = new double[counter];
            // Loop over the partitions in parallel.
            Parallel.ForEach(rangePartitioner, (range, loopState) =>
            {
                // Loop over each range element without a delegate invocation.
                for (int i = range.Item1; i < range.Item2; i++)
                {
                    f_prime = split[i].Replace(" " , "");
                    PdfPage page = doc.AddPage();
                    XGraphics gfx = XGraphics.FromPdfPage(page);
                    XImage image = XImage.FromFile(f_prime);
                    double x = 0;
                    gfx.DrawImage(image, x, 0);
                    
                }
            });


This post has been edited by d_rop4nme: 27 December 2010 - 03:48 PM


Is This A Good Question/Topic? 2
  • +

Replies To: Multithreading for loop maintain order

#2 Curtis Rutland   User is offline

  • (╯□)╯︵ (~ .o.)~
  • member icon


Reputation: 5106
  • View blog
  • Posts: 9,283
  • Joined: 08-June 10

Re: Multithreading for loop maintain order

Posted 27 December 2010 - 04:22 PM

I don't know exactly how I would do this, but I would probably add each result to a dictionary or a list indexed by its page number. Then, once all tasks are done, compose them together in order.
Was This Post Helpful? 0
  • +
  • -

#3 d_rop4nme   User is offline

  • D.I.C Head
  • member icon

Reputation: 17
  • View blog
  • Posts: 149
  • Joined: 18-April 10

Re: Multithreading for loop maintain order

Posted 27 December 2010 - 06:49 PM

I was thinking something like that, or to put all the images into a byte array and then assemble them at the end. I haven't been able to really conceptualize it yet, again I'm worried that if I do that and then put 1200 70kb+ tiffs into each array its gonna zap the performance so bad that it will negate the gains from the multi-threading.
Was This Post Helpful? 0
  • +
  • -

#4 Curtis Rutland   User is offline

  • (╯□)╯︵ (~ .o.)~
  • member icon


Reputation: 5106
  • View blog
  • Posts: 9,283
  • Joined: 08-June 10

Re: Multithreading for loop maintain order

Posted 27 December 2010 - 07:25 PM

I agree, but I'm not sure I see any other way around it. Perhaps you can put each in a List as they finish, and trigger another method on a different thread. This method can gain a lock on the list so it can't be modified, then process the list. Keep an int with the current page number, and if any in the list match, add it to the pdf, iterate the index, and release the lock. You might have some left over in the list at the end, and you can process the rest in order.

Maybe what would work better is an infinite loop on another thread (with a thread.sleep to avoid a busy loop) that continuously checks the dictionary for the page that matches the current index. If it's found, write it and iterate the index. Keep this thread running until the entire set is processed.

Or perhaps you can create one large image filled with dummy data, then overwrite just the portion each image belongs to after each is converted. Again, you'll have to lock the image or a locker object each time you do it, since it's a critical section, but perhaps that will work. Edit: I guess this won't work with the way you are creating the PDF.

Part of the problem is that multithreaded processing is designed for things that don't have to happen in any particular order. If you need order, then you may not be able to do this with multiple threads.

This post has been edited by insertAlias: 27 December 2010 - 07:28 PM

Was This Post Helpful? 1
  • +
  • -

#5 d_rop4nme   User is offline

  • D.I.C Head
  • member icon

Reputation: 17
  • View blog
  • Posts: 149
  • Joined: 18-April 10

Re: Multithreading for loop maintain order

Posted 27 December 2010 - 08:52 PM

Two nice solutions that have significantly increased speed of operation and maintained page order. Got some help one these but hopefully this will help someone. I ran through 1200 pages in a couple of seconds with this on a server with dual xeon hyper-threaded, spread it out nicely over all the schedulers.

PdfDocument doc = new PdfDocument();
string mail = textBox1.Text;
string[] split = mail.Split(new string[] { Environment.NewLine }, StringSplitOptions.None);

int counter = split.Count();

// Source must be array or IList.
var source = Enumerable.Range(0, 100000).ToArray();
// Partition the entire source array.
var rangePartitioner = Partitioner.Create(0, counter);

double[] results = new double[counter];

PdfPage[] pages = new PdfPage[counter];
for (int i = 0; i < counter; ++i) 
{
    pages[i] = doc.AddPage();
}

// Loop over the partitions in parallel.
Parallel.ForEach(rangePartitioner, (range, loopState) =>
{
    // Loop over each range element without a delegate invocation.
    for (int i = range.Item1; i < range.Item2; i++)
    {
        f_prime = split[i].Replace(" " , "");
        PdfPage page = pages[i];
        XGraphics gfx = XGraphics.FromPdfPage(page);
        XImage image = XImage.FromFile(f_prime);
        double x = 0;
        gfx.DrawImage(image, x, 0);
    }
});




or

PdfDocument doc = new PdfDocument();
string mail = textBox1.Text;
string[] split = mail.Split(new string[] { Environment.NewLine }, StringSplitOptions.None);

int counter = split.Count();

// Source must be array or IList.
var source = Enumerable.Range(0, 100000).ToArray();
// Partition the entire source array.
var rangePartitioner = Partitioner.Create(0, counter);

double[] results = new double[counter];

// Loop over the partitions in parallel.
Parallel.ForEach(rangePartitioner, (range, loopState) =>
{
    // Loop over each range element without a delegate invocation.
    for (int i = range.Item1; i < range.Item2; i++)
    {
        PdfPage page = doc.AddPage();
        // Only use i as a loop not as the index
        int pageIndex = page.PageIndex; // This is what I don't know
        f_prime = split[pageIndex].Replace(" " , "");
        XGraphics gfx = XGraphics.FromPdfPage(page);
        XImage image = XImage.FromFile(f_prime);
        double x = 0;
        gfx.DrawImage(image, x, 0);
    }
});




thanks for the ideas insertAlias

This post has been edited by d_rop4nme: 27 December 2010 - 08:53 PM

Was This Post Helpful? 0
  • +
  • -

#6 Curtis Rutland   User is offline

  • (╯□)╯︵ (~ .o.)~
  • member icon


Reputation: 5106
  • View blog
  • Posts: 9,283
  • Joined: 08-June 10

Re: Multithreading for loop maintain order

Posted 27 December 2010 - 09:08 PM

Glad that you got it done. Good question.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1