Optimization challenge: Count Characters

  • (2 Pages)
  • +
  • 1
  • 2

20 Replies - 9976 Views - Last Post: 01 October 2012 - 03:26 PM

#1 janne_panne  Icon User is offline

  • WinRT Dev
  • member icon

Reputation: 429
  • View blog
  • Posts: 1,047
  • Joined: 09-June 09

Optimization challenge: Count Characters

Posted 13 January 2012 - 01:40 PM

Optimization challenge: Count Characters

Introduction
Optimization is an important part of software development, especially if the software is a game or some kind of scientific software which performs thousand or millions of calculations each second. In such application some method which is called all the time (path finding in a game or some formula in calculations) has to be fast. It might take only 100 ticks when called once so the optimization wouldn't be noticed but if it's called all the time, just reducing the time taken by 20 ticks might make your whole software few percents faster.

The Challenge
In this challenge you have to optimize a very basic method which is called ten thousand times. The method counts how many times one character appears in a string.

The only method you have to edit to give right results is CountCharactersOptimized(string str, char c). You are NOT allowed to change the signature of this method in any way, for example to (string str, string c). But you can edit the inside of the method any way you want. Create a loop, use string class methods, whatever it is that you desire.

Here is the code you can copy-paste into your Console application:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace Optimization_Challenge_1
{
    class Program
    {
        private static Random random = new Random();

        static void Main(string[] args)
        {
            string str = GetRandomString(10000);
            char c = GetRandomChar();
            System.Diagnostics.Stopwatch st = new System.Diagnostics.Stopwatch();

            st.Start();
            for (int i = 0; i < 10000; i++) {
                CountCharactersSlow(str, c);
            }
            st.Stop();

            Console.WriteLine("CountSlow: " + CountCharactersSlow(str, c));
            long elapsedSlow = st.ElapsedTicks;
            Console.WriteLine("TimeSlow: " + elapsedSlow);

            st.Restart();
            for (int i = 0; i < 10000; i++) {
                CountCharactersOptimized(str, c);
            }
            st.Stop();

            Console.WriteLine("CountOptimized: " + CountCharactersOptimized(str, c));
            long elapsedOptimized = st.ElapsedTicks;
            Console.WriteLine("TimeOptimized: " + elapsedOptimized);

            decimal difference = ((decimal)elapsedOptimized / (decimal)elapsedSlow);
            if (difference > 1)
                Console.WriteLine("CountCharactersOptimized is slower by " + (difference - 1).ToString("P"));
            else
                Console.WriteLine("CountCharactersOptimized is faster by " + (1 - difference).ToString("P"));
            
            Console.ReadKey();
        }

        static string GetRandomString(int length)
        {
            string str = "";
            for (int i = 0; i < length; i++)
                str += (char)random.Next((int)'a', (int)'f');
            return str;
        }

        static char GetRandomChar()
        {
            return (char)random.Next((int)'a', (int)'f');
        }

        static int CountCharactersSlow(string str, char c)
        {
            return str.Count(c1 => c1 == c);
        }

        static int CountCharactersOptimized(string str, char c)
        {
            //return str.Count(c1 => c1 == c);
        }
    }
}



Example run:
CountSlow: 2002
TimeSlow: 9091487
CountOptimized: 2002
TimeOptimized: 3893097
CountCharactersOptimized is faster by 57,18 %



As you can see in the Main method, the results will be in percentages. Milliseconds/Ticks wouldn't tell us enough because someone might have faster computer.

For testing you may reduce the number of calls or the length of the string because the Slow method takes about 9 seconds on my computer (which is slow like the method) when ran in debug mode.

Please don't use solution where you cache the answer and then just return the same answer :P This might not be the case in real world application that the method is called with the same string every time.

Rewards

The one who is able to reduce the time taken the most will be rewarded with the awesome "Optimization guru" title which you can include in your profile signature as text. However, no one will be telling you that you had the fastest time, you have to check it yourself in this thread.

Final words

Good luck and have fun optimizing. Creating 50% faster method might be easy. But don't stop there, try to be 80% or 90% faster!

Remember to use Spoiler and Code tags when posting code. But you can post the results outside of the code tags.

Feel free to post your unsuccessful solutions too.

Is This A Good Question/Topic? 2
  • +

Replies To: Optimization challenge: Count Characters

#2 lordofduct  Icon User is offline

  • I'm a cheeseburger
  • member icon


Reputation: 2533
  • View blog
  • Posts: 4,633
  • Joined: 24-September 10

Re: Optimization challenge: Count Characters

Posted 13 January 2012 - 05:27 PM

Simple approach, averaged about 99% efficiency gains.

Spoiler


Note I ran these on my machine, not sure if that played a role in the efficiency percentage yield. We are doing percentage yield, and I don't explicitly use any multi-threading so it technically shouldn't matter...

Those specs are:

Intel i7 quad-core @2.93ghz w/ hyper-threading (8 threads total)
12 GB DDR3

This post has been edited by lordofduct: 13 January 2012 - 05:57 PM

Was This Post Helpful? 2
  • +
  • -

#3 AdamSpeight2008  Icon User is offline

  • MrCupOfT
  • member icon


Reputation: 2257
  • View blog
  • Posts: 9,445
  • Joined: 29-May 08

Re: Optimization challenge: Count Characters

Posted 13 January 2012 - 05:44 PM

The obvious one (since you don't specify a framework) is .AsParallal.
.net 4.5 ~7 - 15% faster. Try on more cores.
Spoiler

Was This Post Helpful? 0
  • +
  • -

#4 AdamSpeight2008  Icon User is offline

  • MrCupOfT
  • member icon


Reputation: 2257
  • View blog
  • Posts: 9,445
  • Joined: 29-May 08

Re: Optimization challenge: Count Characters

Posted 13 January 2012 - 06:28 PM

Why does lordofduct's count a different number of matches? Counting 10 times less than the slow method.

This post has been edited by AdamSpeight2008: 13 January 2012 - 06:30 PM

Was This Post Helpful? 0
  • +
  • -

#5 lordofduct  Icon User is offline

  • I'm a cheeseburger
  • member icon


Reputation: 2533
  • View blog
  • Posts: 4,633
  • Joined: 24-September 10

Re: Optimization challenge: Count Characters

Posted 13 January 2012 - 07:23 PM

There may be a bug in it...

[edit]
and that bug explains a bit why it was so easy for me to get 99%. I thought it seemed fishy. Didn't think much of the 'count' being lower, I for some reason thought it was the 'tick count'.

70% efficiency:

Spoiler

This post has been edited by lordofduct: 13 January 2012 - 07:30 PM

Was This Post Helpful? 0
  • +
  • -

#6 AdamSpeight2008  Icon User is offline

  • MrCupOfT
  • member icon


Reputation: 2257
  • View blog
  • Posts: 9,445
  • Joined: 29-May 08

Re: Optimization challenge: Count Characters

Posted 13 January 2012 - 09:44 PM

Parallel Variant of lordofduct's ~64% faster (on my 2-Core Machine)
Spoiler

Was This Post Helpful? 0
  • +
  • -

#7 wiero  Icon User is offline

  • D.I.C Head

Reputation: 48
  • View blog
  • Posts: 78
  • Joined: 29-June 11

Re: Optimization challenge: Count Characters

Posted 16 January 2012 - 04:28 AM

~ 73% , however lordofduct's solution has ~ 75,5% on my machine
Spoiler

Was This Post Helpful? 0
  • +
  • -

#8 janne_panne  Icon User is offline

  • WinRT Dev
  • member icon

Reputation: 429
  • View blog
  • Posts: 1,047
  • Joined: 09-June 09

Re: Optimization challenge: Count Characters

Posted 16 January 2012 - 05:02 AM

My solution, about 80% faster:

Spoiler

This post has been edited by janne_panne: 16 January 2012 - 05:03 AM

Was This Post Helpful? 0
  • +
  • -

#9 AdamSpeight2008  Icon User is offline

  • MrCupOfT
  • member icon


Reputation: 2257
  • View blog
  • Posts: 9,445
  • Joined: 29-May 08

Re: Optimization challenge: Count Characters

Posted 16 January 2012 - 05:43 AM

View Postjanne_panne, on 16 January 2012 - 01:02 PM, said:

Spoiler

1. Doesn't count all of the occurrences (only count number of c repetitions at start of string.
2. Potential infinite loop (as i doesn't increment)
3. You need to check at least N chars, when N is number of char in string.
Was This Post Helpful? 0
  • +
  • -

#10 wiero  Icon User is offline

  • D.I.C Head

Reputation: 48
  • View blog
  • Posts: 78
  • Joined: 29-June 11

Re: Optimization challenge: Count Characters

Posted 16 January 2012 - 05:44 AM

janne_panne
i checked your solution and i must say that, i achieved 80% only in visual studio debug mode , in release mode it's about 68% and almost same results display when running from console. Any ideas what may be the reason?
Was This Post Helpful? 0
  • +
  • -

#11 janne_panne  Icon User is offline

  • WinRT Dev
  • member icon

Reputation: 429
  • View blog
  • Posts: 1,047
  • Joined: 09-June 09

Re: Optimization challenge: Count Characters

Posted 16 January 2012 - 06:25 AM

View PostAdamSpeight2008, on 16 January 2012 - 02:43 PM, said:

1. Doesn't count all of the occurrences (only count number of c repetitions at start of string.
2. Potential infinite loop (as i doesn't increment)
3. You need to check at least N chars, when N is number of char in string.


I think you misread my code snippet or your brain just malfunctioned :P

It does count all occurrences. Variable i is getting assigned inside the while loop's if statement section: while ((i = str.IndexOf(c, i + 1)) > -1) which takes the loop forward. For example:
If we are looking for 'a' in string "abcabc":

First iteration:
i = -1 -> i + 1 = 0
str.IndexOf('a', 0) returns 0 because 'a' is in position 0 of the string

Second iteration:
i = 0 -> i + 1 = 1
str.IndexOf('a', 1) returns 3 because it starts looking for 'a' at position 1 ('b') and finds second 'a' at position 3

Third iteration:
i = 3 -> i + 1 = 4
str.IndexOf('a', 4) can't find more 'a's in the string and returns -1 and while loop breaks.

wiero said:

i checked your solution and i must say that, i achieved 80% only in visual studio debug mode , in release mode it's about 68% and almost same results display when running from console. Any ideas what may be the reason?


Good find. It appears that pretty much all solutions posted here yield about 10-20 percentage units slower results in release. I don't know the reason since it would be logical that release builds are faster than debug builds, and usually that is the case. Someone who knows better could shed some light on this matter.
Was This Post Helpful? 0
  • +
  • -

#12 AdamSpeight2008  Icon User is offline

  • MrCupOfT
  • member icon


Reputation: 2257
  • View blog
  • Posts: 9,445
  • Joined: 29-May 08

Re: Optimization challenge: Count Characters

Posted 16 January 2012 - 06:39 AM

Assignment mid-expression is an abomination!
Was This Post Helpful? 0
  • +
  • -

#13 Momerath  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 1010
  • View blog
  • Posts: 2,444
  • Joined: 04-October 09

Re: Optimization challenge: Count Characters

Posted 23 January 2012 - 10:59 PM

My parallel version gets 84.72% speed increase (outside VS, inside I get 69.42% increase)
Spoiler


Things to note: partitions must divide evenly into the size of the string. The number of partitions used here was optimized by testing all even divisions of 10,000. Your results will vary depending on many factors, mainly the number of processor cores but also what other activities your system is currently processing.

Results aren't consistent (hard to control what tasks might interrupt), the result posted was the average of 1000 runs.

This post has been edited by Momerath: 23 January 2012 - 11:00 PM

Was This Post Helpful? 1
  • +
  • -

#14 Nakor  Icon User is offline

  • Professional Lurker
  • member icon

Reputation: 444
  • View blog
  • Posts: 1,492
  • Joined: 28-April 09

Re: Optimization challenge: Count Characters

Posted 03 March 2012 - 11:27 PM

A little late to the game, but it looked interesting so here's my go at it. It seems to be about a 70-75% percent speed increase.

Spoiler

This post has been edited by Nakor: 03 March 2012 - 11:29 PM

Was This Post Helpful? 0
  • +
  • -

#15 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3552
  • View blog
  • Posts: 11,010
  • Joined: 05-May 12

Re: Optimization challenge: Count Characters

Posted 31 May 2012 - 03:28 AM

Late to the game, but getting about 90%. There was no restriction that the unsafe code was not allowed so:
Spoiler

This post has been edited by Skydiver: 31 May 2012 - 03:28 AM

Was This Post Helpful? 0
  • +
  • -

  • (2 Pages)
  • +
  • 1
  • 2