How can I make this faster? (Small Geometric functions).

  • (3 Pages)
  • +
  • 1
  • 2
  • 3

42 Replies - 2311 Views - Last Post: 12 September 2020 - 09:17 PM Rate Topic: -----

#1 random Goose   User is offline

  • D.I.C Head

Reputation: 1
  • View blog
  • Posts: 50
  • Joined: 07-August 20

How can I make this faster? (Small Geometric functions).

Posted 06 September 2020 - 08:09 AM

Some time ago I wrote small script in Python 3 for calculations points X,Y of circle or part of circle(circle curve). Was surprised with speed (it was ~12 seconds for 10 000 circles with 360 x-y points each!). I thought - I can do a bit faster than this! Re-write it in C++, I got ~350-400ms first time! After that I get a bit obsessed enthusiastic with that. So now(few days of experiments after work) it's ~60-32ms! I want less! I feel, it cant be done, but i don't have any ideas any more, how to do that.
P.S:(I'm not a programmer(as maybe obvious), but drop-out psychology student (please don't laugh), just have a big interest in Computers and Programming). *Pardon my English please, I'm from cold Europe. :bigsmile:

#include <cmath>
#include <chrono>
#include <vector>
#include <iostream>
#include <string>

using namespace std;

typedef unsigned long int uLint;
const float PI = M_PI;

struct fpa{
	float* vX;
	float* vY;
	uLint sz = 0;
};

/* // Unfortunately I didn't save actual first version, (did Ctr+S all the time:( )
 * // but(except few calls to convert angle and do something else) it was dumber and (even) more naive than something like this:
vf compass_VecPair(const float radius, float angle){
	angle = fmodf(angle, 360.0f);	//check angle.
	vf result;
	result.push_back( radius * (sinf( (angle*PI)/180.0f )) );						// A side length. Changed from: sin( degs_toRads(angle) )
	result.push_back( sqrtf( (radius*radius) - (result[0] * result[0]) ) );			// B side length.
	result[1] = (angle > 90.0f && angle < 270.0f) ? -result[1] : result[1];
	return result;
}
*/
// Current version:
// remember DON'T PASS END_ANGLE > 360.0.
fpa compass(const float radius, const float start_Angle, const float end_Angle, const float angle_Step){
	const uLint n_sz = uLint((end_Angle - start_Angle) / angle_Step);
	float* tmp_Ar = new float[n_sz];
	float tmp_compArr[n_sz];
	tmp_Ar[0] = start_Angle;
	tmp_compArr[0] = start_Angle;
	uLint i = 1;
	for(; i < n_sz; ++i){
		tmp_compArr[i] = angle_Step*i;
		tmp_Ar[i] = sinf(( tmp_compArr[i] * PI) / 180.0f);
	}
	for(i=0; i < n_sz; ++i){
		tmp_Ar[i] *= radius;
	}
	const float r = radius*radius;
	float* tmpY = new float[n_sz];
	for(i=0; i < n_sz; ++i){
		tmpY[i] = sqrtf( r - (tmp_Ar[i] * tmp_Ar[i]) );
		tmpY[i] = (tmp_compArr[i] > 90.0f && tmp_compArr[i] < 270.0f) ? -tmpY[i] : tmpY[i];
	}
	fpa result;	result.sz = n_sz;
	result.vX = tmp_Ar;
	result.vY = tmpY;
	return result;
}

const string stars_str = "\n***************************************\n";
const string time_MS_str = "Time spent (ms) -----> \t";
const string time_SEC_str = "\nTime spent (sec)-----> \t";
const unsigned int RUNS = 10;
const unsigned int CYCLES = 10001;
float sum = 0.0f;

vector<fpa> v;

int test_Compass(){
	using std::chrono::system_clock;
	auto t1 = chrono::steady_clock::now();
	for(unsigned int i=1; i < CYCLES; ++i){
		v.push_back( compass(float(i), 0.0f, 360.0f, 1.0f) );
	};
	auto t2 = chrono::steady_clock::now();
	cout << time_SEC_str << chrono::duration_cast<chrono::seconds>(t2 - t1).count() << endl <<
			time_MS_str << chrono::duration_cast<chrono::milliseconds>(t2 - t1).count();
	return int(chrono::duration_cast<chrono::milliseconds>(t2 - t1).count());
}

int main(){
	for(unsigned int i=0; i < RUNS; ++i){
		sum += test_Compass();
	}
	cout<< "\nRuns: "<< RUNS << "; Cycles per run: " << CYCLES-1 <<
		endl << "\tAverage time (ms): " << sum/RUNS << endl;
	return 0;
}


Quote

want less! I feel, it cant be done


I meant it CAN be done! *misspell. :online2long:

Is This A Good Question/Topic? 0
  • +

Replies To: How can I make this faster? (Small Geometric functions).

#2 modi123_1   User is online

  • Suitor #2
  • member icon



Reputation: 15894
  • View blog
  • Posts: 63,609
  • Joined: 12-June 08

Re: How can I make this faster? (Small Geometric functions).

Posted 06 September 2020 - 08:17 AM

I am getting a host of errors when trying to compile.

identifier "M_PI" is undefined	cplusplus	10
expression must have a constant value	cplusplus	34
'M_PI': undeclared identifier	cplusplus	10
expression did not evaluate to a constant	cplusplus	34
failure was caused by a read of a variable outside its lifetime	cplusplus	32
see usage of 'end_Angle'	cplusplus	32
C3863	array type 'float [n_sz]' is not assignable	cplusplus	36
C3863	array type 'float [n_sz]' is not assignable	cplusplus	39

Was This Post Helpful? 0
  • +
  • -

#3 random Goose   User is offline

  • D.I.C Head

Reputation: 1
  • View blog
  • Posts: 50
  • Joined: 07-August 20

Re: How can I make this faster? (Small Geometric functions).

Posted 06 September 2020 - 08:29 AM

Don't get it, maybe:
const float PI = 3.14159265358979f;


That's weird.
It compiles on mine machine.
g++ -O3 -mtune=native -mfma -msse4.2 -mavx2 -fabi-version=0 -fno-trapping-math -fno-math-errno -ffast-math -fprefetch-loop-arrays -std=c++20 -Wall -no-pie -fno-pie -o main main.cpp
Was This Post Helpful? 0
  • +
  • -

#4 random Goose   User is offline

  • D.I.C Head

Reputation: 1
  • View blog
  • Posts: 50
  • Joined: 07-August 20

Re: How can I make this faster? (Small Geometric functions).

Posted 06 September 2020 - 08:41 AM

Did I messed up somewhere!?
Was This Post Helpful? 0
  • +
  • -

#5 random Goose   User is offline

  • D.I.C Head

Reputation: 1
  • View blog
  • Posts: 50
  • Joined: 07-August 20

Re: How can I make this faster? (Small Geometric functions).

Posted 06 September 2020 - 08:55 AM

Compiles both on g++ and clang 10. On my machine. I don't understand. :surrender:
clang++-10 -O3 -mtune=native -mfma -msse4.2 -mavx2 -fno-trapping-math -fno-math-errno -ffast-math -std=c++20 -Wall -o main main.cpp


$ clang++-10 --version
clang version 10.0.0-4ubuntu1~18.04.2
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

$ g++ --version
g++ (Ubuntu 10.1.0-2ubuntu1~18.04) 10.1.0

It compiles even online. I don't know how to help :(
Compile and Execute C++ Online (GNU GCC v7.1.1)
https://www.tutorial..._cpp_online.php
Was This Post Helpful? 0
  • +
  • -

#6 modi123_1   User is online

  • Suitor #2
  • member icon



Reputation: 15894
  • View blog
  • Posts: 63,609
  • Joined: 12-June 08

Re: How can I make this faster? (Small Geometric functions).

Posted 06 September 2020 - 09:14 AM

Regardless, tweaked around the code to get it working.

I am not sure why you are using uLint when nothing in there really needs more than an int.

Also the 'const' modifier in the 'fpa compass' parameters seemed odd.

Your for loops will most likely be where you gain any traction. It's more common to define your loop control variable in the for loop instead of once outside.

Of your three for loops you can combine #2 and #3.

On second thought - make it one for loop for (int i = 0; i < n_sz; i++)

You have an 'ELSE' condition with this line that doesn't need to happen as it assigns its value back to itself, so just make it an actual 'IF' statement.
tmpY[i] = (tmp_compArr[i] > 90.0f && tmp_compArr[i] < 270.0f) ? -tmpY[i] : tmpY[i];


After all that you can look at things like storing the S and MS into a collection instead of COUT them on the spot, and print them out AFTER your test_compass is done in the loop.

Finally, after all that is done, you could even look into making it multi-threaded and parallel process the results.


After fiddling with those suggestions - excluding threading - my starting values were:
Time spent (sec)----->  0
Time spent (ms) ----->  112
Time spent (sec)----->  0
Time spent (ms) ----->  107
Time spent (sec)----->  0
Time spent (ms) ----->  108
Time spent (sec)----->  0
Time spent (ms) ----->  113
Time spent (sec)----->  0
Time spent (ms) ----->  108
Time spent (sec)----->  0
Time spent (ms) ----->  109
Time spent (sec)----->  0
Time spent (ms) ----->  108
Time spent (sec)----->  0
Time spent (ms) ----->  107
Time spent (sec)----->  0
Time spent (ms) ----->  109
Time spent (sec)----->  0
Time spent (ms) ----->  111
Runs: 10; Cycles per run: 10000
        Average time (ms): 109.2


Ending:
Time spent (sec)----->  0
Time spent (ms) ----->  94
Time spent (sec)----->  0
Time spent (ms) ----->  92
Time spent (sec)----->  0
Time spent (ms) ----->  89
Time spent (sec)----->  0
Time spent (ms) ----->  91
Time spent (sec)----->  0
Time spent (ms) ----->  90
Time spent (sec)----->  0
Time spent (ms) ----->  90
Time spent (sec)----->  0
Time spent (ms) ----->  91
Time spent (sec)----->  0
Time spent (ms) ----->  88
Time spent (sec)----->  0
Time spent (ms) ----->  89
Time spent (sec)----->  0
Time spent (ms) ----->  91
Runs: 10; Cycles per run: 10000
        Average time (ms): 90.5

Was This Post Helpful? 1
  • +
  • -

#7 modi123_1   User is online

  • Suitor #2
  • member icon



Reputation: 15894
  • View blog
  • Posts: 63,609
  • Joined: 12-June 08

Re: How can I make this faster? (Small Geometric functions).

Posted 06 September 2020 - 09:53 AM

Amusingly, replacing the 'for' loop in test_Compass with a 'parallel_for' hit some results.

Time spent (sec)----->  0
Time spent (ms) ----->  30
Time spent (sec)----->  0
Time spent (ms) ----->  24
Time spent (sec)----->  0
Time spent (ms) ----->  25
Time spent (sec)----->  0
Time spent (ms) ----->  24
Time spent (sec)----->  0
Time spent (ms) ----->  26
Time spent (sec)----->  0
Time spent (ms) ----->  25
Time spent (sec)----->  0
Time spent (ms) ----->  24
Time spent (sec)----->  0
Time spent (ms) ----->  25
Time spent (sec)----->  0
Time spent (ms) ----->  26
Time spent (sec)----->  0
Time spent (ms) ----->  24
Runs: 10; Cycles per run: 10000
        Average time (ms): 25.3

Was This Post Helpful? 0
  • +
  • -

#8 jimblumberg   User is online

  • member icon

Reputation: 5880
  • View blog
  • Posts: 17,867
  • Joined: 25-December 09

Re: How can I make this faster? (Small Geometric functions).

Posted 06 September 2020 - 10:31 AM

Your original code doesn't compile for me either:

Quote

||=== Build: Debug in c++homework (compiler: gcc9-2) ===|
main.cpp||In function ‘fpa compass(float, float, float, float)’:|
main.cpp|34|error: ISO C++ forbids variable length array ‘tmp_compArr’ [-Wvla]|


C++ doesn't allow Variable Length Arrays, for some reason n_sz it not being detected as a compile time constant (not all const are compile time const).

Also if you're worried about "speed" you should be compiling with optimizations, also you are continually pushing elements into your vector which can also be slow.

Quote

No optimizations:

Time spent (sec)-----> 0
Time spent (ms) -----> 177
Time spent (sec)-----> 0
Time spent (ms) -----> 171
Time spent (sec)-----> 0
Time spent (ms) -----> 170
Time spent (sec)-----> 0
Time spent (ms) -----> 171
Time spent (sec)-----> 0
Time spent (ms) -----> 170
Time spent (sec)-----> 0
Time spent (ms) -----> 170
Time spent (sec)-----> 0
Time spent (ms) -----> 173
Time spent (sec)-----> 0
Time spent (ms) -----> 170
Time spent (sec)-----> 0
Time spent (ms) -----> 170
Time spent (sec)-----> 0
Time spent (ms) -----> 170
Runs: 10; Cycles per run: 10000
Average time (ms): 171.2

-O2

Time spent (sec)-----> 0
Time spent (ms) -----> 79
Time spent (sec)-----> 0
Time spent (ms) -----> 75
Time spent (sec)-----> 0
Time spent (ms) -----> 75
Time spent (sec)-----> 0
Time spent (ms) -----> 73
Time spent (sec)-----> 0
Time spent (ms) -----> 72
Time spent (sec)-----> 0
Time spent (ms) -----> 72
Time spent (sec)-----> 0
Time spent (ms) -----> 73
Time spent (sec)-----> 0
Time spent (ms) -----> 72
Time spent (sec)-----> 0
Time spent (ms) -----> 72
Time spent (sec)-----> 0
Time spent (ms) -----> 80
Runs: 10; Cycles per run: 10000
Average time (ms): 74.3



Also where are you deleting the memory you're allocating with new?


Lastly you should really profile the code to find where the bottlenecks (if any) are happening.

Jim
Was This Post Helpful? 2
  • +
  • -

#9 random Goose   User is offline

  • D.I.C Head

Reputation: 1
  • View blog
  • Posts: 50
  • Joined: 07-August 20

Re: How can I make this faster? (Small Geometric functions).

Posted 06 September 2020 - 11:40 AM

Thank you for advice! This is just awesome!
Couple of things from that, for me, was a bit counter-intuitive. But it makes sense now.
It was a good idea to ask some one (possibly) smarter then me. :bigsmile:
Current(fixed, as was suggested) version->

#include <cmath>
#include <chrono>
#include <vector>
#include <iostream>
#include <string>

using namespace std;
const float PI = 3.14159265358979f;

struct fpa{
	float* vX;
	float* vY;
	unsigned int sz = 0;
};

// Current version:
// remember DON'T PASS END_ANGLE > 360.0.
fpa compass(float radius, float start_Angle, float end_Angle, float angle_Step){
	const unsigned int n_sz = (end_Angle - start_Angle) / angle_Step;
	const float r = radius*radius;
	float* tmp_Ar = new float[n_sz];	tmp_Ar[0] = start_Angle;
	float tmp_compArr[n_sz];			tmp_compArr[0] = start_Angle;
	float* tmpY = new float[n_sz];
	for(unsigned int i=0; i < n_sz; ++i){
		tmp_compArr[i] = angle_Step*i;
		tmp_Ar[i] = sinf(( tmp_compArr[i] * PI) / 180.0f);
		tmp_Ar[i] *= radius;
		tmpY[i] = sqrtf( r - (tmp_Ar[i] * tmp_Ar[i]) );
		if((tmp_compArr[i] > 90.0f) && (tmp_compArr[i] < 270.0f)){ tmpY[i] = -tmpY[i];}
	}
	fpa result;
	result.sz = n_sz;
	result.vX = tmp_Ar;		result.vY = tmpY;
	return result;
}

const string start_str = "\t**Start of test.**\n";
const string nRuns_str = "Number of Runs: ";
const string nCycles_str = " Number of Cycles per Run: ";
const string avr_TimeMS_str = "\tAverage time (ms): ";

const unsigned int RUNS = 10;
const unsigned int CYCLES = 10001;
float sum = 0.0f;
vector<float> vMS;

int test_Compass(){
    using std::chrono::system_clock;
    
    auto t1 = chrono::steady_clock::now();
    for(unsigned int i=1; i < CYCLES; ++i){
        compass(float(i), 0.0f, 360.0f, 1.0f);
    };
    auto t2 = chrono::steady_clock::now();
    vMS.push_back( chrono::duration_cast<chrono::milliseconds>(t2 - t1).count() );
    return int(chrono::duration_cast<chrono::milliseconds>(t2 - t1).count());
}

int main(){
    cout << start_str;
    for(unsigned int i=0; i < RUNS; ++i){
        sum += test_Compass();
    }
    float minMS = vMS[0]; float maxMS = vMS[0];
    for(auto n : vMS){
        if(minMS > n){minMS=n;}
        if(maxMS < n){maxMS=n;}
    }
    cout<< "\tmin ms: "<< minMS << "; max ms: "<< maxMS<< endl;
    cout<< nRuns_str<< RUNS << nCycles_str<< CYCLES-1<< endl<<
        avr_TimeMS_str<< sum/RUNS << endl;
    return 0;
}


Quote

**Start of test.**
min ms: 18; max ms: 27
Number of Runs: 10 Number of Cycles per Run: 10001
Average time (ms): 19.1


Amazing!
Was This Post Helpful? 0
  • +
  • -

#10 random Goose   User is offline

  • D.I.C Head

Reputation: 1
  • View blog
  • Posts: 50
  • Joined: 07-August 20

Re: How can I make this faster? (Small Geometric functions).

Posted 06 September 2020 - 11:48 AM

Quote

Also where are you deleting the memory you're allocating with new?

Default destructor is not good if it's
struct
?

Quote

Also if you're worried about "speed" you should be compiling with optimizations,

g++ -O3 -mtune=native -mfma -msse4.2 -mavx2 -fabi-version=0 -fno-trapping-math -fno-math-errno -ffast-math -fprefetch-loop-arrays -std=c++20 -Wall -no-pie -fno-pie -o main main.cpp

Quote

also you are continually pushing elements into your vector which can also be slow.

Thanks. I didn't thought about that.
Was This Post Helpful? 0
  • +
  • -

#11 random Goose   User is offline

  • D.I.C Head

Reputation: 1
  • View blog
  • Posts: 50
  • Joined: 07-August 20

Re: How can I make this faster? (Small Geometric functions).

Posted 06 September 2020 - 11:58 AM

After adding->
struct fpa{
	float* vX;
	float* vY;
	unsigned int sz = 0;
	~fpa();
};
fpa::~fpa(){
    delete[] vX;
    delete[] vY;
}

I'am getting runtime error, with that:

Quote

double free or corruption (!prev)
Aborted (core dumped)
------------------
(program exited with code: 134)

Was This Post Helpful? 0
  • +
  • -

#12 Skydiver   User is online

  • Code herder
  • member icon

Reputation: 7530
  • View blog
  • Posts: 25,321
  • Joined: 05-May 12

Re: How can I make this faster? (Small Geometric functions).

Posted 06 September 2020 - 12:47 PM

See Bresenham’s circle drawing algorithm for fast circle drawing.
Was This Post Helpful? 1
  • +
  • -

#13 jimblumberg   User is online

  • member icon

Reputation: 5880
  • View blog
  • Posts: 17,867
  • Joined: 25-December 09

Re: How can I make this faster? (Small Geometric functions).

Posted 06 September 2020 - 01:32 PM

Look at line 21 and 22 in your last code. Where are you deleting the memory for that new call on line 21. Line 22 should not compile, that is a VLA which is expressly forbidden in C++ (at this time at least).

Quote

Default destructor is not good if it's

A "default" destructor doesn't delete anything. If you call new[] you must call delete[].

It is also considered a bad practice (IMO) to have some foreign entity free memory that it doesn't own. You may want to consider either using smart pointers or std::vector instead of the manual memory management.

Quote

I'am getting runtime error, with that

First are you by chance calling that destructor somewhere in your code? That "double" is a very important word in that error message. It is saying that that destructor is being called more that once (bad juju).

I would add the following compile switches to what you have, -pedantic, -pendatic-errors, Wvla. The first two additions force g++ to stop using all compiler specific hacks, and the last will warn you when you try to use Variable Length Arrays.

Jim
Was This Post Helpful? 1
  • +
  • -

#14 random Goose   User is offline

  • D.I.C Head

Reputation: 1
  • View blog
  • Posts: 50
  • Joined: 07-August 20

Re: How can I make this faster? (Small Geometric functions).

Posted 06 September 2020 - 02:33 PM

Quote

First are you by chance calling that destructor somewhere in your code?

No. Only presented code here is being compiled and run. That`s weird.

Quote

It is saying that that destructor is being called more that once (bad juju).

What should I do with that? I don't see where it even could be done.

Quote

I would add the following compile switches to what you have

Thank you.

Quote

It is also considered a bad practice to have some foreign entity free memory that it doesn't own.

Well I didn't know that. I have to google some information about this kind of things. Thanks.
Was This Post Helpful? 0
  • +
  • -

#15 random Goose   User is offline

  • D.I.C Head

Reputation: 1
  • View blog
  • Posts: 50
  • Joined: 07-August 20

Re: How can I make this faster? (Small Geometric functions).

Posted 06 September 2020 - 03:04 PM

Quote

What should I do with that?

My bad, I fixed that. Memory is should be freed now.

#include <cmath>
#include <chrono>
#include <vector>
#include <iostream>
#include <string>

using namespace std;
const float PI = 3.14159265358979f;


struct fpa{
	float* vX;
	float* vY;
	unsigned int sz = 0;
	~fpa(){
		delete[] vX;
		delete[] vY;
	}
};

// Current version:
// remember DON'T PASS END_ANGLE > 360.0.
fpa compass(float radius, float start_Angle, float end_Angle, float angle_Step){
	const unsigned int n_sz = (end_Angle - start_Angle) / angle_Step;
	const float r = radius*radius;
	float* tmp_compArr = new float[n_sz];	tmp_compArr[0] = start_Angle;
	float* tmp_Ar = new float[n_sz];		tmp_Ar[0] = start_Angle;
	float* tmpY = new float[n_sz];
	for(unsigned int i=0; i < n_sz; ++i){
		tmp_compArr[i] = angle_Step*i;
		tmp_Ar[i] = sinf(( tmp_compArr[i] * PI) / 180.0f);
		tmp_Ar[i] *= radius;
		tmpY[i] = sqrtf( r - (tmp_Ar[i] * tmp_Ar[i]) );
		if((tmp_compArr[i] > 90.0f) && (tmp_compArr[i] < 270.0f)){ tmpY[i] = -tmpY[i];}
	}
	delete[] tmp_compArr;
	fpa result;
	result.sz = n_sz;
	result.vX = tmp_Ar;		result.vY = tmpY;
	return result;
}

const string start_str = "\t**Start of test.**\n";
const string nRuns_str = "Number of Runs: ";
const string nCycles_str = " Number of Cycles per Run: ";
const string avr_TimeMS_str = "\tAverage time (ms): ";

const unsigned int RUNS = 10;
const unsigned int CYCLES = 10001;
float sum = 0.0f;
vector<float> vMS;

int test_Compass(){
    using std::chrono::system_clock;
    
    auto t1 = chrono::steady_clock::now();
    for(unsigned int i=1; i < CYCLES; ++i){
        compass(float(i), 0.0f, 360.0f, 1.0f);
    };
    auto t2 = chrono::steady_clock::now();
    vMS.push_back( chrono::duration_cast<chrono::milliseconds>(t2 - t1).count() );
    return int(chrono::duration_cast<chrono::milliseconds>(t2 - t1).count());
}

int main(){
    cout << start_str;
    for(unsigned int i=0; i < RUNS; ++i){
        sum += test_Compass();
    }
    float minMS = vMS[0]; float maxMS = vMS[0];
    for(auto n : vMS){
        if(minMS > n){minMS=n;}
        if(maxMS < n){maxMS=n;}
    }
    cout<< "\tmin ms: "<< minMS << "; max ms: "<< maxMS<< endl;
    cout<< nRuns_str<< RUNS << nCycles_str<< CYCLES-1<< endl<<
        avr_TimeMS_str<< sum/RUNS << endl;
    return 0;
}




But after all was freed(supposedly properly), it slowed down a bit.

Quote

**Start of test.**
min ms: 29; max ms: 38
Number of Runs: 10 Number of Cycles per Run: 10000
Average time (ms): 29.9

Maybe I should really try Bresenham’s circle drawing algorithm...
Was This Post Helpful? 0
  • +
  • -

  • (3 Pages)
  • +
  • 1
  • 2
  • 3