Best way to Efficiently Parse a LArge Amount of Data

  • (2 Pages)
  • +
  • 1
  • 2

28 Replies - 705 Views - Last Post: 04 July 2012 - 06:31 PM Rate Topic: -----

#1 bearsomg  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 11
  • Joined: 03-July 12

Best way to Efficiently Parse a LArge Amount of Data

Posted 03 July 2012 - 10:07 AM

Hello. I am currently writing a 3D rendering engine in C++ with Direct3D 10 that is focused on animating one type of model and motion data. I need an efficient way to parse in the animation data. The motion file is set up like this:

-Header-

-number of bone keyframes- (unsigned long)

Each Bone Motion Frame:
  • Bone name (char[20])
  • Keyframe Number (unsigned long)
  • Position vector (float[3])
  • Rotation quaternion (float[4])
  • interpolation parameters (char[64])



-number of face morph keyframes- (unsigned long)

Each face morph keyframe:
  • Face name (char[15])
  • Keyframe Number (unsigned long)
  • Weight (float)


I currently have structs set up to pull the data from a raw buffer like so:

typedef struct _BoneFrame {
   char name[15];          /* bone name */
   unsigned long keyFrame; /* key frame */
   float pos[3];           /* position (x, y, z) */
   float rot[4];           /* rotation (x, y, z, w) */
   char interpolation[64]; /* interpolation parameters */
} BoneFrame;

typedef struct _FaceFrame {
   char name[15];          /* face name */
   unsigned long keyFrame; /* key frame */
   float weight;           /* weight (0.0 - 1.0) */
} FaceFrame;



When the file is loaded, the program will first read the entire file into a buffer, then the buffer is parsed into BoneFrame and FaceFrame arrays. First it reads the number of bone frames, then goes into a for loop that continues until it reaches the last bone frame data block. It then does the same thing, except for the face morph frames.

_numBoneKeyframe = *((unsigned long *) data);
data += sizeof(unsigned long);

_boneFrameRaw = (BoneFrame *)data;
data += sizeof(BoneFrame)*_numBoneKeyframe;

_numFaceKeyframe = *((unsigned long *)data);
data += sizeof(unsigned long);

_faceFrameRaw = (FaceFrame*)data;



This is able to pull the raw data from the file into the program, but in order for it to be used, it needs to be parsed. The issue is that the motion file does not have the frames put in any particular order, it seems like they are just randomly put in there within their respective sections. In order to use this data in the animation, it needs to be separated like this:

Array of bones
For each bone in array:
Bone Name
Number of keyframes for this bone
Array of keyframe data for this bone

Array of faces
For each face in array:
Face Name
Number of keyframes for this face
Array of keyframe data for this face



So, in order to build these arrays, the program will first loop through all the bone frame data blocks in the array, and determine the number of bones in the animation:
for (int i=0;i<(int)_numBoneKeyframe;i++)
	{
		bool dupe = false;
		for (int j=0;j<i;j++)
		{
			if (!strcmp(_boneFrameRaw[i].name, _boneFrameRaw[j].name))
			{
				dupe = true;
			}
		}
		if (!dupe)
		{
			numUniqueBones++;
		}
	}


Then it will loop through again, and this time get the name of each bone, and put it in the bone array:
_boneList = (BoneMotion*)malloc(sizeof(BoneMotion)*numUniqueBones);
	int numFilledMotions = 0;
	for (int i=0;i<(int)_numBoneKeyframe;i++)
	{
		bool dupe = false;
		for (int j=0;j<i;j++)
		{
			if (!strcmp(_boneFrameRaw[i].name, _boneFrameRaw[j].name))
			{
				dupe = true;
			}
		}
		if (!dupe)
		{
			_boneList[numFilledMotions].name = (char*)malloc(sizeof(char)*strlen(_boneFrameRaw[i].name)+1);
			strcpy_s(_boneList[numFilledMotions].name, sizeof(char)*strlen(_boneFrameRaw[i].name)+1, _boneFrameRaw[i].name);
			numFilledMotions++;
		}
	}


Then get the number of keyframes for each bone:
for (int i=0;i<numUniqueBones;i++)
	{
		_boneList[i].numKeyFrame = 0;
		for (int j=0;j<(int)_numBoneKeyframe;j++)
		{
			if (!strcmp(_boneFrameRaw[j].name, _boneList[i].name))
			{
				_boneList[i].numKeyFrame++;
			}
		}
	}


And finally build the keyframe list for each bone and sort the keyframes:
for (int i=0;i<numUniqueBones;i++)
	{
		int numFullKeyframes = 0;
		_boneList[i].keyFrameList = new BoneKeyframe[_boneList[i].numKeyFrame];
		for (int j=0;j<(int)_numBoneKeyframe;j++)
		{
			if (!strcmp(_boneFrameRaw[j].name, _boneList[i].name))
			{
				_boneList[i].keyFrameList[numFullKeyframes].keyframe = (float)_boneFrameRaw[j].keyFrame;
				_boneList[i].keyFrameList[numFullKeyframes].pos = _boneFrameRaw[j].pos;
				_boneList[i].keyFrameList[numFullKeyframes].rot = _boneFrameRaw[j].rot;
				numFullKeyframes++;
			}
		}
		qsort(_boneList[i].keyFrameList, numFullKeyframes, sizeof(BoneKeyframe), compare);
	}



The same process is repeated for the face keyframes.


However, this is incredibly inefficient. What would be a better way to approach this? The motion files can get up to and probably over 50MB, and when I try to parse a file that large it takes 30 minutes or windows kills the process because it thinks it stalled.


Thank you in advance.

Is This A Good Question/Topic? 0
  • +

Replies To: Best way to Efficiently Parse a LArge Amount of Data

#2 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3553
  • View blog
  • Posts: 11,014
  • Joined: 05-May 12

Re: Best way to Efficiently Parse a LArge Amount of Data

Posted 03 July 2012 - 10:52 AM

A quick note: you are mixing the use of malloc() and C++ new in your code above. You probably want to consistently use the C++ new.

I don't see the point of you doing your initial passes to count the unique bones, and then get the names of the bones, then get the number of keyframes, and then finally get the bone keyframes finally sorted. That's 4 passes through the data.

I would get the number of bone keyframes from the file, and allocate a large array BoneFrame. I would read the all the BoneFrame data into the large array and then call qsort() where my compare function uses BoneFrame.name as the primary key, and BoneFrame.keyname as the secondary key. Now I can do a single pass through the large BoneFrame array, and build up the BoneMotion structure. The logic for building up the BoneMotion structure could dramatically be simplified by using std::vector<BoneMotion>.
Was This Post Helpful? 1
  • +
  • -

#3 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3553
  • View blog
  • Posts: 11,014
  • Joined: 05-May 12

Re: Best way to Efficiently Parse a LArge Amount of Data

Posted 03 July 2012 - 11:29 AM

I'll ask a silly question: Since you stated in the beginning that you are writing the 3D rendering engine, why do you even need to re-order the frames in the order you listed above? Since it's your engine, you should be able to write it so that it can skip around in the frames array as needed. If you say that you need it for speed, what is your engine doing between the keyframes? Why can't it be searching for the next keyframe or building up an index of the frames during its slack time?
Was This Post Helpful? 0
  • +
  • -

#4 baavgai  Icon User is offline

  • Dreaming Coder
  • member icon

Reputation: 5805
  • View blog
  • Posts: 12,643
  • Joined: 16-October 07

Re: Best way to Efficiently Parse a LArge Amount of Data

Posted 03 July 2012 - 11:39 AM

You need to figure out if you're writing C or C++. You mostly have C here, but if you're using C++ you should go with that entirely.

This looks like a weird memory leak:
_boneList[numFilledMotions].name = (char*)malloc(sizeof(char)*strlen(_boneFrameRaw[i].name)+1);


The struct has the space allocated, after all.

Just using an std::map would let your avoid all your checking and double checking. If you were doing C, a memcpy once you've figured out what you want would save a considerable amount of time.

If you don't need to ditch data, if it's not too big, you could just manipulated it in place.

Just for fun, in C...
typedef struct {
	BoneFrame *data;
	unsigned long size;
} BoneFrames;

void parse(BoneFrames *bfs, char *data) {
	int i=0;
	bfs->size = *((unsigned long *) data);
	bfs->data = (BoneFrame *)(data + sizeof(unsigned long));
	
	while(i<bfs->size) {
		int j=i+1;
		char *name = bfs->data[i].name;
		while(j<bfs->size) {
			if (strcmp(bfs->data[j].name, name)==0) {
				bfs->size--;
				bfs->data[j] = bfs->data[bfs->size];
			} else {
				j++;
			}
		}
	}
}



There, all data moved around in place. No extra allocations.
Was This Post Helpful? 0
  • +
  • -

#5 GWatt  Icon User is offline

  • member icon

Reputation: 270
  • View blog
  • Posts: 3,068
  • Joined: 01-December 05

Re: Best way to Efficiently Parse a LArge Amount of Data

Posted 03 July 2012 - 12:06 PM

How do you store the data? If you have no special pointers that need to be restored you can do direct file read/write to restore/preserve your model:
BOOL saveModel(const char *fileName, HEADER *header, ulong bfc, BoneFrame *bfs, ulong ffc, FaceFrame *ffs) {
    FILE *fp=fopen(fileName, "wb");
    if (fp==NULL) {
        return FALSE;
    }
    fwrite(header, sizeof *header, 1, fp);
    fwrite(&bfc, sizeof bfc, 1, fp);
    fwrite(bfs, sizeof *bfs, bfc, fp);
    fwrite(&ffc, sizeof ffc, fp);
    fwrite(ffs, sizeof *ffs, ffc, fp);
    fclose(fp);

    return TRUE;
}

BOOL readModel(const char *fileName, HEADER *header, ulong *bfc, BoneFrame **bfs, ulong *ffc, FacFrame **ffs) {
    FILE *fp=fopen(fileName, "rb");
    if (fp==NULL) {
        return FALSE;
    }
    fread(header, sizeof *header, 1, fp);
    fread(bfc, sizeof *bfc, 1, fp);
    *bfs=calloc(sizeof **bfs, *bfc);
    fread(*bfs, sizeof **bfs, *bfc, fp);
    fread(ffc, sizeof *ffc, 1, fp);
    *ffs=calloc(sizeof **ffs, *ffc);
    fread(*ffs, sizeof **ffs, *ffc, fp);

    fclose(fp);
    return TRUE;
}



This will read your data back in the same order you wrote them.
Was This Post Helpful? 0
  • +
  • -

#6 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3553
  • View blog
  • Posts: 11,014
  • Joined: 05-May 12

Re: Best way to Efficiently Parse a LArge Amount of Data

Posted 03 July 2012 - 12:10 PM

View Postbaavgai, on 03 July 2012 - 11:39 AM, said:

If you don't need to ditch data, if it's not too big, you could just manipulated it in place.

Just for fun, in C...
typedef struct {
	BoneFrame *data;
	unsigned long size;
} BoneFrames;

void parse(BoneFrames *bfs, char *data) {
	int i=0;
	bfs->size = *((unsigned long *) data);
	bfs->data = (BoneFrame *)(data + sizeof(unsigned long));
	
	while(i<bfs->size) {
		int j=i+1;
		char *name = bfs->data[i].name;
		while(j<bfs->size) {
			if (strcmp(bfs->data[j].name, name)==0) {
				bfs->size--;
				bfs->data[j] = bfs->data[bfs->size];
			} else {
				j++;
			}
		}
	}
}



There, all data moved around in place. No extra allocations.


Assuming that I don't get stuck in an infinite loop (I think the code is missing an i++ somewhere), don't I lose keyframes with the code above?

With data:
size:4
name:"Skull", frame:1
name:"Skull", frame:12
name:"Pelvis", frame:1
name:"Pelvis", frame:23



Won't I end up with BoneFrames:
name:"Skull", frame:1
name:"Pelvis", frame:1
size:2



What happened to the skull frame 12, and pelvis frame 23?
Was This Post Helpful? 0
  • +
  • -

#7 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3553
  • View blog
  • Posts: 11,014
  • Joined: 05-May 12

Re: Best way to Efficiently Parse a LArge Amount of Data

Posted 03 July 2012 - 12:17 PM

View PostGWatt, on 03 July 2012 - 12:06 PM, said:

How do you store the data? If you have no special pointers that need to be restored you can do direct file read/write to restore/preserve your model:
BOOL saveModel(const char *fileName, HEADER *header, ulong bfc, BoneFrame *bfs, ulong ffc, FaceFrame *ffs) {
    FILE *fp=fopen(fileName, "wb");
    if (fp==NULL) {
        return FALSE;
    }
    fwrite(header, sizeof *header, 1, fp);
    fwrite(&bfc, sizeof bfc, 1, fp);
    fwrite(bfs, sizeof *bfs, bfc, fp);
    fwrite(&ffc, sizeof ffc, fp);
    fwrite(ffs, sizeof *ffs, ffc, fp);
    fclose(fp);

    return TRUE;
}

BOOL readModel(const char *fileName, HEADER *header, ulong *bfc, BoneFrame **bfs, ulong *ffc, FacFrame **ffs) {
    FILE *fp=fopen(fileName, "rb");
    if (fp==NULL) {
        return FALSE;
    }
    fread(header, sizeof *header, 1, fp);
    fread(bfc, sizeof *bfc, 1, fp);
    *bfs=calloc(sizeof **bfs, *bfc);
    fread(*bfs, sizeof **bfs, *bfc, fp);
    fread(ffc, sizeof *ffc, 1, fp);
    *ffs=calloc(sizeof **ffs, *ffc);
    fread(*ffs, sizeof **ffs, *ffc, fp);

    fclose(fp);
    return TRUE;
}



This will read your data back in the same order you wrote them.


The reading/writing doesn't seem to be the issue. It's the grouping the keyframes together so that all the keyframes for a particular bone or face are all clustered together (and presumably order in ascending order). As he stated, the keyframes are not stored "in no particular order", but his rendering engine wants them to be ordered.
Was This Post Helpful? 0
  • +
  • -

#8 GWatt  Icon User is offline

  • member icon

Reputation: 270
  • View blog
  • Posts: 3,068
  • Joined: 01-December 05

Re: Best way to Efficiently Parse a LArge Amount of Data

Posted 03 July 2012 - 12:26 PM

Actually, he did state that the frames are stored in no particular order. The exact wording:

Quote

The issue is that the motion file does not have the frames put in any particular order, it seems like they are just randomly put in there within their respective sections.

Was This Post Helpful? 0
  • +
  • -

#9 bearsomg  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 11
  • Joined: 03-July 12

Re: Best way to Efficiently Parse a LArge Amount of Data

Posted 03 July 2012 - 12:39 PM

View PostSkydiver, on 03 July 2012 - 10:52 AM, said:

A quick note: you are mixing the use of malloc() and C++ new in your code above. You probably want to consistently use the C++ new.

I don't see the point of you doing your initial passes to count the unique bones, and then get the names of the bones, then get the number of keyframes, and then finally get the bone keyframes finally sorted. That's 4 passes through the data.

I would get the number of bone keyframes from the file, and allocate a large array BoneFrame. I would read the all the BoneFrame data into the large array and then call qsort() where my compare function uses BoneFrame.name as the primary key, and BoneFrame.keyname as the secondary key. Now I can do a single pass through the large BoneFrame array, and build up the BoneMotion structure. The logic for building up the BoneMotion structure could dramatically be simplified by using std::vector<BoneMotion>.


Ah, thank you for pointing that out. I feel so clueless now... If I sort the entire BoneFrame array in order of each bone name, then in order of the keyframe frame number, I should be able to just loop through the array once, checking which "section" (given by bone name) the current index in the array is, and put the data into the respective index in the _boneList array. I still can't figure out how to make the sort function though. I just looked at my dump of the motion data file, and it looks like the first bone keyframe entries in the motion data are each unique bone, at frame 0, in the order that I need them to be when the sort is finished. But after there entries, the frame data is just shoved in there randomly.

View PostSkydiver, on 03 July 2012 - 11:29 AM, said:

I'll ask a silly question: Since you stated in the beginning that you are writing the 3D rendering engine, why do you even need to re-order the frames in the order you listed above? Since it's your engine, you should be able to write it so that it can skip around in the frames array as needed. If you say that you need it for speed, what is your engine doing between the keyframes? Why can't it be searching for the next keyframe or building up an index of the frames during its slack time?


I have one MotionController class, that takes care of advancing the current frame based on the deltatime and the desired fps. Each time the update function is called, it also checks to see if we passed the keyframe we are currently interpolating to, and if we did then it will increment the index into the keyframe array up 1, so the next tiem update is called it will interpolate between the last keyframe and the next keyframe. This saves processing time, so each time I request the current position and rotation, all it has to do is calculate the interpolated position and rotation between the keyframes, using the current keyframe index to directly get the keyframe at that index in the array. That way, it doesn't need to search for the next keyframe and waste time doing so. With the frame data separated into bone name, then ordered by frame number, incrementing the current index by 1 will automatically reference the next keyframe. The engine itself is calculating physics, updating bone positions, and skinning vertices each frame, so any processing power that can be saved must be saved.
Was This Post Helpful? 0
  • +
  • -

#10 bearsomg  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 11
  • Joined: 03-July 12

Re: Best way to Efficiently Parse a LArge Amount of Data

Posted 03 July 2012 - 12:45 PM

View PostGWatt, on 03 July 2012 - 12:06 PM, said:

How do you store the data? If you have no special pointers that need to be restored you can do direct file read/write to restore/preserve your model:
BOOL saveModel(const char *fileName, HEADER *header, ulong bfc, BoneFrame *bfs, ulong ffc, FaceFrame *ffs) {
    FILE *fp=fopen(fileName, "wb");
    if (fp==NULL) {
        return FALSE;
    }
    fwrite(header, sizeof *header, 1, fp);
    fwrite(&bfc, sizeof bfc, 1, fp);
    fwrite(bfs, sizeof *bfs, bfc, fp);
    fwrite(&ffc, sizeof ffc, fp);
    fwrite(ffs, sizeof *ffs, ffc, fp);
    fclose(fp);

    return TRUE;
}

BOOL readModel(const char *fileName, HEADER *header, ulong *bfc, BoneFrame **bfs, ulong *ffc, FacFrame **ffs) {
    FILE *fp=fopen(fileName, "rb");
    if (fp==NULL) {
        return FALSE;
    }
    fread(header, sizeof *header, 1, fp);
    fread(bfc, sizeof *bfc, 1, fp);
    *bfs=calloc(sizeof **bfs, *bfc);
    fread(*bfs, sizeof **bfs, *bfc, fp);
    fread(ffc, sizeof *ffc, 1, fp);
    *ffs=calloc(sizeof **ffs, *ffc);
    fread(*ffs, sizeof **ffs, *ffc, fp);

    fclose(fp);
    return TRUE;
}



This will read your data back in the same order you wrote them.


If I could optimize the data (get it in all the right orders) then write it back to the file, I would. However, the program used to make the animations and motion data will be unable to read the file back in.


View PostSkydiver, on 03 July 2012 - 12:17 PM, said:

View PostGWatt, on 03 July 2012 - 12:06 PM, said:

How do you store the data? If you have no special pointers that need to be restored you can do direct file read/write to restore/preserve your model:
BOOL saveModel(const char *fileName, HEADER *header, ulong bfc, BoneFrame *bfs, ulong ffc, FaceFrame *ffs) {
    FILE *fp=fopen(fileName, "wb");
    if (fp==NULL) {
        return FALSE;
    }
    fwrite(header, sizeof *header, 1, fp);
    fwrite(&bfc, sizeof bfc, 1, fp);
    fwrite(bfs, sizeof *bfs, bfc, fp);
    fwrite(&ffc, sizeof ffc, fp);
    fwrite(ffs, sizeof *ffs, ffc, fp);
    fclose(fp);

    return TRUE;
}

BOOL readModel(const char *fileName, HEADER *header, ulong *bfc, BoneFrame **bfs, ulong *ffc, FacFrame **ffs) {
    FILE *fp=fopen(fileName, "rb");
    if (fp==NULL) {
        return FALSE;
    }
    fread(header, sizeof *header, 1, fp);
    fread(bfc, sizeof *bfc, 1, fp);
    *bfs=calloc(sizeof **bfs, *bfc);
    fread(*bfs, sizeof **bfs, *bfc, fp);
    fread(ffc, sizeof *ffc, 1, fp);
    *ffs=calloc(sizeof **ffs, *ffc);
    fread(*ffs, sizeof **ffs, *ffc, fp);

    fclose(fp);
    return TRUE;
}



This will read your data back in the same order you wrote them.


The reading/writing doesn't seem to be the issue. It's the grouping the keyframes together so that all the keyframes for a particular bone or face are all clustered together (and presumably order in ascending order). As he stated, the keyframes are not stored "in no particular order", but his rendering engine wants them to be ordered.


Yes. In the file (as I said in my last post, apparently you guys were replying while I was writing it ;)), the first entries are all the bones, in the correct order, at frame 0. After that, the entries are completely random, until you reach the face section. After that the same setup is repeated for the face keyframes.
Was This Post Helpful? 0
  • +
  • -

#11 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3553
  • View blog
  • Posts: 11,014
  • Joined: 05-May 12

Re: Best way to Efficiently Parse a LArge Amount of Data

Posted 03 July 2012 - 12:55 PM

View Postbaavgai, on 03 July 2012 - 11:39 AM, said:

You need to figure out if you're writing C or C++. You mostly have C here, but if you're using C++ you should go with that entirely.

This looks like a weird memory leak:
_boneList[numFilledMotions].name = (char*)malloc(sizeof(char)*strlen(_boneFrameRaw[i].name)+1);


The struct has the space allocated, after all.


Actually I was tricked into thinking the same thing, too. But on closer inspection _boneList is type of (BoneMotion*), not (BoneFrame*) that I think both of us were thinking. Since we don't have the definition of BoneMotion, I can only guess that the code above is correct in that it needs to allocate room for the name.

View PostGWatt, on 03 July 2012 - 12:26 PM, said:

Actually, he did state that the frames are stored in no particular order. The exact wording:

Quote

The issue is that the motion file does not have the frames put in any particular order, it seems like they are just randomly put in there within their respective sections.


Oops. The 'not' was not supposed to be in my statement

Quote

As he stated, the keyframes are not stored "in no particular order", but his rendering engine wants them to be ordered.


View Postbearsomg, on 03 July 2012 - 12:45 PM, said:

If I could optimize the data (get it in all the right orders) then write it back to the file, I would. However, the program used to make the animations and motion data will be unable to read the file back in.


That would mean that the program has some kind of index file of the keyframes if you can't change the order in the file. Why can't you read the index file to figure out the order of the keyframes? Or is the keyframe data file documented, but the index file proprietary?

This post has been edited by Skydiver: 03 July 2012 - 12:50 PM

Was This Post Helpful? 0
  • +
  • -

#12 bearsomg  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 11
  • Joined: 03-July 12

Re: Best way to Efficiently Parse a LArge Amount of Data

Posted 03 July 2012 - 12:58 PM

View PostSkydiver, on 03 July 2012 - 12:52 PM, said:

Actually I was tricked into thinking the same thing, too. But on closer inspection _boneList is type of (BoneMotion*), not (BoneFrame*) that I think both of us were thinking. Since we don't have the definition of BoneMotion, I can only guess that the code above is correct in that it needs to allocate room for the name.


Yes, name is just a char*, it does not have a predefined allocation size. This is the declaration for BoneMotion and BoneKeyframe (the struct for a single keyframe, each BoneMotion has an array of these:

typedef struct _BoneKeyframe {
	float keyframe;
	D3DXVECTOR3 pos;
	D3DXQUATERNION rot;
} BoneKeyframe;
typedef struct _BoneMotion {
	char *name;
	unsigned long numKeyFrame;
} BoneMotion;



Sorry, I typed that wrong, and I can't seem to find the edit post button:


typedef struct _BoneMotion {
	char *name;
	unsigned long numKeyFrame;
	BoneKeyframe *keyFrameList;
} BoneMotion;


Was This Post Helpful? 0
  • +
  • -

#13 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3553
  • View blog
  • Posts: 11,014
  • Joined: 05-May 12

Re: Best way to Efficiently Parse a LArge Amount of Data

Posted 03 July 2012 - 01:22 PM

View Postbearsomg, on 03 July 2012 - 12:39 PM, said:

If I sort the entire BoneFrame array in order of each bone name, then in order of the keyframe frame number, I should be able to just loop through the array once, checking which "section" (given by bone name) the current index in the array is, and put the data into the respective index in the _boneList array. I still can't figure out how to make the sort function though.


How did you get your initial compare() function to work in the code where you called qsort() to sort the keyframes? Just cribbed some old pre-existing code?

Anyway, here's an example of sorting where POINT.x is the primary key, and POINT.y is the secondary key:
typedef struct _point
{
    int x, y;
} POINT;

void DumpPoints(POINT * point, int n)
{
    for(int i = 0; i < n; ++i, ++point)
        printf("%d: %d, %d\n", i, point->x, point->y);
    printf("\n");
}

int mycompare(const void * pv1, const void *pv2)
{
    const POINT * p1 = reinterpret_cast<const POINT *>(pv1);
    const POINT * p2 = reinterpret_cast<const POINT *>(pv2);

    if (p1->x != p2->x)
        return p1->x - p2->x;

    return p1->y - p2->y;
}

int _tmain(int argc, _TCHAR* argv[])
{
    POINT points[5] = { { 3, 5 }, { 6, 7 }, { 3, 1, }, { 1, 2 }, { 6, 4 } };

    DumpPoints(points, 5);
    qsort(points, 5, sizeof(POINT), mycompare);
    DumpPoints(points, 5);
    return 0;
}



It should give you a basis for figuring out how to store by name as the primary key and frame number as the secondary key.
Was This Post Helpful? 0
  • +
  • -

#14 Salem_c  Icon User is online

  • void main'ers are DOOMED
  • member icon

Reputation: 1655
  • View blog
  • Posts: 3,133
  • Joined: 30-May 10

Re: Best way to Efficiently Parse a LArge Amount of Data

Posted 03 July 2012 - 02:49 PM

Since you're on windows, and therefore lack a decent profiler (unless you shell out the $$$ to MS), you should do this
http://msdn.microsof...v=vs.85%29.aspx

At key points in the code, put a few things like
printf("Start of reading bones @T=%lu\n", GetTickCount() );


You should end up with say a few dozen prints, and some idea of where the big time differences are.
Perhaps add more prints to the inner parts of the expensive routines to find out more detail.

The first step to optimisation is knowing where the problem is to begin with. For this, you need some measurements on your existing code (guessing does NOT work).

When you've found the function / loop which is taking 90% of the time, you can focus your attention on that particular bit of the code.
Was This Post Helpful? 1
  • +
  • -

#15 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3553
  • View blog
  • Posts: 11,014
  • Joined: 05-May 12

Re: Best way to Efficiently Parse a LArge Amount of Data

Posted 03 July 2012 - 03:16 PM

Salem_C makes a good point about profiling, but I'm willing to buy somebody a year's supply of Oreos that the slowest bit of code (other than the code to actually read from disk into memory) is the last chunk of code in the OP, where the actual keyframe data is copied from _boneFrameRaw into the _boneList[i].keyFrameList and then qsort()'d.

(PM me if your mailing address in case I am wrong and there is something else that was posted that is slower.)
Was This Post Helpful? 0
  • +
  • -

  • (2 Pages)
  • +
  • 1
  • 2