4 Replies - 938 Views - Last Post: 17 February 2009 - 05:51 AM

#1 JayX   User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 2
  • Joined: 13-February 09

My code seems so badly written it's crashing the server..!

Post icon  Posted 13 February 2009 - 06:18 AM

Hi guys, apologies for the length of the post.. I'd rather give too much information than too little and waste peoples time.

I'm working on a project in work at the moment where I have to convert insurance policy data provided by different companies to a format that is readable by our programming guys for lookups. I'm basically passed data in various CSV formats and I'm asked to spit it back out, independently of who provided it, in the same 'one line' pipe-delimited format. It's not ideal, but this was set up before I became involved and it's too late to change now it seems! (Typical!)

I took over when the guys trying to do it in Excel got hit by the 65k limit of 2003 and we were dealing with files at 130,000+ lines and I said I'd give it a shot as I had an understanding of php/sql and wanted to expand my knowledge in the area. So I've written the php/sql and it's working fine for the company the formatting was based around, but since then we have a client providing their policies in a completely different format.

Every week I receive a set of files from them in fixed width format, with 3 different width settings based on the value of the 10th character. If this is a '1' it's a policy information line (only one of which can exist), '2' is information on the policy holder (one line per policy holder) and '3' is a list of medical conditions (one line per condition per policy holder). I made a table for each type of line.

The format I need to supply this in as stated before is on one line, so columns tend to be INSURED1, CONDITION1_1, CONDITION2_1, CONDITION3_1, INSURED2, CONDITION1_2, CONDITION2_2, CONDITION3_2 etc.

Currently I dump the raw files into a standard MyISAM db in the same formatting they are received as they contain more details than required and useful for more detailed lookups. I then need to combine the policy line with every line about the insured along with all their conditions.

Doing this in one attempt didn't seem to work, so I thought about combining the insured/condition sections first as this was the more complicated bit, then adding the results to the policy data and importing. This is where I'm having problem. The other sections all work fine, and even tho my code isn't great due to my inexperience, it works and it doesn't take a ridiculous amount of time.. however this next bit is possibly crashing the queried table (It's hard to tell as it's not saying crashed, but phpmyadmin/websites trying to access it completely halt and even killing the process doesn't work. I'm not ruling out the MySQLD being badly setup, but my code certainly isn't helping).

Here is the code I'm using for the insured/condition combine.. I'm sure there's a better and more efficient way of doing this and also setting up proper indexes would help (I'm reading into this at the moment, but currently the DB has been crashed in the same way as above when I attempted to simply drop an index that had been made before so I'm not able to test anything until SQLD eventually goes back up!)

<?php 
include('includes/dbconnect.php'); 
ini_set('max_execution_time', 0);

mysql_query("TRUNCATE TABLE `insdcond`") or die(mysql_error());
echo "Table emptied.. <br />";
mysql_query("ALTER TABLE `insdcond`
  DROP `condition1`,
  DROP `condition2`,
  DROP `condition3`,
  DROP `condition4`,
  DROP `condition5`;"); 
echo "Columns dropped.. <br />";
mysql_query ("INSERT INTO `insdcond` SELECT * FROM `insured`;") or die(mysql_error());
echo "Data copied.. <br />";
mysql_query("ALTER TABLE `insdcond` 
	ADD `condition1` VARCHAR( 30 ) NULL AFTER `conditions`,
	ADD `condition2` VARCHAR( 30 ) NULL AFTER `condition1`,
	ADD `condition3` VARCHAR( 30 ) NULL AFTER `condition2`,
	ADD `condition4` VARCHAR( 30 ) NULL AFTER `condition3`,
	ADD `condition5` VARCHAR( 30) NULL AFTER `condition4`;") or die ("HAF! HAF! (2)");
echo "Columns added.. <br /><br />";

$query = "SELECT DISTINCT 
c1.policyNumber as polNo, c1.personNumber as perNo, c1.conditionName as cond1,
c2.policyNumber, c2.personNumber, c2.conditionName as cond2,
c3.policyNumber, c3.personNumber, c3.conditionName as cond3,
c4.policyNumber, c4.personNumber, c4.conditionName as cond4,
c5.policyNumber, c5.personNumber, c5.conditionName as cond5

FROM condition c1

LEFT JOIN condition c2 on c1.policyNumber = c2.policyNumber
AND c1.personNumber = c2.personNumber
AND c1.conditionName != c2.conditionName

LEFT JOIN condition c3 on c1.policyNumber = c3.policyNumber
AND c1.personNumber = c3.personNumber
AND c1.conditionName != c3.conditionName
AND c2.conditionName != c3.conditionName

LEFT JOIN condition c4 on c1.policyNumber = c4.policyNumber
AND c1.personNumber = c4.personNumber
AND c1.conditionName != c4.conditionName
AND c2.conditionName != c4.conditionName
AND c3.conditionName != c4.conditionName

LEFT JOIN condition c5 on c1.policyNumber = c5.policyNumber
AND c1.personNumber = c5.personNumber
AND c1.conditionName != c5.conditionName
AND c2.conditionName != c5.conditionName
AND c3.conditionName != c5.conditionName
AND c4.conditionName != c5.conditionName

GROUP BY concat(c1.policyNumber,c1.personNumber)";

$result = mysql_query($query) or die ("HAF! HAF! (3)");
$num_rows = mysql_num_rows($result);

$counter=1;

	while ($row = mysql_fetch_array($result))
		{
			extract($row);
			
			set_time_limit(100);
							
	$cond1 = addslashes($cond1);
	$cond2 = addslashes($cond2);
	$cond3 = addslashes($cond3);
	$cond4 = addslashes($cond4);
	$cond5 = addslashes($cond5);
			
			mysql_query("UPDATE saga_insdcond SET condition1 = '$cond1', condition2 = '$cond2', condition3 =
			'$cond3', condition4 = '$cond4', condition5 = '$cond5'
			WHERE 
			policyNumber = '$polNo' AND personNumber = '$perNo'") or die(mysql_error());
			
			echo "$counter <br />";
		}
		$counter++; 
		?>



Currently there are 300,000 lines in the conditions DB and about 190,000 lines in insured. I'm sure you can guess which part of the code I think is horribly inefficient! Does anyone have a better way of doing this? When I tried it with smaller line counts it worked fine, but the amount of work required to pull this instruction off is probably ludicrous and the server is bailing.

Please feel free to ask for any info that might help out.. TIA.

This post has been edited by JayX: 13 February 2009 - 06:20 AM


Is This A Good Question/Topic? 0
  • +

Replies To: My code seems so badly written it's crashing the server..!

#2 Hary   User is offline

  • D.I.C Regular

Reputation: 44
  • View blog
  • Posts: 427
  • Joined: 23-September 08

Re: My code seems so badly written it's crashing the server..!

Posted 15 February 2009 - 08:08 AM

Quote

however this next bit is possibly crashing the queried table (It's hard to tell as it's not saying crashed, but phpmyadmin/websites trying to access it completely halt and even killing the process doesn't work. I'm not ruling out the MySQLD being badly setup, but my code certainly isn't helping).


It looks like MySQL is creating a lot of temp tables in memory, while the RAM is exhausted. The OS starts swapping and the MySQLd is halted and cannot be stopped.

The "group by" in combination with a concat is quite hard, you might try to have MySQL generate its query plan for the query and try to optimize using that knowledge. You might have to change the order of the joins for it...
Was This Post Helpful? 1
  • +
  • -

#3 baavgai   User is offline

  • Dreaming Coder
  • member icon


Reputation: 7205
  • View blog
  • Posts: 15,017
  • Joined: 16-October 07

Re: My code seems so badly written it's crashing the server..!

Posted 15 February 2009 - 05:02 PM

You code is vivid example of why programmers shouldn't mess with databases. ;)

If I understand what I'm seeing, you don't need to bring any of this data down to the client. You certainly don't need all that object creation. If your design was marginally normalized, it might not need any object creation at all.

This code may work.
drop TABLE if exists cscratch;

CREATE TABLE cscratch (
	id INT NOT NULL AUTO_INCREMENT, 
	seq INT, 
	policyNumber VARCHAR(30),
	personNumber VARCHAR(30),
	conditionName VARCHAR(30),
	PRIMARY KEY (id)
);

INSERT INTO cscratch(policyNumber, personNumber, conditionName)
	SELECT DISTINCT policyNumber, personNumber, conditionName FROM condition;

CREATE TEMPORARY TABLE cscratch2 ( id INT, seq INT );

insert into cscratch2 select a.id, count(b.id) + 1 from cscratch a left outer join cscratch b on a.policyNumber = b.policyNumber and a.personNumber = b.personNumber and a.id>b.id group by a.id;

UPDATE cscratch a SET seq = (select seq from cscratch2 b WHERE a.id=b.id);

drop TABLE if exists cscratch2;

UPDATE saga_insdcond a
	SET condition1 = ( select conditionName from cscratch b where b.seq=1 and a.policyNumber = b.policyNumber and a.personNumber = b.personNumber ),
	condition2 = ( select conditionName from cscratch b where b.seq=2 and a.policyNumber = b.policyNumber and a.personNumber = b.personNumber ),
	condition3 = ( select conditionName from cscratch b where b.seq=3 and a.policyNumber = b.policyNumber and a.personNumber = b.personNumber ),
	condition4 = ( select conditionName from cscratch b where b.seq=4 and a.policyNumber = b.policyNumber and a.personNumber = b.personNumber ),
	condition5 = ( select conditionName from cscratch b where b.seq=5 and a.policyNumber = b.policyNumber and a.personNumber = b.personNumber )
	WHERE concat(policyNumber, personNumber) in (select distinct concat(policyNumber, personNumber) from cscratch);
	
drop TABLE if exists cscratch;



This would probably be even easier if you didn't have the condition1, condition2, condition3... conditionN business. This kind of think begs for a child table.

Hope this helps.
Was This Post Helpful? 1
  • +
  • -

#4 JayX   User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 2
  • Joined: 13-February 09

Re: My code seems so badly written it's crashing the server..!

Posted 17 February 2009 - 05:34 AM

Hary - ah thanks for the insight, I'm going to move PHP/SQL over to a dedicated box I have full control of, so will certainly make life easier if it bails on me again. Cheers!

Baavgai - to call me a programmer is a huge insult to anyone who can write more than a 'Hello World' in any language! ;)

Thanks for the code, I tried to run it but I got the same problem again, I'll definitely give it another bash when I move PHP/SQL over to the new server and see if it will run then. It's good to know a proper way of moving from a (semi) normalised table to the mess that's requested of me!
Was This Post Helpful? 0
  • +
  • -

#5 baavgai   User is offline

  • Dreaming Coder
  • member icon


Reputation: 7205
  • View blog
  • Posts: 15,017
  • Joined: 16-October 07

Re: My code seems so badly written it's crashing the server..!

Posted 17 February 2009 - 05:51 AM

View PostJayX, on 17 Feb, 2009 - 06:34 AM, said:

Thanks for the code, I tried to run it but I got the same problem again, I'll definitely give it another bash when I move PHP/SQL over to the new server and see if it will run then. It's good to know a proper way of moving from a (semi) normalised table to the mess that's requested of me!


You could just be hitting a wall with memory or storage. Step through what I gave you and see where it gets stuck. You might be able to break down a step further and drop or release some allocated resources.

For normalized... If you have a personNumber or policyNumber, they should be numeric. If you have a one to many relationship, like a person can only have one policy, it should be a field in the person record and just pass about the person reference. If it's many to many, it might be worth putting an identity on that table so you only have to pass one value around. Don't be afraid to put indexes where you need them. If the conditions are standard, the should also have some numeric id. Remember, it's much easier to manipulate a unique int than a unique varchar.

If you have a schema that's bottle necking on you, post it to the database forum for some suggestions. Good Luck.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1