12 Replies - 3293 Views - Last Post: 09 April 2010 - 12:20 PM Rate Topic: -----

#1 bwcamy  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 9
  • Joined: 23-July 09

cfloop issue

Posted 09 April 2010 - 07:38 AM

Hello,

I have an application that loops through records to process them. The application will be initiated and start processing the through the loop, say at 4 pm. It will process for a while. Then at say, 4:30 pm, it will still be running, but it somehow starts itself over again at the beginning. So, at 4:30, I will see the records midstream of the data processing, as well as the records at the beginning of the data stream being processed for a second time.

I cannot see any errors in the ColdFusion logs, nor are there any second triggers of the process at the time the duplication starts.

I have two separate files that process large volumes of data using a cfloop - one loops in a from-to structure, and the other loops through a query. They both have experienced this duplication / simultaneously running issue.

Any ideas of what might cause this? What might I look at to try to find a resolution to this?

TIA!
Amy

Is This A Good Question/Topic? 0
  • +

Replies To: cfloop issue

#2 Craig328  Icon User is offline

  • I make this look good
  • member icon

Reputation: 1866
  • View blog
  • Posts: 3,389
  • Joined: 13-January 08

Re: cfloop issue

Posted 09 April 2010 - 07:42 AM

Amy:

Is this process running as a scheduled task perchance? Can you post the looping code you're using?
Was This Post Helpful? 0
  • +
  • -

#3 bwcamy  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 9
  • Joined: 23-July 09

Re: cfloop issue

Posted 09 April 2010 - 07:53 AM

Well, there is a scheduled task involved with both processes. In both cases, one file schedules a task. The task is file #2, and file #2's job is to execute the cfc file that contains the looping.

However, one of my two loops previously did not run via the scheduled task. File 1 called the cfc directly, and I still had the duplication issue.

I'll post the abbreviated code in another post shortly.
Was This Post Helpful? 0
  • +
  • -

#4 Craig328  Icon User is offline

  • I make this look good
  • member icon

Reputation: 1866
  • View blog
  • Posts: 3,389
  • Joined: 13-January 08

Re: cfloop issue

Posted 09 April 2010 - 08:19 AM

Well, while you do that, I have a couple of questions/thoughts/suggestions for how you might troubleshoot the issue.

Does the loop tend to restart at a certain point? That is, is the number of records the loop process runs through fairly consistent before it starts over? Do you have error catching that might be causing the page containing the loop process to be recalled? Can you run the looping process in a browser (rather than as a scheduled task on the server)? If so, have you tried dropping in a cfflush tag with some bit of info (like perhaps the loop iteration count) at the end of successful loop so that you can track progress via the browser? Finally, have you tried wiping out the existing scheduled task and creating a new one?

My initial thought is that you're running into either a timeout problem OR your scheduled task is set up to run 30 minutes later. CF did have some weirdness with scheduled task scheduling/running some time back. You may be running afoul of something like that.

Anyway, post your code when you can so we can see if you have anything obvious there. Also, what version of CF are you running, platform it's running on and if the loop process accesses a database, what database product/version you're using?
Was This Post Helpful? 0
  • +
  • -

#5 bwcamy  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 9
  • Joined: 23-July 09

Re: cfloop issue

Posted 09 April 2010 - 09:10 AM

Here is one snippet - I abbreviated it because the full code set is really long. I'm still working to abbreviate the second process.
<cfloop from="#arguments.startRecord#" to="#arguments.endRecord#" index="i">
            	<cftry>
				<cfscript>
					<!--- uses cfscript to get customer data from XML --->			
				</cfscript>
                
                <!--- convert XML data into plain text for use in email --->	
				
				<!--- if Contact_no exists --->
				<cfif len(trim(this.Customer.ID)) GT 0>
					<cfif this.saveCustomerObj.CheckCustomer(customerID: this.Customer.ID, datasource: sendEmailDB) is false>
						<cfscript>
		
							this.Customer.RecipientID = this.saveCustomerObj.SaveRecipient(recipient: this.Customer, datasource: sendEmailDB);
							this.saveCustomerObj.SaveCustomer(Customer: this.customer, datasource: sendEmailDB);						
						</cfscript>
						<cfset this.recipientID = this.Customer.RecipientID>
					<cfelse>
						<cfscript>
							if (this.saveCustomerObj.CheckCustomerEmail(customer: this.customer, datasource: sendEmailDB) is false)
							{
								// update customer email
								this.saveCustomerObj.UpdateCustomer(Customer: this.customer, datasource: sendEmailDB);
								this.saveCustomerObj.UpdateRecipient(Customer: this.customer, datasource: sendEmailDB);
							}
							this.recipientID = this.saveCustomerObj.GetRecipientID(this.Customer.ID, sendEmailDB);
						</cfscript>
					</cfif>
					
				<!--- else, no contact_no...save recipient and get ID --->
				<!--- Modify to check recipient ID --->
				<cfelse>
					<cfset this.RecipientID = this.saveCustomerObj.SaveRecipient(recipient: this.Customer, datasource: sendEmailDB)>
				</cfif>
				
				
				<!--- Personalize email content --->

				<cfset this.HTMLEmail = this.EmailXMLObj.EMAIL.HTML_EMAIL.XMLText>
				<cfset this.textEmail = this.EmailXMLObj.EMAIL.TEXT_EMAIL.XMLText>
				
				<cfset this.opening = "<">
				<cfset this.closing = ">">
				
				<!--- Series of replacenocase here for email content --->

				<cfif this.customer.ID eq ''>
                	<cfset this.customer.ID = 0>
                </cfif>
				<cfif>

					<!--- send email out with one set of parameters by calling cfc --->	
					</cfif>
				<cfelse>

					<!--- send email out with a different set of parameters by calling cfc --->
				</cfif>

				<cfquery>

				</cfquery>
                
				<cfquery>

				</cfquery>
            <cfcatch>
            	<cfmail>

                </cfmail>
            </cfcatch>
            </cftry>

			</cfloop>

Was This Post Helpful? 0
  • +
  • -

#6 Craig328  Icon User is offline

  • I make this look good
  • member icon

Reputation: 1866
  • View blog
  • Posts: 3,389
  • Joined: 13-January 08

Re: cfloop issue

Posted 09 April 2010 - 10:36 AM

Okay. There's nothing in that code snippet that should cause it to restart on it's own. It was heavily redacted so I can't say whether the missing parts would have any effect but from what it looks like you're doing, it doesn't appear so.

Earlier you said "I have an application that loops through records to process them". How many records are we talking about here and how are you getting these records (query result set, web feed, XML document)?

However, in the interim, I'll suggest this: can you copy the pertinent pages off, make changes to them so they don't send out emails and such to your customers or change your customer database and run them manually through a browser? In addition to the changes to nullify database updates/inserts/deletes and sending out emails try:

  • putting a cfoutput before the looping starts (<cfset total = arguments.endRecord - arguments.startRecord><cfoutput>Records: [#total#]</cfoutput><br>)
  • putting a cfoutput tag at the top of your loop process (<cfoutput>Counter: [#i#] at #Now()#</cfoutput><br>)
  • putting a cfflush down at the end of the loop (<cfflush interval="1">)

...and let it run.

What you'll see is the count of how many records you're about to process and a counter and time initiated for each loop iteration. I suspect that you'll discover that the loop proceeds all the way through to the end without repeating. In fact, I'd be very surprised to hear it didn't do exactly that. If it runs fine (and you'll have a count of the records processed and an idea needed for the time required to process them) then it's likely something having to do with the scheduled task itself...most likely the timeout setting for that task.

If the time you need to process the records exceeds the timeout time allotted to the scheduled task then you're probably on the right track to nailing this issue down, I suspect.

There are other ways you might consider for processing batch records like this. For instance, does all the processing need to occur at once or can it be broken down into separate smaller tasks? If you're getting the records from a database then you might consider building a task that creates a new table in the database (if it doesn't exist) that contains the record ID, a datestamp and a processed flag column, pulling up a block of records (a fraction of what you're doing now), looping over them and upon completion of each record, setting a "processed" flag for that record in the new table you just created. You call that as a scheduled task as many times a day as you need to process all the records and when the last record is done, run a query to drop the new table you created.

That last suggestion is, of course, a rough description of what you might do. It can be refined much more than that and is just a gross example for how you might consider accomplishing your task differently than you're doing now.

This post has been edited by Craig328: 09 April 2010 - 10:36 AM

Was This Post Helpful? 0
  • +
  • -

#7 bwcamy  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 9
  • Joined: 23-July 09

Re: cfloop issue

Posted 09 April 2010 - 10:53 AM

There is no specific point that the issue occurs at. It does always occur with higher volumes, never with say, 2 records or 10 records, but rather with 5K or 25K. But every time I process a large number of records, it doesn't always duplicate.

There is a cftry / cfcatch inside the loop, so if the process errs, cf should catch it and go on to the next record.

Previously, I did run the process from the browser, and still had the problem that way. I have not tried a cfflush yet. But I have never been able to produce the issue in a testing environment - it has only occurred on production, but I could put the cfflush into the results file to review the next time that it does occur.

The task isn't scheduled at regular intervals - file #1 of my application schedules it to occur one time. So, each time the app runs, a new task is scheduled and set to run at an interval of "once". I can post the code that is scheduling the tasks as well.

Because the interval is set to "once", and because the duplication has occurred when there wasn't even a scheduled task involved in the process, I don't think it is the scheduling / timing of the task that is causing the duplication.

The weird thing that makes me feel like it isn't a timeout issue is that the initial instance of the loop continues to run until it is complete, while the secondary instance is simultaneously running.

I am running CF8 on Windows Server 2003. The queries within the loop access a MySql 5 database.

Thanks so much!
Was This Post Helpful? 0
  • +
  • -

#8 xheartonfire43x  Icon User is offline

  • D.I.C Regular

Reputation: 46
  • View blog
  • Posts: 454
  • Joined: 22-December 08

Re: cfloop issue

Posted 09 April 2010 - 11:04 AM

I had an issue once trying to loop over big files. It was an import of a couple of files our client would upload everyday and we needed to move the data into a database. I found that running the import through coldfusion was such a hellish process. I ended up writing the loop in VB. It ran much faster (and wouldn't time out at all) and when I tried it in ColdFusion CF would either 1. Totally Crash from to much data processing, or 2. Do really funky things like what you have described. Writing a simple loop and SQL query is really easy in VB, and you can download the Express version from MS to write the code and then find a friend who has Pro edition to export it with an installer.
Was This Post Helpful? 0
  • +
  • -

#9 bwcamy  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 9
  • Joined: 23-July 09

Re: cfloop issue

Posted 09 April 2010 - 11:09 AM

Usually it's processing between 5K up to 35K records. In the instance I posted, it is pulling the data from an XML variable. In the other instance I haven't had a chance to post yet, it is pulling the data from a cfhttp query.

Those are some really good ideas for the debugging and possibilities for restructuring the code.

The only problem is that I can't get the error to reproduce in testing, but I can put the debugging code up on production so that the next time the error occurs, I can use it for diagnostics. Since I can't get it to reproduce though, there's pretty much no way around that, I guess.

I am sure that the loop is running through and completing. But while it is completing, it is running a second time. I can tell by the data going into the database. The records look like this:

Recipient 20000
recipient 20001
recipient 20002
recipient 20003
recipient 20004
recipient 1
recipient 20005
recipient 2
recipient 20006
recipient 3
...
recipient(last in list)
recipient (whatever position it is on in the second iteration)
recipient (continuation of the second iteration until it completes)

So, if the task was timing out, I'm not sure how the original loop is able to continue. Also, if the task was starting again due to a timeout of the original, then shouldn't I be able to see an indication that the task executed again in the schedule log?

View Postxheartonfire43x, on 09 April 2010 - 10:04 AM, said:

I had an issue once trying to loop over big files. It was an import of a couple of files our client would upload everyday and we needed to move the data into a database. I found that running the import through coldfusion was such a hellish process. I ended up writing the loop in VB. It ran much faster (and wouldn't time out at all) and when I tried it in ColdFusion CF would either 1. Totally Crash from to much data processing, or 2. Do really funky things like what you have described. Writing a simple loop and SQL query is really easy in VB, and you can download the Express version from MS to write the code and then find a friend who has Pro edition to export it with an installer.


Yeah, this is kind of what I am thinking. I actually have worked in VB.NET a couple of years ago. But, our apps are all completely CF. So, I'm not sure how it would go over with the Powers that Be to intermingle .net.

So, maybe just going to have to break up the process into smaller subtasks or some other alternative arrangement...
Was This Post Helpful? 0
  • +
  • -

#10 Craig328  Icon User is offline

  • I make this look good
  • member icon

Reputation: 1866
  • View blog
  • Posts: 3,389
  • Joined: 13-January 08

Re: cfloop issue

Posted 09 April 2010 - 11:10 AM

Hm.

That does sound odd. Well, it's situations like these that my experience informs me perhaps a new method of doing what you're doing is needed. Like any decent dev, I always like to understand why something I wrote that should be working, isn't. That said, sometimes (and especially with Adobe CF) you simply have to understand that while all things have an answer, sometimes the effort to find that answer is more trouble than it's worth...especially if you can arrive at your destination by an alternative, less-troubled way.

From what you just mentioned:

  • The issue doesn't occur in smaller batches...only larger ones
  • It doesn't occur in dev, only production
  • It doesn't have a set, specific, repetitive schedule


From those, the first question I'd ask is: can the task be done in dev? Now, that relies entirely on what "dev" is for you. If it means dev code and data then, of course, the answer is no. If your dev access production data (or can) perhaps consider running the task via dev.

However, as I mentioned previously, perhaps the best way to skin this cat is not to skin it at all. You may just need to get a different cat. Consider, could you build and maintain a database table of the records that need processing along with a date/time stamp and a processed flag and then run a process that pulls records for processing in batches of like 500-1K? That would seem to pass under the limit where you start to see issues...you'd just need to run that process more often is all. Several smaller processes in place of one big one.

Look at the factors that seem to point to an increased risk of failure and then model your solution to avoid those factors while still getting the job done. There isn't a bona fide reason that a 5K - 25K iterative loop process ought to fail...but failing it is. First priority when it comes to a fail is to make it so the process does not fail. How you do it is secondary. If you can dig and dig and dig and finally find the source of why the one big process fails, great. But it doesn't sound like anything obvious from what you've mentioned. Perhaps coming up with a new way to do it is what you need.
Was This Post Helpful? 0
  • +
  • -

#11 Craig328  Icon User is offline

  • I make this look good
  • member icon

Reputation: 1866
  • View blog
  • Posts: 3,389
  • Joined: 13-January 08

Re: cfloop issue

Posted 09 April 2010 - 11:20 AM

Your last post hit before I finished my last one.

Quote

I am sure that the loop is running through and completing. But while it is completing, it is running a second time. I can tell by the data going into the database. The records look like this:

Recipient 20000
recipient 20001
recipient 20002
recipient 20003
recipient 20004
recipient 1
recipient 20005
recipient 2
recipient 20006
recipient 3
...
recipient(last in list)
recipient (whatever position it is on in the second iteration)
recipient (continuation of the second iteration until it completes)

So, if the task was timing out, I'm not sure how the original loop is able to continue.


Yeah, that screams to me that you need to conduct the looping via records in the database table.

  • Get the records you're going to need to process.
  • Stick them into a table.
  • Go ahead and run your looping code but on each iteration, pull one record from that table.
  • Process that record and, if successfully processed, delete it from the table at the end of the loop iteration.


Doing it that way means you won't ever be processing the same record twice. You're only pulling from the database table and each loop iteration contains logic and a query to remove that record once processed. That should take care of the issue. Nice part is, if you have it set a "processing" flag when you pull it, you can have more than one process running against it simultaneously. Process A pulls record 1 and sets an "in use" flag. Process B pulls the next record that isn't "in use" and pulls record 2. Process C does the same and pulls record 3. Process A finishes and deletes record 1 and pulls the next record not in use which is record 4...and so on.

Might actually run faster that way to boot. It's more mechanical but it's also likely to be a lot less prone to failure and repeats.

This post has been edited by Craig328: 09 April 2010 - 11:22 AM

Was This Post Helpful? 1
  • +
  • -

#12 bwcamy  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 9
  • Joined: 23-July 09

Re: cfloop issue

Posted 09 April 2010 - 12:11 PM

Okay, I think you are completely right.

I really didn't want to have to rework the project, but given the circumstances, since there are no obvious explanations, I can either spend who knows how long chasing my tail trying to figure out this quirky problem, or I can just cut my losses and put a better solution in place.

I really think that writing the records to the table first and them flagging them as complete will be better on many levels anyway.

Thank you so much for your input and for helping me work through this! It is very much appreciated!
Was This Post Helpful? 0
  • +
  • -

#13 Craig328  Icon User is offline

  • I make this look good
  • member icon

Reputation: 1866
  • View blog
  • Posts: 3,389
  • Joined: 13-January 08

Re: cfloop issue

Posted 09 April 2010 - 12:20 PM

Now, the only thing you'll want to watch now is the number of database transactions this will cause. For 25K records you'll have 50K transactions (1 pull, 1 delete per record). You can modify it to do one big pull and if you autonumber the records when they go in, just loop over the query recordset and refer to the record ID all the way through the process and still delete it at the end. So, for 25K records, you'd get 1 pull and 25K deletes. Nothing to panic about...just to keep an eye on.

Always trade-offs with web dev. :)

Good luck!
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1