3 Replies - 1066 Views - Last Post: 08 April 2013 - 11:12 AM Rate Topic: -----

#1 techgadgeteer  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 2
  • Joined: 08-April 13

Large Query Problem

Posted 08 April 2013 - 09:00 AM

I am having problems trying to re-create the query below so that I do not have to load 30,000 items into memory with the first query (name="hide") just to check against it. I have tried multiple ways to do this and this was the only solution that works. The reason I need to find a better solution is because it is taking a toll on the server - between the CPU, memory, MySQL, JRUN, etc. I have cached the first query using "cachedwithin="0,0,15,0" which helps but is more of a band-aid.

The real issue seems to be that I am trying to match two different fields from the "hide" table, making the query below it quite cumbersome (even more than it is). If anyone has some suggestions I'd greatly appreciate any help.



<cfquery name="hide" datasource="products" cahc>
SELECT MFGPart, UPC
from Hide_Table
</cfquery>


<cfquery name="get_products" datasource="products">
Select distinct
products_table.MFGPart, 
products_table.MFG_Name, 
products_table.retail_price, 
products_table.Product_title, 
products_table.unique_prodid,
Item_Override.call_for_price,
Item_Override.LongDescription_Add, Item_Override.Discount_In_Cart,  Item_Override.Discount_In_Email, 
Item_Override.Frontpage,  Item_Override.description as override_Product_title, Item_Override.price as override_price,
Item_Override.memberprice as override_memberprice, Item_Override.longdescription_add as override_longdescription_add, Item_Override.Preorder as override_preorder, 
Item_Inventory.weight, menu_nav.classID, menu_nav.CatSubID, menu_nav.CatMainID AS categoryid, class.class, sub_category.category as category, 
main_category.maincategory, IF(Item_Override.price > 0,Item_Override.price, IF(products_table.MAP > 0, products_table.map,
IF(products_table.retail_price > 0, products_table.retail_price, 0))) as sell_price,
IF(Item_Inventory.QTY='In',5,IF(Item_Inventory.QTY='Out',0,Item_Inventory.whse1 + Item_Inventory.whse2 + Item_Inventory.whse3 + Item_Inventory.whse4 + Item_Inventory.whse5 + Item_Inventory.whse6)) as Total_qty,
IF(products_table.upc is NULL, products_table.MFGPart, products_table.upc) as retailer_sku
FROM products_table
LEFT JOIN menu_nav ON products_table.unique_prodid = menu_nav.MFGPart
LEFT JOIN Item_Override on products_table.unique_prodid = Item_Override.MFGPart
LEFT JOIN Item_Inventory on products_table.unique_prodid = Item_Inventory.MFGPart
LEFT JOIN main_category ON menu_nav.catMainID = main_category.CatMainID
LEFT JOIN sub_category ON menu_nav.catsubid = sub_category.CatSubID
LEFT JOIN class on menu_nav.classid = class.classid
Where products_table.hide is NULL
AND ((store_exclusive is NULL) or (store_exclusive = 1))
AND products_table.MFGPart = '#MFGPart#'
AND (products_table.upc not IN (#QuotedValuelist(noshow.storesku)#) and products_table.unique_prodid not IN (#QuotedValuelist(noshow.productid)#))
AND products_table.MFG_Name = '#MFG_Name#'
order by retail_price, MFG_Name, productname;
</cfquery> 




I have also tried this suggestion from another post on DevShed, but unfortunately it would hang when I tried to run it.

SELECT products.customID, products.upc
  FROM products
LEFT OUTER
  JOIN hide
    ON hide.customID = products.customID
 WHERE hide.customID IS NULL
UNION
SELECT products.customID, products.upc
  FROM products
LEFT OUTER
  JOIN hide
    ON hide.storesku = products.upc
 WHERE hide.storesku IS NULL



Is This A Good Question/Topic? 0
  • +

Replies To: Large Query Problem

#2 Craig328  Icon User is offline

  • I make this look good
  • member icon

Reputation: 1926
  • View blog
  • Posts: 3,471
  • Joined: 13-January 08

Re: Large Query Problem

Posted 08 April 2013 - 09:36 AM

Welcome to DIC techgadgeteer!

I've looked at your code and I'm not sure we have enough to go on here to help you solve your problem.

You said that your first query returns 30K records, right? You're bringing back MFGpart and UPC as the two sole fields from that query and then, from what you've posted, you're trying to doing a second query using (I guess) the MFGpart value from the first query? I'm having to guess on this because, the way your code is now, unless you declared a MFGpart variable elsewhere in this request, your second query will error as there isn't a MFGpart variable for you to reference at line 32 of your second query (or any of the other variables your second query references for that matter). I'm guessing you're looping over your first query's result set and doing the second query within that loop? Like I said, can't tell from the limited amount of code you have here.

If that's what you're doing, then yeah, if the first query is returning 30K rows and you're doing that big second query inside a 30K loop (meaning you're trying to execute that second query 30K times), it's no small wonder you're killing your database. For the next part, I'm going to go ahead and assume this is what you're doing. If you're not doing that then my next text could well be worthless to you.

So, if you're needing the MFGPart values from your hide query as filter values for the where statement of your second query, there are three ways (maybe more) to go about doing this. Your primary issue is that you're running one big ass query 30K+ times. That has to go. One solution is to build a quoted value list from the MFGpart column that is returned by your hide query and then use that list in a single query call in your second query. In that case, if you don't need the UPC value you're bringing back from that query, I'd suggest you add a DISTINCT and pull just the MFGpart but it's not mandatory that you do so. After that query you'll want to build a list using CF's QuotedValueList() function like so:

<cfset variables.MFGpartList = QuotedValueList(hide.MFGpart,",")>


...and then use that new list at line 32 in your second query, like so:

AND products_table.MFGPart IN (#PreserveSingleQuotes(variables.MFGpartList)#)


You'll want to look at the PreserveSingleQuotes() function to familiarize yourself with its operation but basically, this solution is what I like to call more "mechanical" in that you go step by step and build something. In this case, you're replacing your 30K loop with a single query doing the WHERE statement filter using a list of the MFGpart values from the first query. This may or may not be appropriate per your data and the other variables you're using in the query though.

The second solution is to scrap the first query altogether and incorporate it as a WHERE statement subquery in the second query. Again, this requires a knowledge of the data you're working with as to whether this solution is appropriate. That solution would look something like this:

AND products_table.MFGPart IN (SELECT DISTINCT MFGPart FROM Hide_Table)


That's basically doing though what the first solution does but it does mandate the distinct value and assume that you didn't need the UPC value from Hide_Table that you're pulling during your first query.

The third solution is to add a JOIN to Hide_Table in your second query. That would look like this:

LEFT JOIN Hide_Table ON products_table.MFGPart = Hide_Table.MFGpart


...and you'd lose line 32 of your WHERE statement entirely. This has the benefit of doing the filtering via a JOIN and not via a mechanically included list of items. Your database SHOULD run this query faster than it would with a comma delimited list of quoted values but it's not guaranteed.

The concern I have is that your second query seems like it was written by someone who knows something about optimizing queries for best return and the solutions I've suggested (especially the last one) are ones that would already be there if they were feasible and possible. It may be that they're neither feasible or appropriate so that's why the query was written without them. It may also be that you're not the guy who wrote the second query originally and may lack the SQL skills to do so (not a shame as many CF devs aren't equally adept with SQL).

Anyway, give those solutions a spin and see how they do for you. As I said, my post was predicated on there being a missing CFLOOP there. If there isn't of if the situation is different than what you described and I assumed, feel free to post back here with more info.

Good luck!
Was This Post Helpful? 2
  • +
  • -

#3 techgadgeteer  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 2
  • Joined: 08-April 13

Re: Large Query Problem

Posted 08 April 2013 - 10:47 AM

Hi Craig,

Thank you for the great (and fast) response. I did screw things up in the original post when I was changing the variable names for the sake of this posting. The second query had the incorrect field names in the quoted value lists. It should have read as follows:

AND (products_table.upc not IN (#QuotedValuelist(noshow.upc)#) and products_table.unique_prodid not IN (#QuotedValuelist(noshow.MFGPart)#))




I do not run a loop from the results of the first query. Instead I compare the values against a quotedvaluelist within the second query for both fields to see if either of them are a match. I think this provides the same benefits of doing the CFSET statement without having that extra step. My biggest issue is that I am trying to find a match of either of those two fields. Originally I only had to match one field which made things simple (join statement), but the retailer doesn't always have the correct MFGpart number and at other times they don't always use the UPC code - so I had to redesign the queries to match EITHER of the fields. So basically, the second query only brings back a match if the item being passed to the page ISN'T found in the "hide" table - based on either the MFGPart or the UPC.

It sounds like I'm doing things the 'right' way (if there is such a thing) for the intended purpose, but I am hoping I may have overlooked something that might save me from having to have that first query.

Thank you again for your help.

Ron

also wanted to mention. I was the one who wrote these queries, so you might then think I know what I'm doing - and maybe at times I do, but it is because of the amazing help I've had over the years from experts on forums like these that have really helped me so much. So, thank you to everyone that shares their knowledge on forums.
Was This Post Helpful? 0
  • +
  • -

#4 Craig328  Icon User is offline

  • I make this look good
  • member icon

Reputation: 1926
  • View blog
  • Posts: 3,471
  • Joined: 13-January 08

Re: Large Query Problem

Posted 08 April 2013 - 11:12 AM

So, you're only running that second query one time? And that's what's crushing your server? You're saying that because the list you're pushing into the query has 30K items in it that that's what you suspect is killing the query?

So, this is actually more a SQL question than anything else. Tell me, what CF server and version and what database product are you using? Also, if you can post a copy of the actual SQL that's getting executed (you can truncate some of the verbatim quoted value list items if you like, for bevity) from something like a CFDUMP request of the returned recordset, I might could suggest a few things you can try to fine tune your query. In short terms, what you want to do is get the SQL that CF is submitting to the database and then post that SQL into some kind of database IDE like SQL Server Mgmt Studio (if using MS SQL) or TOAD or Navicat (if using MySQL) etc and run the query there.

Once you get the beast into a query analyzer like that, you can fiddle with it to see if some of the joins are bogging your query and that might could lead to clues as to what you could do to tweak the query to speed it up.
Was This Post Helpful? 1
  • +
  • -

Page 1 of 1