9 Replies - 751 Views - Last Post: 06 April 2016 - 01:02 PM

#1 kyle_denney  Icon User is offline

  • D.I.C Head

Reputation: 0
  • View blog
  • Posts: 171
  • Joined: 10-August 12

How to optimize a query with multiple INNER JOINs and subqueries?

Posted 06 April 2016 - 11:22 AM

I am struggling to meet a clients need with SQL Server. The client expects certain test data results where a student can take a test and be ranked on that test. The issue comes when a student scores the same on two different tests. The client wants to pick just one option in those cases and select whatever the highest score is regardless. I have the query working as it was intended but it takes multiple subqueries and it is slow - it takes around 35 seconds to run on a 10GB SQL Server 2012 Windows Server.

I think the query can be optimized perhaps eliminating the subquery or the removal of the duplicates that are created as a result of the subquery but I am not a database administrator. The data itself gets narrowed down through a range of INNER JOINS across multiple tables because the results are displayed and grouped based on what school the student attends along with what the test was they took.

The @userid is an INT such as 27 The @gradyears is a comma-separated list of graduation years such as: '2015,2016,2017'

Here is the query in question:


                DECLARE @tempStudents TABLE(StudentIDs VARCHAR(MAX))
                DECLARE @temptabledata TABLE(TableName VARCHAR(500), CountOfStudentsInTable VARCHAR(500))
                DECLARE @countofstudents INT = 0
                DECLARE @tabletoquery VARCHAR(255)

                SET @tabletoquery = 'Career Cluster'

                INSERT INTO @tempTable
                SELECT c.clustername AS Clustername, d.district AS LocationName, cty.county AS County, cc.monsterid AS ClusterCount, d.IRN AS IRN, 0 AS GradYear
                FROM tblCareerCluster cc 
                INNER JOIN tblClusters c ON c.clusterid = cc.clusterid
                LEFT JOIN tblStudentPersonal sp ON sp.monsterid = cc.monsterid
                INNER JOIN tblStudentSchool ss ON ss.monsterid = cc.monsterid
                LEFT JOIN (

                SELECT cc.monsterid AS CountOfClusters, MIN(c.clustername) AS Clustername
                FROM tblCareerCluster cc 
                INNER JOIN tblClusters c ON c.clusterid = cc.clusterid
                LEFT JOIN tblStudentPersonal sp ON sp.monsterid = cc.monsterid
                INNER JOIN tblStudentSchool ss ON ss.monsterid = cc.monsterid
                INNER JOIN tblSchools s ON s.schoolid = ss.schoolid
                INNER JOIN tblSchoolDistricts sd ON sd.schoolid = s.schoolid
                INNER JOIN tblDistricts d ON d.districtid = sd.districtid
                INNER JOIN tblRegionUserRegionGroups rurg ON rurg.districtid = d.districtid
                INNER JOIN tblGroups g ON g.groupid = rurg.groupid
                INNER JOIN tblUserGroups ug ON ug.groupid = g.groupid
                WHERE cc.ranking = (SELECT DISTINCT  TOP 1 cc2.ranking AS Ranking
                    FROM tblCareerCluster cc2 
                    WHERE cc2.monsterid = cc.monsterid  
                    ORDER BY cc2.ranking DESC)   
                AND ss.graduationyear IN (SELECT Items FROM FN_Split(@gradyears, ',')) AND sp.optin = 'Yes'
                AND g.groupname = @groupname AND ug.userid = @userid            
                GROUP BY cc.monsterid

                ) a ON a.CountOfClusters = cc.monsterid
                INNER JOIN tblSchools s ON s.schoolid = ss.schoolid
                INNER JOIN tblSchoolDistricts sd ON sd.schoolid = s.schoolid
                INNER JOIN tblDistricts d ON d.districtid = sd.districtid
                INNER JOIN tblCounties cty ON cty.countyid = ss.countyid
                INNER JOIN tblRegionUserRegionGroups rurg ON rurg.districtid = d.districtid
                INNER JOIN tblGroups g ON g.groupid = rurg.groupid
                INNER JOIN tblUserGroups ug ON ug.groupid = g.groupid
                WHERE cc.ranking = (SELECT DISTINCT  TOP 1 cc2.ranking AS Ranking
                    FROM tblCareerCluster cc2 
                    WHERE cc2.monsterid = cc.monsterid  
                    ORDER BY cc2.ranking DESC)
                AND ss.graduationyear IN (SELECT Items FROM FN_Split(@gradyears, ',')) AND sp.optin = 'Yes'
                AND g.groupname = @groupname AND ug.userid = @userid
                GROUP BY cc.ranking, c.clustername, d.district, cty.county, cc.monsterid, cc.clusterid, s.IRN, d.IRN, d.districtid  

                ;WITH TempEmpDetails (ClusterCount, duplicateRecordCount)
                AS
                (
                    SELECT ClusterCount,ROW_NUMBER() OVER(PARTITION BY ClusterCount ORDER BY ClusterCount)
                    AS duplicateRecordCount  FROM @tempTable
                )
                DELETE FROM TempEmpDetails WHERE duplicateRecordCount > 1   

                SET @count = (SELECT Count(ClusterCount) AS ClusterCount FROM @tempTable)

                INSERT INTO @temptabledata (TableName, CountOfStudentsInTable) VALUES (@tabletoquery, @count)



Is This A Good Question/Topic? 0
  • +

Replies To: How to optimize a query with multiple INNER JOINs and subqueries?

#2 modi123_1  Icon User is online

  • Suitor #2
  • member icon



Reputation: 14039
  • View blog
  • Posts: 56,179
  • Joined: 12-June 08

Re: How to optimize a query with multiple INNER JOINs and subqueries?

Posted 06 April 2016 - 11:40 AM

Quote

I have the query working as it was intended but it takes multiple subqueries and it is slow - it takes around 35 seconds to run on a 10GB

Why would this be ran each time against the entire 10gb of data?

Why are you using so many inner joins?

Have you ran this with statistics turned on? Reviewed your execution plan for bottle necks?

set statistics profile on

Folks really are going to have a hard time since the majority of the table definitions are not known nor any of the data..

http://www.codeproje..._JOINS_orig.jpg
Was This Post Helpful? 0
  • +
  • -

#3 kyle_denney  Icon User is offline

  • D.I.C Head

Reputation: 0
  • View blog
  • Posts: 171
  • Joined: 10-August 12

Re: How to optimize a query with multiple INNER JOINs and subqueries?

Posted 06 April 2016 - 11:49 AM

I apologize for the confusion. The 10 GB is the size of the RAM for the database not the size of the database itself. The amount of records is around 3.5 million.

The indexes are basically as follows:

tblStudentPersonal - PK: monsterid
tblCareerCluster - PK: clusterid, FK: monsterid
tblStudentSchool - FK: schoolid, monsterid

The other major tables that are in the joins the id fields that are being joined upon are the indexes for those tables respectively.

Thanks for the feedback.
Was This Post Helpful? 0
  • +
  • -

#4 kyle_denney  Icon User is offline

  • D.I.C Head

Reputation: 0
  • View blog
  • Posts: 171
  • Joined: 10-August 12

Re: How to optimize a query with multiple INNER JOINs and subqueries?

Posted 06 April 2016 - 12:02 PM

As to why I am using so many inner joins - it is because the data the client wants to display has so many constraints on it. For instance, the entire structure is built around the idea that you can pick a subset of students who took tests who graduate only on certain years, who only go to certain schools within certain districts and who are opted into the reporting system. Those groups of students are predefined from the teachers in the system who pick the grouping or in some cases the system builds pre-defined groups as needed and that is why the joins on groups are there as well.

Is there a way to alleviate them but still have the constraints?
Was This Post Helpful? 0
  • +
  • -

#5 modi123_1  Icon User is online

  • Suitor #2
  • member icon



Reputation: 14039
  • View blog
  • Posts: 56,179
  • Joined: 12-June 08

Re: How to optimize a query with multiple INNER JOINs and subqueries?

Posted 06 April 2016 - 12:07 PM

Then start looking at the execution plan and the statistics profile.

All those 'constraints' sound like WHERE clause limitations and not INNER JOIN material.
Was This Post Helpful? 0
  • +
  • -

#6 kyle_denney  Icon User is offline

  • D.I.C Head

Reputation: 0
  • View blog
  • Posts: 171
  • Joined: 10-August 12

Re: How to optimize a query with multiple INNER JOINs and subqueries?

Posted 06 April 2016 - 12:16 PM

View Postmodi123_1, on 06 April 2016 - 12:07 PM, said:

Then start looking at the execution plan and the statistics profile.

All those 'constraints' sound like WHERE clause limitations and not INNER JOIN material.


Is there a way to retrieve the information for the 'WHERE clause limitations' without using INNER JOINs?

I thought INNER JOINs were faster than doing subqueries?

Thanks!
Was This Post Helpful? 0
  • +
  • -

#7 modi123_1  Icon User is online

  • Suitor #2
  • member icon



Reputation: 14039
  • View blog
  • Posts: 56,179
  • Joined: 12-June 08

Re: How to optimize a query with multiple INNER JOINs and subqueries?

Posted 06 April 2016 - 12:17 PM

There wouldn't be a need for subqueries, as far as I can tell. Basic joins and a WHERE clause with conditions.
Was This Post Helpful? 0
  • +
  • -

#8 kyle_denney  Icon User is offline

  • D.I.C Head

Reputation: 0
  • View blog
  • Posts: 171
  • Joined: 10-August 12

Re: How to optimize a query with multiple INNER JOINs and subqueries?

Posted 06 April 2016 - 12:24 PM

Would you be able to create a small working example with the code given? If you need more information to make that happen let me know. I am having a hard time envisioning how that would work.
Was This Post Helpful? 0
  • +
  • -

#9 modi123_1  Icon User is online

  • Suitor #2
  • member icon



Reputation: 14039
  • View blog
  • Posts: 56,179
  • Joined: 12-June 08

Re: How to optimize a query with multiple INNER JOINs and subqueries?

Posted 06 April 2016 - 12:27 PM

No, not really. Again - I am not going to burn a whole mess of my time recreating tables off what is there, guess at values, and then rework your query.
Was This Post Helpful? 0
  • +
  • -

#10 dag72  Icon User is offline

  • D.I.C Head

Reputation: 1
  • View blog
  • Posts: 205
  • Joined: 02-March 11

Re: How to optimize a query with multiple INNER JOINs and subqueries?

Posted 06 April 2016 - 01:02 PM

You may also want to consider using the 'Inner Hash Join', instead of so many Inner Join. The Hash can efficiently process large unsorted, non-index inputs, useful for intermediate results in complex queries and reduce the need for demoralisation.

This post has been edited by dag72: 06 April 2016 - 01:14 PM

Was This Post Helpful? 0
  • +
  • -

Page 1 of 1