5 Replies - 290 Views - Last Post: 27 May 2019 - 05:20 PM Rate Topic: -----

#1 masterori   User is offline

  • D.I.C Head

Reputation: 0
  • View blog
  • Posts: 75
  • Joined: 27-January 14

Best way to speed up joins across multiple table relationships?

Posted 26 May 2019 - 05:05 PM

Hi,

Let's say I have some models like so...

class School():
    id = Column(..., primary_key=True)
    ...

    teachers = Relationship(..., load='selectin')  # many-to-many as could teach at multiple schools
    students = Relationship(..., load='selectin')  # many-to-many as could go to multiple schools
    ... # more many-to-many

class Student():
    id = ...
    name = ...
    ...

    # one-to-many relationships here 

class Teacher():
    id = ...
    name = ...
    ...

    course = Relationship(..., load='selectin') # one-to-many relationships here 


Now, I would like to get all students and teachers that goes/teaches at a particular school. Using raw sql then it's pretty fast, but if I use an ORM (e.g SQLAlchemy), then it takes about 30 seconds for say ~10,000 rows. This time stays about the same even if I execute using raw sql through the ORM.

e.g
# something like
raw_sql_str = 'select t.name as teacher_name, t.attr1 as teacher_attr1, t.attr2 as teacher_attr2, ... , s.name as student_name, ... from school sch join school_teacher on sch.school_id = school_teacher.school_id join teacher as t on t.id = school_teacher.teacher_id join school_student on sch.school_id = school_student.school_id join student as s on s.id = school_student.student_id ...'

db.session.execute(raw_sql_str)



I was playing around with the different loading strategies, and selectin is definitely faster than subquery. Now I am wondering, let's say after I get all data, I want to write to a file. So this will take a few seconds as well, and totaling up to about ~1 min for a 1.5 MB file. Is this slow for a file that size?

How can I speed up these joins? What's the appropriate combination of eager vs lazying loading to use here? Since I need to get all data to create that file, is eager the way to go for all relationship loading? I remember reading from SO that lazy is used on the "collection side" of a one-to-many, then should I lazy load the "course" relationship?

Hope my questions are not too confusing! I'm still trying to improve my understanding of sql and databases

Is This A Good Question/Topic? 0
  • +

Replies To: Best way to speed up joins across multiple table relationships?

#2 modi123_1   User is online

  • Suitor #2
  • member icon



Reputation: 15226
  • View blog
  • Posts: 60,945
  • Joined: 12-June 08

Re: Best way to speed up joins across multiple table relationships?

Posted 26 May 2019 - 07:29 PM

What's your table structure like?

At worst it should have:

One table for the school info and key.
One table for teacher info and key
One table for relating teacher to school.
One table for student info and key, and school key.

At worst each call - because calling info for teachers and students should be two calls - would be a join from the relational table to the teacher table or just the student table... all looking for a given school id.
Was This Post Helpful? 0
  • +
  • -

#3 masterori   User is offline

  • D.I.C Head

Reputation: 0
  • View blog
  • Posts: 75
  • Joined: 27-January 14

Re: Best way to speed up joins across multiple table relationships?

Posted 27 May 2019 - 10:29 AM

View Postmodi123_1, on 26 May 2019 - 07:29 PM, said:

What's your table structure like?

At worst it should have:

One table for the school info and key.
One table for teacher info and key
One table for relating teacher to school.
One table for student info and key, and school key.

At worst each call - because calling info for teachers and students should be two calls - would be a join from the relational table to the teacher table or just the student table... all looking for a given school id.

The tables would be like so...

school (m) -- (m) teacher
school (m) -- (m) student
school (m) -- (m) other (other related table)
student (1) -- (1) A (some related table)
teacher (1) -- (1) B (some related table)
other1 (1) -- (1) Child (some related table)

so tables...
- school
- teacher
- student
- other
- school_student
- school_teacher
- school_other
- student_A_table
- teacher_B_table
- other_child_table



Due to how ORM works, when I load teacher, student and other, then their child relationships also get loaded (either lazy or eager). I can avoid this by telling the ORM to not load those, or go with the raw sql string and execute like that. There would be 6 joins (correct?)

select
    school.id as school_id,
    t.name as teacher_name,
    stu.name as student_name,
    ...
from
    school
    join school_teacher on school.id = school_teacher.school_id
    join teacher as t on t.id = school_teacher.teacher_id
    join school_student on school.id = school_student.school_id
    join student as stu on stu.id = school_student.student_id
    join school_other ...
    join other ...
where school.id = <num>
    and student_A_table.some_id = teacher_B_table.some_id
    and ...


Let's say student table is the largest with ~20,000 rows, and the query with filters and all returns about ~10,000 rows. This takes about 30 sec. Is that too slow for 10,000 rows? Can I make it faster any other way?
Was This Post Helpful? 0
  • +
  • -

#4 modi123_1   User is online

  • Suitor #2
  • member icon



Reputation: 15226
  • View blog
  • Posts: 60,945
  • Joined: 12-June 08

Re: Best way to speed up joins across multiple table relationships?

Posted 27 May 2019 - 10:45 AM

Yes, that is long.

You should not be joining on students. Again, joining students and teachers are not the right way to go.
Was This Post Helpful? 0
  • +
  • -

#5 masterori   User is offline

  • D.I.C Head

Reputation: 0
  • View blog
  • Posts: 75
  • Joined: 27-January 14

Re: Best way to speed up joins across multiple table relationships?

Posted 27 May 2019 - 04:30 PM

I'm not too sure what you mean. Are you saying I shouldn't make the join for students and teacher as one query?
The table structure is like this

school <--> school_teacher <--> teacher
 - id       - school_id        - id
 - ...      - teacher_id       - ...

school <--> school_student <--> student
 - id       - school_id        - id
 - ...      - student_id       - ...

and so on...


This post has been edited by masterori: 27 May 2019 - 04:34 PM

Was This Post Helpful? 0
  • +
  • -

#6 modi123_1   User is online

  • Suitor #2
  • member icon



Reputation: 15226
  • View blog
  • Posts: 60,945
  • Joined: 12-June 08

Re: Best way to speed up joins across multiple table relationships?

Posted 27 May 2019 - 05:20 PM

Correct.

Also, you dont need a join to the school table. Doesn't seem to be a reason for that since then school id is in the relationship table.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1