Hey, I'm currently working on an assignment for my data structures course and unfortunately the book is a bit light on the details, and while I think my solution isn't bad there may be something I didn't notice. Also by solution I mean the general idea/algorithm there's no actual code. Also I don't necessarily want the solution I'd just like to know if my solution hits the best O-complexity for it or not, but I don't mind a few hints if there's a better approach.

Here's the problem description:

There are two sorted lists of comparable items. the individual lists themselves have no duplicates, but their intersections may or may not be empty. Determine if their intersection is empty and compute the symmetric difference.

*important details this is will be in java, but I won't be using any java api that does this stuff for me since that would just defeat the purpose of the assignment. Additionally while I may be using java this isn't about learning java but understanding the data structures. Lastly the semester started a few weeks ago we've only learned some algorithm analysis, and a little about lists, stacks, linked lists, and very little about tree's. So there may or may not be some superior data structure to use. I'm just using what I know

Okay, here's my solution to the problem:

First check that the lists aren't empty, then check if the upper and lower bounds of the two lists have some overlap. for example if set A1 has a lower bound which is greater than the upper bound of list A2 the intersection is empty and the symmetric difference is the union of both lists. Also if either of the extremes equal (lower bound of A1 = upper bound A2) then that's the intersection.

next assuming the lists had such an overlap using whichever list has more elements(less comparisons I think), create a Binary Search Tree. Now using the other list with less elements (assuming the number of elements weren't equal) start adding it to the Binary Search Tree. If an element already exists inside the tree then it's not part of the symmetric difference so remove it from the tree. Once this is done the tree will have the symmetric difference of the two sets.

I had a few more notes and ideas but in general this is it. Also correct me if I'm wrong (this is one of the parts where the book seems to be lacking), but I think the worst case for building a binary search tree is O(N) as if it was a linked list. With the average case being O(N log N) run time. Here's the part where I'd like some clarity After I build the tree from the first list and I begin to compare/add the values of the second list, this would still be O(N log N) right? because the program would be doing a O(log N) operations up to N times for all the elements in the list. Is this right? Also sorry if this is the wrong sub-thread, I wasn't sure where the best place to post this was.

## 8 Replies - 832 Views - Last Post: 07 October 2016 - 03:03 PM

### #1

# Is this an efficient solution? Or is there a better approach?

Posted 01 October 2016 - 07:38 AM

##
**Replies To:** Is this an efficient solution? Or is there a better approach?

### #2

## Re: Is this an efficient solution? Or is there a better approach?

Posted 01 October 2016 - 08:30 AM

jon93c, on 01 October 2016 - 04:38 PM, said:

I think the worst case for building a binary search tree is O(N) as if it was a linked list. With the average case being O(N log N) run time.

O(N) is less than O(N log N), so it can't be your worst case. Depending on how you build your BST, it might be your

*best*case (if you keep a pointer to the max node, allowing O(1) inserts at the end and you never balance). For an unbalanced BST, the worst case will O(N^2) and the average will be O(N log N). For a balanced BST best, average and worst case will all be O(N log N).

Quote

Here's the part where I'd like some clarity After I build the tree from the first list and I begin to compare/add the values of the second list, this would still be O(N log N) right?

If your BST is not balanced, the worst case lookup is O(N), so your whole algorithm will be O(N^2) (even if we ignore the cost of building the BST). If it is balanced, it will be O(N log N), yes.

There is an O(N) solution to this problem though (which does not involve creating any auxiliary data structures), so yours is not the most efficient one.

### #3

## Re: Is this an efficient solution? Or is there a better approach?

Posted 01 October 2016 - 08:37 AM

While this sounds like it works, I would consider a simpler approach. Suppose you have two lists, A and B, and you're looking at A[0] and B[0]. These two elements are either equal, or they are not. If they are equal, they are part of the intersection. If they are not, then one is less than the other, let us suppose for the moment that it is A[0]. What do we now know about the elements of B vis a vis A[0], and what do we know about A[0]?

### #4

## Re: Is this an efficient solution? Or is there a better approach?

Posted 01 October 2016 - 09:04 AM

Here is a big hint: look into merge sort and see how it works. Take what you learn from it and apply it to your problem.

### #5

## Re: Is this an efficient solution? Or is there a better approach?

Posted 01 October 2016 - 12:21 PM

Yea I meant to specify that the tree would be balanced, but I thought of a better solution and I'm using that one instead. Thanks for the help guys!

This post has been edited by **jon93c**: 01 October 2016 - 12:21 PM

### #6

## Re: Is this an efficient solution? Or is there a better approach?

Posted 02 October 2016 - 11:30 AM

Quote

First check that the lists aren't empty, then check if the upper and lower bounds of the two lists have some overlap. for example if set A1 has a lower bound which is greater than the upper bound of list A2 the intersection is empty and the symmetric difference is the union of both lists. Also if either of the extremes equal (lower bound of A1 = upper bound A2) then that's the intersection.

This bit won't change your worst case so in that sense, it doesn't matter. Where it might be important is when you are processing lots of pairs of lists, and many of them could be empty or non-overlapping.

If this is about using the Java API then start thinking about what data structures you could use. Any algorithm I can think of involves scanning through one set and seeing if the item is in the other set. The "merge sort like" algorithm hinted at above sounds particularly efficient. However, you can get the same time complexity just by picking the right collection to put the second set into.

Edit: to be clear, it's not a tree. Trees tend to give log(N) time for a lookup or N.log(N) for N lookups. You can do better than that.

This post has been edited by **cfoley**: 02 October 2016 - 11:31 AM

### #7

## Re: Is this an efficient solution? Or is there a better approach?

Posted 06 October 2016 - 01:17 PM

Sorry for reviving a thread that might be a little old, but I'd like to know if the route I ended up taking was the best method or if there was something better.

first ensure the two lists have some common ground and not that one of the lists is completely greater than the other list.

compare the first element in list1 to he first element in list2 then take the one that is smaller and add it to the symmetricDifference for example lets say list1 (element 1) is greater than list 2 (element 1), then I would compare it to the second element of list 2 and so on. Eventually an element in list2 might be greater than the first element in list1 then the first element in list1 gets added to symmetricDifference and the index for list1 increases and starts comparing the next element in list1. It continues to do this until one of the lists runs out of elements and at this point all of the elements in the list that still has elements remaining are in the symmetricDifference. If at any point the two lists have the same element this element is not added to the list, and the indexes of both lists are increased by one. Also the boolean listIntersectionIsNull is set to false since the intersection of the lists is not empty. this is at worst case O(N) which would be if both lists were completely different and the symetricDifference was the union of both lists.

In retrospect maybe instead of adding the remaining elements after one of the lists has no next element to the symmetricDifference I could of just referenced that list if there was some call to an element of an index greater than the max index of the list symmetricDifference. This would let the program avoid adding all the remaining elements to the symmetricDifference

Is there anything I missed? Is this a good solution to the problem or is there a better one?

first ensure the two lists have some common ground and not that one of the lists is completely greater than the other list.

compare the first element in list1 to he first element in list2 then take the one that is smaller and add it to the symmetricDifference for example lets say list1 (element 1) is greater than list 2 (element 1), then I would compare it to the second element of list 2 and so on. Eventually an element in list2 might be greater than the first element in list1 then the first element in list1 gets added to symmetricDifference and the index for list1 increases and starts comparing the next element in list1. It continues to do this until one of the lists runs out of elements and at this point all of the elements in the list that still has elements remaining are in the symmetricDifference. If at any point the two lists have the same element this element is not added to the list, and the indexes of both lists are increased by one. Also the boolean listIntersectionIsNull is set to false since the intersection of the lists is not empty. this is at worst case O(N) which would be if both lists were completely different and the symetricDifference was the union of both lists.

In retrospect maybe instead of adding the remaining elements after one of the lists has no next element to the symmetricDifference I could of just referenced that list if there was some call to an element of an index greater than the max index of the list symmetricDifference. This would let the program avoid adding all the remaining elements to the symmetricDifference

Is there anything I missed? Is this a good solution to the problem or is there a better one?

### #8

## Re: Is this an efficient solution? Or is there a better approach?

Posted 06 October 2016 - 01:58 PM

Sounds like you've got the right idea. The description is a little hard to read, so there might be a flaw hidden in there that I'm not seeing, but the overall concept of walking down a pair of sorted lists is what we were hinting at.

### #9

## Re: Is this an efficient solution? Or is there a better approach?

Posted 07 October 2016 - 03:03 PM

Are the lists stored as doubly linked lists that you implement therefore have access to the prev/next references, or will you be forced to use the Java lists?

Page 1 of 1