Page 1 of 1

Python Singly Linked Lists

#1 atraub  Icon User is offline

  • Pythoneer
  • member icon

Reputation: 759
  • View blog
  • Posts: 2,010
  • Joined: 23-December 08

Posted 22 January 2011 - 10:56 PM

*
POPULAR

This is a step by step guide to the process of creating a data structure in Python; specifically a linked list. Skip to the bottom to see the code in its' completed form.

I started working on a custom data structure recently. This structure would require some sort of underlying list or queue to be used. My goal was was to optimize insertion, pop, and peek to all be O(1). Unfortunately, the Python List and the Python Queue fail to fit the bill. While I bet Python has something in one of its libraries, what self respecting programmer can resist the urge to make something from scratch.

To fulfill the need, I didn't want to use any builtin Lists. Why? Because Lists allocate a certain amount of memory beneath the scenes. When you add in a value that won't fit, a nasty process of reallocating memory occurs. Since this is my first made-from-scratch Python data structure, I'm going to use a seemingly easy one to implement, and then discuss optimizing it in a later tutorial.

The structure that we'll be using is called a Singly Linked list. This implementation will be accomplished by creating Node objects. These Nodes will be simple. They will contain an element, and a link to the next node in line. With multiple Nodes, the structure will look something like this:
[a]->-[b]->-[c]->-[d]->-[e]

With a few accessors and mutators, we can create a simple Node class. I also create a __str__ method to simplify printing. All-together, here's what I have so far:
class LinkedList:
    '''Linked List'''
    class __Node:
        '''private node object.  Node's consist of an element and a pointer to the next Node'''
        def __init__(self, element=None):
            self.element = element
            self.next = None

        def __str__(self):
            '''string representation of Node'''
            return str(self.element)

        def hasNext(self):
            '''returns true if the Node points to another Node'''
            return self.next != None

        def getNext(self):
            '''get next Node'''
            return self.next

        def getElement(self):
            '''get the element of the currentNode'''
            return self.element

        def setNext(self,nextItem):
            '''set the next Node of the currentNode'''
            self.next = nextItem

        def setElement(self, element):
            '''set the element of the current Node'''
            self.element = element


As you can see, I opted to make Node a private class inside of Linked List. Why Private? There's no reason the user should ever see 'Node' objects. The user does not need to know they exist! To make something private, just add two underscores to the front of it. This is an important feature of Python; we'll want to remember it.

Next on our list is initiating a Linked List. My approach is to always have an empty Node at the front, this Node is known as a sentinel. This ensures me that I never have to worry about some of the complications that an empty data structure could encounter. One other thought. How will we keep track of the length of this beast? While we could iterate through the list every time we need the size O(n), I think keeping a length attribute is the smarter decision O(1).
    def __init__(self):
        '''Creates the LinkedList'''
        self.first = LinkedList.__Node()
        self.length = 0


So far, so good. While you might be tempted to go straight into writing the append code, I personally want to take care of some more house keeping first. To use python's len keyowrd, the class needs a function called __len__.
    def __len__(self):
        '''returns the length of the list O(1)'''
        return self.length

    def isEmpty(self):
        '''returns true if the list is empty'''
        return self.length == 0


Not much glam, but this is where things get interesting. Given the nature of the Singly Linked list, adding an element to the front of it should be really easy. All we need to do is create a new Node with an element in it. The new Node should point to whatever self.front is pointing to, and the self.front needs to point at the new node. Also, we'll need to increment length.
def addToFront(self,element):
        '''Optimized method to add an element to the front of the list O(1)'''
        toAdd = LinkedList.__Node(element)
        toAdd.setNext(self.first.getNext())
        self.first.setNext(toAdd)
        self.length += 1


At this point, I hope you're thinking to yourself, "this Linked List is awesome!" And why shouldn't you? So far, it can mimic pushing onto a stack with O(1) goodness. We could even implement a pop front very easily too:
    def popFront(self):
        '''Optimized method to add pop front element from list O(1)'''
        if self.length > 0:
            val = self.first.getNext().getElement()
            self.first.setNext(self.first.getNext().getNext())
            self.length -= 1
            return val
        else:
            raise IndexError("pop from an empty List")


Aside from peeking, our linked list already fulfills all the requirements of a stack. As a practice exercise, you could implement your own peek algorithm. However, our goal wasn't to make a stack, it was to make a linked list. A more complicated feature would be appending. If we're adding to the end, all we have to do is keep looking at the next Node if there's a node following it. Notice this is an O(n) algorithm, we have to look at EVERY node in our list in order to append. Unfortunately, with the current specs of our Linked List, this is inevitable.
    def append(self, element):
        '''Add element to last position of the list'''
        currentNode = self.first
        while currentNode.hasNext():
            currentNode = currentNode.getNext()
        currentNode.setNext(LinkedList.__Node(element))
        self.length+=1

There is a certain beauty in the simplicity of this code. This is where the simplicity starts to fade a bit. Now it's time to implement pop(). To pop an element, we must find it in the list. we must then make the previous element point to the element following the one we want to remove. We'll need to decrement length of course, and we want to return the value we popped. Once again, this method seems to grow at a rate of O(n).
    def pop(self, index =None):
        '''Removes and returns an element in the list. last element by default.'''
        if len(self) == 0:
            raise IndexError("pop from empty list")
        
        #utilize default parameter    
        if index ==None: 
            index = len(self)-1
        if type(index) != int:
            raise TypeError("Index must be an integer or a slice not a "+str(type(index)).split("'")[1])

        #If the index is out of bounds
        if index >= self.length or index < 0:
            raise IndexError("Index out of bounds")

        previous = self.first
        for i in range(index):
            previous = currentNode.getNext()

        toRemove = previous.getNext()
        afterNext = None
        if toRemove.hasNext():
            afterNext = toRemove.getNext()
        previous.setNext(afterNext)
        self.length-=1
        return toRemove.getElement()



The insert algorithm is going to handle very similarly, so for the sake of brevity you'll have to look in the spoiler tags to see it.At this point, I found myself needing to iterate the list to get to a specific node a lot, so I decided to make a private function to handle that for me.
    def __getNodeAtPosition(self, index):
        '''(Private) Gets a Node at a given index'''
        currentNode = self.first
        for i in range(index+1):#Adds 1 to account for initial Node (sentinel)
            currentNode = currentNode.getNext()
        return currentNode
This function reduces a lot of repeated code in insert, append, pop, and a few functions we haven't discussed yet.

I also found myself checking to see if a given index is valid a lot, so I implemented a private function to handle that too.
    def __checkIndex(self, index):
        '''(Private) check if the index is an acceptable value.'''
        if type(index) != int:
            raise TypeError("Index must be an integer or a slice not a "+str(type(index)).split("'")[1])

        #If the index is out of bounds
        if index >= self.length or index < 0:
            raise IndexError("Index out of bounds")


An important method of this data structure is __getitem__(self, index). By implementing this function, your data structure can be indexed using the notation of myList[x] and it can also be used in for loops in the context of for index in myList. On the surface, implementing __getitem__ appears easy.
    def __getitem__(self, index):
        '''Allows for indexing, index must be an integer'''
        self.__checkIndex(index)
        return self.__getNodeAtPosition(index).getElement()

For simple tests this appeared to work. But what happens if we try to slice it? We get an error. A slice object is not an integer, therefore we're in trouble. Slicing one of Pythons builtin Lists yields a new List. Don't believe me? Try going into idle and typing the following:

>>> x = [1,2,3,4,5,6,7,8]

>>> x[5:].append(9)

>>> print(x)

You'll find that x's value hasn't changed. This is due to the fact that when we sliced it, we created a new list. To implement slicing, you must realize that a slice object has a start, stop, and step attribute. You must also be aware that these attributes are read only. Seems easy right? Don't forget, you can leave any of these attributes as None (like my x[5:] example where stop and step are None) and function should behave properly. Ok, still not the hardest problem ever, right? It gets worse! Slices can utilize negative numbers. My solution for slicing looks like this:
    def __getitem__(self, index):
        '''Allows for indexing, index must be an integer'''
        #accounts for slicing
        if type(index) == slice:
            return self.__sliceList(index)

        return self.__getNodeAtPosition(self.__checkIndex(index)).getElement()


    def __sliceList(self,theSlice):
        '''(Private) function to handle slicing.  Returns a Linked List'''
        retList = LinkedList()

        if theSlice.step == None:
            step = 1
        else:
            step = theSlice.step

        if theSlice.start == None:
            if step > 0:
                start = 0
            else:
                start = self.length-1
        else:
            start = theSlice.start

        if theSlice.stop == None:
            if step > 0:
                stop = self.length
            else:
                stop = -1
        else:
            stop = theSlice.stop

        for eachItem in range(start,stop,step):
            retList.append(self.__getitem__(eachItem))
        return retList       

Pretty nasty, but it works. Unfortunately, due to our implementation, I believe slicing is an O(n**2) algorithm. You'll also notice that __checkIndex now returns a value. This was done to incorporate negative values. You might be thinking to yourself, "I can reduce some repeated code here" You're right. Check the code in the spoiler tags to see a much cleaner __sliceList function.

The last technique I'd like to point out was for the __str__ function. I wanted my __str__ function to mimic the one in Python's builtin List class. Here's how I did it
    def __str__(self):
        '''returns a string representation of the list'''
        if self.length == 0:    
            return '[]'
        retString = "["
        currentElement = self.first.getNext()
        for i in range(self.length):
            retString += str(currentElement) +", "
            currentElement = currentElement.getNext()
        
        return retString[:-2] + ']'
What I'd like to point out is that I add a comma and a space after every element, and I simply slice the last comma-space off at the end so that I don't have to use any annoying conditionals to try to determine which value is last.

There's a lot more to my Singly Linked List, but that's enough in-depth discussion of it for now. Feel free to look at the code in the spoiler tag to see the completed version. Notice that you can add 2 singly linked lists together, and that my linked list even supports addition to lists (although lists won't support adding my linkedList to them).

The code below has been further cleaned and optimized. Many of the above methods have been modified. For example, __checkIndex now correctly handles negatives. This is the final version of the Singly Linked List class unless I notice something horrendous. The next tutorial will cover optimization techniques and Doubly Linked Lists.


Spoiler

This post has been edited by atraub: 26 January 2011 - 07:36 AM


Is This A Good Question/Topic? 5
  • +

Replies To: Python Singly Linked Lists

#2 skorned  Icon User is offline

  • New D.I.C Head

Reputation: 13
  • View blog
  • Posts: 41
  • Joined: 30-August 08

Posted 24 January 2011 - 11:38 AM

Yeah, I was gonna mention how redundant your code for pop() would be once you've implemented getNodeAtPosition(index), and have the length of the list. You could just go straight to
getNodeAtPosition(self.length)
. But I see that you've done that in your final code, although I don't think you've mentioned this DRY optimization in your tutorial.
Overall, beautiful tutorial though. I would still stick to the beautiful list in python, though I've been forced to implement a linkedlist in java for my school work...
Was This Post Helpful? 1
  • +
  • -

#3 atraub  Icon User is offline

  • Pythoneer
  • member icon

Reputation: 759
  • View blog
  • Posts: 2,010
  • Joined: 23-December 08

Posted 24 January 2011 - 11:47 AM

You're right, I didn't point out that there are further optimizations in the spoiler tag. I'll update the tutorial accordingly.
Was This Post Helpful? 0
  • +
  • -

#4 kewene  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 1
  • Joined: 28-October 12

Posted 28 October 2012 - 11:47 PM

Quote

I have implemented a similar Queue data structure with a singly linked list also. However, using a reference to both a 'head' and 'tail' results in greater/more efficient possibilities for queuing/dequeuing etc.
Great tutorial and coding (I have borrowed a few methods from your implementation). Cheers.

Was This Post Helpful? 0
  • +
  • -

#5 atraub  Icon User is offline

  • Pythoneer
  • member icon

Reputation: 759
  • View blog
  • Posts: 2,010
  • Joined: 23-December 08

Posted 11 July 2013 - 04:59 PM

View Postkewene, on 29 October 2012 - 02:47 AM, said:

I have implemented a similar Queue data structure with a singly linked list also. However, using a reference to both a 'head' and 'tail' results in greater/more efficient possibilities for queuing/dequeuing etc.
Great tutorial and coding (I have borrowed a few methods from your implementation). Cheers.


The tail wouldn't be very useful unless it was a doubly linked list, which is beyond the scope of this tutorial.
Was This Post Helpful? 0
  • +
  • -

#6 macosxnerd101  Icon User is online

  • Self-Trained Economist
  • member icon




Reputation: 10486
  • View blog
  • Posts: 38,857
  • Joined: 27-December 08

Posted 11 July 2013 - 07:20 PM

Actually, a tail would allow for Theta(1) time for appending a node to the end of the list. It would be very useful.
Was This Post Helpful? 0
  • +
  • -

#7 atraub  Icon User is offline

  • Pythoneer
  • member icon

Reputation: 759
  • View blog
  • Posts: 2,010
  • Joined: 23-December 08

Posted 22 July 2013 - 10:17 AM

Haha I hadn't even thought of appending. Man, it's been so long since I've done any real programming, I'm getting really rusty.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1