Page 1 of 1

Birst Sort Explained Rate Topic: -----

#1 AdamSpeight2008   User is offline

  • MrCupOfT
  • member icon

Reputation: 2298
  • View blog
  • Posts: 9,535
  • Joined: 29-May 08

Posted 07 August 2008 - 08:48 PM

BIRST Sort Explained
Bytewise
Index
Radix
SorT
Link to snippet:- BIRST Sort Improved
It is made from three tables

1. The Byte Index
The first is called the Byte Index and only contains 256 entries.
Each entry is made up of two parts;-
A pointer to the first node
A pointer to the last node
  Private Structure ByteIndice
        Dim First As Long
        Dim Last As Long
    End Structure


2. Radix SorT
The second part is the radix part and contains the same number of entries as there are entries in the input data.
Each entry is made up of two parts;-
A pointer to the location of the actual data
A pointer to the next node, which also has the byte index radix.
Private Structure PtrNextPtr
 Dim Ptr As Long
 Dim NextPtr As Long
End Structure


3. Position Array
The Third is used a temporary array to copy the data into during the final sort.

Algorithm
Attached File  BirstSort.bmp (633.5K)
Number of downloads: 253
1. Find the longest length string (m) in the list of strings
Since we have to look at every string we set the List_Pos for each word to it position in the list.
  For i As Integer = 0 To WordList.Count - 1
   List_Pos(i) = i
   If WordList(i).Length > m Then
     m = WordList(i).Length
   End If
  Next i


2. Start string position m-1 (m-1 since the string position start at 0. Zero) work down to the 0th position.
For r As Integer = m - 1 To 0 Step -1


3. Reset the Byte Index Array, I use -1
For i As Integer = 0 To 255
  byteindex(i).First = -1 : byteindex(i).Last = -1
Next i


4. For Each String
For i As Integer = 0 To WordList.Count - 1


5. Get the ListPosition(i)
Ptr_Pos = List_Pos(i)


6. If r is greater than the length of the string, we used the byteindex of zero, else we use the character code of the letter at position r of the string.
#
If r >= WordList(Ptr_Pos).Length Then
 ' Position is beyond the lenght of the string
 ' So give it a default value
 b = 0
Else
 ' Otherwise get value of character at current radix position
 b = Asc(WordList(Ptr_Pos).Substring(r, 1))
End If


7. Set the Node Pointers. Next to a invalid value (-1), Ptr to Ptr_Pos
 TmpNP(i).NextPtr = -1 ' end of node list
 TmpNP(i).Ptr = Ptr_Pos


8. If byteindex(b).first isn't invalid then there is already a node chain attached to it, attached this one on the end.
Otherwise attach the first node of the node chain.
If byteindex(b).First > -1 Then 
 ' Radix value has node chain attached to it
 ' so add a link to it by change Last node's next to point to this one
 TmpNP(byteindex(b).Last).NextPtr = i
Else
 ' Pointer to first node of this radix value
 byteindex(b).First = i
End If
' Pointer to last node of the radix value


9. Set the byteindex(b).last to i
byteindex(b).Last = i


10. Start at the byteindex(0) follow the node chains, Setting List_Post(K) to the TmpNP(Curr).ptr.
Next bit looks complicated but all it is doing is follow the node chains.
k = -1
For j As Integer = 0 To 255 Step 1
 'Check to see if radix value has a node chain attached to it
 If byteindex(j).First > -1 Then
  ' Get first link in Node chain
  Nxt = byteindex(j).First
  Do
   Curr = Nxt
   k = k + 1
   List_Pos(k) = TmpNP(Curr).Ptr
   Nxt = TmpNP(Curr).NextPtr
  Loop Until Nxt = -1
 End If
Next j


11. Loop over the other string positions
Next r


At this stage TmpNP() contain enough information to work out the where each string goes when sort.
Example
TmpNP(0).ptr Contains the index of the string in the first position.
TmpNP(99).ptr Contains the index of the string in the hundredth position.
12. Copy the strings in the correct position into a temporary array
Dim Temp_List(WordList.Count) As String
For i As Integer = 0 To WordList.Count - 1
 ' Set the TempList(i) to the word list at the indicate list position
 Temp_List(i) = WordList(List_Pos(i))
Next i


13. Copy back into the original array.
 ' Move back into Wordlist
  WordList.Clear()
  WordList.AddRange(Temp_List)


Analysis
Order: O(n) <= 2MN + 256M + 2M
Memory Footprint
Byte Index Array = 256 * 2 * 4 bytes =>2048b / 2Kb
Radix Sort = n * 2 * 4 bytes
LIst Positon = n * 4 bytes
Since were only reposition the string only twice (copying string being a slow operation).
The sort only use longs which are a nice easy size for the processor to it is quick.
Notes
This algorithm can be extended to half word radix but the potential increase in execution speed is let down by the code complexity;- odd & even length string, bigger memory footprint.

Is This A Good Question/Topic? 0
  • +

Page 1 of 1