Page 1 of 1

The DAWG (Directed Acyclic Word Graph) Rate Topic: -----

#1 AdamSpeight2008  Icon User is offline

  • MrCupOfT
  • member icon


Reputation: 2240
  • View blog
  • Posts: 9,410
  • Joined: 29-May 08

Posted 17 June 2012 - 02:30 AM

This tutorial take utilizes the code snippet Option(Of T) (NuGet: OptionOfT).


Basic Layout of a CharNode

 +-------+-----------+
 |       |           | Next
 | Value | EndOfWord |--->--
 |       |           |
 +-------+-----------+
     |
     | Down
     |







We'll be putting the DAWG to good use in another Tutorial.

Spoiler



CharNodeBase

Public MustInherit Class CharNodeBase
  Public ReadOnly Value As Char
  Public Property EndOfWord As Boolean
  Public Property [Next] As CharNodeBase
  Public Property Down As CharNodeBase

  Protected Friend Sub New(Value As Char)
    Me.Value = Value
  End Sub

  Public Overrides Function ToString() As String
    Return String.Format("[{0}{1}]", Value, If(EndOfWord, ":", " "))
  End Function

  Public Shared Operator =(X As CharNodeBase, c As Char) As Boolean
    Return X.Value = c
  End Operator

  Public Shared Operator <>(X As CharNodeBase, c As Char) As Boolean
    Return X.Value = c
  End Operator

  Public MustOverride Function CreateNew(Of T As CharNodeBase)(Value As Char) As T 
End Class



CharNode

Public Class CharNode
  Inherits CharNodeBase

  Public Sub New()
    MyBase.New(Nothing)
  End Sub

  Public Sub New(Value As Char)
    MyBase.New(Value)
  End Sub
  Public Overrides Function CreateNew(Of T As CharNodeBase)(Value As Char) As T
    Dim tmp As CharNodeBase = New CharNode(Value)
    Return DirectCast(tmp, T)
  End Function
End Class






DAWGlist_base

Imports OptionOfT

Public MustInherit Class DAWGlist_base(Of T As {New, CharNodeBase})
  Protected Friend _DIC As T

  Protected Friend Function AppendRestTo(Curr As T, Rest As String) As T
    For Each c In Rest
      Curr.Next = Curr.CreateNew(Of T)(c)
      Curr = CType(Curr.Next, T)
    Next
    'Return Curr
    Return DirectCast(Curr, T)
  End Function

  Public Sub Add(Word As String)
    Add(_DIC, Word, 0)
  End Sub

  Protected Friend Function Add(ByVal Curr As T, Word As String, Index As Integer) As OptionOfT.Option
    If String.IsNullOrEmpty(Word) = False Then
      Dim fn = Function(c As T, w As String, i As Integer) As OptionOfT.Option
                 Dim res = AppendRestTo(c, Word.Substring(i))
                 res.EndOfWord = True
                 Return [Option].Create(Of T)(res)
               End Function
      If _DIC Is Nothing Then
        Dim tt = New T
        _DIC = tt.CreateNew(Of T)(Word(0))
        Return fn(_DIC, Word, 1)
      Else
        Return RecurseThroughDAWG(Curr, Word, 0,
                                  Function(c, w, i) fn(c, w, i),
                                  Function(c, w, i)
                                    c.Down = Curr.CreateNew(Of T)(w(i))
                                    Return fn(CType(c.Down, T), w, i + 1)
                                  End Function)
      End If
    Else
      Return [Option].Create(Of T)()
    End If
  End Function

  Public Function Find(Word As String) As Boolean
    If _DIC Is Nothing Then Return False
    Dim r = Find(_DIC, Word, 0)
    If TypeOf r Is [Option].Nowt Then Return False
    Dim s = DirectCast(r, [Option].Some(Of T))
    Return If(s.Value IsNot Nothing, s.Value.EndOfWord, False)
  End Function
  Public Function Retrive(Word As String) As CharNodeBase
    Dim r = Find(_DIC, Word, 0)
    If TypeOf r Is [Option].Nowt Then Return Nothing
    Dim s = DirectCast(r, [Option].Some(Of T))
    Return If(s.Value IsNot Nothing, s.Value, Nothing)

  End Function
  Protected Friend Function Find(ByVal Curr As T, Word As String, Index As Integer) As OptionOfT.Option
    Return RecurseThroughDAWG(Curr, Word, 0,
               Function(c, w, i)
                 If i < Word.Length Then
                   Return [Option].Create(Of T)()
                 Else
                   Return [Option].Create(Of T)(c)
                 End If
               End Function,
               Function(c, w, i)
                 If i < Word.Length Then
                   Return [Option].Create(Of T)()
                 Else
                   Return [Option].Create(Of T)(c)
                 End If
               End Function)
  End Function

  Protected Friend Function RecurseThroughDAWG(ByVal Curr As T, Word As String, Index As Integer,
                       NextAct As Func(Of T, String, Integer, OptionOfT.Option),
                       DownAct As Func(Of T, String, Integer, OptionOfT.Option)
                      ) As OptionOfT.Option
    Dim ni = Index + 1
    Dim wl = Word(Index)
    If Curr = wl Then
      If Curr.Next IsNot Nothing Then
        If ni < Word.Length Then Return RecurseThroughDAWG(CType(Curr.Next, T), Word, ni, NextAct, DownAct)
        Return [Option].Create(Curr)
      End If
      Return NextAct(Curr, Word, ni)
    Else
      If Curr.Down IsNot Nothing Then Return RecurseThroughDAWG(CType(Curr.Down, T), Word, Index, NextAct, DownAct)
      Return DownAct(Curr, Word, Index)
    End If
  End Function

End Class



DAWGlist
Option Strict On
Public Class DAWGList
  Inherits DAWGlist_base(Of CharNode)
  
End Class







Extending The Capabilities and make a DAWG act like a Dictionary

CharNodeData
b]Basic Layout of a CharNodeDate[/b]
+-------+-----------+------+
|       |           |      | Next
| Value | EndOfWord | Data |--->--
|       |           |      |
+-------+-----------+------+
    |
    | Down
    |



As you can see is very similar to a CharNode, just with an extra section for the Data.
So we can utilize this and inherit the structure on function of the CharData.

Public Class CharNodeData(Of T)
  Inherits CharNodeBase

  Public Property Data As T

  Public Sub New()
    MyBase.New(Nothing)
  End Sub

  Public Sub New(Value As Char)
    MyBase.New(Value)
  End Sub

  Public Overrides Function ToString() As String
    Return String.Format("{0}:=(1)", MyBase.ToString(), Me.Data)
  End Function

  Public Overrides Function CreateNew(Of T1 As CharNodeBase)(Value As Char) As T1
    Dim Tmp As CharNodeBase = New CharNodeData(Of T)(Value)
    Return CType(Tmp, T1)
  End Function
End Class



Now lets extend the DAWGlist to behave like a Dictionary, inheriting from DAWGlist_base(Of CharNodeData(Of T)) so the underlaying data types act on the correct types.

DAWGlistData

Imports OptionOfT.Option

Public Class DAWGListData(Of T)
  Inherits DAWGlist_base(Of CharNodeData(Of T))

  Public Sub New()
    MyBase.New()
  End Sub

  Public Overloads Sub Add(ByVal Word As String, Data As T)
    Dim r = MyBase.Add(_DIC, Word, 0)
    If TypeOf r Is Nowt Then Exit Sub
    Dim t = DirectCast(r, Some(Of CharNodeData(Of T))).Value
    t.Data = Data
  End Sub

  Public Function GetData(Word As String) As OptionOfT.Option
    Dim res = Find(_DIC, Word, 0)
    Return res
  End Function

  Public Sub SetData(Word As String, Data As T)
    Dim res = Find(_DIC, Word, 0)
    If TypeOf res Is Some Then
      Dim s = DirectCast(res, Some(Of CharNodeData(Of T)))
      Dim t = s.Value
      t.Data = Data
    End If
  End Sub

End Class






So that's the DAWG and we'll be putting it to good use in another Tutorial soon.
If you want to play with the DAWG before then. (NuGet: DAWG)

This post has been edited by AdamSpeight2008: 17 June 2012 - 11:25 AM


Is This A Good Question/Topic? 1
  • +

Replies To: The DAWG (Directed Acyclic Word Graph)

#2 TechKid  Icon User is offline

  • D.I.C Head
  • member icon

Reputation: 3
  • View blog
  • Posts: 82
  • Joined: 04-September 10

Posted 25 November 2012 - 09:42 AM

Awesome stuff!
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1