0 Replies - 880 Views - Last Post: 05 December 2011 - 06:13 AM

Topic Sponsor:

#1 shashwat.x  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 1
  • Joined: 05-December 11

Parsing the Parse File

Posted 05 December 2011 - 06:13 AM

Hi there,

This is in relation to an NLP application I am trying to build. I have a program which generates all possible parses of an ambiguous sentence (in my native language) in the following format.

__EarlyStartSymbol -> VAKYA EOF
  --------- ambiguous VAKYA: 1 of 3 ---------
    VAKYA -> KVS SP KP KPAI
      KVS -> KVS1
        KVS1 -> KVS1_
          KVS1_ -> SPx
            SPx -> S
              S -> SITA
                SITA
      SP -> SPx
        SPx -> S
          S -> DILLI
            DILLI
      KP -> K
        K -> GAYI
          GAYI
      KPAI
  --------- ambiguous VAKYA: 2 of 3 ---------
    VAKYA -> SP KP KPAI
      SP -> SPx
        SPx -> S
          S -> SITA
            SITA
      --------- ambiguous KP: 1 of 2 ---------
        KP -> SPx AUXK
          SPx -> S
            S -> DILLI
              DILLI
          AUXK -> GAYI
            GAYI
      --------- ambiguous KP: 2 of 2 ---------
        KP -> SPx K
          SPx -> S
            S -> DILLI
              DILLI
          K -> GAYI
            GAYI
      --------- end of ambiguous KP ---------
      KPAI
  --------- ambiguous VAKYA: 3 of 3 ---------
    VAKYA -> SP KP KPAI
      SP -> VSH SPx
        VSH -> SPx
          SPx -> S
            S -> SITA
              SITA
        SPx -> S
          S -> DILLI
            DILLI
      KP -> K
        K -> GAYI
          GAYI
      KPAI
  --------- end of ambiguous VAKYA --------




VAKYA = Sentence in above.

As you can see, there are 4 possible parses in this case. We can ennumerate these parses in the order that they appear. I need some method which when given the parse number and the above as input, gives me the entire parse corresponding to only that particular parse number. For example, for parse number 2, my output should be,
VAKYA -> SP KP KPAI
      SP -> SPx
        SPx -> S
          S -> SITA
            SITA
      KP -> SPx AUXK
         SPx -> S
           S -> DILLI
             DILLI
         AUXK -> GAYI
            GAYI
      KPAI



I have spent over a day and a half and cannot come up with an efficient algorithm for the same. Any help will be greatly appreciated.

Thanks.

Is This A Good Question/Topic? 0
  • +

Page 1 of 1