AITEC Contract Research Projects in FY1996 : Software

(12) A Study on Parallel Robust Parsing based on GLR Algorithm

Dr. Susumu Kunifuji, Professor, Japan Advanced Institute for Science and Technology

Robust LR Parser (R-LRPar)

Thanaruk Theeramunkong, Manabu Okumura,
Susumu Kunifuji and Hiroki Imai
at Japan Advanced Institute of Science and Technology

[Software Features]

This software is useful for analyzing natural language based on given context-free grammar. As the result of the analysis, the syntactic trees of the analyzed input will be produced. These resultant trees are used as input for succeeding processing, such as semantic analysis, discourse analysis, and reasoning. This software can be used to analyze not only well-formed sentences but also ill-formed sentences.

This software has the following characteristics:

This software can parse an input sentence under constraints given the form of context free grammar. As the result, the software outputs the syntactic trees which are the interpretation of the sentence. This resulting interpretation can be applied later in succeeding processing, such as semantic analysis module, discourse analysis module, or any inference module. The algorithm applied is based on the fastest parsing algorithm named `Generalized LR parsing method'. Due to this, our software is expected to parse input data efficiently and with high speed.
This software can analyze not only grammatical inputs but also ungrammatically ill-formed inputs. Although this software can robustly parse _ill-formed_ inputs, the efficiency of analysis of _grammatical_ inputs is not sacrificed because the extra processing steps needed for analyzing ill-formed inputs are not executed.
This software includes a tagger which is used as a preprocessor for assigning the most appropriate lexical category to each word in an input before parsing. This tagger is based on the rule-based tagging method constructed by Brill. Please see its license before using it.
This software also provides a way to give priority to any syntactic interpretation of a sentence. The scoring method is now based on syntactic constraints. However, a user can easily change scoring definitions.
This software can execute on both a general-purpose computer and a loosely-coupled parallel machine called PIM. The types of provided load balancing are "on-demand dynamic load balancing" and "random dynamic load balancing". A user can select either of them.

[Required Environment]

   prolog version software<\B>
        software requirement
	    Prolog Package (e.g.,SICStus prolog)
            C Compiler (e.g., cc, gcc)
            Perl, shell(sh) Package
        Environment
            SunOS 4.1.4, Solaris 2.x
   KL1 version software<\B>
        software requirement
	    KL1 Interpreter (PIMOS)
            C Compiler (e.g., cc, gcc)
            Perl, shell(sh) Package
        Environment
            PIMOS (PIM Operating System)
            PIM/m (A Parallel Inference Machine developed by ICOT)

[File Configuration]

H8-12 --+-- README basic software information | +-- INSTALL installation instruction | +-- doc --+-- Manual user manual | | | +-- Specification specification | +-- R-LRPar --+-- btagger tagger | +-- tagger_util utility programs for tagger | +-- LR_prolog sequential parser(prolog) | +-- LR_kl1 parallel parser (KL1)

[Source Size]

R-LRPar prolog version 1200 lines R-LRPar KL1 version 1500 lines Converted grammar 64000 lines Others 100 lines Size after compressing 3 M (tared gzip format)

[FTP]

README
Program and Documents in Japanese [3.6M]

www-admin@icot.or.jp

(12) A Study on Parallel Robust Parsing based on GLR Algorithm

Robust LR Parser (R-LRPar)

Thanaruk Theeramunkong, Manabu Okumura, Susumu Kunifuji and Hiroki Imai at Japan Advanced Institute of Science and Technology

[Software Features]

[Required Environment]

[File Configuration]

[Source Size]

[FTP]

Thanaruk Theeramunkong, Manabu Okumura,
Susumu Kunifuji and Hiroki Imai
at Japan Advanced Institute of Science and Technology