AITEC Contract Research Projects in FY1996 : Software
|
(12) A Study on Parallel Robust Parsing based on GLR Algorithm
Dr. Susumu Kunifuji, Professor, Japan Advanced Institute for Science and Technology
Robust LR Parser (R-LRPar)
by
Thanaruk Theeramunkong, Manabu Okumura,
Susumu Kunifuji and Hiroki Imai
at Japan Advanced Institute of Science and Technology
[Software Features]
This software is useful for
analyzing natural language based on given context-free grammar. As
the result of the analysis, the syntactic trees of the analyzed input
will be produced. These resultant trees are used as input for
succeeding processing, such as semantic analysis, discourse analysis,
and reasoning. This software can be used to analyze not only
well-formed sentences but also ill-formed sentences.
This software has the following characteristics:
- This software can parse an input sentence under constraints given
the form of context free grammar. As the result, the software outputs
the syntactic trees which are the interpretation of the sentence.
This resulting interpretation can be applied later in succeeding
processing, such as semantic analysis module, discourse analysis
module, or any inference module. The algorithm applied is based
on the fastest parsing algorithm named `Generalized LR parsing
method'. Due to this, our software is expected to parse input data
efficiently and with high speed.
- This software can analyze not only grammatical inputs but also
ungrammatically ill-formed inputs. Although this software can robustly parse
_ill-formed_ inputs, the efficiency of analysis of _grammatical_ inputs
is not sacrificed because the extra processing steps needed
for analyzing ill-formed inputs are not executed.
- This software includes a tagger which is used as a preprocessor
for assigning the most appropriate lexical category to each word in an
input before parsing. This tagger is based on the rule-based tagging
method constructed by Brill. Please see its license before using it.
- This software also provides a way to give priority to any
syntactic interpretation of a sentence. The scoring method is now
based on syntactic constraints. However, a user can easily change
scoring definitions.
- This software can execute on both a general-purpose computer and a
loosely-coupled parallel machine called PIM. The types of provided load
balancing are "on-demand dynamic load balancing" and "random dynamic
load balancing". A user can select either of them.
[Required Environment]
prolog version software<\B>
software requirement
Prolog Package (e.g.,SICStus prolog)
C Compiler (e.g., cc, gcc)
Perl, shell(sh) Package
Environment
SunOS 4.1.4, Solaris 2.x
KL1 version software<\B>
software requirement
KL1 Interpreter (PIMOS)
C Compiler (e.g., cc, gcc)
Perl, shell(sh) Package
Environment
PIMOS (PIM Operating System)
PIM/m (A Parallel Inference Machine developed by ICOT)
[File Configuration]
H8-12 --+-- README basic software information
|
+-- INSTALL installation instruction
|
+-- doc --+-- Manual user manual
| |
| +-- Specification specification
|
+-- R-LRPar --+-- btagger tagger
|
+-- tagger_util utility programs for tagger
|
+-- LR_prolog sequential parser(prolog)
|
+-- LR_kl1 parallel parser (KL1)
[Source Size]
R-LRPar prolog version 1200 lines
R-LRPar KL1 version 1500 lines
Converted grammar 64000 lines
Others 100 lines
Size after compressing 3 M (tared gzip format)
[FTP]
- README
- Program and Documents in Japanese [3.6M]
www-admin@icot.or.jp