(25) A proposed framework of LTB Japanese Grammar and implementing its rules

Dr. Hiroshi Sano, Tokyo University of Foreign Studies
(sano@stargate.fs.tufs.ac.jp)


IFS - Japanese Morphological Grammar Rules

Introduction

The IFS Japanese Morphological Grammar Rules is an implementation for UNIX-based computers of the grammar rules of the Morphological Analysis System LAX -- free software released by ICOT(Institute for New Generation Computer Technology). The IFS grammar rules are designed for the JUMAN Morphological Analysis Engine. Some dictionary data is also added.

With this rule, we can analyze(articulate) Japanese sentences. The morphological grammar system, which this rule is based on, does not depend on a particular part of speech system of syntactic grammar at the most. Thus the result of the analysis is versatile, and can be put into wide variety of usages such as: linguistic researches, and software development applying a natural language processing system.

Functional word (JOJI and SETSUJI) dictionaries are released with rules. There is no general dictionary for content words. Also included in the release are the converted dictionaries from the IPAL freely distributed verb and adjective dictionaries, and from JUMAN 1.0 dictionary with 38,000 entries (without nouns).

Purpose

Natural language processing

IFS Japanese Morphological Grammar Rules uses JUMAN, Chasen, and Breakfast as morphological analysis engines. This software package contains morphological grammar rules ready for compiling with utility programs attached with JUMAN (Chasen, Breakfast), dictionary data, and utility programs.

Assumed Users

The assumed users are researchers of natural language processing and application developers applying natural language processing.

System Requirement

Computers:
UNIX-based computers (such as SunOS Release 4.x) with GNU software (glibc, gcc, etc.) installed.
Analysis Engine:
JUMAN 2.0 or later, Chasen 1.0 or later (can be obtained from NAIST or Kyoto University FTP). System utility programs (makemat, makeint, maketree, etc.) is essential.
AWK:
Probably already installed on BSD UNIX. If not, obtain and install.
PERL:
Probably already installed on BSD UNIX. If not, obtain and install.
Disk Space:
10MB.
Archive Files

README/ Readme file
COPYRIGHT/ Copyright declaration
RULE/ Morphological analysis data
DICT/ Dictionary data
MISC/ Utility programs


RULE/JUMAN.connect.c Japanese morphological grammar rules
RULE/JUMAN.grammar Analysis engine
DICT/CLOSEWORD/*.txt Dictionary / closed word data
DICT/OPENWORD/ (Dictionary / open word data)
DICT/OPENWORD/BNST/*.txt Test data
DICT/OPENWORD/IPAL/*.txt Data converted from IPAL dictionary index
DICT/OPENWORD/LCT1/*.txt Verbs, adverbs, adjectives, etc.
DICT/OPENWORD/LCT2/*.txt Nouns
DICT/OPENWORD/SIZN/*.txt Sample data sufficient to analyze a certain story
MISC/LIB/*.sh Shell scripts for making morphological analysis parser
MISC/CORPUS/BUNSETU/*.TXT Test data
MISC/CORPUS/CNST/kenpo* Sample corpus for seminars
MISC/CORPUS/CNST/words.txt Sample dictionary for seminars
MISC/LIB/PROG/AWK/*.awk Utility programs (written with AWK)
MISC/LIB/PROG/PERL/*.pl Utility programs (written with PERL)

FTP


Related Sites

juman3.1
JUMAN3.1 distributed by Kyoto University NAGAO Laboratory
Chasen
Chasen1.0 distributed by NAIST MATSUMOTO Laboratory
breakfast
Breakfast version 4.0.4f distributed by FUJITSU Co., Ltd.

sano@fs.tufs.ac.jp
All Rights Reserved, Copyright (C) 1997, SANO Hiroshi
Last modified: May 16 1997


www-admin@icot.or.jp