Abstract This system proves the effectiveness of the Parallel Inference Machine(PIM) for the motif extraction problem, which extracts common patterns(motifs) from s protein database. This uses the minimum description length principle and genetic algorithms. Features The experimental motif extraction system automatically extracts common patterns in some protein categories, such as cytochrome c. The system regards a motif as a stochastic rule to deal with exceptions to the classification of proteins.
- The Minimum Description Length (MDL) principle was adopted as a criterion for motif evaluation to avoid motif's overfitting to sample data.
- Genetic Algorithms (GA) were employed as a motif search method to reduce the effects of the combinatorial explosion and to reduce search time.
- Highly parallelism on the PIM was achieved by exploiting trial, divide- and-conquer and data parallelism.
![]() Configuration of Experimental Motif Extraction System |