Structured Motif Extractor

In the last years, the completion of the human genome sequencing showed up a wide range of new challenging issues involving raw data analysis. In particular, the discovery of information implicitly encoded in biological sequences is assuming a prominent role in identifying genetic diseases and in deciphering biological mechanisms. This information is usually represented by patterns frequently occurring in the sequences. Because of biological observations, a specific class of patterns is becoming particularly interesting: frequent structured patterns. In this respect, it is biologically meaningful to look at both "exact" and "approximate" repetitions of patterns within the available sequences.

SME implements algorithms allowing to discover frequent structured patterns, both in "exact" and "approximate" form, present in a collection of input biological sequences.

To access the system click here

If the system’s page does not load, please contact us to access the system.

References:

G. Terracina, A fast technique for deriving  frequent structured patterns from biological data sets, Information Sciences, To Appear

L. Palopoli and G. Terracina, Discovering frequent structured patterns from string databases: an application to biological sequences, Proc. of The 5th International Conference on Discovery Science (DS 2002), 34-46, Lübeck, Germany, 2002, Lecture Notes in Computer Science, Springer Verlag