I have few letters and want to find pattern which happen often.
I have below letters.
---------TAAA-GAGAG----T--T-------T
-------------T------A---------T----
----------AAA-GAGAG---C-----------T
------C------------T-A----T-----TA-
-----AC----------------TT--------C-
-------------------T---------------
---A---------------T---------------
-------------------------T----T----
------C---------------------T-----T
----------AAA-GAGAG---C------------
----A--------------------T-------A-
--------G-G-----------G-------T----
----A--------------------T---------
---------T-------------------------
-----AC------T--------------------T
--GA-G------------GT--------------T
-----A--G-AAA-GAGAG-AA-------------
-----------------------------------
-T-------C-G-------T---TT------T--T
TT-----------------T---------------
-------------------T---TT-T--------
---A----G------------A--------T----
-----------------------------------
-------T--AA--G-GAG---C------------
-T-A----------------A--------------
------------T------------------T---
-----------G-----------------------
--G-A------------------C---T---T---
----A---G---A-------A------T-------
--------G-------------------T-----T
-TG--------A---------A-T-----------
--G--A--------GAGAG---CT-----------
---A-------G------------T----G---A-
T-------G----T---------------------
-T----C-GCAA--GAGAG-A-C-----T--TT--
-----A----AAA-GAGAG-A-T------G---A-
-T---------G-----------------------
---A---T---------------------G-----
---A---------T-------A---T---A---A-
-----------------------------------
TC--A----T----------G-------T-T--G-
-T----CT---G-T-----T-A----T-T--T---
-------------T-----TA------------A-
--G----T-----------------T-T--T----
---A------AA--GAGAG---C-----T------
--------------GAGAG-A-C------------
----------AA--GAGAG---C-G----------
And as you can see there is pattern on the middle there is "GAGAG"
So I can say This G+A+G+A+G <- is coming from same data.
So I want to split out that lines and group.
Wrong Case. First line There is T+A+A+A but another line does not together with T+A+A+A , only A+A+A always together. So in this case T and A is not group.
Anybody has idea this kind of related algorithm or how to find a pattern.
I try to do with Java Programming.
Thank you.
Aucun commentaire:
Enregistrer un commentaire