lundi 16 novembre 2020

Find unique sequences within dna strings

I have a file which contains bunch of sequences. The strings have a prefix of AAGCTT and a suffix of GCGGCCGC.

Between these two pattern lies unique sequences. I want to find these sequences and count their occurrence.

Example below

AAGCTTCTGCCCACACACCGAAACATGAATCGATCACATACTAGAATCAGGCAGTCAGAGATATCAAAGATGATGAGTTCGGCGGCCGC

String CTGCCCACACACCGAAACATGAATCGATCACATACTAGAATCAGGCAGTCAGAGATATCAAAGATGATGAGTTCG is present 1000 times.

Thanks in advance !!

Aucun commentaire:

Enregistrer un commentaire