lundi 19 août 2019

How to detect specific pattern from raw text with unknown filler content and length?

I have raw text files and I am interested in specific patterns:

ABCD 12 # 55\n6 # 2. 6 # 1. 53 # 1. 40 # 1. 27 # 1. 14 # 1. 2 # 0. 49 #
        0. 36 # 0. 24 # 0. 11 # 54\n7 # 2. 5 # 1. 52 # 1. 40 # 1. 
27 #   1. 14 # 1. 2 # 0. 49 # 0. 36 # 0. 24 # 0. 11 # 53\n
8 # 2. 5 # 1.    52 # 1. 39
# 1. 27 # 1. 14 # 1. 1 # 0. 49 # 0. 36 # 0. 23 # 
0. 11 # 52\n
9 # 2. 5 # 1. 52 # 1. 39 # 1. 26 # 1. 14 # 1. 1 # 0. 48 # 0. 36 # 0. 23       # 0. 11 # 51\n
10 # 2. 5 # 1.     ABCDEFK A B C D

Now I want to 1) detect if the parts in which a repeated ... # {some symbol} # {some symbol} {potential linebreak} # {some symbol} # {some symbol} ... and 2) extract all data which is inbetween two #.

I struggle to formualize this in regex format because of the irregularity of the pattern (length and content wise).

Can somebody help?

Aucun commentaire:

Enregistrer un commentaire