jeudi 14 novembre 2019

Sort lines in text file between patterns

I am trying to sort lines between patterns in Bash or in Python. I would like to sort the lines based on the second field with "," as delimiter.

Given the following text input file:

Sample1
T1,64,0.65  MEDIUM
T2,60,0.45  LOW
T3,301,0.68  MEDIUM
T4,65,0.75  HIGH
T5,59,0.72  MEDIUM
T6,51,0.82  HIGH
Sample2
T1,153,0.77  HIGH
T2,152,0.61  MEDIUM
T3,154,0.67  MEDIUM
T4,283,0.66  MEDIUM
T5,161,0.65  MEDIUM
Sample3
T1,147,0.71  MEDIUM
T2,154,0.63  MEDIUM
T3,45,0.63  MEDIUM
T4,259,0.77  HIGH

I expect as output:

Sample1
T6,51,0.82  HIGH
T5,59,0.72  MEDIUM
T2,60,0.45  LOW
T1,64,0.65  MEDIUM
T4,65,0.75  HIGH
T3,301,0.68  MEDIUM
Sample2
T5,161,0.65  MEDIUM
T3,154,0.67  MEDIUM
T2,250,0.61  MEDIUM
T8,255,0.59  MEDIUM
Sample3
T3,45,0.63  MEDIUM
T1,147,0.71  MEDIUM
T2,154,0.63  MEDIUM
T4,259,0.77  HIGH

I have tried to adapt this suggestion by glenn jackman found in another post but it only works for 2 pattern as far as I tested:

> gawk -v cmd="sort -k7" p=1 '
>     /^PATTERN2/ {          # when we we see the 2nd marker:
>         close("cmd", "to");
>         while (("cmd" |& getline line) >0) print line 
>         p=1
>     }
>     p  {print}             # if p is true, print the line
>     !p {print |& "cmd"}   # if p is false, send the line to `sort`
>     /^PATTERN1/ {p=0}      # when we see the first marker, turn off printing ' FILE

Aucun commentaire:

Enregistrer un commentaire