vendredi 1 avril 2022

awk: processing log and search pattern

I am working with the log filles arranged in the following format:

Finding intramodel H-bonds
Constraints relaxed by 0.5 angstroms and 20 degrees
Models used:
    1.1 SarsCov2_06I_nsp5holo_rep1.pdb
    1.2 SarsCov2_06I_nsp5holo_rep1.pdb
    1.3 SarsCov2_06I_nsp5holo_rep1.pdb
    1.4 SarsCov2_06I_nsp5holo_rep1.pdb
    1.5 SarsCov2_06I_nsp5holo_rep1.pdb
    1.6 SarsCov2_06I_nsp5holo_rep1.pdb
    1.7 SarsCov2_06I_nsp5holo_rep1.pdb
    1.8 SarsCov2_06I_nsp5holo_rep1.pdb
    1.9 SarsCov2_06I_nsp5holo_rep1.pdb
    1.10 SarsCov2_06I_nsp5holo_rep1.pdb
    1.11 SarsCov2_06I_nsp5holo_rep1.pdb
    1.12 SarsCov2_06I_nsp5holo_rep1.pdb
    1.13 SarsCov2_06I_nsp5holo_rep1.pdb
    1.14 SarsCov2_06I_nsp5holo_rep1.pdb

30 H-bonds
H-bonds (donor, acceptor, hydrogen, D..A dist, D-H..A dist):
SarsCov2_06I_nsp5holo_rep1.pdb #1.1/? THR 26 N      SarsCov2_06I_nsp5holo_rep1.pdb #1.1/A UNL 888 O   SarsCov2_06I_nsp5holo_rep1.pdb #1.1/? THR 26 H       3.477  2.692
SarsCov2_06I_nsp5holo_rep1.pdb #1.1/? GLU 166 N     SarsCov2_06I_nsp5holo_rep1.pdb #1.1/A UNL 888 O   SarsCov2_06I_nsp5holo_rep1.pdb #1.1/? GLU 166 H      3.129  2.160
SarsCov2_06I_nsp5holo_rep1.pdb #1.1/A UNL 888 N     SarsCov2_06I_nsp5holo_rep1.pdb #1.1/? THR 26 O    SarsCov2_06I_nsp5holo_rep1.pdb #1.1/A UNL 888 H      3.433  2.633
SarsCov2_06I_nsp5holo_rep1.pdb #1.2/? GLY 143 N     SarsCov2_06I_nsp5holo_rep1.pdb #1.2/A UNL 888 O   SarsCov2_06I_nsp5holo_rep1.pdb #1.2/? GLY 143 H      3.108  2.169
SarsCov2_06I_nsp5holo_rep1.pdb #1.2/? GLU 166 N     SarsCov2_06I_nsp5holo_rep1.pdb #1.2/A UNL 888 O   SarsCov2_06I_nsp5holo_rep1.pdb #1.2/? GLU 166 H      3.136  2.180
SarsCov2_06I_nsp5holo_rep1.pdb #1.2/A UNL 888 N     SarsCov2_06I_nsp5holo_rep1.pdb #1.2/? THR 26 O    SarsCov2_06I_nsp5holo_rep1.pdb #1.2/A UNL 888 H      3.426  2.631
SarsCov2_06I_nsp5holo_rep1.pdb #1.3/? GLN 189 NE2   SarsCov2_06I_nsp5holo_rep1.pdb #1.3/A UNL 888 S   SarsCov2_06I_nsp5holo_rep1.pdb #1.3/? GLN 189 1HE2   3.568  2.993
SarsCov2_06I_nsp5holo_rep1.pdb #1.3/A UNL 888 N     SarsCov2_06I_nsp5holo_rep1.pdb #1.3/? PHE 140 O   SarsCov2_06I_nsp5holo_rep1.pdb #1.3/A UNL 888 H      3.119  2.247
SarsCov2_06I_nsp5holo_rep1.pdb #1.4/? HIS 163 NE2   SarsCov2_06I_nsp5holo_rep1.pdb #1.4/A UNL 888 N   no hydrogen                                          3.044  N/A
SarsCov2_06I_nsp5holo_rep1.pdb #1.4/A UNL 888 N     SarsCov2_06I_nsp5holo_rep1.pdb #1.4/? ARG 188 O   SarsCov2_06I_nsp5holo_rep1.pdb #1.4/A UNL 888 H      3.046  2.116
SarsCov2_06I_nsp5holo_rep1.pdb #1.5/? GLU 166 N     SarsCov2_06I_nsp5holo_rep1.pdb #1.5/A UNL 888 O   SarsCov2_06I_nsp5holo_rep1.pdb #1.5/? GLU 166 H      3.146  2.335
SarsCov2_06I_nsp5holo_rep1.pdb #1.6/? ASN 142 ND2   SarsCov2_06I_nsp5holo_rep1.pdb #1.6/A UNL 888 O   SarsCov2_06I_nsp5holo_rep1.pdb #1.6/? ASN 142 1HD2   2.977  2.229
SarsCov2_06I_nsp5holo_rep1.pdb #1.6/? GLY 143 N     SarsCov2_06I_nsp5holo_rep1.pdb #1.6/A UNL 888 N   SarsCov2_06I_nsp5holo_rep1.pdb #1.6/? GLY 143 H      3.312  2.502
SarsCov2_06I_nsp5holo_rep1.pdb #1.6/A UNL 888 N     SarsCov2_06I_nsp5holo_rep1.pdb #1.6/? THR 190 O   SarsCov2_06I_nsp5holo_rep1.pdb #1.6/A UNL 888 H      2.993  2.138
SarsCov2_06I_nsp5holo_rep1.pdb #1.7/? HIS 163 NE2   SarsCov2_06I_nsp5holo_rep1.pdb #1.7/A UNL 888 N   no hydrogen                                          3.208  N/A
SarsCov2_06I_nsp5holo_rep1.pdb #1.7/A UNL 888 N     SarsCov2_06I_nsp5holo_rep1.pdb #1.7/? ARG 188 O   SarsCov2_06I_nsp5holo_rep1.pdb #1.7/A UNL 888 H      2.928  2.013
SarsCov2_06I_nsp5holo_rep1.pdb #1.8/? ASN 142 ND2   SarsCov2_06I_nsp5holo_rep1.pdb #1.8/A UNL 888 O   SarsCov2_06I_nsp5holo_rep1.pdb #1.8/? ASN 142 1HD2   3.146  2.477
SarsCov2_06I_nsp5holo_rep1.pdb #1.8/A UNL 888 N     SarsCov2_06I_nsp5holo_rep1.pdb #1.8/? THR 190 O   SarsCov2_06I_nsp5holo_rep1.pdb #1.8/A UNL 888 H      2.986  2.138
SarsCov2_06I_nsp5holo_rep1.pdb #1.9/A UNL 888 N     SarsCov2_06I_nsp5holo_rep1.pdb #1.9/? ARG 188 O   SarsCov2_06I_nsp5holo_rep1.pdb #1.9/A UNL 888 H      3.295  2.571
SarsCov2_06I_nsp5holo_rep1.pdb #1.11/? ASN 142 ND2  SarsCov2_06I_nsp5holo_rep1.pdb #1.11/A UNL 888 O  SarsCov2_06I_nsp5holo_rep1.pdb #1.11/? ASN 142 1HD2  3.013  2.056
SarsCov2_06I_nsp5holo_rep1.pdb #1.11/? GLU 166 N    SarsCov2_06I_nsp5holo_rep1.pdb #1.11/A UNL 888 O  SarsCov2_06I_nsp5holo_rep1.pdb #1.11/? GLU 166 H     3.158  2.376
SarsCov2_06I_nsp5holo_rep1.pdb #1.12/? HIS 41 NE2   SarsCov2_06I_nsp5holo_rep1.pdb #1.12/A UNL 888 S  no hydrogen                                          3.850  N/A
SarsCov2_06I_nsp5holo_rep1.pdb #1.12/? GLY 143 N    SarsCov2_06I_nsp5holo_rep1.pdb #1.12/A UNL 888 N  SarsCov2_06I_nsp5holo_rep1.pdb #1.12/? GLY 143 H     3.125  2.149
SarsCov2_06I_nsp5holo_rep1.pdb #1.12/A UNL 888 N    SarsCov2_06I_nsp5holo_rep1.pdb #1.12/? THR 190 O  SarsCov2_06I_nsp5holo_rep1.pdb #1.12/A UNL 888 H     3.071  2.229
SarsCov2_06I_nsp5holo_rep1.pdb #1.13/? ASN 142 ND2  SarsCov2_06I_nsp5holo_rep1.pdb #1.13/A UNL 888 S  SarsCov2_06I_nsp5holo_rep1.pdb #1.13/? ASN 142 2HD2  3.767  2.968
SarsCov2_06I_nsp5holo_rep1.pdb #1.13/? GLN 189 NE2  SarsCov2_06I_nsp5holo_rep1.pdb #1.13/A UNL 888 O  SarsCov2_06I_nsp5holo_rep1.pdb #1.13/? GLN 189 1HE2  2.948  2.173
SarsCov2_06I_nsp5holo_rep1.pdb #1.13/A UNL 888 N    SarsCov2_06I_nsp5holo_rep1.pdb #1.13/? PHE 140 O  SarsCov2_06I_nsp5holo_rep1.pdb #1.13/A UNL 888 H     2.950  2.140
SarsCov2_06I_nsp5holo_rep1.pdb #1.14/? ASN 142 ND2  SarsCov2_06I_nsp5holo_rep1.pdb #1.14/A UNL 888 N  SarsCov2_06I_nsp5holo_rep1.pdb #1.14/? ASN 142 1HD2  3.175  2.426
SarsCov2_06I_nsp5holo_rep1.pdb #1.14/? ASN 142 ND2  SarsCov2_06I_nsp5holo_rep1.pdb #1.14/A UNL 888 O  SarsCov2_06I_nsp5holo_rep1.pdb #1.14/? ASN 142 1HD2  3.043  2.185
SarsCov2_06I_nsp5holo_rep1.pdb #1.14/A UNL 888 N    SarsCov2_06I_nsp5holo_rep1.pdb #1.14/? THR 190 O  SarsCov2_06I_nsp5holo_rep1.pdb #1.14/A UNL 888 H     3.245  2.377

I need to find the first occurence of the "GLU 166 N" pattern and print the number present on the same line just before the pattern as #1.number/?, associated with this pattern. So in the example the detected number should be 1 (since the associating number is #1.1/?).

I would start from basic pattern-detection

awk '/GLU 166 N/' file

but how to find correctly the number defined just before the pattern and print it as output ?

Aucun commentaire:

Enregistrer un commentaire