mercredi 27 décembre 2017

Find a row in a dataframe using regex R?

I have a translation table (trans_df):

rs1065852 rs201377835 rs28371706 rs5030655 rs5030865 rs3892097 rs35742686
1          G           C          G         A         C         C          T
2          G           C          G         A         C         C        del
3          A           C          G         A         C         T          T
4        del         del        del       del       del       del        del
5          G           C          G       del         C         C          T
6          G           C          G         A         C         C          T
7          G           C          G         A         C         C          T
8          A           C          G         A         C         C          T
9          G           C          A         A         C         C          T
10         G           C          G         A         C         C          T
11         G           C          G         A         C         C          T
   rs5030656 rs5030867 rs28371725 rs59421388
1        CTT         T          C          C
2        CTT         T          C          C
3        CTT         T          C          C
4        del       del        del        del
5        CTT         T          C          C
6        CTT         G          C          C
7        del         T          C          C
8        CTT         T          C          C
9        CTT         T          C          C
10       CTT         T          C          T
11       CTT         T          T          C

and input :

rs1065852 rs201377835 rs28371706 rs5030655 rs5030865 rs3892097 rs35742686
1       G|A           C        G|A         A         C       T|C          T
  rs5030656 rs5030867 rs28371725 rs59421388
1       CTT         T        C|T          C

I want to find the input row in the trans_df using regular expression. I have achieved it by position:

Reduce(intersect,lapply(seq(1, ncol(trans_df)), 
                          function(i) {grep(pattern = input[, i], 
                          trans_df[, i])}))

Is there any way to do this where pattern = input? Please advise.

Aucun commentaire:

Enregistrer un commentaire