vendredi 26 février 2021

How do I create a new column that that contains part of a string based on a pattern in R

Apologies if this has been solved before, I haven't been able to find a solution to this problem. I am trying to pull out the letter "N" out of a sting including the -1 and +1 position and report it in a new column, there may be more than one instance of N in the string and i would like it to report all of them. I can filter the peptides containing N using dt_contains_N <-dt[str_detect(dt$Peptide, "N"),] but I'm not sure how to extract it, I was thinking something like , dt_N_motif <- dt[substring(dt$Peptide, regexpr("N", dt$Peptide) + 1)] but im not sure how to include the N-position column information to extract the N-1, N and N+1 positions. For example a simplified view of my data table looks like: dt <- data.frame(Peptide= c("GESNEL", "SADNNEW", "SADNNEW"), N_pos=c(4,4,5)) . .

peptide N pos
GESNEL 4
SADNNEW 4
SADNNEW 5

and I would like it to look like this:

peptide N pos Motif
GESNEL 4 SNE
SADNNEW 4 DNN
SADNNEW 5 NNE

Any help would be great,

Thanks!

Aucun commentaire:

Enregistrer un commentaire