dimanche 22 novembre 2020

Replace multiple values in data table column after multiple pattern match

Here is a snippet that could help a few 'R beginners' like me: I was referring to this thread for a need on my melted data table

Replace entire string anywhere in dataframe based on partial match with dplyr

I was looking for an easy way of replacing an entire string in one of the columns in data table with a partial match string. I could not find a straight fit on the forum , hence this post:

dt<-data.table(x=c("A_1", "BB_2", "CC_3"),y=c("K_1", "LL_2", "MM_3"),z=c("P_1","QQ_2","RR_3")
> dt
      x    y    z
1:  A_1  K_1  P_1
2: BB_2 LL_2 QQ_2
3: CC_3 MM_3 RR_3

replace multiple values in col y with multiple patterns to match :

dt[,2]<-str_replace_all(as.matrix(dt[,2]),c("K_.*" = "FORMULA","LL_.*" = "RACE","MM_.*" = "CAR"))

using as.matrix() on column excludes the warning on input to the str_replace_all() function The result is :

> dt[,2]<-str_replace_all(as.matrix(dt[,2]),c("K_.*" = "FORMULA","LL_.*" = "RACE","MM_.*" = "CAR"))
> dt
      x       y    z
1:  A_1 FORMULA  P_1
2: BB_2    RACE QQ_2
3: CC_3     CAR RR_3
>

very un-elegant, but worked for me, when the column data is large, this seemed to be a quick solution

Requires library(stringr) Any suggestions to improve are appreciated.

Aucun commentaire:

Enregistrer un commentaire