samedi 19 septembre 2020

detect repeating patterns across two columns in a dataframe that has an ID variable

I have a data set that has 3 columns , (ID , D , AE). i want to retrieve records where any two values in column D are co-occurring with any value in Column AE 3 times or more:

enter image description here

sample=data.frame(
  ID=c(1,1,1,1,2,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5),
 
D=c('a','b','c','d','e','f','g','y','z','a','b','g','y','d','e','f','g','b','a','y'),AE=c('m','h','j','k','m','h','j','k','m','j','m','h','l','j','k','m','h','m','o','s')
      )

and I want the output to be something like :

output=data.frame(ID=c(1,1),
D=c('a','b'),
AE='m')

please note that i want the output to have records where any two values from column D within a certain ID occurred with a value from column AE 3 times or more.

Aucun commentaire:

Enregistrer un commentaire