lundi 13 mai 2019

How can I delete rows which have two or more words (after each other) in a sequence?

I want to remove the rows which have the same two or more words after each other, like a sequence. This is to do a sequential pattern mining analysis.

I already tried the distinct() function, but it's not working.

r_seq_5 <- r_seq_5 %>% # remove duplicated rows based on the ROIs
 distinct(ROI, next_roi, .keep_all = TRUE)

I already tried the duplicated() function, but this only removes the whole row.

r_seq_5 <- r_seq_5[!duplicated(r_seq_5),] # remove duplicates


   #       Su Score result ROI       next_roi  third_roi  four_roi   five_roi   
   #  1     1    90 high   Elsewhere Elsewhere Teacher    Teacher    Teacher   
   #  2     1    90 high   Elsewhere Teacher   Teacher    Teacher    Teacher   
   #  3     1    90 high   Teacher   Pen       Teacher    Elsewhere  Smartboard

This is the table. If Teacher is two or three times in the sentence it doesn't matter, as long as it is not after each other.

The desired result is:

# 3     1    90 high   Teacher   Pen       Teacher    Elsewhere  Smartboard

Aucun commentaire:

Enregistrer un commentaire