mercredi 10 février 2021

How can I identify patterns over several rows in a column and fill a new column with information about that pattern using R?

I am working with a dataset where I have a dummy variable that tells me when a participant is near a chair and when the participant is away from the chair. I have an id for each participant and the dummy = 1 when the participant is near the chair and 0 when they are a certain feet away. The data updates every 30 seconds or so (it isn't perfectly timed, so I can't use that information to help identify the pattern). If a participant stops at the chair for 2 minutes then we would have 4 observations where D=1. Using the dummy variables I want to identify the start of the participant moving away from the chair and then moving back towards it. I would also like to filly a column with variable (or something like TRUE when it fits that pattern or leaving, wandering, and coming back).

note: The overall event could end with the individual away from the chair or near.

df1 illustrates and example of the dataframe I'm working with and df2 is the final dataframe. I assume I have to use regex or grep to get the final output, but I'm not particularly familiar with either and I could use some guidance!

df1

TIME        ID        D
12:30:10    2         0
12:30:42    2         0
12:30:59    2         1
12:31:20    2         0
12:31:50    2         0
12:32:11    2         0
12:32:45    2         1
12:33:10    2         1
12:33:33    2         1
12:33:55    2         1
12:34:15    2         0
12:34:30    2         0

What I'd like to have in the end:

df2

TIME        ID        D      Pattern
12:30:10    2         0      FALSE
12:30:42    2         0      FALSE
12:30:59    2         1      TRUE
12:31:20    2         0      TRUE     
12:31:50    2         0      TRUE
12:32:11    2         0      TRUE
12:32:45    2         1      TRUE
12:33:10    2         1      FALSE
12:33:33    2         1      FALSE
12:33:55    2         1      FALSE
12:34:15    2         0      FALSE
12:34:30    2         0      FALSE

Eventually I want to end up with a data frame where the observations only include the rows that are TRUE.

Once again, the dataframe includes several IDs and events like this.

Thanks!

Aucun commentaire:

Enregistrer un commentaire