mardi 18 février 2020

How to recognize unknown patterns in data frame by row?

I have a data frame where I have agricultural use codes (1-5) for 15 consecutive years. Each row is a polygon representing a field. Ultimately I need R to loop through the rows and recognize patterns of use and tell me their respective frequency. Unfortunately in my real data set I have over 1 mio. features and thus all possible patterns are not known.

a <- data.frame(replicate(15, sample(0:5,500,rep=TRUE)))
colnames(a) <- paste0("use",2005:2019)
id <- c(1:500)
a <- cbind(id,a)

id use2005 use2006 use2007 use2008 use2009 use2010 use2011 use2012 use2013 use2014 use2015 ...
1  1       1       1       1       1       2       2       1       4       4       4       ...
2  4       4       4       4       5       5       5       0       5       5       5       ...
3  1       4       3       2       3       2       4       5       1       1       1       ...
4  1       1       1       1       1       2       2       1       4       4       4       ...
5  4       2       2       2       2       5       3       3       3       3       3       ...

So in this arbitrary example, the code should recognize that id 1 & 4 have the same pattern.

In the end I imagine the result to be some sort of frequency distribution to see if there are certain patterns in the agricultural use of my fields.

For example:

1 1 1 1 1 2 1 1 1 3 2 4 1 1 1

[50] - occurs 50 times

5 5 5 5 5 1 1 1 1 4 4 4 2 2 3

[35] - occurs 35 times

and so forth with all existing combinations...

Unfortunately I have no idea how to approach this. I have no experience with pattern recognition.

Thank you!

Aucun commentaire:

Enregistrer un commentaire