I have a data frame where I have agricultural use codes (1-5) for 15 consecutive years. Each row is a polygon representing a field. Ultimately I need R to loop through the rows and recognize patterns of use and tell me their respective frequency. Unfortunately in my real data set I have over 1 mio. features and thus all possible patterns are not known.
a <- data.frame(replicate(15, sample(0:5,500,rep=TRUE)))
colnames(a) <- paste0("use",2005:2019)
id <- c(1:500)
a <- cbind(id,a)
id use2005 use2006 use2007 use2008 use2009 use2010 use2011 use2012 use2013 use2014 use2015 ...
1 1 1 1 1 1 2 2 1 4 4 4 ...
2 4 4 4 4 5 5 5 0 5 5 5 ...
3 1 4 3 2 3 2 4 5 1 1 1 ...
4 1 1 1 1 1 2 2 1 4 4 4 ...
5 4 2 2 2 2 5 3 3 3 3 3 ...
So in this arbitrary example, the code should recognize that id 1 & 4 have the same pattern.
In the end I imagine the result to be some sort of frequency distribution to see if there are certain patterns in the agricultural use of my fields.
For example:
1 1 1 1 1 2 1 1 1 3 2 4 1 1 1
[50] - occurs 50 times
5 5 5 5 5 1 1 1 1 4 4 4 2 2 3
[35] - occurs 35 times
and so forth with all existing combinations...
Unfortunately I have no idea how to approach this. I have no experience with pattern recognition.
Thank you!
Aucun commentaire:
Enregistrer un commentaire