My dataset contains a free text field with information on building plans. I need to split the content of the field in 2 parts, the first part contains only the number of planned buildings, the other only the type of building. I have a reference lexicon list with the types of buildings.
Example
Plans<- c("build 10 houses ","5 luxury apartments with sea view", "renovate 20 cottages"," transform 2 bungalows and a school", "1 hotel")
Reference list
Types <-c("houses", "cottages", "bungalows", "luxury apartments")
Desired Output 2 colums, Number and Type, with this content:
Number Type
10 houses
5 apartments
20 cottages
2 bungalows
Tried
matches <- unique (grep(paste(Types,collapse="|"), Plans, value=TRUE))
I can match the plans and types, but I can’t extract the numbers and types into two columns. I tried str_split_fixed and gepl using :digit: and :alpha: but it isn’t working.
Many thanks for help!
Aucun commentaire:
Enregistrer un commentaire