I have collected tweets and I would like to extract the emoji unicode from each tweet. The emoji unicode is in format and I have used the gsub function to remove all texts before and after the emoji using the function
tweets$text <- gsub(".(<.>).*", "\1", tweets$text)
However, because there may be several emojis per tweet, i have decided to split each column after the character ">".
In some columns, there are strings that are just alphabet characters and does not start with "<".
My question is: How do I remove the string if it does not start with a "<"?
Aucun commentaire:
Enregistrer un commentaire