samedi 27 juillet 2019

How to remove strings that start with alphabet using gsub in R?

I have collected tweets and I would like to extract the emoji unicode from each tweet. The emoji unicode is in format and I have used the gsub function to remove all texts before and after the emoji using the function

tweets$text <- gsub(".(<.>).*", "\1", tweets$text)

However, because there may be several emojis per tweet, i have decided to split each column after the character ">".

In some columns, there are strings that are just alphabet characters and does not start with "<".

My question is: How do I remove the string if it does not start with a "<"?

Aucun commentaire:

Enregistrer un commentaire