links =[ ' 'http://ift.tt/2nodMfW', 'http://ift.tt/2my50gc', 'http://ift.tt/10xcuxv', 'https://www.twitter.com/NPR']
Objective: get links contain (/yyyy/mm/dd/ddddddddd/) format. e.g. /2017/03/10/519650091/
for some reasons just cannot get it right, always has the facebook, twitter and 2017/03/20170311 format links in it.
sel_links = [] def selectedLinks(links): r = re.compile("^(/[0-9]{4}/[0-9]{2}/[0-9]{2}/[0-9]{9})$") for link in links: if r.search(link)!="None": sel_links.append(link) return set(sel_links) selectedLinks(links)
Aucun commentaire:
Enregistrer un commentaire