mardi 11 juillet 2017

Scraping links in Pattern library for Python

I found code similar to this in a course I was taking. This code gets all of the links of a certain format that are mentioned in the source code of the webpage. I understand everything, except for the last line. The last line says the following:

 print link.attrs.get('href', '')

This works, however I'm unsure as to how the instructor figured out how to do this. I've looked through the documentation and I can't figure out what .get does. Could someone please let me know how I can find this information.

Documentation for Pattern Library: http://ift.tt/1kyDb3D

xml = requests.get("http://ift.tt/2uPZs4j").text
pattern = 'http://ift.tt/2ugsmxz'

dom = web.Element(xml)
all_links = dom.by_tag('a')

for link in all_links:
    print link.attrs.get('href', '')

Aucun commentaire:

Enregistrer un commentaire