I found code similar to this in a course I was taking. This code gets all of the links of a certain format that are mentioned in the source code of the webpage. I understand everything, except for the last line. The last line says the following:
print link.attrs.get('href', '')
This works, however I'm unsure as to how the instructor figured out how to do this. I've looked through the documentation and I can't figure out what .get does. Could someone please let me know how I can find this information.
Documentation for Pattern Library: http://ift.tt/1kyDb3D
xml = requests.get("http://ift.tt/2uPZs4j").text
pattern = 'http://ift.tt/2ugsmxz'
dom = web.Element(xml)
all_links = dom.by_tag('a')
for link in all_links:
print link.attrs.get('href', '')
Aucun commentaire:
Enregistrer un commentaire