Hacker Times
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
babo
on Feb 24, 2009
|
parent
|
context
|
favorite
| on:
How to scrape data from sites you can't log into
I do my scrapping with BeautifulSoup from Python, actually from an iPython shell. With a urllib2 opener you could easily handle cookies and UserAgent pretty easy, the later is also important for some sites.
geoscripting
on Feb 24, 2009
[–]
There's also mechanize for python. It can be found here :
http://wwwsearch.sourceforge.net/mechanize/
. It handles cookies by default, and is a pretty good tool.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: