Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

I do my scrapping with BeautifulSoup from Python, actually from an iPython shell. With a urllib2 opener you could easily handle cookies and UserAgent pretty easy, the later is also important for some sites.


There's also mechanize for python. It can be found here : http://wwwsearch.sourceforge.net/mechanize/ . It handles cookies by default, and is a pretty good tool.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: