Hacker News >100pts feed

February 11, 2012 / Mad Coding

Today I wrote a quick parser using Scrapy to grab contents from the Hacker News website for news having over 100 points. My main motivation for this is so that I can more easily read news while I don’t have Internet on my phone. The old RSS feed I was using showed only the title and doesn’t have the content of the webpage being discussed. Therefore, I whipped out my Scrapy and Python and coded this up. I also made use of the readability-lxml package to strip unnecessary HTML.

You can access the feed via http://feeds.dannysu.com/hackernews100.atom

I also want to give a shout-out to people at Mozilla for the new developer tools. I just discovered the new way you can inspect HTML elements just by hovering or clicking around this week. Unlike in Chrome where you have to go through the HTML code just to match the code up to what you’re seeing visually. Having the new developer tool made scraping and verifying things much easier.

UPDATE: Source code added to my github