If you ever need to extract data from a Python site or file quickly and easily, Beautiful Soup, a library designed to help with things like web scraping, is here for you. Beautiful Soup is built on top of popular Python parsers lxml and html5lib, and it’s great for web scraping because it automatically converts any incoming documents to unicode and any outgoing documents to utf-8.
Beautiful Soup has a easy command-line install, and has been used in several cool projects since its inception, and may or may not have gotten its name from a song in Lewis Carroll’s famous Alice’s Adventures in Wonderland, which is pretty cool. If you’ve got some web scraping or data extraction to do, do yourself a favor and install this software.