Introducing the liveblog-indexer gem

require 'liveblog-indexer'

lbi = filepath: '/tmp', \
                word_index: 'words_indexed.json', url_index: 'indexed.json'
lbi.crawl ''

This gem can not only index a single liveblog page it can crawl over all Liveblog pages. It does this by reading each formatted.xml which contains a link to the previous page. Within a page, it scans each section for words to index, while ignoring specific HTML elements including pre, code, and time.

Note: There are 2 files which are saved in JSON format, those are words_indexed.json and urls_indexed.json. The filepath and the filenames can be set at initialize().