Yunqa • The Delphi Inspiration

Delphi Components and Applications

User Tools

Site Tools


products:googlereader:index

DIGoogleReader

DIGoogleReader is an advanced plugin for DIHtmlParser to illustrate how Google web search result pages can be parsed.

Overview

DIGoogleReader contains the TDIGoogleReader component class which can parse Google web search results pages and extract individual results. For each result, it fires the OnResult event. The detailed result properties can then be accessed by applications.

DIGoogleReader is fully Unicode enabled and returns results in all languages.

What DIGoogleReader is not

Initially, DIGoogleReader was (and still is) intended as a learning example of how to write advanced plugins for DIHtmlParser. Very soon, however, people found it extremely useful to analyze Google searches. Unfortunately, Google does not like this a great deal, so I encourage you to read Google's license terms before you put DIGoogleReader to practical use, especially commercial.

DIGoogleReader was tested to work fine with many Google web search result pages at the time of writing, which is demonstrated by the example result pages located right next to the demo project. However, there is no guarantee that the parsing algorithm works with all Google result pages, especially since Google may change its page layout without further notice at any time.

In the event that Google does one day introduce a new page layout which breaks the existing DIGoogleReader algorithm, I reserve the right not to adjust DIGoogleReader to those changes right away, maybe even not at all. Remember: DIGoogleReader is first and foremost a demonstration of how to solve complex tasks with DIHtmlParser easily.

Example Project

The screenshot shows the compiled demo project when running. It reads, extracts, and displays search results from a Google search results page previously saved to disk.

The demo's source code is included, as well as a precompiled binary.

Requirements

DIGoogleReader includes full sources for the plugin and demo application. To compile, DIHtmlParser is required for the low level HTML reading and parsing.

DIHtmlParser is available as a separate package on this site, so make sure to download it before you recompile the demo application or write your own.

products/googlereader/index.txt · Last modified: 2022/02/04 16:59 by 127.0.0.1