We a short while ago had a consumer who is a multi-national retailer with each a bodily and Online presence. The shopper wanted a way to receive certain business intelligence (BI) information from the Online on a day-to-day foundation. Soon after quite a few unsuccessful attempts to build this performance themselves, they arrived to us for a resolution.
On the surface the demands seemed to be tough and it was effortless to see why their own IT workforce experienced unsuccessful to come across a option. They had been pondering “inside of the box”, however, and hadn’t thought of third-party options. The technical specs needed that the application conduct all of these responsibilities:
Retrieve new products listings on competitor’s world wide web web pages.
Retrieve present-day pricing for all products and solutions detailed on competitor’s world-wide-web web pages.
Retrieve google reverse index of competitor’s Push Releases and public financial reviews.
Track all inbound links pointing to competitor’s internet web pages from other net web pages.
After the information was obtained it needed to be processed for reporting functions and then saved in the info warehouse for long term accessibility.
Just after examining present-day web-primarily based info acquisition know-how, including “spiders” which crawled the World-wide-web and returned info which then experienced to be processed by way of HTML filters, we determined that the Google API and Web Expert services supplied the very best alternative.
The Google API presents remote accessibility to all of the lookup engine’s uncovered functionality and gives a conversation layer which is accessed by means of the “Straightforward Item Obtain Protocol” (Soap), a world-wide-web expert services conventional. Because Soap is an XML-dependent engineering it is simply integrated into legacy net-enabled purposes.
The API achieved all of the demands of the software in that it:
Provided a methodology for querying the Net using non-HTML interfaces
Enabled us to schedule common search requests created to harvest new and updated information on the focus on topics.
It presented knowledge in a structure which was equipped to be easily integrated with the client’s legacy devices.
Making use of the Google API, Soap and WSDL, our developers have been capable to define messages that fetched cached pages, searched the Google doc index and retrieve the responses devoid of obtaining to filter out HTML or reformat the info. The resulting knowledge was then handed off to the client’s legacy systems for validation, reporting and additional processing before reaching the information warehouse.
For the duration of the Evidence of Idea section we ran exams wherever we have been ready to reliably recognize and retrieve up to date general public relations and investor relations details that exceeded the client’s anticipations.
In our up coming test we retrieved the most currently available solution webpages which have been mentioned in Google and then ran a different question to retrieve the Google “cached web site” versions. We ran these two data sets through distinction filters and had been ready to develop correct selling price improve and minimize experiences as nicely as detect new merchandise.
For our last examination we used the Google API’s ability to accessibility the “link:” function to promptly build lists of inbound backlinks.
These minimal checks demonstrated that the Google API was able of creating the BI data that the consumer asked for as effectively as demonstrating that the facts could be returned in a pre-described format which eliminated the need to use submit retrieval filters.
The shopper was happy with the final results of our Proof of Thought period and authorized us to continue with creating the solution. The software is now in every day use and is exceeding the client’s performance expectations by a huge margin.