Create your own Price comparison web application

Summary: RumboJ is a Price Comparison web application for comparing product prices in different Shopping websites. Currently supports price comparison only for selected product categories (Mobile Phones and Watches) and data is compared only between Amazon.com and Flipkart.com.

Architecture:

Description: RumboJ harvests web pages with web crawlers. Loaded documents are then parsed, reduced, and then indexed. Index data is stored on RAM. At application start, it loads index data from a backup file into RAM so that subsequent searches are faster. Administrator controls the crawling process through REST services. Through this, they can start crawlers, stop crawlers, and check crawling status, all from the GUI. So each time data is added to index through crawlers, a background job runs periodically and writes index data in RAM to a backup directory. There is a separate crawler for each shopping website. Same products from multiple websites are then merged together and the price information for each website is then added to the product details.

Components: RumboJ consists of a variety of components that serve the Indexing, Searching, Updating multiple prices for each product, HTML Parsers, and other supporting operations. The following table shows the different components that are being used.

Description:
·     Java 8, JEE 7
·     Apache Lucene
·     Phantom JS
·     JQuery
·     Bootstrap
·     Spring MVC, Bean, Security

·     Apache Tomcat


Achievements:
·     Time taken to serve each user is approximately 1.5 s
·     The Data scraper program used simulates human behaviour (Scrolling up and down, Staying on the page for some sometime) while loading shopping webpages to avoid getting blocked
·     Fresh Tor IP networks were used after regular period of time, during data extraction, to avoid IP address block
·     Carefully edited Http header information used by data scraper program to mimic a real web browser
·     A Product data stored in just 10KB in RAM
·     Same product from different shopping websites are matched using Product title and product description
·     Spell Checker to suggest product keywords if the entered string is incorrect.

·     Chat feature for queries on products. (Yet to be implemented)

 Visit this website - https://github.com/RagulkumarRaj/RumboJ

Comments

Popular Posts