Friday, September 28, 2012
How redBus uses BigQuery to master big data
By Pradeep Kumar, redBus
This guest post was written by Pradeep Kumar. Pradeep is a technical architect at redBus, an online travel agency in India that provides a unified online bus ticketing service. We recently published a business case study for redBus and wanted to dive into some more technical detail for the readers of the Google Developers Blog.
Our company has been providing Internet bus ticketing for India since 2006. There are more than 10,000 bus routes available for booking, and we have dozens of machines processing booking requests. Each step in the booking process produces a lot of data – on search terms, route availability, server health and more. We needed tools to to be able to process this data quickly and easily to determine whether decreases in customer bookings are the result of server problems or simply less demand.
While we typically use relational databases to store and analyze data, we knew we needed something more powerful if we wanted to analyze 500GB or more, so we started to look at open source frameworks like Hadoop and analysis platforms like Hive and Pig. We found that these frameworks require considerable in-house expertise and infrastructure investments and wouldn’t give us answers to our questions as fast as we wanted. We decided to try out Google BigQuery as a trusted tester, with hopes that it would give us the ability to perform quick iterative analysis without much up-front investment. Our initial tests went very well, so we started building our analysis tools on top of BigQuery.
BigQuery allows us to run SQL-like queries to understand the bus routes in highest demand and what types of searches users are performing. We’ve also used it to build internal dashboards that give us a snapshot of system health.
For more information on how we structured our immutable tables, pipelined our data into BigQuery for analysis using RabitMQ, and to see example SQL queries we’ve used, check out my article on developers.google.com.
Pradeep Kumar is a technical architect at redBus.
Posted by Scott Knaster, Editor
Labels:
bigquery,
guest post
Location:
Mountain View, CA, USA
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment