by Andrew Gasparovic
Sabre processes more than 12 billion shopping requests and serves over 1 billion travelers every year. Through our next generation of AI-powered solutions, we are focused on optimizing the retailing, distribution, and fulfillment experience of the travel industry. In my role as chief architect for Sabre Labs, I spearhead the long-term technology choices we are making to enhance the development, deployment, and operation of our software. We’ve embarked on a multi-year strategy to transform our technology and solutions. Rearchitecting our infrastructure to become fully cloud native is a central tenet of this transformation. In these early iterations of our plan, we have seen the significant impact of machine learning and its potential in bringing truly personalized travel experiences to customers. As part of this effort, we’ve established a 10-year partnership with Google to help accelerate our transformation and bring innovation to the travel industry.
Database choices require tradeoffs
The complexity and scale of the travel industry places high demands on the cloud services we utilize. We tend to place more emphasis on how a particular cloud service will impact an application’s reliability, performance, or development time, rather than choosing a service purely for its functionality. When it comes to databases, this can often mean making a tradeoff between latency and consistency.
The tradeoff exists for any database that serves multiple copies of its data in different availability zones or geographic regions for reliability. A database designed to ensure that everyone sees a consistent view of the latest data might update those copies synchronously using a consensus algorithm, which affects how quickly the data can be served. On the other hand, a database that’s optimized for faster data serving might update each copy asynchronously and not guarantee consistent reads across records.
Cloud Spanner and Bigtable – two of Google Cloud’s managed databases – are both highly effective services, and each one could support many of our travel applications. But as you will see, the latency vs. consistency tradeoff made it clear which one was best suited for two of our most critical cases.
Google Cloud Spanner facilitates strong, global consistency for Sabre
An airline’s reservation database stores a passenger’s booking information, seat selection, tickets, special requests, and other critical information about their trip. As a result, this data sits at the consistency end of that consistency/latency spectrum. Sabre typically processes thousands of reservation updates per second on behalf of our carrier customers. An airline’s reservation database must be served from many availability zones (and data is replicated across these availability zones) so that it remains available in the event of an outage. It also requires ACID properties for transactional updates across records since airlines often make changes to multiple passengers and multiple flights at the same time.
We needed a system that can handle bursts of concurrent updates, as would occur during a snowstorm when hundreds of thousands of passengers might be automatically moved to alternate flights. Spanner is a great fit for the reservations case because of its unique consistency guarantees. It processes over 1 billion requests per second at peak and provides five 9s SLA (99.999 percent) to support our applications. Spanner also helps us maintain compliance, business continuity, redundancy, and reliability using the same secure-by-design infrastructure, built-in data protection and replication, and multi-layered security that are essential to our Google Cloud workloads.
Spanner’s client libraries also provide built-in mechanisms to handle retrying in the event of write conflicts with another transaction and allow developers to choose stale reads in read-only transactions for improved performance. Of course, consistency across multiple zones or regions doesn’t come for free. It means higher write latencies than if we wrote to a comparable database running in a single availability zone, but for an application that manages flight reservations, it’s a tradeoff that makes sense.
Bigtable provides predictable, low latency at scale
Our flight shopping systems sit at the other end of the latency/consistency spectrum. Sabre’s shopping engine generates millions of itineraries per second on behalf of travelers using mobile apps, third-party travel websites, and airline call centers. Each itinerary requires significant compute resources to calculate: We need to find which combinations of flights make sense and evaluate complex rules about their availability and pricing. Users are typically less patient while searching for a flight than they are when booking it, so we need to return a result within a second or two. But we can cache many of these shopping results to reduce our compute usage. For instance we can decide how long to cache results based on factors like how far the flight results are from the departure.
Bigtable makes an excellent choice for this shopping cache. It’s a NoSQL database service built to handle high-throughput, low-latency applications, with more than 10 exabytes of data under management. Bigtable’s unique latency properties—such as predictability and single-digit millisecond response time even for multi-petabyte tables—enable us to serve large volumes of shopping results cost effectively, while providing low response times to travelers.
Google Cloud supports a focus on innovation
Managed databases like Bigtable and Spanner are a significant part of Sabre’s cloud strategy. The combination of unique tools like Key Visualizer for Bigtable and Spanner as well as integration with other Google Cloud services like Cloud IAM, Cloud Monitoring, and now Datastream, make the experience of operating managed databases with Google Cloud much easier than it is with several of their self-hosted counterparts. As a result of their granular pricing models and ability to be deployed and automated by SREs, the managed databases we use also end up with a lower total cost of ownership.
We’re particularly excited about a few recent database-related announcements from Google Cloud. Bigtable’s SLA update gives us more concrete expectations in terms of multi-cluster, multi-region uptime. Spanner’s change to provisioning in Processing Units increases its cost efficiency when deploying in non-production environments where we may need many isolated instances, but won’t come close to the limits of a single node. In those cases, Spanner instances may now be configured in one-tenth of a node increments.
Our cloud transformation depends on having a choice of databases for different use cases, trade offs, and migration schedules. In addition to managed databases, we expect to use self-hosted databases, database solutions available in Cloud Marketplace, and transitional services like Cloud SQL for a few more years. In an industry as demanding as travel, accelerating our most critical applications using technology unique to Google Cloud means less time spent optimizing latency and consistency, and more time spent innovating.