Part 4 of 5
By: Dan Cohn, Sabre Labs

How it often goes

As an end user or customer support engineer, nothing is more frustrating than hearing a developer say, “but it works on my machine.” This phrase is often uttered in the context of a desktop application that, incredibly, isn’t working as advertised despite running perfectly on the developer’s own machine. After all, code execution is deterministic, right? The user must have done something wrong. Maybe it’s the hardware or operating system. 

Ironically, software developers are also on the receiving end of the “it works on my machine” refrain, often when they first join a company or a new team. Picture how this plays out: A new hire joins, learns the ropes, and yearns to become a contributing member of the team. Armed with a laptop and, if lucky, an encyclopedia of links to internal wikis, documentation, and Getting Started guides, they begin tackling their first user story or software defect. That’s when the trouble starts. The existing code won’t compile. Or it eventually compiles, but the application won’t start. And yet, somehow, it seems to work fine for everyone else on the team. 

I skipped over the onboarding prerequisites of downloading, installing, and configuring various tools and applications, likely dozens of them. After spending many long days setting up a “local environment” for coding and testing, only to find that nothing really works, can be incredibly frustrating. Instead of debugging their own code, new developers find themselves debugging the development environment. The new hire thinks: “Did I miss a step? Forget to install something? Maybe I downloaded the wrong version of one of the tools. Is the documentation out of date? This shouldn’t be so difficult…” 

There is also a scenario in which our intrepid newbie thinks everything is good to go and submits code that works on their system but somehow fails to build or, worse yet, behaves differently in the integration environment. Thus begins the process of troubleshooting, which consumes not only the new hire’s time but often that of the most experienced team members as well. 

How can we avoid these difficulties? The answer seems plain enough: Construct a common development environment, ideally one that supports multiple teams or even the whole company. There are several “traditional” ways to do this: 

  • Create a working environment on a laptop and take an image. 
  • Build a virtual machine (VM) image with everything preinstalled. 
  • Write an “install” script that loads and configures the environment. 

The trouble with these is that they’re fixed in time. What happens when you need to upgrade a tool or introduce something new in the environment? How do you roll out updates to all your developers without disrupting their work? 

Contain yourself: Docker to the rescue! 

We’ve found a good alternative in Docker images. Docker is a tool for delivering software in portable packages called containers. It has been around for ten years and has the advantage of maintaining version-controlled images in a centralized repository known as a registry. Container images are comprised of layers, making it possible to add or replace layers without having to rebuild the entire image whenever an update is required. After releasing a new version of the development environment, users download only the new layers, which are much smaller than the whole image. Docker is also able to download (or “pull”) multiple layers at once, which typically makes better use of network bandwidth and completes the download in less time. 

Unlike virtual desktops or VMs, there is no need to “reboot” after downloading a new version. Launching a new container is relatively fast because it makes use of an operating system kernel that’s already up and running. Containers have the advantage of running pretty much anywhere, including physical machines, virtual machines, and cloud environments. Our developers have a mixture of Windows and Mac laptops along with Linux VMs and some cloud-based virtual desktops. The beauty of a containerized environment is that it behaves the same (for the most part) on every system. 

While I’m extolling the virtues of container-based development, let me add one more, and this one is important: The continuous integration (CI) system can easily use the same container image to perform builds and execute tests. This means that code that builds “locally” is almost guaranteed to build properly in the CI environment (assuming both are using the same version). The resulting executables should behave consistently as well. 

As with virtual machines, another benefit of containers is that they are isolated from the host machine. Nothing you do inside the container can “brick” your laptop. If something goes horribly awry (as it tends to do from time to time), you simply restart the environment. You can even roll back to a previous version if necessary. 

How it works at Sabre 

If you’ve read the previous posts in this series, you know that we have a Git-based monorepo and use Bazel as a common build tool. The overall themes are consistency and usability for a best-in-class developer experience with the objective of increased engineering productivity. A standard developer environment with “batteries included” enables us to achieve this goal. We call it the Sabre Engineering Environment, or S2E. 

S2E is Linux-based and comes preconfigured with everything but an integrated development environment (IDE). (I’ll address the missing IDE in a moment.) We provide developers an install wizard that takes them through the steps of downloading the environment and any prerequisites (in particular, Docker Desktop for Mac or Windows). Once these are installed, they run a command-line tool that starts the environment within a terminal window. Parts of the filesystem are shared between S2E and the desktop. For example, source files are present in both places, so they are editable by desktop tools such as Visual Studio Code. 

Sabre has an internal container registry which houses the 8+ GB S2E image. The “sabre2” monorepo contains the Dockerfile and other files required to create the image (just like any other source code we maintain). An automated CI process builds and verifies the image whenever there’s a change. We release a new version about once a week. 

Layering it on 

One of the interesting challenges we face is how to minimize the impact of updates to the container image. What do I mean by this? Well, suppose we want to add a new tool such as “gcloud” for interfacing with Google Cloud services. If we add it near the beginning of the Dockerfile, it will cause all subsequent layers to be rebuilt, resulting in a huge download. If we add it at the end of the Dockerfile, the initial release will be manageable, but the new layer will be rebuilt whenever there’s a change that affects a previous layer, such as an update to one of our internal tools. Since the Google Cloud SDK is fairly large, we do not want to embed a new copy in each release of the environment. That means we probably want to install it somewhere around the middle of the Dockerfile. 

What about upgrading an existing tool? If the install step is early in the Dockerfile, the update will be quite large. Maybe we can add a step later in the file to reinstall the tool (overwriting the older version), but that will increase the overall size of the image and may cause difficulties down the road. Another option is to batch upgrades together – updating multiple tools at one time – so that we only pay the price occasionally (hopefully not that often). 

There is no perfect answer because we can’t predict the future. However, we can establish some rules of thumb: Tools that are large and/or infrequently updated belong near the top of the Dockerfile. Tools or files that change frequently belong near the bottom unless they are relatively large, in which case they belong closer to the middle. 

Challenges 

Unfortunately, Docker isn’t conducive to graphical tools like IDEs (integrated development environments). It’s possible to run a GUI-based application in a container, but it’s cumbersome and requires a Linux machine or VM. For this reason, we ask our developers to install IDEs outside of S2E. This has the advantage that developers can choose their favorite IDEs or editors. They can edit files on their desktops and then execute builds and git commits from within the container. This is achieved by volume-mounting certain directories, so they are shared between the host and Docker environments. A disadvantage of this approach is that some features of the IDE are unusable unless you install additional tools, like compilers and SDKs, on the host system. This somewhat defeats the purpose of a self-contained development environment, but as a practical matter, we’ve found it to be a good compromise. 

One way to avoid having to install more tools on the host system is by connecting the IDE’s graphical “front end” or “thin client” to a back end running inside the container. For example, Visual Studio Code supports this via its remote development feature.  

You may be thinking that performance is another challenge. It turns out that modern virtualization is reasonably performant. On non-Linux systems, though, you have to configure Docker Desktop appropriately, so it has sufficient memory and CPU to do its job, especially when using it to build and run applications. In addition, there is a cost to sharing files between operating systems. Some tuning is necessary to ensure that volumes with high I/O activity such as the build cache are configured appropriately. On Macs in particular, it’s helpful to configure source code directories as “cached” so Docker optimizes them for container-based reads and host-based writes. On Windows, you must enable Docker’s WSL 2 engine if containers will be launched from the Windows Subsystem for Linux (WSL). You must disable it when using Git Bash or another Windows-based terminal. 

Side note: Working in a Linux container isn’t conducive to desktop app development (e.g., for Windows or Mac applications). We have a few such applications at Sabre. For now, they use existing platforms and tools rather than adopting container-based development. 

Another challenge is what I call the Kitchen Sink Problem. Developers often have their own favorite tools, and, of course, the needs of one project may differ from those of another. It isn’t practical to stuff every possible compiler, SDK, shell, and power tool into a common platform shared by everyone. It would create a large and unwieldy image. Furthermore, giving developers so many options would diminish consistency and increase platform support costs. 

Our approach has been to honor requests for tools with wide applicability or a small footprint and to push back on others. We’ve also made it easy to load additional tools or customizations at startup time. As a result, the platform is relatively manageable, even though we’ve packed in quite a lot of functionality (well over a hundred tools). 

What the future holds 

Despite the challenges, container-based development has enabled us to operate with increased velocity and to focus on building quality software for our customers. With contributions from our internal developer community, we continue adding value for everyone who uses the environment. This is easily achievable when you have a common platform and a weekly release cadence. 

Nonetheless, we are moving beyond containers and creating a next-generation environment based on Nix and the Nix package manager. Nix is a tool specifically designed to produce consistent, reliable systems for builds and deployments. It turns out to be both an alternative and a complement to what we have today. 

After extolling the virtues of containers, you may wonder why we are going in a new direction. The main reason is simply the pursuit of a top-notch developer experience. Nix gives us the ability to run the environment on virtual machines without the small yet consequential overhead of containers. Although containers are portable, there are configuration and performance differences from one host operating system to another, primarily due to filesharing. 

We ran into significant problems when developers began receiving MacBooks based on “Apple silicon” (with the ARM64 processor architecture). Our images are Intel x86-based and did not play nicely with the new hardware and “qemu” emulation layer. This was a big impetus for us to explore other options. Fortunately, Docker recently added Rosetta 2 support following Apple’s release of MacOS 13. This appears to have largely eliminated the difficulties we had running the environment on M1 and M2 laptops. 

With Nix we can more easily add, remove, and upgrade individual packages. We no longer need to worry about invalidating layers and producing enormous image downloads. We can install Nix inside a container and continue to support our Docker-based users as we transition them to VMs. We can continue to use containers in our Kubernetes-based build system as well. 

Speaking of build systems, that’s the subject of our next and final post in this series. We’ll bring all the pieces together – Git, monorepos, Bazel, and containers – and look at how we built a centralized CI system supporting hundreds of users and microservices. 

Read the rest of the series:

Stay in touch

Fill out the form below and be the first to know when we release new blogs.