Tuesday, October 07, 2014

Bake - Continuous Integration System

Summary: I've written a continuous integration system, in Haskell, designed for large projects. It works, but probably won't scale yet.

I've just released bake, a continuous integration system - an alternative to Jenkins, Travis, Buildbot etc. Bake eliminates the problem of "broken builds", a patch is never merged into the repo before it has passed all the tests.

Bake is designed for large, productive, semi-trusted teams:

  • Large teams where there are at least several contributors working full-time on a single code base.
  • Productive teams which are regularly pushing code, many times a day.
  • Semi-trusted teams where code does not go through manual code review, but code does need to pass a test suite and perhaps some static analysis. People are assumed to be fallible.

Current state: At the moment I have a rudimentary test suite, and it seems to mostly work, but Bake has never been deployed for real. Some optional functionality doesn't work, some of the web UI is a bit crude, the algorithms probably don't scale and all console output from all tests is kept in memory forever. I consider the design and API to be complete, and the scaling issues to be easily fixable - but it's easier to fix after it becomes clear where the bottleneck is. If you are interested, take a look, and then email me.

To give a flavour, the web GUI looks of a running Bake system looks like:

The Design

Bake is a Haskell library that can be used to put together a continuous integration server. To run Bake you start a single server for your project, which coordinates tasks, provides an HTTP API for submitting new patches, and a web-based GUI for viewing the progress of your patches. You also run some Bake clients which run the tests on behalf of the server. While Bake is written in Haskell, most of the tests are expected to just call some system command.

There are a few aspects that make Bake unique:

  • Patches are submitted to Bake, but are not applied to the main repo until they have passed all their tests. There is no way for someone to "break the build" - at all points the repo will build on all platforms and all tests will pass.
  • Bake scales up so that even if you have 5 hours of testing and 50 commits a day it will not require 250 hours of computation per day. In order for Bake to prove that a set of patches pass a test, it does not have to test each patch individually.
  • Bake allows multiple clients to run tests, even if some tests are only able to be run on some clients, allowing both parallelisation and specialisation (testing both Windows and Linux, for example).
  • Bake can detect that tests are no longer valid, for example because they access a server that is no longer running, and report the issue without blaming the submitted patches.

An Example

The test suite provides both an example configuration and commands to drive it. Here we annotate a slightly simplified version of the example, for lists of imports see the original code.

First we define an enumeration for where we want tests to run. Our server is going to require tests on both Windows and Linux before a patch is accepted.

data Platform = Linux | Windows deriving (Show,Read)
platforms = [Linux,Windows]

Next we define the test type. A test is something that must pass before a patch is accepted.

data Action = Compile | Run Int deriving (Show,Read)

Our type is named Action. We have two distinct types of tests, compiling the code, and running the result with a particular argument. Now we need to supply some information about the tests:

allTests = [(p,t) | p <- platforms, t <- Compile : map Run [1,10,0]]

execute :: (Platform,Action) -> TestInfo (Platform,Action)
execute (p,Compile) = matchOS p $ run $ do
    cmd "ghc --make Main.hs"
execute (p,Run i) = require [(p,Compile)] $ matchOS p $ run $ do
    cmd ("." </> "Main") (show i)

We have to declare allTests, then list of all tests that must pass, and execute, which gives information about a test. Note that the test type is (Platform,Action), so a test is a platform (where to run the test) and an Action (what to run). The run function gives an IO action to run, and require specifies dependencies. We use an auxiliary matchOS to detect whether a test is running on the right platform:

#if WINDOWS
myPlatform = Windows
#else
myPlatform = Linux
#endif

matchOS :: Platform -> TestInfo t -> TestInfo t
matchOS p = suitable (return . (==) myPlatform)

We use the suitable function to declare whether a test can run on a particular client. Finally, we define the main function:

main :: IO ()
main = bake $
    ovenGit "http://example.com/myrepo.git" "master" $
    ovenTest readShowStringy (return allTests) execute
    defaultOven{ovenServer=("127.0.0.1",5000)}

We define main = bake, then fill in some configuration. We first declare we are working with Git, and give a repo name and branch name. Next we declare what the tests are, passing the information about the tests. Finally we give a host/port for the server, which we can visit in a web browser or access via the HTTP API.

Using the Example

Now we have defined the example, we need to start up some servers and clients using the command line for our tool. Assuming we compiled as bake, we can write bake server and bake client (we'll need to launch at least one client per OS). We can view the state by visiting http://127.0.0.1:5000 in a web browser.

To add a patch we can run bake addpatch --name=cb3c2a71, using the SHA1 of the commit, which will try and integrate that patch into the master branch, after all the tests have passed.

9 comments:

Luca Bruno said...

Bake is also the name for a Vala build system: https://launchpad.net/bake

Anonymous said...

> Bake scales up so that even if you have 5 hours of testing and 50 commits a day it will not require 250 hours of computation per day. In order for Bake to prove that a set of patches pass a test, it does not have to test each patch individually.

Doesn’t that hurt bisectability if there untested patches reach the repository?

Neil Mitchell said...

Luca: I didn't know that. I suspect Bake has been used for several build systems over time (anything *ake seems to be quite popular), so I'll make sure to always write "Bake continuous integration" and hopefully that won't confuse people. I might also add a link to make it quite clear the projects are unrelated.

nometa: Yes, it hurts bitsectability a bit. But already most people only CI test on a push, not on each commit, so there are plenty of untested commits. Even with my small projects testing each commit would be infeasible. So it removes a few known good points, but hopefully it's not significantly worse in practice.

Greg Weber said...

> Patches are submitted to Bake, but are not applied to the main repo until they have passed all their tests. There is no way for someone to "break the build" - at all points the repo will build on all platforms and all tests will pass.

This is how the CI from koalitycode.com works. However, they just got bought by docker and will shut down, so your claim to uniqueness may end up holding.

Travis CI integration with pull requests on github ends up providing a similar result. Which style is better depends on your particular use case.

Neil Mitchell said...

Greg: I should probably tone down the word unique. I'll do that in the README, which I basically pasted here.

Travis CI + GitHub provides a similar result for untrusted teams, where the contributors have their code reviewed. Doing it with your own patches would be too much work. You want to automate the pressing of merge when it has passed, which Bake does.

insitu said...

Hi Neil,
Really interested in seeing and using this!

I like the idea of committing after the build is proved to pass. This what I achieved with gerrit + Jenkins, with the additional benefits of having a code review gate, something that I found of great value for distributed (in space or time) teams for which pairing is hard. The nice thing with gerrit is that each commit triggers a build so bisectability is not hurt. My project was small enough (and I strived hard to keep it that way) that build time was not an issue.

But then you can combine gerrit + zuul (that's what mediawiki and openstack do) to 1/ parallelise builds and 2/ stop early builds that contain a failed commit.

I would not have been so categorical few years ago but now I think having each commit built and passed before merge is an absolute must (eg. bisectability is not an option IMHO), having wasted hours to track down bugs in stack of commits.

Arnaud

Neil Mitchell said...

Arnaud: I've been looking at it, and it does seem I could do incremental building on each commit, but just not run the tests on each commit - that gives you most of bisectability, but saves you the time of running the tests (which can be hours).

I totally agree that code must pass before merging, in my case because if you have a multi-developer team you need the person who wrote the bad code to have the problem of fixing the bad code, not the other people who need to get their code included.

insitu said...

IMHO, this should be left to the discretion of the user. The definition of what qualifies as a valid build is highly dependent on your particular context. But this might already be the case :-)

Neil Mitchell said...

Of course! The entire system is parameterised over state, patches and tests. There is no inbuilt knowledge of anything about what any of the tests do, and things like Git integration is added around the outside. Polymorphism basically makes it impossible to cheat and shove such knowledge in the internals.