Testcontainers

I have been working on VADL in a research group for a couple of months now. VADL is an architecture description language to generate tools for rapid prototyping. Johannes Zottele is generating a QEMU simulator, while I am working on a LLVM compiler.
It is very important for me to have a smooth developer experience while working on it. One of the problems that I have is that I need to constantly have multiple windows open and switch between them to verify my changes. It is more work to make sure that none of my new changes had any effect that have worked before. To mitigate this issue, I tried to move as much testing as possible to JUnit tests to codify the workflow. The java tests do not have much core logic in it but they are creating containers which do to most of the heavy lifting. The Java test can then verify whether the container has terminated successfully or it extracts the generated output to assert that the output is correct.
Since I did not want to directly work with the Docker API, we used a library called testcontainers (version: 1.20.0). I would like to tell you about some problems that Johannes and I have encountered while testing VADL.

Cache Mounts

My initial goal was to compile LLVM with a container to verify whether my changes are compilable.
This turned out to be very hard because I wanted to compile it at build time, not at runtime. My plan was to use the pre-built image and run it for every test input (this did not work). A clean LLVM build requires around 30 minutes. With caching, the build time drops to a few minutes which is ok for the CI. LLVM can be used with ccache but ccache requires a directory to hold the cache. This is not a problem with the Docker's CLI version because you can mount directories at image build time as well. However, testcontainer library does not support this feature. There is another project called sccache. The benefit over ccache is that it supports additional storage backends like redis. Johannes the great idea to a create an additional redis container with a volume so the entire test setup works out of the box. The network setup turned out to be uncomfortable but Johannes managed to set it up.

var testNetwork = Network.newNetwork();
var container = new GenericContainer<>("redis:7.4")
        .withCreateContainerCmdModifier(cmd -> {
          var mount = new Mount()
              .withType(MountType.VOLUME)
              .withSource("open-vadl-redis-cache")
              .withTarget("/data");

          Objects.requireNonNull(cmd.getHostConfig())
              .withMounts(List.of(mount));
          cmd.withName("open-vadl-test-cache");
          cmd.withAliases("redis");
        })
        // we need this custom network, because other containers must access
        // the redis cache with the given hostname/alias
        // (which is only available on custom networks)
        .withNetwork(testNetwork)
        .withNetworkAliases(hostName);

var image = SetupRedisEnv.setupEnv(new ImageFromDockerfile("tc_llvm17", false)
            .withDockerfile(Paths.get(configuration.outputPath() + "/lcb/Dockerfile"))
            .withBuildArg("TARGET", target))
        .withBuildImageCmdModifier(modifier -> modifier.withNetworkMode(testNetwork().getId()));

The image variable can then used to run a container which can use the redis container as cache.

Build logs

Another inconvenience was that the logs are not automatically turned on when building an image. However, after some googling we have found the necessary configuration for logback.xml to make them appear. I have included it here to spare you the time to google.

...
<logger name="tc" level="DEBUG"/>
...

The tc logger is the required property which had to be set to DEBUG.

Copy from Container

One of the problems of testcontainer's API is that you cannot copy a directory from a container to the host system.
But Johannes has discovered the reason why it is not possible to copy a directory. The testcontainer's API returns a stream of an archive. But when you read from a stream then you will read nothing from the directory because it has no bytes. Instead, you need to get the next entry of the archive.

I have solved it by creating a tar in the container. This archive is then copied from the container and the JUnit test extracts it on the host system.

Why not use mounts?
Because they did unfortunately not work for all systems. I did not see the created files in the mounted folder on the host system. Strangely, it worked on my machine but not on the CI. I did not find the cause and gave up.

Image Builds

I mentioned earlier that we wanted to build LLVM at image build time so we can use the image and run the compiler for each test input. This turned out to be more difficult than expected. The reason is that testcontainer library does not provide an API to build an image specifically. Instead, the image will be built when the container is run.
I have resolved it by compiling all the inputs in a single container and then using a TestFactory to assign the test inputs to test outputs, allowing me to see the test result per test case.

@TestFactory
List compileLlvm() throws IOException, DuplicatedPassKeyException {

var image = createImage();
var container = runContainer(image);

return inputFilesFromCFile().map(input -> DynamicTest.dynamicTest(input, () -> {
      var name = Paths.get(input).getFileName();
      var expected = new File(
          "../../open-vadl/vadl-test/main/resources/llvm/riscv/assertions/assembly/" + name + ".s");

      var errorPath = hostOutput + "/output/" + name + ".err";
      var errorFile = new File(errorPath);

      // First check if an error file exists. Note that the container always
      // creates an error file, so we also check for the size.
      if (errorFile.exists() && errorFile.length() != 0) {
        assertThat(contentOf(errorFile)).isEqualToIgnoringWhitespace(contentOf(expected));
      } else {
        var actual = new File(hostOutput + "/output/" + name + ".s");
        assertThat(contentOf(actual)).isEqualToIgnoringWhitespace(contentOf(expected));
      }
    })).toList();
}

The runContainer will copy all the inputs into the container. After the container was executed, it will copy a tar file from the container to the host system. The archive will contain an assembly file or an error file with some debugging information.