Trial

Trial

A set of tools to supercharge teaching assistants.

March 24, 2023

  • project
  • docker
  • scripts

Table of contents

Automatic assignements submission retrieval

Since the class students are using Git to collaborate and reviewing their code on Github, we opted against requesting a .zip file for submission. Instead, we encorage them to adopt modern software practices and release a new version of their code using git “tags”. Therefore, it’s our responsibility to retrieve that release at the time limit of submission. That’s why I decided to create a simple script to automatically collect all the teams repos at a given branch or tag, set to run on a cron job.

With the pandemic and the continually increasing popularity of programming, we’ve observed a 2x increase in course enrollement. As of 2024, there are over 35 teams to assess. To further optimize this process, the script spawns multiple parallel jobs. This ensures that event if some repositories exceed 100 MB in size, they should all be retrieved simultaneously.

Furthermore, a new requirement for this year was to deploy a Dockerized version of the API to the Github Container Registry. We wanted to test this containerized version of their code, but to do so, we needed to ensure that no new version was pushed in the meantime. Hence, the tool also retrieves the image SHA ids for later download and usage.

“But, our tests say it works!”

The second tool required was to ensure that all projects were thoroughly graded in a consistent and objective manner. Several attempts had been made in the last years, each encountering its own set of challenges. Notably:

  • Employing highly efficient and user-friendly runtime, which proved to be unstable (Deno and Bun).
  • Depending on a globally nullable shared state to store, reuse and verify created entities.
  • Lack of transparency regarding the requests made (and responses received).
  • Complexity in writting tests for a new project.
  • Clarity of the generated JSON report.

A newer version was to be developed from scratch, ensuring portability and scalability. This version also integrates some additional ideas :

  • Abstraction of report creation, allowing for various file types like Markdown or HTML.
  • Functional oriented actions and assertions to mitigate statefull issues.
  • Randomization of requests to avoid data collisions.
  • Execution of prerequesite actions for every test to achieve complete independence.
  • Simplified object validation using popular JS libraries (instead of niche Deno ones).

Hence, this allows the creation of numerous reusable blocks, making future tests easy and quick to write. With each test now fully independent, they can now all be executed concurrently, which significantly reduces the overall execution time.

To everyone its own… environment

Now that all the tests are easily readable and writtable, executing them is another story. Because they are written in Javascript, the setup required is a mess – the correct Node.js version must be installed, certain dependencies rely on system libraries, and the runtime may change in the future – all of which must be managed individually by each grader. With approximately 10 graders now involved, this leads to numerous debugging sessions!

Another complexity arises from the fact that specific steps must be performed both before and after each test execution, in a specific order :

  1. Starting the database
  2. Compiling and starting the API, while providing the database credentials
  3. Executing the tests
  4. Stopping and dropping the database

These steps are always the same and also necessitate the installation and setup of yet another tool (MongoDB). Hence, it became evident that a wrapper around the tests needed to be developped. Thus, a small Python CLI automates the process, executing Docker processes with runtime-generated configurations. Why Python? Because all the graders have it installed, it integrates seamlessly with OS functionalities (file system, commands, threads, etc.) and it’s incredibly easy to read and write. Additionally, it has only a single dependency (for generating Yaml files from Python dictionaries), which minimizes the risks of mismatching versions.

There is no trial without proof, law and order

After some reflection, I realized that these tools somewhat resemble components of a trial. The first one gathers information and content for further inspection (like proofs); the second one defines the rules that need to be followed (the law); and the third one ensure that every action is executed at the right time (ensuring order). That’s were the name comes from :)

For preventing abuses, this code will unfortunately not be open-sourced.