Documentation
Quick Start Guides

Test Data

CodeJudge supports various forms of testing methods which suits different types of exercises. A test group consists of one or more tests. A test may consist of a number of parameters: standard input (in), command line arguments (args), expected output (out), a hint (hint), a score (score) and/or a test script (described in the following). To create a test group one must create the appropiate test files, e.g. files specifying the input and the expected output.

Suppose we want to create three tests for a program adding two numbers from standard input. The tests could be "5 7" with expected output "12", "1 2" with expected output "3" and "-9 9" with expected output "0". There are two ways to structure the test files: either in a single file or in many files (the first is most practical when created manually and the second when generated with scripts). For the first approach the above test cases can be created by a single file Tests.in containing:

/// Test
5 7
/// Out: 12
/// Test
1 2
/// Out: 3
/// Test
-9 9
/// Out: 0

The general pattern for this approach is a file named Xxxx.[parameter] where [parameter] can be any of the types described below. In this file, each test case should be started by a line with "/// Test". Tests may also be explicitly terminated by "/// EOT" (End Of Test). All lines not within a test are considered common to all tests. By default tests created this way are named Xxxx01, Xxxx02, .... If you like, you can specify the name of a test by "/// Name: [name]" (the name may not contain whitespaces). Finally, in any "Tests.[parameter1]" file one can write "/// [parameter2]: [value]" to set the contents of [parameter2] to [value] for a test (this is especially practical for hints and scores). (Note: specifying expected output is normally not necessary, see Expected Output for further details.)

Using the other approach, the test data could be saved in 6 different files, called Test01.in, Test01.out, Test02.in, Test02.out, Test03.in and Test03.out respectively. The general pattern here is simple; each parameter must be specified in a single file per test case with the name [test].[parameter].

A number of examples on how tests can be set up, can be found in the guide How to setup an exercise?.

Configuration

Besides the parameters for each test, a test group can have some common configuration settings. A test group can be configured either directly on the CodeJudge site when uploading or by uploading a file called testgroup.json together with the rest of the test files. The different parameters that can be configured are described below:

General Options

ExecutionMode

Enum:
All | UntilFailure | Performance

Choose how the test group should be executed. All, all tests will be run no matter the outcome. Until failure, if a test fails the rest of the tests won't be run. Performance, if a test fails the remaining will not be run. If a test fails due to the time limit, this and all following tests will be ignored.

Compiler Options

Target

String

This will overwrite the automatically detected target, and works differently depending on the language.

CompilerArguments

String

Overwrites the automatically build compiler arguments. $FILES will be replaced be a space separated list of filenames. $OUT will be replaced by the target name. (This feature is currently not very robust, so use with care)

Runner Options

RunAsScripts

Boolean

This feature is only for Matlab using test scripts. Uploaded code will be executed before the test script instead of being assumed to be functions.

UseInteractiveMode

Boolean

Input will be sent to program line by line. Output will be output and input interleaved. (Simulates a user using the program in a console)

CppTestMode

Enum:
Precompiled | SourceBased

This feature is only for C++ using test scripts. In "Normal" mode, the users uploaded files are first compiled indepedently of the test script. This means, the user code is not able to access stuff defined in the test script. In "Source" mode user code and test script is compiled together. When possible, "Normal" mode should be prefered.

CpuTimeLimit

Integer (ms)

The CPU time limit of the program. In case of programs using multiple cores, it will be the sum of CPU time over all cores.

WallTimeLimit

Integer (ms)

We normally recommend you use CPU time limit. The wall time limit will automatically be set to 3 times the cpu time limit if not specified.

MemoryLimit

Integer (kB)

At least 2000kB for Java programs. The implementation of this limit is language dependent, and might not work properly in all languages.

StackLimit

Integer (kB)

The implementation of this limit is language dependent, and might not work properly in all languages.

MaxCores

Integer

The maximum number of cores the users program is allowed to use. Defaults to 1. Our current graders support up to 4 cores.

Judge Options

JudgeType

Enum:
TokenBased | Custom | Exact | None

Choose the judge type of the test group. Token based, user output and expected output are compared token by token, and all whitespaces in between tokens are ignored. Exact, The user output should exactly match the expected output. Custom Judge A custom judge to evaluate the output can be uploaded (useful if the exercise has multiple correct solutions).

RelativePrecision

Double

Only relevant if JudgeType=TokenBased.

The relative precision of which decimal numbers should be compared. Leave blank to compare all decimals.

AbsolutePrecision

Double

Only relevant if JudgeType=TokenBased.

The absolute precision of which decimal numbers should be compared. Leave blank to compare all decimals.

CaseSensitive

Boolean

Only relevant if JudgeType=TokenBased.

Should the comparision of output to expected output be case sensitive.

AddDelimiters

String

Only relevant if JudgeType=TokenBased.

Extra characters can be specified as delimiters. They should just be written as a string (e.g. "()[]" to ignore parenthesis and square brackets).

Delimiters

String

Only relevant if JudgeType=TokenBased.

Similarly the exact delimiters can be specified, i.e. the default delimiters will be overwritten.

IgnoredSymbols

String

Only relevant if JudgeType=TokenBased.

These symbols are stripped from the (expected) output before comparison.

UseControlTokens

Boolean

(Deprecated)

IgnoreTrailingWhitespaces

Boolean

Only relevant if JudgeType=Exact.

Ignore whitespaces at the end of lines.

View Options

HideExpectedOutput

Boolean

Hide the expected output for normal users (in the UI and when downloading test data)

Expected Output (out)

When a program is tested its output is compared against the expected output, therefore the expected output is a vital part of a test. The way which the output is compared to the expected output is determined by the judge being used. Expected output can either be specified in files in the test data, or more practically it can be generated automatically by CodeJudge. If you want CodeJudge to generate it for you, simply upload a solution before uploading your test data.

Command Line Arguments (args)

Command Line Arguments, or just arguments, can be specified. See the documentation for your language to see how to get access to command line arguments.

Standard Input (in)

Standard Input also known as console input is the most commonly used option besides test scripts.

Files in working directory (TestXX/wkdir/*)

All files and directories placed in the path /TetsXX/wkdir/ will be copied to the working directory of the users program when executed. This is useful if you want the users to learn about file access.

Test Scripts

A test script is a program written in the same language as the submission, which will be executed in combination with the submitted files. How this is done depends on the language. A test script is in many ways equivalent to a unit test. For instance, one could make an exercise where the users must implement a function average(a, b) that will return the average of a and b. In order to test it, you can upload a number of test scripts calling average(a, b) with different arguments. The easiest way to learn how to make test scripts, is to look at our samples in the guide How to setup test data for my exercise?.

Java (java)

In java the test script must be a fully functional Java program, except it may call methods the users are supposed to provide. That is it must consist of a public class with a normal public static void main(String[] args) method. For instance:

public class Test01 {
    public static void main(String[] args) {
        System.out.println(Calculator.average(4, 9));
    }
}

If you have not uploaded a solution file before uploading your test data, you are also required to add java files in the directory dummy/ that contains the classes and methods the users are supposed to implement. It is highly recommended not to use this approach, but simply upload your solution instead. These classes/functions do not need to be functional, but they must be compileable. In other words, if you combine the test script file and these extra files, they must be able to compile. For the above example, we should add the file dummy/Calculator.java:

public class Calculator { public static dobule average(double a, double b) { return 0; // Dummy, since it must be compileable } }

Please do not use packages in test scripts / solutions. In this way, we can support users putting the classes in their own packages.

Remaining Languages

See examples.

Hints (hint)

You can add a hint to a test case which will be available to the user, if the users fails the test (currently hints are shown no matter how a test fails). Hints are supposed to be short for example "Did you consider negative numbers?" or similar.

Score (score)

A test can have an associated score. This is useful for competitions and grading purposes. The score must be a single number. Higher scores are considered better. The score of a submission is the sum of all the scores of the test cases it passes.

Size (size)

A test might be given a size. Can be used for plotting running time VS size - useful for analyzing the assymptotic running time of a solution.

Language Support

Below you see a table of all languages currently supported on CodeJudge.

Language
Arguments
Input
Files
Test Scripts
Java 11
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
Java 8
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
C (gcc 8.2)
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
C++ (C++14, g++ 8.2)
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
C++11 (g++ 8.2)
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
C++1z (g++ 8.2)
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
C# (Mono 5.14)
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
F# (4.1)
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
Python3 (3.7)
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
Python2 (2.7)
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
* Matlab (2018b)
Arguments: NO
Input: YES
Files: YES
Test Scripts: YES
R (3.5)
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
Bash (4.4 GNU)
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
Prolog (SWI-Prolog 7.6)
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
Rust (1.28)
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
Pascal (fpc 3.0)
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
Coq (8.5pl3) [beta]
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
Elixir (1.5) [beta]
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
Haskell (8.2) [beta]
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
Go (1.9) [beta]
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
HDL
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
Scala (2.12)
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
JavaScript
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
TypeScript
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO

* Please note: Matlab support is only available if you have a special agreement with us.

Judges

A Judge is the program on the grader which evaluates the output of the users' programs. For each test case the output of a user's program is compared to the expected output file (usually), and if they match (the matching criteria depends on the judge), the test is passed.

CodeJudge currently supports the three following judges: Exact Judge, Token Based Judge and Custom Judge.

Exact Judge

The Exact Judge checks if the output of the user's program exactly matches the output file, that including all white spaces, new lines etc. The only exception is that \r characters are ignored and lines break at the very end of the file.

This judge is expecially useful for exercises where strings including white spaces should be printed (so they match 100%), and can also be used for test scripts since the user should not be printing the output here.

Token Based Judge

The TokenBased Judge also compares the output of the user's program with the expected output, but any sequence of white spaces including new lines is considered as one single white space, so how the user chooses to separate the output won't matter.

This judge is useful for input/output exercises where the users have to print the output themselves and the white spaces don't matter. This judge also has a number of configuration options (for decimal comparisons, case-sensitivity, etc)

Custom Judge

If you have a more advanced exercise where one of the above judges cannot evaluate the programs properly you can write your own custom judge. The judge can be written in any of our supported languages.

Uploading

To upload the custom judge, you must add a folder named judge/ when you upload the test fils and place the source code file of the judge in this folder.

Accessing data

If the users program runs successfully, the custom judge will ben ran in the same working directory with two additional files; expected (the expected output) and output (the user output) without file extensions, which can be read by the judge. If the test has an input file, the content of this file can be read from standard input by the judge.

Evaluating the program

After evaluating the program, the judge can output the following lines to standard output to save the results (only the first is mandatory):

RESULT [result]
[result] must be CORRECT if the test case is accepted, otherwise WRONG. (mandatory)
TEXT [text]
An optional message to the users about this single test run. For instance, if the test was passed it could be "Correct" as for the other judges, or if the test failed it could be an error message like "Saw a but expected b".
SCORE [score]
A score can indicate how well they passed the test case. Can be used for optimization problems.
FILE [file]
Specifies that [file] should be copied from the working directory, and will be shown together with the results of the test in CodeJudge.

If the judge program terminates with an error, the system will mark it as a system error.

Templates/Examples

A few templates and examples of custom judges will be available here later.