Documentation
Quick Start Guides

Test Data

CodeJudge supports various forms of testing methods which suits different types of exercises. A test group consists of one or more tests. A test may consist of a number of parameters: standard input (in), command line arguments (args), answer/expected output (ans), a hint (hint), a score (score) and/or a test script (described in the following). To create a test group one can either set it up in the web interfance or create the appropiate test files when using file-based exercise management.

A number of examples on how tests can be set up, can be found in the guide How to create test data for my language?.

The file path written in parantheses after each type below should be used when using file-based exercises.

Standard Input (TestXX.in)

Standard Input also known as console input is the most commonly used option besides test scripts.

Command Line Arguments (TestXX.args)

Command Line Arguments, or just arguments, can be specified. See the documentation for your language to see how to get access to command line arguments.

Files in working directory (wkdir/* or TestXX.wkdir/*)

Working directory files will be copied to the working directory of the users program when executed. This is useful if you want the users to learn about file access. Working directory files can be added on a per exercise, per test group or per test level.

Test Scripts (TestXX.[lang])

A test script is a program written in the same language as the submission, which will be executed in combination with the submitted files. How this is done depends on the language. A test script is in many ways equivalent to a unit test. For instance, one could make an exercise where the users must implement a function average(a, b) that will return the average of a and b. In order to test it, you can upload a number of test scripts calling average(a, b) with different arguments. The easiest way to learn how to make test scripts, is to look at our samples in the guide How to create test data for my language?.

Python (TestXX.py)

Example of a Python test script file.

print(average(4, 9))

Java (TestXX.java)

In java the test script must be a fully functional Java program, except it may call methods the users are supposed to provide. That is it must consist of a public class with a normal public static void main(String[] args) method. For instance:

public class TestXX {
    public static void main(String[] args) {
        System.out.println(Calculator.average(4, 9));
    }
}

Please do not use packages in test scripts / solutions. In this way, we can support users putting the classes in their own packages.

Other Languages

See How to create test data for my language?.

Answer/Expected Output (TestXX.ans)

When a program is tested its standard output is compared against the expected output, therefore the expected output is a vital part of a test. The way which the output is compared to the expected output is determined by the judge being used. Expected output can either be specified in files in the test data, or more practically it can be generated automatically by CodeJudge. If you want CodeJudge to generate it for you, simply upload a solution before uploading your test data.

Answer files (TestXX.ans/*)

Instead of checking the standard output of a program, you can instead check if it produced the right output in one or more specific files.

For instance if you create the file TestXX.ans/programout.txt with some content, CodeJudge will verify that submissions produce the file programout.txt with the same content.

Hints (TestXX.hint)

You can add a hint to a test case which will be available to the user, if the users fails the test (currently hints are shown no matter how a test fails). Hints are supposed to be short for example "Did you consider negative numbers?" or similar.

Score (TestXX.score)

A test can have an associated score. This is useful for competitions and grading purposes. The score must be a single number. Higher scores are considered better. The score of a submission is the sum of all the scores of the test cases it passes.

Group (TestXX.group)

A test can be assigned to a group. When used in combination with scores, all tests in a group must pass in order for a submission to obtain points on that test.

Test Configuration

A test group can have some common configuration settings. When using file-based exercises, the configuration can be specified in either exercise.yml, testgroup.yml, or config.yml. The different parameters that can be configured are described below:

General Options

OnTestFailure

Enum:
Continue | Break | BreakAndDiscardTimeLimit

What should happen when a test fails.

Compiler Options

Target

String

This will overwrite the automatically detected target, and works differently depending on the language.

CompilerArguments

String

Overwrites the automatically build compiler arguments. $FILES will be replaced be a space separated list of filenames. $OUT will be replaced by the target name. (This feature is currently not very robust, so use with care)

Runner Options

RunAsScripts

Boolean

This feature is only for Matlab using test scripts. Uploaded code will be executed before the test script instead of being assumed to be functions.

UseInteractiveMode

Boolean

Input will be sent to program line by line. Output will be output and input interleaved. (Simulates a user using the program in a console)

CppTestMode

Enum:
Normal | SourceBased

This feature is only for C++ using test scripts. In "Normal" mode, the users uploaded files are first compiled indepedently of the test script. This means, the user code is not able to access stuff defined in the test script. In "Source" mode user code and test script is compiled together. When possible, "Normal" mode should be prefered.

CpuTimeLimit

Integer (ms)

The CPU time limit of the program. In case of programs using multiple cores, it will be the sum of CPU time over all cores.

WallTimeLimit

Integer (ms)

We normally recommend you use CPU time limit. The wall time limit will automatically be set to 3 times the cpu time limit if not specified.

MemoryLimit

Integer (kB)

At least 2000kB for Java programs. The implementation of this limit is language dependent, and might not work properly in all languages.

StackLimit

Integer (kB)

The implementation of this limit is language dependent, and might not work properly in all languages.

MaxCores

Integer

The maximum number of cores the users program is allowed to use. Defaults to 1. Our current graders support up to 4 cores.

Judge Options

JudgeType

Enum:
TokenBased | Custom | Exact | None

Choose the judge type of the test group. Token based, user output and expected output are compared token by token, and all whitespaces in between tokens are ignored. Exact, The user output should exactly match the expected output. Custom Judge A custom judge to evaluate the output can be uploaded (useful if the exercise has multiple correct solutions).

RelativePrecision

Double

Only relevant if JudgeType=TokenBased.

The relative precision of which decimal numbers should be compared. Leave blank to compare all decimals.

AbsolutePrecision

Double

Only relevant if JudgeType=TokenBased.

The absolute precision of which decimal numbers should be compared. Leave blank to compare all decimals.

CaseSensitive

Boolean

Only relevant if JudgeType=TokenBased.

Should the comparision of output to expected output be case sensitive.

AddDelimiters

String

Only relevant if JudgeType=TokenBased.

Extra characters can be specified as delimiters. They should just be written as a string (e.g. "()[]" to ignore parenthesis and square brackets).

Delimiters

String

Only relevant if JudgeType=TokenBased.

Similarly the exact delimiters can be specified, i.e. the default delimiters will be overwritten.

IgnoredSymbols

String

Only relevant if JudgeType=TokenBased.

These symbols are stripped from the (expected) output before comparison.

UseControlTokens

Boolean

(Deprecated)

IgnoreTrailingWhitespaces

Boolean

Only relevant if JudgeType=Exact.

Ignore whitespaces at the end of lines.

View Options

HideExpectedOutput

Boolean

Hide the expected output for normal users (in the UI and when downloading test data)

Merging of configurations

As the test configs can be defined at three different levels (template/exercise/test group), the configs are merged in the following order:

  1. Template
  2. Exercise
  3. Test group
  4. Template language specific
  5. Exercise language specific
  6. Test group language specific

Example: Let's say you have a template t in which the general CPU Time limit is set to 1000ms, but for python the CPU Time Limit is set to 4000ms. Now for an exercise e you use the template t but set the general CPU Time Limit to 2000ms. Then if a student uploads a solution in Java, the time limit will be 2000ms (as exercise overrides template), but if he instead uploads a solution in Python, the time limit will be 4000ms (as template language specific overrides exercise general).

Language Support

Below you see a table of all languages currently supported on CodeJudge.

Language
Arguments
Input
Files
Test Scripts
Java
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
Java
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
C
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
C++
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
C++
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
C++
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
C#
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
F#
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
Python
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
Python
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
* Matlab
Arguments: NO
Input: YES
Files: YES
Test Scripts: YES
R
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
Bash
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
Prolog
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
Rust
Arguments: YES
Input: YES
Files: YES
Test Scripts: YES
Pascal
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
Coq
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
Elixir
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
Haskell
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
Go
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
HDL
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
Scala
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
JavaScript
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO
TypeScript
Arguments: YES
Input: YES
Files: YES
Test Scripts: NO

* Please note: Matlab support is only available if you have a special agreement with us.

Judges

A Judge is the program on the grader which evaluates the output of the users' programs. For each test case the output of a user's program is compared to the expected output file (usually), and if they match (the matching criteria depends on the judge), the test is passed.

CodeJudge currently supports the three following judges: Exact Judge, Token Based Judge and Custom Judge.

Exact Judge

The Exact Judge checks if the output of the user's program exactly matches the output file, that including all white spaces, new lines etc. The only exception is that \r characters are ignored and lines break at the very end of the file.

This judge is especially useful for exercises where strings including white spaces should be printed (so they match 100%), and can also be used for test scripts since the user should not be printing the output here.

Token Based Judge

The TokenBased Judge also compares the output of the user's program with the expected output, but any sequence of white spaces including new lines is considered as one single white space, so how the user chooses to separate the output won't matter.

This judge is useful for input/output exercises where the users have to print the output themselves and the white spaces don't matter. This judge also has a number of configuration options (for decimal comparisons, case-sensitivity, etc)

Custom Judge

If you have a more advanced exercise where one of the above judges cannot evaluate the programs properly you can write your own custom judge. The judge can be written in any of our supported languages.

Accessing data

If the users program runs successfully, the custom judge will ben ran in the same working directory with two additional files; expected (the expected output) and output (the user output) without file extensions, which can be read by the judge. If the test has an input file, the content of this file can be read from standard input by the judge.

Evaluating the program

After evaluating the program, the judge can output the following lines to standard output to save the results (only the first is mandatory):

RESULT [result]
[result] must be CORRECT if the test case is accepted, otherwise WRONG. (mandatory)
TEXT [text]
An optional message to the users about this single test run. For instance, if the test was passed it could be "Correct" as for the other judges, or if the test failed it could be an error message like "Saw a but expected b".
SCORE [score]
A score can indicate how well they passed the test case. Can be used for optimization problems.
FILE [file]
Specifies that [file] should be copied from the working directory, and will be shown together with the results of the test in CodeJudge.

If the judge program terminates with an error, the system will mark it as an exercise error.