Local grading

tcframe allows you to grade solutions locally, on your machine.

Before grading a solution, you must have already generated the test cases:

./runner

Then, you can “grade” a solution, by executing:

./runner grade [--solution=<solution-command>]

where <solution-command> is the command for executing the solution. If it is omitted, the default is ./solution.

For example, suppose you have written a problem package for a problem. Your friend also has written an alternate solution to the problem, and he wants to check whether his solution agrees with yours. Let’s assume that his solution file is solution_alt.cpp. Compile it into solution_alt, place it in the problem package, and then run

./runner grade --solution=./solution_alt

There are other flags available for use too. For complete set of flags see API Reference.

The verdicts of each test case, each subtask (if any), as well as the overall verdict will be shown, as described below.

Verdicts

The recognized verdicts, from the best to the worst, are as follows:

Accepted
The output produced by the solution is correct.
OK
The output produced by the solution is partially correct.
Wrong Answer
The output produced by the solution is incorrect. By default, the diff will be shown, truncated to the first 10 lines.
Runtime Error
The solution crashed or used memory above the limit, if specified.
Time Limit Exceeded
The solution did not stop within the time limit, if specified.
Internal Error
Custom scorer / communicator (if any) crashed or did not give a valid verdict.

Test case verdicts

A test case verdict consists of a verdict and optionally points.

The verdict of each test case will be shown. For OK verdict, the points (given by the scorer) will be also shown.

Subtask verdicts

If the problem has subtasks, the verdict of each subtask will be shown as well. A subtask verdict is the combination of:

  • verdict: the worst verdict of test case verdicts in the subtask
  • points:
    • the subtask points (assigned via Points()), if all test case verdicts in the subtask are Accepted,
    • the minimum points of OK verdicts in the subtask, if at least one test case verdict is OK and the rest are Accepted, or
    • 0, otherwise.

Overall verdict

Finally, the overall verdict is as follows.

For problem without subtasks:

  • verdict: the worst test case verdict
  • points: the sum of test case points, where:
    • an Accepted verdict will be given 100 / (number of test cases) points
    • an OK verdict will be given its own points
    • any other verdict will be given 0 points

For problem with subtasks:

  • verdict: the worst subtask verdict
  • points: the sum of subtask points

Sample local grading output

Here is a sample output of a local grading for problems without subtasks.

Local grading with solution command: './solution_alt'...

[ SAMPLE TEST CASES ]
  k-product_sample_1: Accepted

[ OFFICIAL TEST CASES ]
  k-product_1: Accepted
  k-product_2: Accepted
  k-product_3: OK [21]
  k-product_4: Wrong Answer
    * scorer Diff:
(expected) [line 01]    11
(received) [line 01]    12

[ VERDICT ]
  Wrong Answer [71]

and here is for problems with subtasks.

Local grading with solution command: './solution_alt'...

[ SAMPLE TEST CASES ]
  k-product_sample_1: Accepted

[ TEST GROUP 1 ]
  k-product_1_1: Accepted

[ TEST GROUP 2 ]
  k-product_2_1: Accepted
  k-product_2_2: Accepted
  k-product_2_3: Accepted

[ TEST GROUP 3 ]
  k-product_3_1: Accepted
  k-product_3_2: Wrong Answer
    * scorer: Diff:
(expected) [line 01]    11
(received) [line 01]    12

  k-product_3_3: Accepted

[ TEST GROUP 4 ]
  k-product_4_1: Accepted
  k-product_4_2: Accepted
  k-product_4_3: Accepted
  k-product_4_4: Accepted
  k-product_4_5: Accepted
  k-product_4_6: Runtime Error
    * Execution of solution failed:
      - Exit code: 1
      - Standard error:

[ SUBTASK VERDICTS ]
  Subtask 1: Accepted [40]
  Subtask 2: Wrong Answer [0]
  Subtask 3: Runtime Error [0]

[ VERDICT ]
  Runtime Error [40]

This local grading feature is useful for creating “unit tests” for your test cases. For each problem, you can write many solutions with different intended results. For example, solution_123.cpp should pass subtasks 1 - 3; solution_12.cpp should pass subtasks 1 and 2 but not subtask 3, etc.

Brief mode

You can pass an additional --brief argument to make the output concise. This is primarily intended to be consumed by scripts instead of human eyes.

The first line of the output contains the overall the verdict in the following format:

<code> <points>

where the code mapping is:

  • AC: Accepted
  • OK: OK
  • WA: Wrong Answer
  • RTE: Runtime Error
  • TLE: Time Limit Exceeded
  • ERR: Internal Error

If the problem has subtasks, the subtask verdicts will be output in the following lines, one line per subtask verdict ordered by subtask number, in the same format as above.

The sample outputs from the previous sections would become the following using --brief argument:

WA 71

and

RTE 40
AC 40
WA 0
RTE 0

Notes

Internally, tcframe uses ulimit to limit the time and memory used when running the solution. Unfortunately, there is no easy way to restrict memory limit on OS X, so the memory limit will be always ignored when using this feature on OS X.