= 5th ASPCOMP - Rules and Scoring =

The System competition aims at reproducing, as faithfully as possible, a 
setting in which performances of solving systems are evaluated on problem 
encodings (and instances) of different nature and complexity.


== Rules ==

The System competition is open to any general-purpose solving system able to
parse the ASP-Core-2 input format. The input language of choice for the 5th
ASP Competition is ASP-Core 2.01c (see File and Language Formats).

The competition is run over a selection of problems. Corresponding encodings
along with instance data are chosen by the Organizing Committee (see
Official Problem Suite). Participant systems will be run in a uniform
setting on each problem and instance thereof.

Problem-specific solving approaches, breaking the uniform setting of the
System competition, are not admitted. For instance, specific techniques must
not be based on predicate or variable names, whereas algorithm selection
based on, e.g., structural features is admissible.

Participants are required to describe their evaluation procedures and 
declare the set of benchmarks they support at submission time. The 
Organizing committee takes the right to analyze any submission and verify 
its congruence, possibly modifying the set of benchmarks supported by the 
system before its execution.


== Scoring ==

The scoring system adopted in each category aims at simplifying the ones
used in the 3rd and 4th ASP Competitions (which, in turn, extended previous
scoring systems). In particular, the scoring system balances the following
factors:
 - Problems shall be weighted equally, although the number of instances per
   problem may differ. Thus, scores are normalized in order to give equal 
   relevance to each problem.
 - If a system outputs an incorrect answer to some instance of a problem
   (see below for further details), this shall invalidate its score for the
   problem, even in case all other instances are correctly solved.
 - In case of optimization problems, scoring should mainly be based on
   solution quality.

In general, 100 points can be earned for each benchmark problem. The final
score of a solving system will hence consist of the sum of scores over all
problems.

Detailed information about scoring are provided in the following.


=== Scoring Details ===

For Search and Query problems, the score of a solver S on a problem P with N
instances is

    S(P) = N_S * 100 / N

where N_S is the number of instances solved within the time and memory
limits (see below for further details).

For Optimization problems, solvers are ranked by solution quality. If M is
the number of participant systems, the score of a solver S for an instance I
of a problem P with N instances is

    S(P,I) = M_S(I) * 100 / (M * N)

where M_S(I) is
 - 0 if S did neither provide a solution, nor report unsatisfiability, OR
 - the number of participant solvers that did not provide any strictly
   better solution than S, where a confirmed optimum solution is considered
   strictly better than an unconfirmed one, otherwise.

The score S(P) of a solver S for problem P is the sum of scores S(P,I) over
all N instances I of P. Note that, as with Search and Query problems, S(P)
can range from 0 to 100.


=== Global Ranking ===

The global ranking in each track, and overall, is obtained by awarding each
participant system the sum of its scores over all problems; systems are
ranked by their sums, in decreasing order. In case of equal sum of scores,
the sum of run-times, including maximum time for unsolved instances, is
taken into account as a tie-breaker in favor of the system whose run-time is
smaller.


== Detection of Incorrect Answers ==

Each benchmark domain P is equipped with a checker program C_P that takes
as input both an instance I and a corresponding witness solution A, and is
such that C_P(A,I) = "true" in case A is a valid witness for I w.r.t. P.
Suppose that a system S is faulty for an instance I of a problem P; then,
there are two possible ways to detect incorrect behavior, and subsequently
disqualify the system S:
 - S produces an answer A, and A is not a correct solution. This scenario is
   detected by checking the output of C_P(A,I).
 - S answers that the instance I is unsatisfiable, but I actually has some
   witness. In this case, it is checked whether a second system S' produced
   a solution A' for which C_P(A',I) is true.

Note that cases of general failure (e.g. "out of memory" errors or other
abrupt system failures) do not imply disqualification on a given benchmark.


=== Optimization problems ===

Concerning optimization problems, checkers produce also the cost of the
(last) witness. This latter value is considered when computing scores and
for assessing answers of systems.

If a system marks its witness as optimal for an instance of an optimization
problem, and no other system finds a better witness, this is pragmatically
assumed to be an optimal solution. In general, the cost of best witnesses
found is taken as the "imperfect optimum". If a system S marks its witness A
as optimal for an instance I of a problem P and the cost of A turns out to
be different from the imperfect optimum for I, then S is disqualified on P.


== Instance Selection process ==

For each domain, instances will be randomly selected from the sets available
from past competitions. They will be uniformly randomly chosen in order 
to ensure a reasonably fair selection over the whole domain.


== Systems making use of randomization ==

In order to guarantee reproducibility of competition outcomes, systems
making usage of randomization are required to run with an a-priori fixed
random seed.


== Dispute resolution ==

The Organizing Committee holds the right to enforce the fair respect of all
the above regulations, to settle possible disputes, and to inflict
appropriate penalties and/or disqualifications to any infringing
participant.


== Hardware, CPU and memory limits ==

The competition will run on a Debian Linux server (64bit kernel) featuring
Intel Xeon X5365 Processors with 4096KB of cache and 16GB of RAM. Time and
memory for each run of a solver on a problem instance are limited to 600
seconds of wall-clock time and 6GB RAM. Parallel solvers will be granted the
whole machine (8 processors), but memory is still limited to 6GB.