1 = 5th ASPCOMP - Rules and Scoring = 2 3 The System competition aims at reproducing, as faithfully as possible, a 4 setting in which performances of solving systems are evaluated on problem 5 encodings (and instances) of different nature and complexity. 6 7 8 == Rules == 9 10 The System competition is open to any general-purpose solving system able to 11 parse the ASP-Core-2 input format. The input language of choice for the 5th 12 ASP Competition is ASP-Core 2.01c (see File and Language Formats). 13 14 The competition is run over a selection of problems. Corresponding encodings 15 along with instance data are chosen by the Organizing Committee (see 16 Official Problem Suite). Participant systems will be run in a uniform 17 setting on each problem and instance thereof. 18 19 Problem-specific solving approaches, breaking the uniform setting of the 20 System competition, are not admitted. For instance, specific techniques must 21 not be based on predicate or variable names, whereas algorithm selection 22 based on, e.g., structural features is admissible. 23 24 Participants are required to describe their evaluation procedures and 25 declare the set of benchmarks they support at submission time. The 26 Organizing committee takes the right to analyze any submission and verify 27 its congruence, possibly modifying the set of benchmarks supported by the 28 system before its execution. 29 30 31 == Scoring == 32 33 The scoring system adopted in each category aims at simplifying the ones 34 used in the 3rd and 4th ASP Competitions (which, in turn, extended previous 35 scoring systems). In particular, the scoring system balances the following 36 factors: 37 - Problems shall be weighted equally, although the number of instances per 38 problem may differ. Thus, scores are normalized in order to give equal 39 relevance to each problem. 40 - If a system outputs an incorrect answer to some instance of a problem 41 (see below for further details), this shall invalidate its score for the 42 problem, even in case all other instances are correctly solved. 43 - In case of optimization problems, scoring should mainly be based on 44 solution quality. 45 46 In general, 100 points can be earned for each benchmark problem. The final 47 score of a solving system will hence consist of the sum of scores over all 48 problems. 49 50 Detailed information about scoring are provided in the following. 51 52 53 === Scoring Details === 54 55 For Search and Query problems, the score of a solver S on a problem P with N 56 instances is 57 58 S(P) = N_S * 100 / N 59 60 where N_S is the number of instances solved within the time and memory 61 limits (see below for further details). 62 63 For Optimization problems, solvers are ranked by solution quality. If M is 64 the number of participant systems, the score of a solver S for an instance I 65 of a problem P with N instances is 66 67 S(P,I) = M_S(I) * 100 / (M * N) 68 69 where M_S(I) is 70 - 0 if S did neither provide a solution, nor report unsatisfiability, OR 71 - the number of participant solvers that did not provide any strictly 72 better solution than S, where a confirmed optimum solution is considered 73 strictly better than an unconfirmed one, otherwise. 74 75 The score S(P) of a solver S for problem P is the sum of scores S(P,I) over 76 all N instances I of P. Note that, as with Search and Query problems, S(P) 77 can range from 0 to 100. 78 79 80 === Global Ranking === 81 82 The global ranking in each track, and overall, is obtained by awarding each 83 participant system the sum of its scores over all problems; systems are 84 ranked by their sums, in decreasing order. In case of equal sum of scores, 85 the sum of run-times, including maximum time for unsolved instances, is 86 taken into account as a tie-breaker in favor of the system whose run-time is 87 smaller. 88 89 90 == Detection of Incorrect Answers == 91 92 Each benchmark domain P is equipped with a checker program C_P that takes 93 as input both an instance I and a corresponding witness solution A, and is 94 such that C_P(A,I) = "true" in case A is a valid witness for I w.r.t. P. 95 Suppose that a system S is faulty for an instance I of a problem P; then, 96 there are two possible ways to detect incorrect behavior, and subsequently 97 disqualify the system S: 98 - S produces an answer A, and A is not a correct solution. This scenario is 99 detected by checking the output of C_P(A,I). 100 - S answers that the instance I is unsatisfiable, but I actually has some 101 witness. In this case, it is checked whether a second system S' produced 102 a solution A' for which C_P(A',I) is true. 103 104 Note that cases of general failure (e.g. "out of memory" errors or other 105 abrupt system failures) do not imply disqualification on a given benchmark. 106 107 108 === Optimization problems === 109 110 Concerning optimization problems, checkers produce also the cost of the 111 (last) witness. This latter value is considered when computing scores and 112 for assessing answers of systems. 113 114 If a system marks its witness as optimal for an instance of an optimization 115 problem, and no other system finds a better witness, this is pragmatically 116 assumed to be an optimal solution. In general, the cost of best witnesses 117 found is taken as the "imperfect optimum". If a system S marks its witness A 118 as optimal for an instance I of a problem P and the cost of A turns out to 119 be different from the imperfect optimum for I, then S is disqualified on P. 120 121 122 == Instance Selection process == 123 124 For each domain, instances will be randomly selected from the sets available 125 from past competitions. They will be uniformly randomly chosen in order 126 to ensure a reasonably fair selection over the whole domain. 127 128 129 == Systems making use of randomization == 130 131 In order to guarantee reproducibility of competition outcomes, systems 132 making usage of randomization are required to run with an a-priori fixed 133 random seed. 134 135 136 == Dispute resolution == 137 138 The Organizing Committee holds the right to enforce the fair respect of all 139 the above regulations, to settle possible disputes, and to inflict 140 appropriate penalties and/or disqualifications to any infringing 141 participant. 142 143 144 == Hardware, CPU and memory limits == 145 146 The competition will run on a Debian Linux server (64bit kernel) featuring 147 Intel Xeon X5365 Processors with 4096KB of cache and 16GB of RAM. Time and 148 memory for each run of a solver on a problem instance are limited to 600 149 seconds of wall-clock time and 6GB RAM. Parallel solvers will be granted the 150 whole machine (8 processors), but memory is still limited to 6GB.
You are not allowed to attach a file to this page.