In this tutorial, we give an
introduction to projecting, implementing and using Extended Builtins in DLV. The
tutorial does not give a full description of the usage and capabilities of
DLV
,
nor of details about builtins
themselves. For a more complete account of these, see the DLV
homepage, the DLV
online user manual, the DLV
online user tutorial
, and the DLV
External Builtins Manual. The
examples shown in this tutorial work with every recent DLV
release. Executables of the
DLV
system for various platforms as
well as needed files for building builtin libraries can be downloaded from the
DLV
homepage.
The external builtins that we want to describe
are functional constructs, which means that starting from an external builtin,
defined in the disjunctive logical program, an extern boolean function, written
in C++ code and called oracle, is invoked.
Obviously, if the user wants to
define a new builtin in a logical program she has to implement also the
correspondent oracle(s). The implementation is rather simple. In the following
we will supply detailed examples showing the oracles implementation and library
building.
The tutorial consists of the following
sections, each of them being built around a guiding example:
·
First
Example: Factorial computation
·
Second
Example: Managing strings
·
Third
Example: Crypt string
Suppose
we want the possibility to write the following rule in a dlv
program:
r1:
factorial
:- #fatt(5,120).
such
that factorial
is part of a model iff 120
is the factorial of 5.
Please
note as, in this case, the binary predicate has both terms bounded: this means
ii as corresponding pattern. For a detailed description of
what are patterns please refer to the
DLV
External Builtins
Manual.
Roughly, we
will have an i corresponding to each bound term and an o
corresponding to each unbound term. The pattern is the string obtained
with the sequence of characters.
The
actual pattern will tell which oracle
is invoked by the system when evaluating the rule. Remember that the designer
must provide an oracle per each possible pattern she wants to be exploited
writing DLV program.
Example
1.
Let’s start having a look at the C++ code describing the oracle corresponding to
the ii pattern for the #fatt
builtin:
1.
BUILTIN(fatt,ii)
{
2.
if(
argv[0].isInt() && argv[1].isInt()
)
{
// Convert the CONSTANT
data to what we do have to compute
3. int x=
argv[0].toInt();
4. int y= argv[1].toInt();
// Just perform the
computation...
5. int
fatt=1;
6. for(int i = 1; i <= x;
i++)
7.
fatt=fatt*i;
8. if( fatt ==
y)
9.
return true;
}
10.
return false;
}
Where
BUILTIN
is a macro defined in extpred.h file,
as we
will see later on. This function receives two arguments, fatt and
ii. The first argument of BUILTIN
is
the external predicate name, but without the “#” symbol, while the second
argument is the pattern
itself.
The oracle body’s is constituted of normal instructions written in C++
language.
Note:
of course, the string length representing the pattern is equal to the arity of
the predicate.
Line
2 performs a type check on the arguments; even if it is not mandatory, it is a
very good idea to perform it everytime, at the very beginning of each oracle
implementation.
Lines
3 and 4 deserve
a more detailed explanation, since particular attention must be paid to
the argv[0].toInt() and argv[1].toInt()
calls.
First
of all, argv
is an array of CONSTANT
objects, in a positional order, such that the first element stores the value of
first term in the current builtin atom; the second element stores the value of
the second term, and so on. In this example, when the oracle is invoked during
the evaluation of rule r1,
argv[0]
contains the value 5
and argv[1] contains 120.
We
can think about CONSTANT
class as an
interface to DLV, which means manipulate TERM
types without knowing inner DLV management.
The
method toInt(),
for
instance, converts an internal DLV number
to a standard C++ integer.
After
the assignment, of course, x
and y
contain 5
and 120,
respectively.
Lines 5 to 9 simple compute the factorial of x and return true if it is equal to y; line 10 causes returning false, in case x is not equal to y, or a type mismatch has occurred (see line 2).
Intuitively, the result of the oracle invocation affects the
model computation performed by DLV. In our case factorial is derived, since the body of the rule becomes true
(containing only the builtin atom); if we had written the rule:
r2: factorial:- #fatt(5,140).
factorial would have not been derived, because #fatt(5,140) is false.
Example 2. Suppose we have defined the following rules:
r1: number(5).
r2: factorial(X,Y):- number(X), #fatt(X,Y).
During the instantiation of rule r2 the term X of predicate number assumes value 5, because number(5) is a fact. So we have factorial(5,Y)and it is easy to see that the actual pattern is io. The associated oracle will be invoked in order to bind the
term Y.
Please note as the user must have defined the oracle
fatt_io(), otherwise the rule would be unsafe.
A possible implementation for the oracle could be the
following.
1. BUILTIN(fatt,io)
{
2. if( argv[0].isInt() )
{
// Convert the CONSTANT data to what we do have to compute
3.
int x = argv[0].toInt();
// Just perform the computation...
4.
int fatt = 1;
5.
for(int i = 1; i <= x; i++)
6.
fatt=fatt*i;
// Convert the result into the proper CONSTANT data
// Save the result onto the output vector
7.
argv[1] = CONSTANT( fatt );
8.
return true;
}
9. return false;
}
The first argument of the BUILTIN macro is fatt, since #fatt is the name of the builtin being defined, and the second is
io since the oracle is associated to this pattern.
Line 2 performs a type check, while, as it should be almost
familiar by now. Please notice that the type check is here performed only with
respect to the input variable; the output one will be properly set within the
oracle; at the moment results as a Null constant. Line 3 defines an integer
variable initialized through a cast from a DLV number. Lines 4 to 6 compute the value of the factorial, as
in the previous example; but in this case the second variable will contain an
“exit” value, to be “returned”. Thus, the computed value of is stored in
argv[1]; this will cause, once the oracle execution is completed,
Y to be bound to the value we are storing now. Please note,
again, as we do have to convert the value into the proper DLV type, taking advantage from a constructor of the
CONSTANT class (line 7).
Example 3. Given the program
r1:
number(120).
r2:
factorial(X,Y):- number(Y), #fatt(X,Y).
Here Y is bound while X is not: the actual pattern is oi. If a proper
oracle is defined, it could evaluate the inverse of the factorial Y, if it really is the factorial of a number. Let’s see a
possible implementation of such an oracle (it is worth remembering that builtins
have to be functional, so for the sake of correctness we will suppose that the
oracle will be invoked only for factorials greater or equal to 2 -- 1 may be the factorial of both 1 and 0).
1. BUILTIN(fatt,oi)
// NOTE: The oracle has sense only for x>=2.
{
2.
if( && argv[1].isInt() && argv[1].toInt() >= 2 )
{
// Convert the CONSTANT data to what we do have to
compute
3.
int x = argv[1].toInt();
// Just perform the computation...
4.
int fatt = 1;
5.
for(int i = 1; fatt <= x; i++)
{
6.
fatt = fatt*i;
7.
if( fatt == x )
{
8.
// Convert the result into
// the proper CONSTANT data.
9.
// Save the result onto the output vector
10.
argv[0] = CONSTANT(i);
11.
return true;
}
}
}
12.
return false;
}
Lines 1 to 3 behave as in the previous examples. Lines 4 to 7 just compute the factorial in a bottom-up way; if
fatt, raising, matches the value of argv[1], then i is the searched value, and lines 10 to 11 store it on argv[0] causing, in the DLV program, X to be bound when true is returned; otherwise, false is the result.
We have just seen three out of the four oracles referring to
the four possible patterns (ii, io, oi). Of
course, the
user may decide not to define all possible oracles.
Let’s see now how to build the library containing the
oracles.
Oracles must be written in C++ language and stored in a file
having “.C” (uppercase) extension. Oracles referring to the same builtin may be
even stored in different files; moreover, each file may contain oracles
associated to different builtins. However, for the sake of style and clarity,
maybe all oracles defining a builtin should stay together.
Within the source file(s), all the functions defining
oracles must be enclosed in the directive:
#ifdef __cplusplus
extern "C" {
#endif
…………………
#ifdef __cplusplus
}
#endif
This allows to compile the functions with C++ code rather
than standard C code. For further information we suggest to have a look at
http://www.isotton.com/howtos/C++-dlopen-mini-HOWTO.html.
Moreover, the user must necessarily include the extpred.h file, which contains the definition of the BUILTIN macro, the interface of the CONSTANT class and other stuffs useful for the compilation.
In the end, the source file written by the user should apper as
following.
//////////////////////////////////////////////////////////////////////////////
//Shared Library for the external builtin #fatt
//fatt_builtin.C
#include "extpred.h"
#ifdef __cplusplus
extern "C" {
#endif
BUILTIN(fatt,ii)
{
if( argv[0].isInt() && argv[1].isInt() )
{
// Convert the CONSTANT data
to what we do have to compute
int x= argv[0].toInt();
int y= argv[1].toInt();
//
Just perform the computation...
int fatt=1;
for(int i = 1; i <= x; i++)
fatt=fatt*i;
if( fatt == y)
return true;
}
return false;
}
BUILTIN(fatt,io)
{
if( argv[0].isInt() )
{
// Convert the CONSTANT data to what we do have to compute
int x = argv[0].toInt();
// Just perform the computation...
int fatt = 1;
for(int i = 1; i <= x; i++)
fatt=fatt*i;
// Convert the result into the proper CONSTANT data
// Save the result onto the output vector
argv[1] = CONSTANT( fatt );
return true;
}
return false;
}
BUILTIN(fatt,oi)
// NOTE: The oracle has sense only for x>=2.
{
if( argv[1].isInt() && argv[1].toInt() >= 2 )
{
// Convert the CONSTANT data to what we do have to
compute
int x = argv[1].toInt();
// Just perform the computation...
int fatt = 1;
for(int i = 1; fatt <= x; i++)
{
fatt = fatt*i;
if( fatt == x )
{
// Convert the result into
// the proper CONSTANT data.
// Save the result onto the output vector
argv[0] = CONSTANT(i);
return true;
}
}
}
return false;
}
#ifdef __cplusplus
}
#endif
Of course, the order of the oracles in the file is not
influential. The only constraint is that the user MUST define the “base” oracle:
the one corresponding to the pattern having only i’s. (ii, in this case; ii...i, in general).
The library is built from the source file. In order to help
the user, all needed commands can be performed through a script called dynamic, which we provide. In our example, let’s imagine to have
stored the three oracles we’ve written in a single file, say it fatt_builtin.C. Therefore, it is enough to launch the following
command:
$ ./dynamic fatt_builtin
omitting the extension “.C”. As a result, if no errors
arise, in the current directory, will appear the library file fatt_builtin.so. Please note it is a shared, dynamically loaded library. For
further information on this topic you can have a look at http://www.tldp.org/HOWTO/Program-Library-HOWTO/introduction.html.
In order to use the libary, and therefore, in order to
invoke the
proper oracle, the user must include the library within the DLV program.
We can exploit all the examples in a single DLV program.
Once the library fatt_builtin.so is defined we can write:.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% A DLV program with external builtins
#include <fatt_builtin>
factorial:- #fatt(5,120).
number_1(5).
factorial_1(X,Y):- number_1(X), #fatt(X,Y).
number_2(120).
factorial_2(X,Y):- number_2(Y), #fatt(X,Y).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
The inclusion directive, in the DLV program, assumes that
the fatt_builtin.so library is stored in the default directory, that is
LIB. Moreover, we suppose that the LIB directory is created in the working directory.
NOTE:
CONSTANT is a
public class defined in extpred.h and, as
already said, it provides what is needed in order to manage
parameters for the C++ defined oracles and manage base-types from/to dlv/C++.
The methods provided are the following:
First method returns an integer of according
value, while the second and third methods are used to
cast from strings (encoded by dlv) to standard C++ string
(i.e. char*) types.
Below a simple example try to clarify the difference between
a DLV string and a DLV symbolic string.
For a fully detailed description of the CONSTANT
class and the methods it provides please refer to the
BUILTINS USER MANUAL.
Second Example: Managing
strings
In this example we will define a library providing builtins
for evaluating string length or extract substrings. We will define #string_length,
binary, having two oracles: one, referring to the ii pattern, verifies whether the second term is the length of
the string appearing as first term, and the other, referring to the io pattern, bounds
the second term to the length of the string appearing as first term. Then we
will define #substring, having arity 4, with two oracles: iiii which verifies
the consistency of bounded terms, and iiio which given a string, an index inside it, and a
length, generates a proper substring.
In this example we manage only quoted string without taking
care of symbols; of course this is not mandatory. The file “managing_string.C” follows.
//////////////////////////////////////////////////////////////////////////////
//managing_string.C
//Dynamic Library for the external builtin
#string_length
//#string_length(char* str, int lenght)
// length contains the number of characters of string
str
// str is a char*
#include "extpred.h"
#include "string.h"
#ifdef __cplusplus
extern "C" {
#endif
BUILTIN(string_length,ii)
{
if( argv[0].isString()&& argv[1].isInt()
)
{
// Convert the CONSTANT
data to what we do have to compute
// Convert argv[0] in an
object of string class
string
str(argv[0].toString());
int length= argv[1].toInt();
// Verify...
int l=str.size();
if( length == l)
return true;
else
return false;
}
else
return false;
}
BUILTIN(string_length,io)
{
if( argv[0].isString() )
{
// Convert the CONSTANT data to what we
do have to compute
string
str(argv[0].toString());
//Perform...
int length = str.size() ;
// Save the result onto the output
vector
argv[1]=CONSTANT(length);
return true;
}
else
return false;
}
//#substring(string str,int pos,int len,string substr)
//substr is the substring of string str starting at position
//pos and of length len. The position is counted from
zero.
BUILTIN(substring,iiii)
{
if( argv[0].isString() && argv[1].isInt()
&& argv[2].isInt() && argv[3].isString() )
{
// Convert the CONSTANT data to what we
do have to compute
string str
(argv[0].toString());
int position =
argv[1].toInt();
int length = argv[2].toInt();
string
sub_string (argv[3].toString());
int length_str =
str.length();
//catch ungestible situation and return
false
if( length <= 0 || length >
length_str || position > length_str)
return false;
else{
// Verify...
string ss(str.substr(position,length));
if(ss ==
sub_string)
return true;
else
return false;
}
}
return false;
}
BUILTIN(substring,iiio)
{
if( argv[0].isString() && argv[1].isInt() &&
argv[2].isInt() )
{
// Convert the CONSTANT
data to what we do have to compute
string
str
=
string( argv[0].toString());
int position =
argv[1].toInt();
int length =
argv[2].toInt();
int length_str =
str.length();
if( length <= 0
|| length >
length_str
|| position > length_str - length)
return false;
else{
//Perform...only if
checked
string sub_string = str.substr(position,length);
// Convert the result into
the proper CONSTANT data
// Save the result onto
the output vector
argv[3]=CONSTANT(sub_string.c_str(),true);
int l =
sub_string.length();
if( l == length)
return true;
else
return false;
}
}
return false;
}
#ifdef
__cplusplus
}
#endif
//////////////////////////////////////////////////////////////////////////////
Let’s suppose that the file is in the default folder LIB. In order to compile and obtain the shared library let’s perform
$~/LIB> ./dynamic string_length
and in the LIB folder together with some object files will appear the file
string_length.so.
Let’s see now a possibile DLV program exploiting the library. We put i tinto
the file strings.dl:, which follows.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% strings.dl
%%
#include < managing_string >
%% Given a string of length N, the suffix at
position i is the
substring
%% obtained from the original string dropping the first N characters.
%% In this example we have:
%% suffix("californication",0)
%% suffix("alifornication",1)
%% suffix("lifornication",2)
%% suffix("ifornication",3)
%% suffix("fornication",4)
%% suffix("ornication",5)
%% suffix("rnication",6)
%% suffix("nication",7)
%% suffix("ication",8)
%% suffix("cation",9)
%% suffix("ation",10)
%% suffix("tion",11)
%% suffix("ion",12)
%% suffix("on",13)
%% suffix("n",14)
#maxint=15.
string("californication").
suffix1(N):- string(Str), #string_length(Str,N).
suffix(Suff,Position):- string(Str),
#string_length(Str,N),
#int(Position),
Position >= 0, Position < N,
N = Position + Lsuffix,
#substring(Str,Position,Lsuffix,Suff).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Invokind DLV with, at the command prompt while in the current folder,
$~/> ./dl strings.dl –filter=suffix
the (unique)
answer set returned is:
{suffix("californication",0), suffix("alifornication",1),
suffix("lifornication",2), suffix("ifornication",3), suffix("fornication",4),
suffix("ornication",5), suffix("rnication",6), suffix("nication",7),
suffix("ication",8), suffix("cation",9), suffix("ation",10), suffix("tion",11),
suffix("ion",12), suffix("on",13), suffix("n",14)}
Third Example: Crypt
string
In the third example we will define a quite simple library
in order to crypt/decrypt strings. The idea is that in
#crypt(string1, string2, key)
String2
is the result of string1
ciphered on the base of key key. Our very simple algorithm ciphers the string adding to
each character a proper offset: to the first character will be added an offset
equal to key, to the second an offset equal to key+1, then key+2
to the third, and so on. For instance, if string1
is bounded to
unical
then string2 will will be ysojiu. Intuitively, the inverse is easily done in order to
decrypt. The builtin manipulates only symbolic constants containing characters
appearing into the ASCII table between a and z (i.e., 97 and 122), decimal notation. Of course oracles may
be written in order to manage quoted string; in this cas attention must be paid
to quotes.
The crypt.so
library of this example contains the three oracles:
The base oracle (iii) simply checks whether the two strings are coupled
with respect to the key. The second one (ioi) binds the second term to the the result of
ciphering the first term with respect to the key; the third (oii) binds the
first term to the deciphered string which binds the second term, again with
respect to the key which binds the third term. The source file for the
library follows.
//////////////////////////////////////////////////////////////////////////////
// crypt.C
// Dynamic Library for the external builtin #crypt
// #crypt(char*, char*, int)
#include "extpred.h"
#include <stdio.h>
#ifdef __cplusplus
extern "C" {
#endif
BUILTIN(crypt,iii)
{
if( argv[0].isSymbol() &&
argv[1].isSymbol() && argv[2].isInt())
{
// Convert
the CONSTANT data to what we do have to compute
// Convert
argv[0] in an object of string class
const char
*name = argv[0].toSymbol();
int l =
strlen( name );
// Convert
argv[2] in an object of string class
int key =
argv[2].toInt();
// The
crypted string to be valuted
char
*crypt_name= new char[ l+1 ];
int
i=0;
for( ; i
< l; i++)
crypt_name[i] = 97 + ( name[i] - 97 + key + i ) % 26 ;
crypt_name[i] =
'\0';
if(
strcmp(crypt_name, argv[1].toSymbol())==0 )
return
true;
}
return false;
}
BUILTIN(crypt,ioi)
{
if( argv[0].isSymbol() &&
argv[2].isInt())
{
const char *name =
argv[0].toSymbol();
int l = strlen( name );
int key = argv[2].toInt();
char *crypt_name=
new char[ l+1 ];
int i=0;
for( ; i < l; i++
)
crypt_name[i] =
97 + ( name[i] - 97 + key + i ) % 26 ;
crypt_name[i] =
'\0';
argv[1] = CONSTANT(
crypt_name );
return true;
}
return false;
}
BUILTIN(crypt, oii)
{
if( argv[1].isSymbol() &&
argv[2].isInt() )
{
const char *name =
argv[1].toSymbol();
int l = strlen( name );
char *decrypt_name= new char[ l+1 ];
int key =
argv[2].toInt();
int i=0;
for( ; i < l;
i++)
decrypt_name[i] = 97 + ( 26 + name[i] - 97 - key - i ) % 26;
decrypt_name[i] =
'\0';
argv[0] = CONSTANT(
decrypt_name );
return true;
}
return false;
}
#ifdef __cplusplus
}
#endif
//////////////////////////////////////////////////////////////////////////////
Let’s have a quick look at the code. Please note that, as
always suggested, each oracle starts with a type check on elements of
argv.
The first oracle (iii, the base one) simpy verifies the correctness of the tuple constituted by the bound terms. It builds the enciphered string starting from the first and the third terms (argv[2] and argv[2]) and checks whether it corresponds to the second term (argv[1]).
The second oracle (ioi) act as the firts, but the enciphered string instead
of being used for checking is put on argv[1] in order to bind the
corresponding term once the control returns to DLV.
The last oracle (oii) just performs the deciphering and put the result on
argv[0].
Now suppose that the file crypt.C is into the default LIB folder. Of course, you can freely change folder, but remember to have the files extpred.h and extpred.o also.
In order to
build the library crypt.so let’s perform, while in
LIB folder:
$ ./dynamic crypt
The LIB folder will actually contain also the file crypt.so.0. This file is needed to generate the donamic library, so dont’t dolete it.
The following is a sample DLV program which exploits the #crypt builtin.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
#include < crypt >
cocktail(mojito).
cocktail(mai_tai).
cocktail(daiquiri).
cocktail(long_island_iced_tea).
crypt(Y) :- cocktail(X), #crypt(X,Y,4).
is_crypt :- #crypt(unical,ysojiu,4).
decrypt(X) :- #crypt( X , ysojiu,4).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Here the extension of crypt(Y) contains the ciphered strings obtained from cocktail(X), is_crypt is derived if ysojiu is the ciphered result of unical with key 4, and decrypt(X) deciphers ysojiu. Running DLV the model obtained is
{crypt(qtppbx), crypt(qfofbjs), crypt(hfoxcrbt),
crypt(pttngrcwmarnytwwsoax), is_crypt, decryp(unical)}.
dlv-ex@mat.unical.it
Last modified 2004-oct-28