THE DLV EXTERNAL BUILTINS TUTORIAL (pre-release, evolving - please check home page for latest news and releases)

In this tutorial, we give an introduction to projecting, implementing and using Extended Builtins in DLV. The tutorial does not give a full description of the usage and capabilities of DLV, nor of details about builtins themselves. For a more complete account of these, see the DLV homepage, the DLV online user manual, the DLV online user tutorial, and the DLV External Builtins Manual. The examples shown in this tutorial work with every recent DLV release. Executables of the DLV system for various platforms as well as needed files for building builtin libraries can be downloaded from the DLV homepage.

Introduction

The external builtins that we want to describe are functional constructs, which means that starting from an external builtin, defined in the disjunctive logical program, an extern boolean function, written in C++ code and called oracle, is invoked.

Obviously, if the user wants to define a new builtin in a logical program she has to implement also the correspondent oracle(s). The implementation is rather simple. In the following we will supply detailed examples showing the oracles implementation and library building.

The tutorial consists of the following sections, each of them being built around a guiding example:

·                    First Example: Factorial computation

·                    Second Example: Managing strings

·                    Third Example: Crypt string

First Example: Factorial Computation

Suppose we want the possibility to write the following rule in a dlv program:

            r1:        factorial :- #fatt(5,120).

such that factorial is part of a model iff 120 is the factorial of 5.

Please note as, in this case, the binary predicate has both terms bounded: this means ii as corresponding pattern. For a detailed description of what are patterns please refer to the DLV External Builtins Manual. Roughly, we will have an i corresponding to each bound term and an o corresponding to each unbound term. The pattern is the string obtained with the sequence of characters.

The actual pattern will tell which oracle is invoked by the system when evaluating the rule. Remember that the designer must provide an oracle per each possible pattern she wants to be exploited writing DLV program.

Example 1. Let’s start having a look at the C++ code describing the oracle corresponding to the ii pattern for the #fatt builtin:

1.    BUILTIN(fatt,ii)

      {

2.          if( argv[0].isInt() && argv[1].isInt() )

            {

            // Convert the CONSTANT data to what we do have to compute

3.          int x= argv[0].toInt();

4.          int y= argv[1].toInt();   

            // Just perform the computation...

5.          int fatt=1;

6.          for(int i = 1; i <= x; i++)

7.                fatt=fatt*i;

8.          if( fatt == y)

9.                return true;

}

10.         return false;

      }

Where BUILTIN is a macro defined in extpred.h file, as we will see later on. This function receives two arguments, fatt and ii. The first argument of BUILTIN is the external predicate name, but without the “#” symbol, while the second argument is the pattern itself. The oracle body’s is constituted of  normal instructions written in C++ language.

Note: of course, the string length representing the pattern is equal to the arity of the predicate. 

Line 2 performs a type check on the arguments; even if it is not mandatory, it is a very good idea to perform it everytime, at the very beginning of each oracle implementation.

Lines 3 and 4 deserve a more detailed explanation, since particular attention must be paid to the argv[0].toInt() and argv[1].toInt() calls.

First of all, argv is an array of CONSTANT objects, in a positional order, such that the first element stores the value of first term in the current builtin atom; the second element stores the value of the second term, and so on. In this example, when the oracle is invoked during the evaluation of rule r1, argv[0] contains the value 5 and argv[1]  contains 120.

We can think about CONSTANT class as an interface to DLV, which means manipulate TERM types without knowing inner DLV management.

The method toInt(), for instance, converts an internal DLV number to a standard C++ integer.

After the assignment, of course, x and y contain 5 and 120, respectively.

Lines 5 to 9 simple compute the factorial of x and return true if it is equal to y; line 10 causes returning false, in case x is not equal to y, or a type mismatch has occurred (see line 2).

Intuitively, the result of the oracle invocation affects the model computation performed by DLV. In our case factorial is derived, since the body of the rule becomes true (containing only the builtin atom); if we had written the rule:   

            r2:        factorial:- #fatt(5,140).

factorial would have not been derived, because #fatt(5,140) is false.

Example 2. Suppose we have defined the following rules:

            r1:        number(5).

            r2:        factorial(X,Y):- number(X), #fatt(X,Y).

During the instantiation of rule r2 the term X of predicate number assumes value 5, because number(5) is a fact. So we have factorial(5,Y)and it is easy to see that the actual pattern is io. The associated oracle will be invoked in order to bind the term Y.

Please note as the user must have defined the oracle fatt_io(), otherwise the rule would be unsafe.

A possible implementation for the oracle could be the following.

1.    BUILTIN(fatt,io)

      {

2.    if( argv[0].isInt() )

            {

            // Convert the CONSTANT data to what we do have to compute

3.          int x = argv[0].toInt();

// Just perform the computation...

4.          int fatt = 1;

5.          for(int i = 1; i <= x; i++)

6.                fatt=fatt*i;

// Convert the result into the proper CONSTANT data

// Save the result onto the output vector

7.          argv[1] = CONSTANT( fatt );

8.          return true;

            }

9.    return false;

}

The first argument of the BUILTIN macro is fatt, since #fatt is the name of the builtin being defined, and the second is io since the oracle is associated to this pattern.

Line 2 performs a type check, while, as it should be almost familiar by now. Please notice that the type check is here performed only with respect to the input variable; the output one will be properly set within the oracle; at the moment results as a Null constant. Line 3 defines an integer variable initialized through a cast from a DLV number. Lines 4 to 6 compute the value of the factorial, as in the previous example; but in this case the second variable will contain an “exit” value, to be “returned”. Thus, the computed value of is stored in argv[1]; this will cause, once the oracle execution is  completed, Y to be bound to the value we are storing now. Please note, again, as we do have to convert the value into the proper DLV type, taking advantage from a constructor of the CONSTANT class (line 7).

Example 3. Given the program

           

           r1:         number(120).

           r2:         factorial(X,Y):- number(Y), #fatt(X,Y).

Here Y is bound while X is not: the actual pattern is oi. If a proper oracle is defined, it could evaluate the inverse of the factorial Y, if it really is the factorial of a number. Let’s see a possible implementation of such an oracle (it is worth remembering that builtins have to be functional, so for the sake of correctness we will suppose that the oracle will be invoked only for factorials greater or equal to 2 -- 1 may be the factorial of both 1 and 0).

1.    BUILTIN(fatt,oi)

// NOTE: The oracle has sense only for x>=2.

      {

2.          if( && argv[1].isInt() && argv[1].toInt() >= 2 )

                  {

// Convert the CONSTANT data to what we do have to compute

3.                int x = argv[1].toInt();

// Just perform the computation...

4.                int fatt = 1;

5.                for(int i = 1; fatt <= x; i++)

                        {

6.                      fatt = fatt*i;

7.                      if( fatt == x )

                              {

8.                           // Convert the result into

// the proper CONSTANT data.

9.                           // Save the result onto the output vector

10.                          argv[0] = CONSTANT(i);

11.                          return true;

                             }

                        }

                  } 

12.               return false;

            }

Lines 1 to 3 behave as in the previous examples. Lines 4 to 7 just compute the factorial in a bottom-up way; if fatt, raising, matches the value of argv[1], then i is the searched value, and lines 10 to 11 store it on argv[0] causing, in the DLV program, X to be bound when true is returned; otherwise, false is the result.

We have just seen three out of the four oracles referring to the four possible patterns (ii, io, oi). Of course, the user may decide not to define all possible oracles.

Let’s see now how to build the library containing the oracles.

Oracles must be written in C++ language and stored in a file having “.C” (uppercase) extension. Oracles referring to the same builtin may be even stored in different files; moreover, each file may contain oracles associated to different builtins. However, for the sake of style and clarity, maybe all oracles defining a builtin should stay together.

Within the source file(s), all the functions defining oracles must be enclosed in the directive:

#ifdef __cplusplus

extern "C" {

#endif

…………………

#ifdef __cplusplus

}

#endif

This allows to compile the functions with C++ code rather than standard C code. For further information we suggest to have a look at http://www.isotton.com/howtos/C++-dlopen-mini-HOWTO.html.

Moreover, the user must necessarily include the extpred.h file, which contains the definition of the BUILTIN macro, the interface of the CONSTANT class and other stuffs useful for the compilation.

In the end, the source file  written by the user should apper as following.

//////////////////////////////////////////////////////////////////////////////

//Shared Library for the external builtin #fatt

//fatt_builtin.C

#include "extpred.h"

#ifdef __cplusplus

extern "C" {

#endif

BUILTIN(fatt,ii)

{

if( argv[0].isInt() && argv[1].isInt() )

            {

            // Convert the CONSTANT data to what we do have to compute

int x= argv[0].toInt();

int y= argv[1].toInt();   

            // Just perform the computation...

int fatt=1;

for(int i = 1; i <= x; i++)

fatt=fatt*i;

if( fatt == y)

return true;

}

      return false;

}

BUILTIN(fatt,io)

{

if( argv[0].isInt() )

            {

            // Convert the CONSTANT data to what we do have to compute

int x = argv[0].toInt();

// Just perform the computation...

int fatt = 1;

for(int i = 1; i <= x; i++)

            fatt=fatt*i;

// Convert the result into the proper CONSTANT data

// Save the result onto the output vector

            argv[1] = CONSTANT( fatt );

            return true;

            }

return false;

}

BUILTIN(fatt,oi)

// NOTE: The oracle has sense only for x>=2.

{

if( argv[1].isInt() && argv[1].toInt() >= 2 )

            {

// Convert the CONSTANT data to what we do have to compute

int x = argv[1].toInt();

// Just perform the computation...

            int fatt = 1;

for(int i = 1; fatt <= x; i++)

                  {

fatt = fatt*i;

if( fatt == x )

                        {

                        // Convert the result into

// the proper CONSTANT data.

// Save the result onto the output vector

argv[0] = CONSTANT(i);

return true;

                        }

                  }

            } 

return false;

}

#ifdef __cplusplus

}

#endif

Of course, the order of the oracles in the file is not influential. The only constraint is that the user MUST define the “base” oracle: the one corresponding to the pattern having only i’s. (ii, in this case; ii...i, in general).

The library is built from the source file. In order to help the user, all needed commands can be performed through a script called dynamic, which we provide. In our example, let’s imagine to have stored the three oracles we’ve written in a single file, say it fatt_builtin.C. Therefore, it is enough to launch the following command:

$ ./dynamic fatt_builtin

omitting the extension “.C”. As a result, if no errors arise, in the current directory, will appear the library file fatt_builtin.so. Please note it is a shared, dynamically loaded library. For further information on this topic you can have a look at http://www.tldp.org/HOWTO/Program-Library-HOWTO/introduction.html.

In order to use the libary, and therefore, in order to invoke  the proper oracle, the user must include the library within the DLV program.

We can exploit all the examples in a single DLV program. Once the library fatt_builtin.so is defined we can write:.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% A DLV program with external builtins

#include <fatt_builtin>

factorial:- #fatt(5,120).

number_1(5).

factorial_1(X,Y):- number_1(X), #fatt(X,Y).

number_2(120).

factorial_2(X,Y):- number_2(Y), #fatt(X,Y).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

The inclusion directive, in the DLV program, assumes that the fatt_builtin.so library is stored in the default directory, that is LIB. Moreover, we suppose that the LIB directory is created in the working directory.

However, it is possible to specify a different path from that of default. Please, see the EXTERNAL BUILTINS USER MANUAL for more information.

NOTE: CONSTANT is a public class defined in extpred.h and, as already said, it provides what is needed in order to manage parameters for the C++ defined oracles and manage base-types from/to dlv/C++. The methods provided are the following:

First method returns an integer of according value, while the second and third methods are used to cast from strings (encoded by dlv) to standard C++ string (i.e. char*) types.

Below a simple example try to clarify the difference between a DLV string and a DLV symbolic string. 

For a fully detailed description of the CONSTANT class and the methods it provides please refer to the BUILTINS USER MANUAL.

Second Example: Managing strings

In this example we will define a library providing builtins for evaluating string length or extract substrings. We will define #string_length, binary, having two oracles: one, referring to the ii pattern, verifies whether the second term is the length of the string appearing as first term, and the other, referring to the io pattern, bounds the second term to the length of the string appearing as first term. Then we will define #substring, having arity 4, with two oracles: iiii which verifies the consistency of bounded terms, and iiio which given a string, an index inside it, and a length, generates a proper substring.

In this example we manage only quoted string without taking care of symbols; of course this is not mandatory. The file managing_string.C” follows.

//////////////////////////////////////////////////////////////////////////////

//managing_string.C

//Dynamic Library for the external builtin #string_length

//#string_length(char* str, int lenght)

// length contains the number of characters of string str

// str is a char*

#include "extpred.h"

#include "string.h"

#ifdef __cplusplus

extern "C" {

#endif

BUILTIN(string_length,ii)

    {

    if( argv[0].isString()&& argv[1].isInt() )

          {

            // Convert the CONSTANT data to what we do have to compute

            // Convert argv[0] in an object of string class

        string str(argv[0].toString());

        int length= argv[1].toInt();

        // Verify...

        int l=str.size();

        if( length == l)

            return true;

        else

            return false;

        }

    else

        return false;

    }

BUILTIN(string_length,io)

    {   

    if( argv[0].isString() )

        {

        // Convert the CONSTANT data to what we do have to compute

        string str(argv[0].toString());   

        //Perform...

        int length = str.size() ;

        // Save the result onto the output vector

        argv[1]=CONSTANT(length);

   

        return true;

        }

    else

        return false;

    }

//#substring(string str,int pos,int len,string substr)

//substr is the substring of string str starting at position

//pos and of length len. The position is counted from zero.

BUILTIN(substring,iiii)

    {

    if( argv[0].isString() && argv[1].isInt() && argv[2].isInt() && argv[3].isString() )

        {

        // Convert the CONSTANT data to what we do have to compute

        string str (argv[0].toString());

        int position = argv[1].toInt();

        int length = argv[2].toInt();

        string sub_string (argv[3].toString());

        int length_str = str.length();

        //catch ungestible situation and return false

        if( length <= 0 || length > length_str || position > length_str)

            return false;

            else{

            // Verify...

            string ss(str.substr(position,length));

            if(ss == sub_string)

                return true;

            else

                return false;

            }

        }

    return false;

    }

BUILTIN(substring,iiio)

    {

      if( argv[0].isString() && argv[1].isInt() && argv[2].isInt() )

        {

            // Convert the CONSTANT data to what we do have to compute

            string  str   =   string( argv[0].toString());

            int position  =   argv[1].toInt();

            int length    =   argv[2].toInt();   

            int length_str = str.length();

            if( length <= 0   || length > length_str

|| position > length_str - length)

                  return false;

            else{

                  //Perform...only if checked

                  string sub_string = str.substr(position,length);   

                  // Convert the result into the proper CONSTANT data

                  // Save the result onto the output vector

                  argv[3]=CONSTANT(sub_string.c_str(),true);

                  int l = sub_string.length();

                  if( l == length)

                        return true;

                  else

                        return false;

                  }

            }

      return false;

      }

#ifdef __cplusplus

}

#endif

//////////////////////////////////////////////////////////////////////////////

Let’s suppose that the file is in the default folder LIB. In order to compile and obtain the shared library let’s perform

$~/LIB> ./dynamic string_length

and in the LIB folder together with some object files will appear the file string_length.so.

Let’s see now a possibile DLV program exploiting the library. We put i tinto the file strings.dl:, which follows.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% strings.dl

%%

#include < managing_string >

%% Given a string of length N, the suffix at position i is the substring

%% obtained from the original string dropping the first N characters.

%% In this example we have:

%%    suffix("californication",0)

%%    suffix("alifornication",1)

%%    suffix("lifornication",2)

%%    suffix("ifornication",3)

%%    suffix("fornication",4)

%%    suffix("ornication",5)

%%    suffix("rnication",6)

%%    suffix("nication",7)

%%    suffix("ication",8)

%%    suffix("cation",9)

%%    suffix("ation",10)

%%    suffix("tion",11)

%%    suffix("ion",12)

%%    suffix("on",13)

%%    suffix("n",14)

#maxint=15.

string("californication").

suffix1(N):- string(Str), #string_length(Str,N).

suffix(Suff,Position):- string(Str), #string_length(Str,N),

                        #int(Position), Position >= 0, Position < N,

                        N = Position + Lsuffix,

                        #substring(Str,Position,Lsuffix,Suff).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Invokind DLV with, at the command prompt while in the current folder,

$~/> ./dl strings.dl –filter=suffix

the (unique) answer set returned is:

{suffix("californication",0), suffix("alifornication",1), suffix("lifornication",2), suffix("ifornication",3), suffix("fornication",4), suffix("ornication",5), suffix("rnication",6), suffix("nication",7), suffix("ication",8), suffix("cation",9), suffix("ation",10), suffix("tion",11), suffix("ion",12), suffix("on",13), suffix("n",14)}

Third Example: Crypt string

In the third example we will define a quite simple library in order to crypt/decrypt strings. The idea is that in

#crypt(string1, string2, key)

String2 is the result of string1 ciphered on the base of key key. Our very simple algorithm ciphers the string adding to each character a proper offset: to the first character will be added an offset equal to key, to the second an offset equal to key+1, then key+2 to the third, and so on. For instance, if string1 is bounded to  unical then string2 will will be ysojiu. Intuitively, the inverse is easily done in order to decrypt. The builtin manipulates only symbolic constants containing characters appearing into the ASCII table between a and z (i.e., 97 and 122), decimal notation. Of course oracles may be written in order to manage quoted string; in this cas attention must be paid to quotes.

The crypt.so library of this example contains the three oracles:

The base oracle (iii) simply checks whether the two strings are coupled with respect to the key. The second one (ioi) binds the second term to the the result of ciphering the first term with respect to the key; the third (oii) binds the first term to the deciphered string which binds the second term, again with respect to the key which binds the third term. The source file for the library follows.

//////////////////////////////////////////////////////////////////////////////

// crypt.C

// Dynamic Library for the external builtin #crypt

// #crypt(char*, char*, int)

#include "extpred.h"

#include <stdio.h>

#ifdef __cplusplus

extern "C" {

#endif

BUILTIN(crypt,iii)

    {

    if( argv[0].isSymbol() && argv[1].isSymbol() && argv[2].isInt())

        {

        // Convert the CONSTANT data to what we do have to compute

        // Convert argv[0] in an object of string class

        const char *name = argv[0].toSymbol();

        int l = strlen( name );

        // Convert argv[2] in an object of string class

        int key = argv[2].toInt();

        // The crypted string to be valuted

        char *crypt_name= new char[ l+1 ];

     

        int i=0;

        for( ; i < l; i++)

            crypt_name[i] = 97 + ( name[i] - 97 +  key + i ) % 26 ;

        crypt_name[i] = '\0';

       

        if( strcmp(crypt_name, argv[1].toSymbol())==0 ) 

          return true;

        }

    return false;

    }

BUILTIN(crypt,ioi)

    {

    if( argv[0].isSymbol() && argv[2].isInt())

        {

        const char *name = argv[0].toSymbol();

        int l = strlen( name );

        int key = argv[2].toInt();

        char *crypt_name= new char[ l+1 ];

     

        int i=0;

        for( ; i < l; i++ )

            crypt_name[i] =  97 + ( name[i] - 97 + key + i ) % 26 ;

     

        crypt_name[i] = '\0';

        argv[1] = CONSTANT( crypt_name );

        return true;

        }

    return false;

    }

BUILTIN(crypt, oii)

    {

    if( argv[1].isSymbol() && argv[2].isInt() )

        {

        const char *name = argv[1].toSymbol();

        int l = strlen( name );

        char *decrypt_name= new char[ l+1 ];

        int key = argv[2].toInt();

     

        int i=0;

        for( ; i < l; i++)

            decrypt_name[i] = 97 + ( 26 + name[i] - 97 - key - i ) % 26;

     

        decrypt_name[i] = '\0';

        argv[0] = CONSTANT( decrypt_name );

        return true;

      }

    return false;

    }

#ifdef __cplusplus

}

#endif

//////////////////////////////////////////////////////////////////////////////

Let’s have a quick look at the code. Please note that, as always suggested, each oracle starts with a type check on elements of argv.

The first oracle (iii, the base one) simpy verifies the correctness of the tuple constituted by the bound terms. It builds the enciphered string starting from the first and the third terms (argv[2] and argv[2]) and checks whether it corresponds to the second term (argv[1]).

The second oracle (ioi) act as the firts, but the enciphered string instead of being used for checking is put on argv[1] in order to bind the corresponding term once the control returns to DLV.

The last oracle (oii) just performs the deciphering and put the result on argv[0].

Now suppose that the file crypt.C is into the default LIB folder. Of course, you can freely change folder, but remember to have the files extpred.h and extpred.o also.

In order to build the library crypt.so let’s perform, while in LIB folder:

$ ./dynamic crypt

The LIB folder will actually contain also the file crypt.so.0. This file is needed to generate the donamic library, so dont’t dolete it.

 

The following is a sample DLV program which exploits the #crypt builtin.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

#include < crypt >

cocktail(mojito).

cocktail(mai_tai).

cocktail(daiquiri).

cocktail(long_island_iced_tea).

crypt(Y) :- cocktail(X), #crypt(X,Y,4).

is_crypt :-  #crypt(unical,ysojiu,4).

decrypt(X) :- #crypt( X , ysojiu,4).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Here the extension of crypt(Y) contains the ciphered strings obtained from cocktail(X), is_crypt is derived if ysojiu is the ciphered result of unical with key 4, and decrypt(X) deciphers ysojiu. Running DLV the model obtained is

{crypt(qtppbx), crypt(qfofbjs), crypt(hfoxcrbt), crypt(pttngrcwmarnytwwsoax), is_crypt, decryp(unical)}.


dlv-ex@mat.unical.it
Last modified 2004-oct-28