A brief introduction

to definite clause grammars

and definite clause translation grammars

A working paper prepared for the W3C XML Schema Working Group

C. M. Sperberg-McQueen

18 July 2004

This document provides a brief introduction to logic grammars, as background information for work on the representation of XSD schemas in logic grammar form.
The term logic grammar is used to denote grammars written in logic-programming systems; this paper describes both definite-clause grammars (DCGs), which are a built-in part of most Prolog systems, and definite-clause translation grammars (DCTGs), which employ a similar formalism. DCTGs closely resemble attribute grammars as described by [Knuth 1968] and later writers and it is a bit easier to handle complex or voluminous attribute value functions in DCTGs than in DCGs.
Both DCGs and DCTGs can be regarded as syntactic sugar for straight Prolog; before execution, both notations are translated into Prolog clauses in the usual notation.

1. Definite-clause grammars

This section reviews the basics of definite-clause grammars and their notation. I'll present a simple grammar in two or three versions, to illustrate the use of DCG notation.
Readers who don't find this introduction detailed enough may consult one of the Prolog books listed among the references.

1.1. Grammar E1: A trivial grammar for a fragment of English

Here is a simple DCG for a trivially small fragment of English, which I'll call E1. It is taken from example 8.1 in [König/Seiffert 1989], an introduction to Prolog for linguists:
%%% Grammar E1, a trivial fragment of English
s --> np, vp.  % A sentence (s) is a noun phrase (np) 
               % plus a verb phrase
np --> det, n. % A noun phrase is a determiner plus a noun
np --> n.      % ... or just a noun.
vp --> v, np.  % A verb phrase is a verb and its direct 
               % object, which is an np
vp --> v.      % ... or just the verb (for intransitives).
n --> [mary].  % 'mary', 'john', 'woman, 'man', 'apple' 
               % are nouns.
n --> [john].
n --> [woman].
n --> [man].
n --> [apple].
det --> [the]. % 'the' is a determiner
v --> [loves]. % 'loves', 'eats', and 'sings' are verbs.
v --> [eats].
v --> [sings].
This grammar generates sentences like “John loves Mary” and “The man sings” and “The woman eats the apple”, as well as the somewhat less plausible “The apple sings the Mary.”
When a Prolog system consults a file containing this grammar, the rules of the grammar are automatically translated into canonical Prolog clauses using difference lists:
?- listing([s,np,vp,n,det,v]).

s(A, B) :-
        np(A, C),
        vp(C, B).

np(A, B) :-
        det(A, C),
        n(C, B).
np(A, B) :-
        n(A, B).

vp(A, B) :-
        v(A, C),
        np(C, B).
vp(A, B) :-
        v(A, B).

n([mary|A], A).
n([john|A], A).
n([woman|A], A).
n([man|A], A).
n([apple|A], A).

det([the|A], A).

v([loves|A], A).
v([eats|A], A).
v([sings|A], A).

Difference lists are pairs of lists such that the second list is a suffix of the first list (i.e. the same from some point or other to the end), for example, the pair [a, b, c, d] and [c, d], which together denote the list [a, b], or (among the terminal rules above) [loves|A] and A, which together denote the list [loves]. Prolog-based parsers commonly use difference lists to represent the input; they allow the Prolog clauses which recognize structures in the input list to consume as much of the input list (the first list in the pair) as they wish, and pass the rest of the input (the second list in the pair) to the next clause. If a DCG predicate succeeds with its final two arguments unified to the difference-list pair [a, b, c, d] and [c, d], what it means is that the grammar rule recognized the list [a, b].
In the clause for s, we can see how one part of the parser picks up where another leaves off. The input list is A, and the np predicate recognizes a noun phrase at the beginning of A. The portion of the list which follows the noun phrase is assigned to C. The predicate vp then tries to recognize a verb phrase at the beginning of list C. What is left after the verb phrase is assigned, in turn, to list B, which is both ‘the part of the input list after the verb phrase’ (in the predicate vp) and (because the verb phrase can be the last item in a sentence) also ‘the part of the input after the s’ (in the predicate s).
For example, given the sentence [the, woman, eats, the, apple], the first rule for np will be called with its first argument bound to the entire input string; after it recognizes the initial noun phrase, the second argument will be bound to the part of the sentence which follows the noun phrase, namely [eats,the,apple]. The predicate vp will then recognize the verb phrase [eats,the,apple], leaving the empty list [] as the part of the input after the sentence.[1]

1.2. Calling the parser

The following fragment of a Prolog session log shows how the grammar can be used to test the grammaticality of sentences; it also shows that the grammar could use some refinement. To test whether a given sequence of tokens is an s, we call the Prolog predicate s with two difference-list arguments. If we specify the empty list as the second argument, we in effect require that the entire input sequence be a legal sentence.
?-  s([john,loves,mary],[]).

?- s([the,woman,eats,the,apple],[]).

?- s([the,man,sings],[]).

?- s([apple,eats,woman,the],[]).

?- s([apple,sings,the,mary],[]).

We can use a variable as a second argument, instead of the empty list; in this case, Prolog will tell us each prefix of the input list which can be parsed as a sentence according to our grammar:
?- s([the,woman,eats,the,apple],Remainder).

Remainder = [] ;

Remainder = [the, apple] ;

In one parse, the entire input is parsed as a sentence, and the remainder is the empty list. An alternative parse, however, is just to take [the,woman,eats] as a sentence, leaving [the,apple] as the remainder.

1.3. Grammar E2: A simple attribute grammar

An attribute grammar is a context-free grammar in which the nodes of the parse tree carry attributes (annotations in the form of name-value pairs). The values of attributes are calculated according to specifications associated with the production rules of the context-free grammar. Attribute values may propagate top-down (inherited attributes) or bottom-up (synthesized attributes) in the parse tree.
An accessible formal description of attribute grammars is given by [Alblas 1991]. Informally, attribute grammars are sometimes described as context-free grammars with parameters on the non-terminals.
We can turn the simple context-free grammar E1 into an attribute grammar by adding parameters to the productions; I will call the attribute grammar E2. In E2, one parameter is used to generate a structural description of the sentence, another to check for subject-verb agreement in grammatical number.
Thus extended with parameters, and a few more words in the lexicon, the grammar looks like this:
%%% E2: a trivial attribute grammar for a fragment of
%%% English, with a synthesized attribute for structural 
%%% description and enforcement of number agreement.

s(s(S,P)) --> np(S,Num), vp(P,Num).
np(np(D,N),Num) --> det(D,Num), n(N,Num).
np(np(N),pl) --> n(N,pl).
np(np(N),sg) --> pn(N,sg).
vp(vp(V,O),Num) --> v(V,Num), np(O,_).
vp(vp(V),Num) --> v(V,Num).
n(n(L),Num) --> [L], { lex(L,n,Num) }.
pn(pn(L),Num) --> [L], { lex(L,pn,Num) }.
v(v(L),Num) --> [L], { lex(L,v,Num) }.
det(det(L),Num) --> [L], { lex(L,det,Num) }.

When parsed with this grammar, the sentence “John loves Mary” is assigned the structure s(np(pn(john)), vp(v(loves), np(pn(mary)))) — or, with pretty-printing:
Here, the grammatical structure for each non-terminal on the left-hand side of a grammar rule is synthesized from the grammatical structures of the non-terminals on the right-hand side of the rule. So it's a synthetic attribute.
On some productions, a second parameter is used to enforce agreement in number between subject and verb and between determiner and noun. Here, Prolog's unification mechanism makes it unnecessary to distinguish between synthetic and inherited attributes. The sentence is grammatical if and only if the number (Num) of the np and that of the vp agree; whether the attribute flows bottom-up on each of these, or bottom-up on one and top-down on the other need not concern us.
In this version of the grammar, it is worth calling the reader's attention to the Prolog clauses enclosed in braces on the right-hand side of some rules, e.g. in
n(n(L),Num) --> [L], { lex(L,n,Num)}.
These Prolog clauses express constraints on the grammatical construction which are not themselves grammatical constituents. They are sometimes called ‘guards’ (e.g. by [Brown/Blair 1990]); they correspond directly to the ‘semantic conditions’ which are a constituent part of attribute grammars as described by [Alblas 1991]. With the guard, this rule can be paraphrased as saying “A noun (an n), has the structure n(L) and the grammatical number Num, if (a) L (for ‘lexical form’) is a single item in the input list [L], and (b) the relation lex contains the triple (L,n,Num).”
Such guards will be present in many of the rules we use to represent XML Schema in logic grammars.
By calling the grammar rules, we can instantiate a variable with the structural description of the sentence.
?- s(Struc,[john,loves,mary],[]).

Struc = s(np(pn(john)), vp(v(loves), np(pn(mary)))) 

?- s(S,[the,woman,eats,the,apple],[]).

S = s(np(det(the), n(woman)), vp(v(eats), np(det(the), n(apple)))) 

?- s(S,[the,man,sings],[]).

S = s(np(det(the), n(man)), vp(v(sings))) 

?- s(S,[apple,sings,the,mary],[]).

?- s(S,[the,apple,sings,mary],[]).

S = s(np(det(the), n(apple)), vp(v(sings), np(pn(mary)))) 

Note that in a ‘conventional’ use of DCGs, the user or application will call the grammar just once, at the top level, and the parser itself will generate recursive calls for the nested grammatical constructs (e.g. np and vp in our example). Note, however, that we can if we wish make calls to any level of the grammar from arbitrary Prolog code. Note also that we can include arbitrary Prolog code in the right-hand side of rules by wrapping it in braces. By doing so, we can make arbitrarily complex calculations part of the grammar rules; this is one reason that DCGs are not limited to context-free languages.

1.4. AG1: a programming-language example

In his introduction to attribute grammars ([Alblas 1991]), Henk Alblas provides two simple examples, which will be reproduced here in definite-clause form. Both examples illustrate how attributes can be used to record type information, and detect type errors, in the expression sublanguage of a programming language.
The first example, which Alblas labels AG1, infers the overall type of a numeric expression by means of synthesized (bottom-up) attributes.
An assignment consists of a variable name, an assignment symbol (‘gets’), and an expression. If the variable's type is integer, and the expression has type real, then an error should be raised.
< 1 AG1 in DCG form [File ag1.dcg] > ≡
/* A DCG version of the attribute grammar AG1 presented in
 * [Alblas 1991].

/* 1.  The grammar */
assignment --> variable(Vtype), gets, expr(Etype),
    { (Vtype = int, Etype = real)
      -> error('Type of right hand side not convertible to type of left-hand side')
      ;  true }.
Continued in <Expressions 2>, <Terms 3>, <Simple terms 4>, <Factors 5>, <Addition operators 6>, <Multiplication operators 7>, <Punctuation 8>, <Variables 9>, <Error messages 10>, <Test data 11>

An expression is a series of terms connected by additive operators. Note that the grammar in [Alblas 1991] is left-associative, but for simplicity I have recast it to be right-associative, since definite-clause grammars have trouble with left recursion.[2] The type of the sum is real if either operand is real; if both are integer, so is the expression.
< 2 Expressions [continues 1 AG1 in DCG form] > ≡
expr(T) --> term(T1), addop(_Op), expr(T2),
	{ T1 = real -> T = real ;  T = T2 }.

expr(T) --> term(T).

A term is a series of factors, joined by multiplication operators. There are three operators, with different type rules:
  • multiplication produces a real result if either operand is real, otherwise an integer result
  • division produces a real result in all cases
  • the mod operator requires integer operands and produces an integer result
< 3 Terms [continues 1 AG1 in DCG form] > ≡
term(T) --> factor(T1), mulop(Op), term(T2),
	{ (Op = mul)
	  -> (T1 = real -> T = real ; T = T2)
        ; (Op = div)
	  -> T = real
	; (Op = mod)
	  -> T = int, 
	     (T1 = real -> error('left operand should be of type integer') ; true),
	     (T2 = real -> error('right operand should be of type integer') ; true)

If there is only one factor in the series, the type of the term is the same as the type of the factor.
< 4 Simple terms [continues 1 AG1 in DCG form] > ≡
term(T) --> factor(T).

Factors are either variables, or parenthesized expressions. Their type is copied up from that of the variable or sub-expression.
< 5 Factors [continues 1 AG1 in DCG form] > ≡
factor(T) --> variable(T).

factor(T) --> lparen, expr(T), rparen.

The terminal symbols for the addition operators are the obvious ones.
< 6 Addition operators [continues 1 AG1 in DCG form] > ≡
addop(add) --> ['+'].
addop(sub) --> ['-'].

There are three surface forms for the multiplication operator, and two for division.
< 7 Multiplication operators [continues 1 AG1 in DCG form] > ≡
mulop(mul) --> ['*'].
mulop(mul) --> ['times'].
mulop(mul) --> ['×'].
mulop(div) --> ['/'].
mulop(div) --> ['÷'].
mulop(mod) --> ['mod'].

The simple punctuation symbols used in this language could easily be written directly into the productions where they occur, but we'll follow instead the practice of defining a pre-terminal symbol for each of them.
< 8 Punctuation [continues 1 AG1 in DCG form] > ≡
gets --> [':='].
lparen --> ['('].
rparen --> [')'].

Alblas's grammar does not deal with the assignment of types to variables; a type (integer or real) is associated with each variable, but the mechanism is out of band. In most languages, declarations would be used for this purpose. For testing purposes, we'll use a simplistic mechanism like the old Fortran rule that the type depends on the initial character of the variable name.
< 9 Variables [continues 1 AG1 in DCG form] > ≡
variable(int) --> [i].
variable(int) --> [j].
variable(int) --> [k].
variable(real) --> [x].
variable(real) --> [y].
variable(real) --> [z].
variable(int) --> [a].
variable(int) --> [b].
variable(int) --> [c].

Similarly, solely for testing purposes we'll define a simplistic error-message predicate.
< 10 Error messages [continues 1 AG1 in DCG form] > ≡
/* 2.  Miscellaneous utilities (for something like verisimilitude) */
error(E) :- nl, write('!! '), write_error_message(E), write(' !!'), nl.
/* if we want to fail on an error, add !, fail after the nl. */

write_error_message(E) :- atom(E), write(E).
write_error_message([H|T]) :- 

And finally, it's useful to have some test data associated with the grammar, not at all an exhaustive set of tests, but some useful sanity checks.
< 11 Test data [continues 1 AG1 in DCG form] > ≡
/* Tests */
/* to walk through these one by one, type 
 *   testdata(Test), assignment(Node,Test,[]), write(Test), nl
 * (or the equivalent) at the Prolog prompt, and then keep hitting ; to see
 * new test data.
testdata([a, :=, b]).
testdata([a, :=, y]).
testdata([z, :=, y]).
testdata([z, :=, a]).
testdata([z, :=, a, /, '(', b, +, y, ')' ]).
testdata([z, :=, a, mod, '(', b, +, y,')' ]).
testdata([z, :=, a, mod, '(', b, +, c,')' ]).
testdata([z, :=, z, mod, '(', b, +, c,')' ]).
testdata([z, :=, z, *, '(', y, +, z,')' ]).
testdata([a, :=, a, mod, '(', b, +, c,')' ]).

2. Definite clause translation grammars

For some applications, it's useful to annotate the notes of the parse tree with information relevant to the application. As was just shown, it's possible to provide such annotations using DCG notation, but it rapidly becomes cumbersome. DCTG notation was devised to handle grammatical attributes more conveniently than DCG, and to separate the semantics more effectively from the syntax [Abramson 1984]. It has also been applied to discussions of SGML semantics ([Brown/Blair 1990], [Brown/Wakayama/Blair 1992]).

2.1. Introduction to DCTG notation

Here is a simple example of DCTG notation. Consider the following context-free grammar for binary strings:
bit ::= '0'.
bit ::= '1'.
bitstring ::= '' /* nothing */.
bitstring ::= bit, bitstring.
number ::= bitstring, fraction.
fraction ::= '.', bitstring.
fraction ::= ''.
We might wish to calculate the length and the (unsigned base-two) value of the bitstring as attributes. Using a yacc-like notation that might look like this. Notice that scale is a top-down attribute and value and fractionalvalue are bottom-up attributes.
bit ::= '0' { $0.bitvalue = 0; }.
bit ::= '1' { $0.bitvalue = power(2,$0.scale); }.
bitstring ::= '' { 
  $0.value = 0; 
  $0.length = 0;
  /* scale doesn't matter here */
bitstring ::= bit, bitstring {
  $0.length = $2.length + 1;
  $1.scale = $0.scale;
  $2.scale = $0.scale - 1;
  $0.value = $1.value + $2.value;
number ::= bitstring, fraction {
  $1.scale = $1.length - 1;
  $0.value = $1.value + $2.fractionalvalue;
fraction ::= '.', bitstring {
  $2.scale = -1;
  $0.fractionalvalue = $2.value;
fraction ::= '' {
  $0.fractionalvalue = 0;
In DCTG notation, this grammar looks like this:[3]
bit ::= [0]
  <:> bitval(0,_).

bit ::= [1]
  <:> bitval(V,Scale) ::- V is **(2,Scale).

bitstring ::= []
  <:> length(0)
  &&  value(0,_).

bitstring ::= bit^^B, bitstring^^B1
  <:> length(Length) ::-
        B1 ^^ length(Length1),
        Length is Length1 + 1
  &&  value(Value,ScaleB) ::-
        B ^^ bitval(VB,ScaleB),
        S1 is ScaleB - 1,
        B1 ^^ value(V1,S1),
        Value is VB + V1.

number ::= bitstring ^^ B, fraction ^^ F
  <:> value(V) ::-
        B ^^ length(Length),
        S is Length-1,
        B ^^ value(VB,S),
        F ^^ fractional_value(VF),
        V is VB + VF.

fraction ::= ['.'], bitstring ^^ B
  <:> fractional_value(V) ::-
        S is -1,
        B ^^ value(V,S).

fraction ::= []
  <:> fractional_value(0).
As may be seen, each rule in DCTG consists of a left hand side, a right-hand side, and optionally a list of attributes. The right-hand side is separated from the left-hand side by the operator ::= and from the following list of attributes (if any) by the operator <:>. The expression on the right hand side is a sequence of non-terminal symbols, each optionally suffixed with the operator ^^ and a variable name (e.g. bitstring^^B). Each attribute is identified by a Prolog structure, e.g. value(V), whose functor is the name of the attribute and whose arguments are its values. The attribute structure may be followed by the operator ::- (designed to look a lot like the standard Prolog operator :- or ‘neck’) and a series of goals which will be satisfied when the attribute value is to be instantiated. In practice, these goals help to calculate the attribute values. Values of attributes attached to the items on the right-hand side may be referred to by the variable name associated with the item and the name of the attribute, joined by the operator ^^, as for example in B ^^ bitval(VB,ScaleB).
A partial EBNF grammar for DCTG notation[4] is:
grammar ::= (clause | directive | rule)*
rule    ::= lhs '::=' rhs ('<:>' att-spec ('&&' att-spec)*)?
lhs     ::= term
rhs     ::= term (',' term)*
attspec ::= compound-term ('::-' goal (',' goal)*)?
compound-term ::= ATOM '(' term (',' term)* ')'

clause  ::= head ':-' body

2.2. Another example: the English grammar fragment E1

Here is another simple example: the trivial fragment of English grammar given above, translated directly into DCTG notation, looks like the following. The only difference from the DCG form is that the separator between the left- and right-hand side of each production is “::=” instead of “-->”.
%%% E1 (trivial context-free grammar for a fragment of English)
%%% in DCTG notation.
s ::= np, vp.
np ::= det, n.
np ::= n.
vp ::= v, np.
vp ::= v.
n ::= [mary].
n ::= [john].
n ::= [woman].
n ::= [man].
n ::= [apple].
det ::= [the].
v ::= [loves].
v ::= [eats].
v ::= [sings].
The translation into standard Prolog is similar to that used for DCGs, but instead of adding two arguments to each non-terminal, the DCTG translation adds three. The first additional argument is a Prolog structure with the functor node, described further below. The second and third are (like the two arguments added in DCG translations) difference lists.
The node structure has three arguments:
  1. the non-terminal of the grammar rule (here s, np, vp, etc.)
  2. a list of the node structures associated with the items on the right-hand side of the grammar rule
  3. a list of grammatical attributes (in this grammar, this list will be empty)
The translation of our trivial grammar into standard Prolog is thus:
?- dctg_reconsult('ks81dctg.pl').

?- listing([s,np,vp,n,det,v]).

:- dynamic s/3.

s(node(s, [A, B], []), C, D) :-
        np(A, C, E),
        vp(B, E, D).

:- dynamic np/3.

np(node(np, [A, B], []), C, D) :-
        det(A, C, E),
        n(B, E, D).
np(node(np, [A], []), B, C) :-
        n(A, B, C).

:- dynamic vp/3.

vp(node(vp, [A, B], []), C, D) :-
        v(A, C, E),
        np(B, E, D).
vp(node(vp, [A], []), B, C) :-
        v(A, B, C).

:- dynamic n/3.

n(node(n, [[mary]], []), A, B) :-
        c(A, mary, B).
n(node(n, [[john]], []), A, B) :-
        c(A, john, B).
n(node(n, [[woman]], []), A, B) :-
        c(A, woman, B).
n(node(n, [[man]], []), A, B) :-
        c(A, man, B).
n(node(n, [[apple]], []), A, B) :-
        c(A, apple, B).

:- dynamic det/3.

det(node(det, [[the]], []), A, B) :-
        c(A, the, B).

:- dynamic v/3.

v(node(v, [[loves]], []), A, B) :-
        c(A, loves, B).
v(node(v, [[eats]], []), A, B) :-
        c(A, eats, B).
v(node(v, [[sings]], []), A, B) :-
        c(A, sings, B).

The predicate dctg_reconsult(File) is used to translate a DCTG grammar into Prolog clauses and load them; it is provided by [Abramson/Dahl/Paine 1990] and is available from a variety of sources on the net.[5]
A short terminal session should make the nature of the results a bit clearer:[6]
?- s(S,[john,loves,mary],[]), write(S).
           [node(n, [[john]], [])], 
           [node(v, [[loves]], []), 
                 [node(n, [[mary]], [])], 

S = node(s, [node(np, [node(n, [[john]], [])], []), 
node(vp, [node(v, [[loves]], []), node(np, [node(n, 
[...], [])], [])], [])], []) 

?- s(S,[the,woman,eats,the,apple],[]), write(S).
           [node(det, [[the]], []), 
            node(n, [[woman]], [])], 
           [node(v, [[eats]], []), 
                 [node(det, [[the]], []), 
                  node(n, [[apple]], [])], 

S = node(s, [node(np, [node(det, [[the]], []), node(n, 
[[woman]], [])], []), node(vp, [node(v, [[eats]], []), 
node(np, [node(det, [...], []), node(..., ..., ...)], 
[])], [])], []) 

?- s(S,[the,man,sings],[]), write(S).
           [node(det, [[the]], []), 
            node(n, [[man]], [])], 
           [node(v, [[sings]], [])], 

S = node(s, [node(np, [node(det, [[the]], []), 
node(n, [[man]], [])], []), node(vp, [node(v, 
[[sings]], [])], [])], []) 

The node structure constructed in the first added argument resembles and serves much the same purpose as the structure attribute used in E2, the attribute-grammar version of the English grammar fragment.

2.3. English grammar with attributes (E2) in DCTG notation

The fragment of English grammar E2, which was presented earlier to illustrate the use of DCGs for attribute grammars, may also be used to illustrate DCTGs.
Let's walk through the grammar. A sentence is an NP followed by a VP; we will call the NP S (‘subject’), and the VP we will call P (‘predicate’).
< 12 English grammar fragment with attributes [File ks92.dctg] > ≡
s ::= np^^S, vp^^P,
  { S^^number(Num), P^^number(Num) }
  <:> {Attributes for non-terminal s 13}

Continued in <NP rules 14>, <VP rules 19>, <Pre-terminal rules 22>, <Lexicon 23>

The goals enclosed in braces (S^^number(Num), P^^number(Num)) together express the constraint that the NP and VP must agree in number: the number attribute of the NP S and the number attribute of the VP P must unify with each other.
The non-terminal s has only one grammatical attribute; let us call it structure. When s is made up (as here) of an NP and a VP, we represent its structure by a Prolog term with s as its functor and the structure of the NP and VP as its two arguments:
< 13 Attributes for non-terminal s > ≡
  structure(s(Sstr,Pstr)) ::- 

This code is used in < English grammar fragment with attributes 12 >

A noun phrase (NP) is made up of a determiner and a noun; they must agree in number. This covers phrases like “the apple”, “the apples”, “one apple”, “some apples”. The agreement rule excludes the phrases “one apples” and “some apple”.[7]
< 14 NP rules [continues 12 English grammar fragment with attributes] > ≡
np ::= det^^D, n^^N,
  { D^^number(Num), N^^number(Num) }
  <:> {NP structure attribute (for DET+N) 15}

  &&  {NP number attribute (for DET+N) 16}

Continued in <NP rules, cont'd 17>, <NP rules, cont'd 18>

The non-terminal np has two attributes, named structure and number.
The structure of the NP is a Prolog term with the name of the non-terminal (np) as its functor and the constituents as the arguments. This is the pattern of the structure attribute on all non-terminals, and I won't comment on it again.
< 15 NP structure attribute (for DET+N) > ≡
  structure(np(Dstr,Nstr)) ::-

This code is used in < NP rules 14 >

The number attribute of the NP illustrates an important idiom: the guard in the syntactic part of the rule has already checked the number attributes of the determiner and the noun, to make sure they unify with each other; the variable Num has the value sg or pl already, and we don't need to do any more computation. We just say that whatever Num is, that's the grammatical number of the NP.
< 16 NP number attribute (for DET+N) > ≡

This code is used in < NP rules 14 >

Noun phrases can also take the form of a plural noun by itself, as in “Men love apples”.
< 17 NP rules, cont'd [continues 14 NP rules] > ≡
np ::= n^^N, { N^^number(pl) }
  <:> structure(np(Nstr)) ::-
  &&  number(pl).

The final form of NP recognized by this grammar is a singular proper noun by itself, as in “John loves Mary”.
< 18 NP rules, cont'd [continues 14 NP rules] > ≡
np ::= pn^^N, { N^^number(sg) }
  <:> structure(np(Nstr)) ::-
  &&  number(sg).

A verb phrase (VP) can include a direct object in the form of a noun phrase:
< 19 VP rules [continues 12 English grammar fragment with attributes] > ≡
vp ::= v^^V, np^^O
  <:> structure(vp(Vs,Os)) ::-
  &&  {Number attribute for VP 21}

Continued in <VP rules, cont'd 20>

or they can be just a verb with no direct object.[8]
< 20 VP rules, cont'd [continues 19 VP rules] > ≡
vp ::= v^^V
  <:> structure(vp(Vs)) ::-
  &&  {Number attribute for VP 21}

Although both the verb and the direct object have a number attribute, only that of the verb counts in determining the value of the number attribute for the VP as a whole.
< 21 Number attribute for VP > ≡
  number(Num) ::- V^^number(Num).

This code is used in < VP rules 19 > < VP rules, cont'd 20 >

Nouns, proper nouns, verbs, and determiners (the ‘pre-terminal’ categories of the grammar) all have rules of the same structure: a token in the string counts as one of these if the lexicon says it's one, and the number attribute has whatever value the lexicon gives.
< 22 Pre-terminal rules [continues 12 English grammar fragment with attributes] > ≡
n ::= [L], { lex(L,n,Num) }
  <:> structure(n(L))
  &&  number(Num).

pn ::= [L], { lex(L,pn,Num) }
  <:> structure(pn(L))
  &&  number(Num).

v ::= [L], { lex(L,v,Num) }
  <:> structure(v(L))
  &&  number(Num).

det ::= [L], { lex(L,det,Num) }
  <:> structure(det(L))
  &&  number(Num).

Finally, the lexicon. To keep things simple, the lexicon here is just a set of facts, with literal values.[9] The entry for the word the is the exception: it does not have a literal value, but the anonymous variable _ to indicate that the determiner the can be either sg or pl.
< 23 Lexicon [continues 12 English grammar fragment with attributes] > ≡

A session log illustrates the structure built by the DCTG rules:
?- s(S,[john,loves,mary],[]), write(S).
                 (number(sg)::-lex(john, pn, sg))])], 
           [ (structure(np(_G292))::-
                      (number(sg)::-lex(john, pn, sg))])^^structure(_G292)), 
                 (number(sg)::-lex(loves, v, sg))]), 
                       (number(sg)::-lex(mary, pn, sg))])], 
                     node(pn, [[mary]], 
                          (number(sg)::-lex(mary, pn, sg))])
           [ (structure(vp(_G351, _G352))::-
                node(v, [[loves]], 
                     (number(sg)::-lex(loves, v, sg))])^^structure(_G351), 
                     [node(pn, [[mary]], 
                           (number(sg)::-lex(mary, pn, sg))])], 
                         node(pn, [[mary]], 
                              (number(sg)::-lex(mary, pn, sg))])
                node(v, [[loves]], 
                     (number(sg)::-lex(loves, v, sg))])^^number(_G373))])], 
     [(structure(s(_G261, _G262))::-
              [node(pn, [[john]], 
                    (number(sg)::-lex(john, pn, sg))])], 
                  node(pn, [[john]], 
                       (number(sg)::-lex(john, pn, sg))])^^structure(_G292)), 
         node(vp, [node(v, [[loves]], 
                        (number(sg)::-lex(loves, v, sg))]), 
                        [node(pn, [[mary]], 
                              (number(sg)::-lex(mary, pn, sg))])], 
                            node(pn, [[mary]], 
                                 (number(sg)::-lex(mary, pn, sg))])
                  [(structure(vp(_G351, _G352))::-
                      node(v, [[loves]], 
                           (number(sg)::-lex(loves, v, sg))])
                      node(np, [node(pn, [[mary]], 
                                     (number(sg)::-lex(mary, pn, sg))])], 
                                   node(pn, [[mary]], 
                                        (number(sg)::-lex(mary, pn, sg))])
                      node(v, [[loves]], 
                           (number(sg)::-lex(loves, v, sg))])
In examining the structure just shown, the reader will note a great deal of apparent repetition; this results from the high incidence of structure sharing, which is not shown explicitly: in the grammar itself, the variables which have child nodes as values are used quite freely and appear both in the list of children and in the rules for calculating synthetic attributes.
Note also that the values of the attributes are not always pre-calculated: instead, the structure has the rules necessary to perform the evaluation on demand. In the example, the structure attribute for the pre-terminal categories is already completely grounded: it has no variables, but only literal values, while the structure attributes for the higher-level parts of the linguistic structure still have unbound variables.

2.4. DCTG attributes

It may be worth discussing the use of DCTG attributes a bit more and mentioning some unusual situations which may arise with them

2.4.1. Finding the value of an attribute at top level

We can find out the value of an attribute by using the ^^ operator between a node and a term matching one of its attributes. To parse a sentence and find the structure attribute of its top node, we might write
?- s(Sentence,[john,loves,mary],[]),
|    Sentence ^^ structure(Str).
This binds the variable Str to the appropriate value, namely (np(pn(john)), vp(v(loves), np(pn(mary)))).

2.4.2. Undefined attributes

If we request an attribute not present in the list of attributes, the term will fail and the variable won't be unified with anything. For example, if we forget that the sentence has only a structure attribute and attempt to bind a variable to its non-existent number attribute by writing Sentence ^^ number(Num), the result is simple:
?- s(Sentence, [john, loves, mary], []), 
|    Sentence ^^ number(Num).

2.4.3. Non-deterministic attributes

If there are two or more ways for the rule defining an attribute to succeed, then backtracking will turn up the alternate bindings in the usual way. If we have
and some attribute a is defined by the rule a(X) ::- man(X), then if we ask for the a attribute of some node N, the result is much as one might expect. In an interactive session, it might look like this:[10]
?- ... $N ^^ a(X).
X = john ;
X = bill ;
X = chuck ;

2.4.4. Attributes defined more than once

If the same attribute appears more than once in the list of attributes for a given node, the same surface behavior will occur. The definition which comes earliest in the list of attributes will be tried first, and backtracking will find alternate solutions for it, and then solutions using the later definitions.

2.4.5. Applying the ^^ operator to other data structures

It may also be useful to mention how the ^^ operator is defined. The code for ^^ allows it to be applied either to node(Non-terminal, Children, Atts) structures or to the Atts argument of such structures itself. Since the Atts argument itself is just a list of terms, some of them with ::- as the main functor and some of them without, we can create such a list and apply the ^^ operator to it whenever it is convenient. For example, we could define a fact this way:[11]
This represents the fact that a lecture course on complexity is given on Monday from 9 to 11 by David Harel in room A of the Feinberg building. We could also use a list of attributes to represent the same information:
course(complexity, [
The second approach allows us to use the ^^ operator:
?- course(complexity,CourseAtts),
|    CourseAtts^^start_time(ST),
|    CourseAtts^^end_time(ET),
|    Duration is ET - ST.
CourseAtts = [name(complexity), day(monday), 
              start_time(9), end_time(11), 
              lecturer(david, harel), 
              building(feinberg), room(a)]
ST = 9
ET = 11
Duration = 2
This usage is sometimes more convenient than using structures with a fixed arity; if nothing else, it relieves us of the burden of either remembering the exact argument sequence for complex structures of high arity or else writing a lot of simple value-extractor predicates of the form
lecturer(Course,GivenName,FamilyName) :-
It will also often be somewhat slower than structures with fixed arity, since the list structure gives the Prolog compiler very little help in finding the right rule, and extraction of attributes immediately degenerates into a linear search of the attribute list.

A. Works cited and further reading

Works cited

Abramson, Harvey. 1984. “Definite Clause Translation Grammars”. Proceedings of the 1984 International Symposium on Logic Programming, Atlantic City, New Jersey, February 6-9, 1984, pp. 233-240. (IEEE-CS 1984, ISBN 0-8186-0522-7)

Abramson, Harvey, and Veronica Dahl. 1989. Logic Grammars. Symbolic Computation AI Series. Springer-Verlag, 1989.

Abramson, Harvey, and Veronica Dahl, rev. Jocelyn Paine. 1990. DCTG: Prolog definite clause translation grammar translator. (Prolog code for translating from DCTG notation to standard Prolog. Note says syntax extended slightly by Jocelyn Paine to accept && between specifications of grammatical attributes, to minimize need for parentheses. Available from numerous AI/NLP software repositotries, including http://www-2.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/prolog/code/syntax/dctg/0.html, http://www.ims.uni-stuttgart.de/ftp/pub/languages/prolog/libraries/imperial_college/dctg.tar.gz, and http://www.ifs.org.uk/~popx/prolog/dctg/.)

Alblas, Henk. 1991. “Introduction to attribute grammars”. Attribute grammars, applications and systems: International Summer School SAGA, Prague, Czechoslovakia, June 4-13, 1991, Proceedings, pp. 1-15. Berlin: Springer, 1991. Lecture Notes in Computer Science, 545.

Bratko, Ivan. 1990. Prolog programming for artificial intelligence. Second edition. Wokingham: Addison-Wesley. xxi, 597 pp.

Brown, Allen L., Jr., and Howard A. Blair. 1990. “A logic grammar foundation for document representation and layout”. In EP90: Proceedings of the International Conference on Electronic Publishing, Document Manipulation and Typography, ed. Richard Furuta. Cambridge: Cambridge University Press, 1990, pp. 47-64.

Brown, Allen L., Jr., Toshiro Wakayama, and Howard A. Blair. 1992. “A reconstruction of context-dependent document processing in SGML”. In EP92: Proceedings of Electronic Publishing, 1992, ed. C. Vanoirbeek and G. Coray. Cambridge: Cambridge University Press, 1992. Pages 1-25.

Clocksin, W. F., and C. S. Mellish. Programming in Prolog. Second edition. Berlin: Springer, 1984.

Cohen, Jacques, and Timothy J. Hickey. “Parsing and compiling using Prolog”. ACM Transactions on Programming Languages and Systems 9.2 (1987): 125-163.

Gal, Annie, Guy Lapalme, Patrick Saint-Dizier, and Harold Somers. 1991. Prolog for natural language processing. Chichester: Wiley, 1991. xiii, 306 pp.

Gazdar, Gerald, and Chris Mellish. 1989. Natural language processing in PROLOG: An introduction to computational linguistics. Wokingham: Addison-Wesley, 1989. xv, 504 pp.

Grune, Dick, and Ceriel J. H. Jacobs. 1990. Parsing techniques: a practical guide. New York, London: Ellis Horwood, 1990. Postscript of the book is available from the first author's Web site at http://www.cs.vu.nl/~dick/PTAPG.html

Knuth, D. E. 1968. “Semantics of context-free languages”. Mathematical Systems Theory 2: 127-145.

König, Esther, and Roland Seiffert. Grundkurs PROLOG für Linguisten. Tübingen: Francke, 1989. [= Uni-Taschenbücher 1525]

Pereira, Fernando C. N., and Stuart M. Shieber, Prolog and natural-language analysis. CSLI lecture notes 10. Stanford: Center for the study of language and information, 1987.

Stepney, Susan. High-integrity compilation. Prentice-Hall. Available from http://www-users.cs.york.ac.uk/~susan/bib/ss/hic/index.htm. Chapter 3 (Using Prolog) provides a terse introduction to DCTG notation and use. The rest of the book (available in Postscript form) provides a serious example of their application in a compiler.

Sterling, Leon, and Ehud Shapiro. 1994. The Art of Prolog: Advanced Programming Techniques. Cambridge, Mass.: MIT Press.

Wielemaker, Jan. 2003. “SWI-Prolog 5.2.4 Reference Manual”. http://www.swi-prolog.org

Further reading

Definite-clause grammars (DCGs) are introduced in almost any good Prolog textbook: e.g. [Clocksin/Mellish 1984], [Bratko 1990], or [Sterling/Shapiro 1994]. They are discussed at somewhat greater length in treatments of Prolog for natural-language processing, including [König/Seiffert 1989], [Gazdar/Mellish 1989], [Pereira/Shieber 1987], and [Gal et al. 1991]. Most extended discussions show how to use additional arguments to record syntactic structure or handle the semantics of the material.
Definite-clause translation grammars were not so widely discussed; the best treatment readily available on the Web appers to be [Stepney 1993]. General introductions to attribute grammars will also be useful; the discussion in [Grune/Jacobs 1990] is extremely accessible and clear.


[1] If for some reason the parser backtracks, the vp predicate will also recognize [eats] as a verb phrase, leaving [the,apple] as the unparsed residue; see below.
[2] There are methods of forcing a left-associative interpretation on the grammar, but I'll pass over them in silence here. The interested reader may consult [Cohen/Hickey 1987] (page 135) for an illustration of how to produce a left-associative syntax tree using a recursive descent parser like that for DCGs. Another well known approach is to use a left-corner parser rather than a recursive-descent parser; see the discussions in [Gal et al. 1991] (section 4.7.2) and [Pereira/Shieber 1987] (section 6.5).
[3] This grammar is supplied as a test case with the standard code for DCTG translation found in many AI repositories Abramson/Dahl/Paine 1990 transcribed (with slight modifications) by Jocelyn Paine from code supplied by Abramson and Dahl Abramson/Dahl 1989. It may possibly mirror an example in Knuth 1968, but I haven't actually ever seen that article and am not sure. (I have reformatted the code slightly to match the style used below.) Tracing the execution of this code while parsing a short bit string is instructive.
[4] This EBNF ignores some Prolog constructs like curly braces; it is therefore not exact or complete.
[5] The version I found on the net assumes that operator precedence values are in the range 0 to 255, as described in [Clocksin/Mellish 1984]; to make this program work with the many Prolog systems which use the range 0 to 1200, one must adjust the operator precedence values appropriately. A modified copy is available from the author.
[6] The attentive reader will note that SWI Prolog abbreviates complex structures when printing them at the shell; I have added a call to write in order to have the entire structure printed out. I have also manually added white space to the output of write, to make the structures easier to examine.
[7] In a more adequate grammar of English, of course, some would be recognized as compatible with singular nouns, too: “Every man loves some woman.”
[8] A more complete grammar would attribute a property to the verb to say whether it is transitive or intransitive; verbs marked intransitive cannot take a direct object, though transitive verbs can normally be used without one (“John eats the apple” but also “John eats”). This refinement is omitted here; the grammar is already long enough.
[9] In more serious grammars, strenuous efforts are made to avoid having to have separate entries for the singular and the plural of every noun. One reason for introducing a lex predicate is to insulate the definition of the pre-terminals from the lexicon.
[10] The dollar sign is used by SWI Prolog to retrieve a top-level binding for N.
[11] The example is drawn from section 2.2 of [Sterling/Shapiro 1994].