The words
module provides access to the WordNet lexical database from Princeton University. The database contains huge amounts of information about English words, particularly their part of speech (noun, verb, adverb, or adjective). This information can be very useful when building Natural Language Processing systems in Plang.
Note: the words
module is optional in Plang and will only be available if WordNet was present on the system when Plang was built. Many GNU/Linux distributions have WordNet packages already. You can try installing wordnet
and wordnet-devel
(or wordnet-dev
) to see if your distribution already has WordNet. If not, download and build it from the sources at the link above. You will need the "devel" package installed to build Plang with WordNet.
The WordNet library and database is distributed under a BSD-style license. The Plang words
module that wraps the WordNet library is distributed under the GNU Lesser General Public License, Version 3 (LGPLv3).
The only language supported by WordNet is English. Primarily American English, but there are also British English spellings in the database (e.g. "colour", "organise", etc). Other languages have different parts of speech and sentence structure, so will require separate add-on modules to handle them in Plang.
To use the Plang module, first import the words
module into your application:
:- import(words).
The names of the predicates in the module are prefixed with "words::" such as words::adjective/1 and words::noun/1. The part of speech testing predicates take a single argument and succeed or fail depending upon whether the word has a specific part of speech:
:- import(words). :- import(stdout). dump_word(Word) { stdout::write(Word); stdout::write(":"); if (words::adjective(Word)) stdout::write(" adjective"); if (words::adverb(Word)) stdout::write(" adverb"); if (words::noun(Word)) stdout::write(" noun"); if (words::verb(Word)) stdout::write(" verb"); stdout::writeln(); }
The predicates in the words
module can also be used in definite clause grammar (DCG) rules to help parse sentences in English:
sentence --> noun_phrase, verb_phrase. noun_phrase --> det, words::noun. verb_phrase --> words::verb, noun_phrase. det --> ["the"]. det --> ["a"].
The WordNet database contains a large number of multi-word entries such as "bobby_fischer", "hand_out", "short_and_sweet", etc. These are also recognized by the part of speech testing predicates. Either a space or an underscore can be used as the word separator. It is possible to modify a set of DCG rules to recognize multi-word forms in a regular sentence as shown in the following example:
noun_phrase --> det, noun. noun([Word|Out], Out) { words::noun(Word); } noun([Word1,Word2|Out], Out) { Word is Word1 + "_" + Word2; words::noun(Word); } noun([Word1,Word2,Word3|Out], Out) { Word is Word1 + "_" + Word2 + "_" + Word3; words::noun(Word); }
WordNet has a large number of queries that can be performed on words. Most produce a list of related words that are in some relationship with the word being queried. In Plang, advanced queries on the WordNet database can be performed with words::search/5, words::overview/2, and words::description/5. For example, the following code queries for the parts of the noun "hand" according to all senses of the word:
words::search("hand", noun, haspartptr, allsenses, Result);
The Result
will be a list that includes words like "finger", "palm", etc. An overview of the word, similar to a dictionary entry, can be obtained with words::description/5:
words::description("hand", noun, overview, allsenses, Description);
A dictionary-like entry that lists all parts of speech and senses of a word can be retrieved with words::overview/2:
words::word("hand", Description);
The permitted query types are based on the names given to them by WordNet:
antptr, hyperptr, hypoptr, entailptr, simptr, ismemberptr,
isstuffptr, ispartptr, hasmemberptr, hasstuffptr, haspartptr,
meronym, holonym, causeto, pplptr, seealsoptr, pertptr,
attribute, verbgroup, derivation, classification, class, syns,
freq, frames, coords, relatives, hmeronym, hholonym, wngrep,
overview, classif_category, classif_usage, classif_regional,
class_category, class_usage, class_regional, instance, instances
Some familiarity with WordNet's terminology will be required to correctly format an advanced query on the database.
Many words in the English language are variations on other words with the addition of a suffix. For example, "eat" and "eating". The WordNet database stores information on the base forms but not the suffixed forms. The word testing predicates will check for base forms, but the advanced queries will not.
Use the base_forms/3 predicate to explicitly fetch a list of all the base forms of a word with respect to a specific part of speech:
base_forms("eating", verb, BaseForms)
The base_forms/3 predicate will fail if the word does not have a base form. That is, it will succeed for "eating", but not for "eat". The base_form/3 predicate will return the first base form, or the word itself if the word does not have a base form:
base_form("eating", verb, Base1) Base1 will be set to "eat"
base_form("eat", verb, Base2) Base2 will be set to "eat"
base_form("bobby", verb, Base3) fails - "bobby" is not a verb
base_form("bobby", noun, Base4) Base4 will be set to "bobby"
words::adjective/1, words::adverb/1, words::base_form/3, words::base_forms/3, words::description/5, words::noun/1, words::overview/2, words::search/5, words::verb/1
noun_phrase --> det, words::adjective, words::noun. noun_phrase(np(D,adj(A),n(N))) --> det(D), words::adjective(A), words::noun(N).
A
will be unified with the adjective to assist with building a parse tree for the sentence.words::adjective(animated) succeeds words::adjective("Animated") succeeds words::adjective(harpsichord) fails words::adjective(X) fails words::adjective(15) fails
verb_phrase --> words::adverb, words::verb, noun_phrase. verb_phrase(vp(adv(A),v(V),NP)) --> words::adverb(A), words::verb(V), noun_phrase(NP).
A
will be unified with the adverb to assist with building a parse tree for the sentence.words::adverb(fitfully) succeeds words::adverb("Fitfully") succeeds words::adverb(harpsichord) fails words::adverb(X) fails words::adverb(15) fails
adjective
, adverb
, noun
, or verb
, indicating the part of speech to search for. instantiation_error
- Word or PartOfSpeech, is a variable. type_error(atom_or_string, Word)
- Word is not an atom or string. type_error(part_of_speech, PartOfSpeech)
- PartOfSpeech is not one of the atoms adjective
, adverb
, noun
, or verb
. words::base_form("eating", verb, Result) succeeds
words::base_form("eating", verb, "eat") succeeds
words::base_form("eat", verb, "eat") succeeds
words::base_form(eat, verb, eat) succeeds
words::base_form(bobby, verb, Result) fails
adjective
, adverb
, noun
, or verb
, indicating the part of speech to search for. instantiation_error
- Word or PartOfSpeech, is a variable. type_error(atom_or_string, Word)
- Word is not an atom or string. type_error(part_of_speech, PartOfSpeech)
- PartOfSpeech is not one of the atoms adjective
, adverb
, noun
, or verb
. words::base_forms("eating", verb, Result)
words::base_forms("eating", verb, ["eat"])
adjective
, adverb
, noun
, or verb
, indicating the part of speech to search for. allsenses
, then all senses will be queried. instantiation_error
- Word, PartOfSpeech, Query, or Sense is a variable. type_error(variable, Result)
- Result is not a variable. type_error(atom_or_string, Word)
- Word is not an atom or string. type_error(part_of_speech, PartOfSpeech)
- PartOfSpeech is not one of the atoms adjective
, adverb
, noun
, or verb
. type_error(word_query, Query)
- Query is not an atom corresponding to one of the valid WordNet query types. type_error(word_sense, Sense)
- Sense is not an integer greater than or equal to 1 or the atom allsenses
.words::description("hand", noun, overview, allsenses, Description); stdout::writeln(Description); The noun hand has 14 senses (first 8 from tagged texts) 1. (215) hand, manus, mitt, paw -- (the (prehensile) extremity of the superior limb; "he had the hands of a surgeon"; "he extended his mitt") 2. (5) hired hand, hand, hired man -- (a hired laborer on a farm or ranch; "the hired hand fixed the railing"; "a ranch hand") ...
noun_phrase --> det, words::adjective, words::noun. noun_phrase(np(D,adj(A),n(N))) --> det(D), words::adjective(A), words::noun(N).
N
will be unified with the noun to assist with building a parse tree for the sentence.words::noun(harpsichord) succeeds words::noun("Harpsichord") succeeds words::noun("Bobby Fischer") succeeds words::noun(fitfully) fails words::noun(X) fails words::noun(15) fails
instantiation_error
- Word is a variable. type_error(variable, Result)
- Result is not a variable.words::overview("hand", Description); stdout::writeln(Description); The noun hand has 14 senses (first 8 from tagged texts) 1. (215) hand, manus, mitt, paw -- (the (prehensile) extremity of the superior limb; "he had the hands of a surgeon"; "he extended his mitt") 2. (5) hired hand, hand, hired man -- (a hired laborer on a farm or ranch; "the hired hand fixed the railing"; "a ranch hand") ... The verb hand has 2 senses (first 1 from tagged texts) 1. (25) pass, hand, reach, pass on, turn over, give -- (place into the hands or custody of; "hand me the spoon, please"; "Turn the files over to me, please"; "He turned over the prisoner to his lawyers") 2. hand -- (guide or conduct or usher somewhere; "hand the elderly lady into the taxi")
adjective
, adverb
, noun
, or verb
, indicating the part of speech to search for. synset
which fetches the members of the WordNet synonym set for Sense that contains Word. allsenses
, then all senses will be queried. instantiation_error
- Word, PartOfSpeech, Query, or Sense is a variable. type_error(variable, Result)
- Result is not a variable. type_error(atom_or_string, Word)
- Word is not an atom or string. type_error(part_of_speech, PartOfSpeech)
- PartOfSpeech is not one of the atoms adjective
, adverb
, noun
, or verb
. type_error(word_query, Query)
- Query is not synset
or an atom corresponding to one of the valid WordNet query types. type_error(word_sense, Sense)
- Sense is not an integer greater than or equal to 1 or the atom allsenses
.words::search("walk", verb, antptr, allsenses, List); stdout::writeln(List); words::search("hand", noun, synset, 1, List2); stdout::writeln(List2);
verb_phrase --> words::adverb, words::verb, noun_phrase. verb_phrase(vp(adv(A),v(V),NP)) --> words::adverb(A), words::verb(V), noun_phrase(NP).
V
will be unified with the verb to assist with building a parse tree for the sentence.words::verb(annoy) succeeds words::verb("Annoy") succeeds words::verb("hand_out") succeeds words::verb(harpsichord) fails words::verb(X) fails words::verb(15) fails