The complete list of grammar tokens is given in the table below.
These tokens are all described in this section except for
scope =
‹Routine›,
which is postponed to the next.
' ‹word›' |
that literal word only |
noun |
any object in scope |
held |
object held by the actor |
multi |
one or more objects in scope |
multiheld |
one or more held objects |
multiexcept |
one or more in scope, except the other object |
multiinside |
one or more in scope, inside the other object |
‹attribute› | any object in scope which has the attribute |
creature |
an object in scope which is animate |
noun = ‹Routine› |
any object in scope passing the given test |
scope = ‹Routine› |
an object in this definition of scope |
number |
a number only |
‹Routine› | any text accepted by the given routine |
topic |
any text at all |
To recap, the parser goes through a line of grammar tokens trying to match each against some text from the player's input. Each token that matches must produce one of the following five results:
Ordinarily, a single line, though it may contain many tokens, can produce at most two substantial results ((a) to (d)), at most one of which can be multiple (b). (See the exercises below if this is a problem.) For instance, suppose the text “green apple on the table” is parsed against the grammar line:
* multi 'on' noun -> Insert
The multi
token matches “green apple” (result: a single object, since
although multi
can match
a multiple object, it doesn't have to), 'on'
matches “on” (result: nothing) and the second
noun
token matches
“the table” (result: a single object again). There are two
substantial results, both objects, so the action that comes out is
<Insert apple table>
. If the text had been “all
the fruit on the table”, the multi
token might have resulted in a list: perhaps of an apple, an orange
and a pear. The parser would then have generated and run through three
actions in turn: <Insert apple table>
, then
<Insert orange table>
and finally
<Insert pear table>
, printing out the name of each
item and a colon before running the action:
>put all the fruit on the table
Cox's pippin: Done.
orange: Done.
Conference pear: Done.
The library's routine InsertSub
, which
actually handles the action, only deals with single objects at a time,
and in each case it printed “Done.”
'
‹word›'
This matches only the literal word given, sometimes called
a preposition because it usually is one, and produces no resulting
information. (There can therefore be as many or as few of them on a
grammar line as desired.) It often happens that several prepositions really
mean the same thing for a given verb: for instance “in”,
“into” and “inside” are often synonymous.
As a convenient shorthand, then, you can write a series of prepositions
(only) with slashes /
in between, to mean “one of
these words”. For example:
* noun 'in'/'into'/'inside' noun -> Insert
noun
Matches any single object “in scope”, a
term defined in the next section and which roughly means “visible
to the player at the moment”.
held
Matches any single object which is an immediate possession
of the actor. (Thus, if a key is inside a box being carried by the actor,
the box might match but the key cannot.) This is convenient for two reasons. Firstly,
many actions, such as Eat
or Wear
,
only sensibly apply to things being held. Secondly, suppose we have grammar
Verb 'eat' * held -> Eat;
and the player types “eat the banana” while
the banana is, say, in plain view on a shelf. It would be petty of the
game to refuse on the grounds that the banana is not being held.
So the parser will generate a Take
action for the banana
and then, if the Take
action succeeds, an Eat
action. Notice that the parser does not just pick up the object, but
issues an action in the proper way – so if the banana had rules
making it too slippery to pick up, it won't be picked up. This is
called “implicit taking”, and happens only for the player,
not for other actors.
multi
Matches one or more objects in scope. The
multi-
tokens indicate
that a list of one or more objects can go here. The parser works out
all the things the player has asked for, sorting out plural nouns and
words like “except” in the process. For instance, “all
the apples” and “the duck and the drake” could match
a multi
token but not
a noun
token.
multiexcept
Matches one or more objects in scope, except that it does
not match the other single object parsed in the same grammar line. This
is provided to make commands like “put everything in the rucksack”
come out right: the “everything” is matched by all of the
player's possessions except the rucksack, which stops the parser from
generating an action to put the rucksack inside itself.
multiinside
Similarly, this matches anything inside the other single
object parsed on the same grammar line, which is good for parsing
commands like “remove everything from the cupboard”.
‹attribute› Matches any object in scope which has the given attribute. This is useful for sorting out actions according to context, and perhaps the ultimate example might be an old-fashioned “use” verb:
Verb 'use' 'employ' 'utilise' * edible -> Eat * clothing -> Wear ... * enterable -> Enter;
creature
Matches any object in scope which behaves as if living.
This normally means having animate
: but, as an exceptional
rule, if the action on the grammar line is Ask
,
Answer
, Tell
or AskFor
then
having talkable
is also acceptable.
noun =
‹Routine›
“Any single object in scope satisfying some condition”.
When determining whether an object passes this test, the parser sets the
variable noun
to the object in question and calls the routine.
If it returns true
, the parser accepts the object, and
otherwise it rejects it. For example, the following should only apply
to animals kept in a cage:
[ CagedCreature; if (noun in wicker_cage) rtrue; rfalse; ]; Verb 'free' 'release' * noun=CagedCreature -> FreeAnimal;
So that only nouns which pass the CagedCreature
test are allowed. The CagedCreature
routine can appear anywhere
in the source code, though it's tidier to keep it nearby.
scope =
‹Routine›
An even more powerful token, which means “an object
in scope” where scope is redefined specially. You can also choose
whether or not it can accept a multiple object. See §32.
number
Matches any decimal number from 0 upwards (though it rounds
off large numbers to 10,000), and also matches the numbers “one”
to “twenty” written in English. For example:
Verb 'type' * number -> TypeNum;
causes actions like <Typenum 504>
when
the player types “type 504”. Note that noun
is
set to 504, not to an object. (While inp1
is set to 1,
indicating that this “first input” is intended as a number:
if the noun had been the object which happened to have number 504,
then inp1
would have been set to this object, the same
as noun
.) If you need more exact number parsing, without
rounding off, and including negative numbers, see the exercise below.
•
EXERCISE 83
Some games, such as David M. Baggett's game ‘The Legend Lives!’
produce footnotes every now and then. Arrange matters so that these
are numbered [1]
, [2]
and so on in order
of appearance, to be read by the player when “footnote 1”
is typed.
▲
The entry point ParseNumber
allows you to provide your
own number-parsing routine, which opens up many sneaky possibilities –
Roman numerals, coordinates like “J4”, very long telephone
numbers and so on. This takes the form
[ ParseNumber buffer length; ...returning false if no match is made, or the number otherwise... ];
and examines the supposed ‘number’ held
at the byte address buffer
, a row of characters of the
given length. If you provide a ParseNumber
routine but
return false
from it, then the parser falls back on
its usual number-parsing mechanism to see if that does any better.
▲▲
Note that ParseNumber
can't return 0 to mean the number
zero, because 0 is the same as false
. Probably
“zero” won't be needed too often, but if it is you can always
return some value like 1000 and code the verb in question to understand
this as 0. (Sorry: this was a poor design decision made too long ago
to change now.)
topic
This token matches as much text as possible, regardless
of what it says, producing no result. As much text as possible means
“until the end of the typing, or, if the next token is a
preposition, until that preposition is reached”. The only way
this can fail is if it finds no text at all. Otherwise, the variable
consult_from
is set to the number of the first word of the
matched text and consult_words
to the number of words.
See §16 and §18
for examples of topics being used.
‹Routine›
The most flexible token is simply the name of a “general
parsing routine”. As the name suggests, it is a routine to do some
parsing which can have any outcome you choose, and many of the interesting
things you can do with the parser involve writing one. A general parsing
routine looks at the word stream using NextWord
and
wn
(see §28) to make its decisions,
and should return one of the following. Note that the values beginning
GPR_
are constants defined by the library.
On an unsuccessful match, returning GPR_FAIL
,
it doesn't matter what the final value of wn
is. On a
successful match it should be left pointing to the next thing after
what the routine understood. Since NextWord
moves wn
on by one each time it is called, this happens automatically unless
the routine has read too far. For example:
[ OnAtorIn; if (NextWord() == 'on' or 'at' or 'in') return GPR_PREPOSITION; return GPR_FAIL; ];
duplicates the effect of
'on'/'at'/'in'
,
that is, it makes a token which accepts any of the words “on",
“at" or “in" as prepositions. Similarly,
[ Anything; while (NextWordStopped() ~= -1) ; return GPR_PREPOSITION; ];
accepts the entire rest of the line (even an empty
text, if there are no more words on the line), ignoring it.
NextWordStopped
is a form of NextWord
which
returns the special value −1 once the original word stream has
run out.
If you return GPR_NUMBER
, the number
which you want to be the result should be put into the library's
variable parsed_number
.
If you return GPR_MULTIPLE
, place your
chosen objects in the table multiple_object
: that is,
place the number of objects in multiple_object-->0
and the objects themselves in -->1
, …
The value GPR_REPARSE
should only be
returned if you have actually altered the text you were supposed to be
parsing. This is a feature used internally by the parser when it asks
“Which do you mean…?” questions, and you can use
it too, but be wary of loops in which the parser eternally changes
and reparses the same text.
▲
To parse a token, the parser uses a routine called ParseToken
.
This behaves almost exactly like a general parsing routine, and returns
the same range of values. For instance,
ParseToken(ELEMENTARY_TT, NUMBER_TOKEN)
parses exactly as number
does: similarly for NOUN_TOKEN
, HELD_TOKEN
,
MULTI_TOKEN
, MULTIHELD_TOKEN
,
MULTIEXCEPT_TOKEN
, MULTIINSIDE_TOKEN
and
CREATURE_TOKEN
. The call
ParseToken(SCOPE_TT, MyRoutine)
does what scope=MyRoutine
does. In fact ParseToken
can parse any kind of token,
but these are the only cases which are both useful enough to mention
and safe enough to use. It means you can conveniently write a token
which matches, say, either the word “kit” or
any named set of items in scope:
[ KitOrStuff; if (NextWord() == 'kit') return GPR_PREPOSITION; wn--; return ParseToken(ELEMENTARY_TT, MULTI_TOKEN); ];
•
EXERCISE 84
Write a token to detect small numbers in French, “un”
to “cinq”.
•
EXERCISE 85
Write a token called Team
, which matches only against
the word “team” and results in a multiple object containing
each member of a team of adventurers in a game.
•▲
EXERCISE 86
Write a token to detect non-negative floating-point numbers like
“21”, “5.4623”, “two point oh eight”
or “0.01”, rounding off to two decimal places.
•▲
EXERCISE 87
Write a token to match a phone number, of any length from 1 to 30 digits,
possibly broken up with spaces or hyphens (such as “01245 666
737” or “123-4567”).
•▲▲
EXERCISE 88
(Adapted from code in "timewait.h": see the references
below.) Write a token to match any description of a time of day, such
as “quarter past five”, “12:13 pm”,
“14:03”, “six fifteen” or “seven
o'clock”.
•▲
EXERCISE 89
Code a spaceship control panel with five sliding controls, each
set to a numerical value, so that the game looks like:
>look
Machine Room
There is a control panel here, with five slides, each of which can
be set to a numerical value.
>push slide one to 5
You set slide one to the value 5.
>examine the first slide
Slide one currently stands at 5.
>set four to six
You set slide four to the value 6.
•▲
EXERCISE 90
Write a general parsing routine accepting any amount of text, including
spaces, full stops and commas, between double-quotes as a single token.
•
EXERCISE 91
On the face of it, the parser only allows two parameters to an action,
noun
and second
. Write a general parsing routine
to accept a third
. (This is easier than it looks: see
the specification of the NounDomain
library routine
in §A3.)
•
EXERCISE 92
Write a token to match any legal Inform decimal, binary or hexadecimal
constant (such as -321
, $4a7
or
$$1011001
), producing the correct numerical value in
all cases, while not matching any number which overflows or underflows
the legal Inform range of −32,768 to 32,767.
•
EXERCISE 93
Add the ability to match the names of the built-in Inform constants
true
, false
, nothing
and
NULL
.
•
EXERCISE 94
Now add the ability to match character constants like '7'
,
producing the correct character value (in this case 55, the ZSCII
value for the character ‘7’).
•▲▲
EXERCISE 95
Next add the ability to match the names of attributes, such as
edible
, or negated attributes with a tilde in front,
such as ~edible
. An ordinary attribute should parse
to its number, a negated one should parse to its number plus 100.
(Hint: the library has a printing rule called DebugAttribute
which prints the name of an attribute.)
•▲▲
EXERCISE 96
And now add the names of properties.
•
REFERENCES
Once upon a time, Andrew Clover wrote a neat library extension called
"timewait.h" for parsing times of day, and allowing
commands such as “wait until quarter to three”. L. Ross
Raszewski, Nicholas Daley and Kevin Forchione each tinkered with and
modernised this, so that there are now also "waittime.h"
and "timesys.h". Each has its merits.