HELP LEXICAL                                    Robert Duncan, June 1991

The lexical rules of PML.


    CONTENTS - (Use <ENTER> g to access required sections)

 -- Acknowledgements
 -- Identifiers (id, longid)
 -- Special Constants (scon)
 -- Reserved Words
 -- Comments


-- Acknowledgements ---------------------------------------------------

Much of this file is taken from the Definition of Standard ML referenced
in HELP * PML.


-- Identifiers (id, longid) -------------------------------------------

An identifier is either alphanumeric or symbolic.

An alphanumeric identifier  is any sequence  of letters, digits,  primes
(') or underscores  (_) starting with  a letter or  a prime; a  symbolic
identifier is any sequence of the following characters:

    ! % & $ # + - / : < = > ? @ \ ~ ` ^ | *

A long identifier is an  identifier optionally prefixed with a  sequence
of structure identifiers separated by  dots (.), e.g. the following  are
all long identifiers:

    x
    S.x
    T.S.x

A long identifier is a  single lexical item, so  there can be no  spaces
surrounding the separating dot characters.

Identifiers are grouped into 9 syntactic classes:

    Value variables         (var)       long
    Value constructors      (con)       long
    Exception constructors  (exncon)    long
    Type variables          (tyvar)
    Type constructors       (tycon)     long
    Record labels           (lab)
    Structure identifiers   (strid)     long
    Signature identifiers   (sigid)
    Functor identifiers     (funid)

Of these, those marked "long" allow long forms as described above.

The class of type variables is lexically distinct from all other classes
in that any identifier starting with a prime is a type variable, and  no
other identifier is a type variable. The syntactic class of any non-type
variable identifier can be determined by the parser.

The identifiers "*"  and "=" are  special: "*" may  not be rebound  as a
type constructor and "="  may not be rebound  at all. This removes  some
potential ambiguities from the syntax.

The class of record labels also includes the numeric labels

    1, 2, 3, ...


-- Special Constants (scon) -------------------------------------------

A special constant  may be  an integer  constant, a  real constant  or a
string constant.

An integer  constant is  any non-empty  sequence of  digits,  optionally
preceded by a negation symbol (~).

Examples:

    ~1234  ~1  0  1  1234


A real constant is an integer  constant, possibly followed by a  decimal
point (.) and a  non-empty sequence of digits,  possibly followed by  an
exponent symbol (E) and an integer constant. Either the decimal point or
the exponent symbol must  be present, so that  no integer constant  is a
real constant.

Examples:

    ~1.0  3.141593  1E6  1.88E~56


A string constant is  a sequence of zero  or more printable  characters,
escape sequences or spaces written  inside double quotes ("). An  escape
sequence is  introduced by  the escape  character (\)  and stands  for a
single character. Legal escape sequences are:

    \n          new-line character
    \t          tab character
    \^c         control character for any appropriate character c
    \ddd        the character with 3-digit ASCII code ddd (0 - 255)
    \"          "
    \\          \

In  addition,  any  sequence  of  white-space  characters  (space,  tab,
newline,  carriage-return   and   form-feed)  written   between   escape
characters is ignored; this allows  strings to be written over  multiple
lines. Any other use of the escape character is incorrect.

Examples:

    ""  "hello world\n"  "bell = \^G"  "this string extends\
                                       \over two lines"


-- Reserved Words -----------------------------------------------------

The following is a complete list of the reserved words of Standard ML:

    abstype and andalso as case do datatype else end eqtype exception fn
    fun functor handle if in include infix infixr let local nonfix of op
    open orelse raise rec sharing sig signature struct structure then
    type val with withtype while

    ( ) [ ] { } , : ; ... _ | = => -> #

In addition, Poplog ML reserves the following words:

    abstraction external pervasive


-- Comments -----------------------------------------------------------

In Standard ML, a comment is a character sequence written inside comment
brackets (* *) in which comment brackets are properly nested. Poplog  ML
also allows  end-of-line  comments, introduced  by  the symbol  ;;;  and
terminated by the next newline.


--- C.all/pml/help/lexical
--- Copyright University of Sussex 1991. All rights reserved. ----------
