This section gives an informal account of the lexical conventions used in writing Better Scheme programs.
Upper and lower case forms of a letter are always remembered (except within variable references) but never distinguished . For example, Foo
is the same identifier as FOO
and #\xff
is the same character as #\XFF
.
Most identifiers allowed by other programming languages are also acceptable in Better Scheme. The precise rules for forming identifiers vary among implementations of Better Scheme, but in all implementations a sequence of letters, digits, and "extended alphabetic characters" is an identifier. In addition, +
, -
, and ...
are identifiers. Here are some examples of identifiers:
lambda
q
list->vector
soup
+
V17a
<=?
a34kTMNs
42
5ducks
the-word-recursion-has-many-meanings
Extended alphabetic characters may be used within identifiers as if they were letters. The following are extended alphabetic characters:
! $ % & * + - . / : < = > ? @ ^ _ ~ |
Identifiers have three interpretations within Better Scheme programs:
Whitespace characters are spaces, tabs and newlines (implementations typically provide additional whitespace characters such as page break). Whitespace is used for improved readability and is necessary to separate tokens from each other, a token being an indivisible lexical unit such as an identifier or number, but is otherwise insignificant. Whitespace may occur between any two tokens, but not within a token. Whitespace may also occur inside a string, where it is significant.
A semicolon ";
" indicates the start of a comment. The comment continues to the end of the line on which the semicolon appears. Comments are invisible to Scheme, but the end of the line is visible as whitespace. This prevents a comment from appearing in the middle of an identifier or number.
;; The fact function computes the factorial
;; of a non-negative integer.
(define fact
(lambda (n)
(if (= n 0)
1;Base case - return 1
(* n (fact (- n 1))))))
A literal is a identifier or other syntax which directly represents a Better Scheme entity rather than a variable which refers to a location containing some entity. When literals are encountered in the program text, the value they represent is substituted for them. When one of those values is displayed it is generally represent as the literal corresponding to it.
The two boolean literals are written #t
and #f
, however all standard conditional functions treat all entities as true (including null) except for the literal #f
. The phrase a true value
(or sometimes just true
) means any entity treated as true by the conditional functions, and the phrase a false value
(or false
) means any object treated as false by the conditional functions.
True is a macro which evaluates to its first argument. The second argument is ignored and not evaluated. The two arguments may be curried.
literal macro: (#f <true-expression> <false-expression>)False is a macro which evaluates to its second argument. The first argument is ignored and not evaluated. The two arguments may be curried.
A list is a sequence of expressions enclosed by parenthesis and separated by spaces. The standard semantics of Scheme evaluate lists as invocations (see section 4.3 Invocation). A list can be included as a literal by quoting it (see section 4.2 Literal Expressions).
literal: (<expression1> <expression> ... . <tail-expression>)An improper list, that is a list whose tail is not a list may be formed by use of the cons dot. The cons dot is placed before the last element of the list and seperated from any surrounding identfiers by whitespace lest it be taken as part of an identfier. The last element is then taken as the tail of the list. The value after the cons dot can be a list so that the expression "'(1 . (2 . (3 . ())))
" evalutes to (1 2 3)
.
A pound sign followed by a backslash and a character is the character literal for that character. This is case sensitive.
literal: #\xFFA pound sign followed by two to four hexadecimal digits is the character literal for the character corresponding to that ascii or unicode character. The three digit form is taken to have a implicit leading zero to represent a unicode character.
literal: #\spaceThe character literal for the space character.
literal: #\tabThe character literal for the tab character.
literal: #\newlineThe character literal for the newline character.
A string is a sequence of characters enclosed in double quotes. To allow string literals to be written which contain none written characters or double quotes escape sequences are used. Within a string literal the following (case-sensitive) escape sequence can be used:
\n |
Newline |
\r |
Carriage Return |
\0 |
Null |
\t |
Tab |
\\ |
Backslash |
\" |
Double Quote |
\xFF |
ASCII literal expressed in hexadecimal |
\uFFFF |
Unicode literal expressed in hexadecimal |
A backslash followed by some other sequence of characters is taken not to be an escape sequence.
Note: the hecadecimal digits in the ASCII and Unicode escape codes are case-insensitive
Note: the following characters are reserved for future escape sequences 'f', and 'b'. It is strongly recommended that one escape backslashes that appear before these characters.
A vector is a sequence of expressions enclosed by square brackets and separated by spaces. The standard semantics of Scheme evaluate vectors as invocations (see section 4.3 Invocation). A vector can be included as a literal by quoting it (see section 4.2 Literal Expressions).
The syntax of the written representations for numbers is described formally in section 7.1.1 Lexical structure. Note that case is not significant in numerical constants.
A number may be written in binary, octal, decimal, or hexadecimal by the use of a radix prefix. The radix prefixes are '#b
' (binary), '#o
' (octal), '#d
' (decimal), and '#x
' (hexadecimal). With no radix prefix, a number is assumed to be expressed in decimal.
A numerical constant may be specified to be either exact or inexact by a prefix. The prefixes are '#e
' for exact, and '#i
' for inexact. An exactness prefix may appear before or after any radix prefix that is used. If the written representation of a number has no exactness prefix, the constant may be either inexact or exact. It is inexact if it contains a decimal point or an exponent, otherwise it is exact.
In systems with inexact numbers of varying precisions it may be useful to specify the precision of a constant. For this purpose, numerical constants may be written with an exponent marker that indicates the desired precision of the inexact representation. The letters 's
', 'f
', 'd
', and 'l
' specify the use of short, single, double, and long precision, respectively. (When fewer than four internal inexact representations exist, the four size specifications are mapped onto those available. For example, an implementation with two internal representations may map short and single together and long and double together.) In addition, the exponent marker 'e
' specifies the default precision for the implementation. The default precision has at least as much precision as double, but implementations may wish to allow this default to be set by the user.
3.14159265358979F0 |
=> |
3.141593 ;Round to single |
0.6L0 |
=> |
.600000000000000 ;Extend to long |
Just as null is a value representing the absence of a string pair, void is a value representing the absence of a function.
literal macro: (#void <expression> ...)Void is a macro which evaluates to itself and does not evaluate its arguments.
Beyond whitespace, comments, identfers and literals there are a few other lexical considerations in Better Scheme.
The evaluative operator (see section 4.4 Evaluative Operator) is the comma. It may be placed in front of any expression to form a new expression which is the evaluation of the first. Since expressions in many contexts are evaluated anyway, this can cause a "double evaluation."
An expression may be quoted by placeing a single quote before it, thereby forming a literal expression (see section 4.2 Literal Expressions).
The following characters are reserved for possible future extension to the language.
{ } |