pure_LISP reference manual
This document describes pure_LISP language, implemented on an modified
SECD machine.
Use also the Tutorial - it provides some info
about installation and first steps in the package.
1. Data types, syntax, functions.
Data types in pure_LISP:
Terminal data types:
- nil - represents empty list
- integer - represents 32-bit integer value (the codes
of SECD machine are also represented by
integers).
- atom - used for variable/function names. Atom names
are used also as short strings.
Composite data types:
- node - used to represent pair (a.b) where a and b are
some data structures.
List (x1 x2 ... xn) is represented as (x1.(x2.(...(xn.nil)...)).
- non-functional objects - used for I/O and communications.
They are variants of the node (subtypes of node type):
- pipe - for interprocess communications.
- stream - represents I/O streams of lexemes/characters.
- connwait - for TCP connection in process of establishing
the connection.
Integer 0 is used for true logical value, while 1
represents false. Atom nil is the same as () and represents
empty list.
Syntax is as simple as possible:
The only special symbols are ( ) . SPACE. CR and LF are treated
as SPACE.
Lexeme is a sequence of non-special symbols.
Example:
Hello 234 -128 a b nil #ld_0_1 are proper lexemes.
Expressions are:
- terminal expressions - simple lexemes
- node expressions - if e1 and e2 are expressions, (e1.e2) is
an expression.
- lists - (e1 ... en) is a list of the expressions e1, e2 ...
en.
Functions in pure_LISP:
Both data and programs in pure_LISP are represented by the expressions
!
The program is simply a function call. Each function call have the form:
(function arg1 arg2 ... argn)
A function may be defined by the compiler (main function) or by the user.
The main function lambda defines new user functions. The process of
program execution in LISP is called evaluation.
Evaluations in pure_LISP are lazy (non-strict or call-by-need designations
are used). Shortly in lazy evaluation model the evaluation process return
formulas for more complex expressions, instead of values. The actual evaluation
is done only if access to some details of the complex expression is needed.
This model allows programming with infinite data structures and defining
user functions, receiving semi-correct parameters, for example:
(letrec ...
(or lambda(e1 e2)
(if e1 0 e2))
)
In strict evaluation model usage of above-mentioned user defined
function or will cause an error in expression (or (eq l nil)
(eq (car l) 0)) if l is an empty list, while in lazy
evaluations e2 will not be evaluated at all.
The rules for lazy evaluations are defined in the compiler.lib source.
Shortly, evaluation of the parameters of all user functions are delayed,
the parameters of main function cons are delayed as well.
First parameter of if and all parameters of all main primitives
except cons are forced. The evaluation of the body of user function
is forced. let and letrec are treated as user functions.
2. Main functions (functions, defined by the compiler).
Primitive functions:
Arithmetic functions:
- (add i j) evaluates sum i+j
- (sub i j) evaluates difference i-j
- (mul i j) evaluates product i*j
- (div i j) evaluates (i div j)
- (rem i j) evaluates remainder (i mod j)
- (inc i) evaluates i+1
- (dec i) evaluates i-1
Predicate functions:
- (eq e1 e2) returns 0 (true) if e1 is the same
as e2
- (leq i j) returns 0 if i<=j
- (lexleq a1 a2) returns 0 if atom a1 is lexicaly
less or equal to atom a2
- (node exp) returns 0 if exp is a node (a non empty
list / pair)
- (null exp) returns 0 if exp is nil (an empty list)
- (integer exp) returns 0 if exp is an integer
- (atom exp) returns 0 if exp is an atom
List/atom functions:
- (cons e1 e2) returns new node (e1.e2).
If e2 is a list (E2 ... EN), the result may be treated as a new
list (e1 E2 ... EN).
- (car e) returns left part of a node (head element
of a list).
- (cdr e) returns right part of a node (tail of
a list).
- (chr i) returns an atom, which name is one character
simbol with ASCII code i.
- (concat a1 a2) returns atom which name is a concatenation
of the names of atoms a1 and a2.
Other primitives:
- (fail e) kill current process.
- (stat e) returns e. Side effect: it prints statistic
about SECD machine in secd.log file.
Special case (stat 1) halts the SECD machine interpreter. I use this
form in speed test scripts, located in subdirectory test.
- (fopen fmod fn) opens the file named by atom fn
and returns stream object.
If fmod is 0, file is opened for input. If fmod is 1, the file
is erased and opened for output.
If fn is nil, the stdin/stdout is associated with the resulting
stream.
- (fclose f) Side effect: it closes the file, represented
by stream f. If f represents TCP connection,
fclose lives the socket, associated with it (in this case you must close
2 streams - (car socket) and (cdr socket)).
- (eof f) detects end_of_file state for stream f.
If f is a normal input file, eof returns true if
there is no more bytes to be read. If f represents a TCP connection,
eof returns true if the connection is broken or closed by the
remote end.
- (connect ip p) tries to establish new TCP connection
to host with IP address ip at port p and returns connwait object.
- (listen maxcon p) starts new TCP listener at port p.
maxcon is the maximum connections allowed for this
new server. listen returns connwait object.
All primitive functions have their correspoding code in SECD architecture. They all have 1 or 2 parameters.
Except cons, all are compiled to force evaluation of their parameters.
Cons delay the evaluation of the parameters.
The other group of main functions may be called special functions (for
control, i/o, process control and other purpouses). They all have specific
semantics.
Special functions:
- (if cond eT eF)
if is a function for conditional evaluation.
First the expression cond is evaluated. If cond is 0, the expression
eT will be returned as a result of if, else eF will be returned.
- (case i e0 ... ek eF)
case is a more complex variant of if.
First the expression i is evaluated. If i
is in range from 0 to k, the expression ei will be returned
as a result of case, else eF will be returned.
- (quote e)
This function returns the parameter e without evaluating
it.
It is used in all cases where you want to treate e as a constant,
not as a formula.
- (lambda args exp) returns a function definition.
args is a list of formal parameters for this
function, exp describes how to evaluate it.
The functions are evaluated in their lexical scope.
- (let exp (V1.e1) (V2.e2) ... (VN.eN))
let is a shortening for the expression ((lambda (V1
V2 ... VN) exp) e1 e2 ... eN)
If LS is the lexical scope for let, first the expressions e1, e2
... eN are delayed and binded with corresponding variables V1, V2, ...
VN.
At last, the expression exp is evaluated in the new scope ((V1
V2 ... VN).LS).
- (letrec exp (V1.e1) (V2.e2) ... (VN.eN))
letrec is similar to let with the difference that e1, e2 ... eN
are delayed in the new scope ((V1 V2 ... VN).LS).
This allow definitions of groups of self-refferencing recursive
functions.
Such recursion is the main control mechanism in LISP programming !
- (do V e1 e2)
In pure functional programming there is no way to determine what
is the order of evaluation of two expressions.
Input/output operations are non-functional by nature and the order
of evaluations is important for them.
do is a function, organising sequence of actions.
It forces the evaluation of e1, then binds the returned
value with variable V and at last evaluates e2
in the scope ((V).LS) where LS is the lexical scope of do.
- (seq e1 e2)
seq is similar to do but is simpler and faster.
It forces the evaluation of e1, then evaluates e2.
Value of e1 can't be used in e2. seq
is useful in cases where only side effect of e1 is significant.
- (send msg io) send message msg to
input/output object io (pipe or stream).
If io is a pipe, msg may be of any type. If it is a stream, msg
must be of terminal type (atom, integer or nil). In this case msg is converted
to lexeme and is written to output stream.
Send may change the running process. Current process is saved in
some pending queue and other process is restored (msg handler or other
process). For details see the Implementation page.
- (recv io) is opposite to send. It receives a
mesage from io and return it as a result.
Additional allowed io type is connwait. In this case the result
is an established new connection - TCP socket, represented by node of
the form (instream.outstream). If connwait object is generated by the
connect function, just 1 socket may be received. If connwait is generated
by listen, many sockets may be received.
If there is no ready msg, current process is saved and other process
is restored.
- (sendc i stream)
While send writes lexemes to stream, sendc writes single character
with ascii code i.
- (recvc stream)
Opposite to sendc. Return ascii code of the received character.
- (fork e1 e2)
New process is created. Current process will evaluate the exspression
e1. The new process will evaluate the expression e2.
- (inLib LibName exp)
This function allows compilation in previously defined lexical
scopes.
LibName.lib is the file name of some library, having syntax like
this:
(letrec user_program (V1.e1) ... (VN.eN))
Substitution is made in the text of the library - exp replaces
atom user_program and resulting expression is compiled.
- (env)
This function returns the node (LS.E) where LS is the lexical scope
of the function and E is the environment (closure) at the moment of evaluation.
env is used for dinamic linking in miniOS, conjucted
with pure_LISP.
In current version of miniOS env is used to generate variable
self, representing RTL lexical scope and the corresponding
closure.
Functions, defined by the compiler are accesible from all programs.
If your program will work alone (not in miniOS shell) it must use only
these functions and functions, defined in the program !
3. Run Time Library (RTL) functions and variables.
If you use miniOS, some additional functions and variables are accesible.
They are defined in run time library and are writen in pure_LISP.
RTL consists of many sources - system.lib, btree.lib, sysfun.lst,
errors.lst, compiler.lib and miniOS.lsp.
Variables input and output are defined in
miniOS to represent standard input and output stream for current shell account.
Variable syspipe is used to organise breakable compilation,
allowing syntax error handling by the compiler.
It is used in the same maner for runtime error handling.
Functions in system.lib:
- (member e l) returns 0 (true) if e is a member
of list l.
- (posn e l) returns position where e is occured
in list l.
- (mapcar f l) applies one-parameter function f
to all elements of list l and returns resulting list.
- (sysmsg msg) writes msg on system
console and continues the session.
- (f2name p s) p and s must be atoms. returns concatenation
p|.|s
Example: (f2name (quote test) (quote lisp)) --> atom with
name 'test.lisp'
- (read f) Reads expression from stream f.
- (read_list f) Internaly used by read.
- (ReadSFrom fn) Reads expression from file named
fn.
- (printf e f st) Prints expression e
to stream f with stylet st.
If st is nil, e is written without any formating.
If st is 'lisp' e is ordered to be more readable as a lisp program.
If st is 'secd' e is ordered to be more readable as a SECD code.
- (print e st) is same as (printf e output st).
- (WriteSTo e st fn) Writes expression e to file
named fn with style st.
- (force e) Forces evaluation for all parts of e.
pure_LISP evaluations are lazy and this function guarantee that
expression e will be fully evaluated.
The only place I use this function is when SECD code is generated.
It is usable only in cases where e doesn't contain infinite parts !
File sysfun.lst contains a list of records, representing some info
about the compiler functions.
The fields in the file have the following semantics:
- name is the name of the function
- type is 0 for primitives and 1 for special functions
- pcnt is nil if function has variable count of args
or is the number of its formal parameters.
- code is SECD code for primitive functions or compiler
index in case statement for special functions.
File errors.lst contains a list of sentences, giving some info
about the run time errors.
Functions in btree.lib are part of the compiler, but some of them
are useful utilities:
- (btree l) creates binary tree from list l,
containing nodes of the form (a.v) where a is
an atom and v is a value, associated with a.
- (subst l p) replaces elements of l
in place of * in pattern list p and returns the result.
- (bsearch bt p) searchs in binary tree bt
for value associated with the atom p. If p is
not found, nil is returned.
- (rev0 a l) reverse elements of list l
into cumulative list a and returns the result.
- (count l i) returns i increased with
count of the elements of list l. If l is not
a list, nil is returned.
Functions in compiler.lib:
- (comp e n c) is the main compiler function.
e is expression source, n is
lexical scope in which e will be compiled, and c
is cumulative variable for resulting SECD code.
Returns SECD code for evaluation of the expression e.
- c_tree is a binary tree, created from the file sysfun.lst.
- There are more than 15 additional functions in compiler.lib,
but all they are recursively used by comp and I do not suggest
you to use them.
Functions in miniOS.lsp:
- self is variable with value (LS.E), where LS represents
lexical scope and E represent closure of the RTL itself.
- (comp_lsp fn) compiles file fn.lsp to file fn.sec
in lexical scope of the RTL.
- (comp_lex fn s) compiles file fn.lsp to file fn.sec
in lexical scope s.
- (comp_self) compiles all miniOS files in their proper
scopes.
It is beter to make backup copy of all miniOS files before using
this function.
- (comp_all) compiles all system software, including miniOS,
writen on pure_LISP and microkernel, writen in strict_LISP.
- (eval e) compiles expression e in RTL scope and
evaluates the resulting SECD code.
- (exec fn) reads SECD code from file fn.sec and
evaluates it in RTL closure.
- (sh_loop) is the main shell loop. First the check is made
for closed connection. Then sh_loop saves the current state of the
shell process, forking in 2 subprocesses. The first waits from syspipe
the result of the simple shell step. If this result is not nil, this is indication
for error and the info for this error is printed, then sh_loop is started
again. The second subprocess sends to syspipe the result of
(sh_step) function (evaluation of the simple shell step), then
dies. This mechanism allow breakable evaluation of simple shell step and error
handling.
- (sh_step) organises the shell step actions - writes the
prompt, reads from input stream the user expression, compile
and evaluates it and prints the result to output stream.
- (show_err state) prints info for run time errors.
(c) Skelet, Sep 2002 in terms of GNU GPL
Mail me at home or work