mirror of
https://github.com/janet-lang/janet
synced 2025-01-25 14:46:52 +00:00
Move doc to wiki.
This commit is contained in:
parent
68895e27d4
commit
c2199646be
35
README.md
35
README.md
@ -3,12 +3,13 @@
|
|||||||
[![Build Status](https://travis-ci.org/bakpakin/dst.svg?branch=master)](https://travis-ci.org/bakpakin/dst)
|
[![Build Status](https://travis-ci.org/bakpakin/dst.svg?branch=master)](https://travis-ci.org/bakpakin/dst)
|
||||||
[![Appveyor Status](https://ci.appveyor.com/api/projects/status/32r7s2skrgm9ubva?svg=true)](https://ci.appveyor.com/project/bakpakin/dst)
|
[![Appveyor Status](https://ci.appveyor.com/api/projects/status/32r7s2skrgm9ubva?svg=true)](https://ci.appveyor.com/project/bakpakin/dst)
|
||||||
|
|
||||||
Dst is a functional and imperative programming language and bytecode interpreter. The syntax
|
Dst is a functional and imperative programming language and bytecode interpreter. It is a
|
||||||
resembles lisp (and the language does inherit a lot from lisp), but lists are replaced
|
modern lisp, but lists are replaced
|
||||||
by other data structures with better utility and performance (arrays, tables, structs, tuples).
|
by other data structures with better utility and performance (arrays, tables, structs, tuples).
|
||||||
The language can also easily bridge to native code, and supports abstract datatypes
|
The language can also easily bridge to native code written in C, and supports abstract datatypes
|
||||||
for interfacing with C. Also support meta programming with macros.
|
for interfacing with C. Also support meta programming with macros, and bytecode assembly for the
|
||||||
The bytecode vm is a register based vm loosely inspired by the LuaJIT bytecode format.
|
dst abstract machine. The bytecode vm is a register based vm loosely inspired by the LuaJIT
|
||||||
|
bytecode format, but simpler and safer (bytecode can be verified by the assembler).
|
||||||
|
|
||||||
There is a repl for trying out the language, as well as the ability
|
There is a repl for trying out the language, as well as the ability
|
||||||
to run script files. This client program is separate from the core runtime, so
|
to run script files. This client program is separate from the core runtime, so
|
||||||
@ -22,6 +23,9 @@ There is not much in the way of documentation yet because it is still a "persona
|
|||||||
I don't want to freeze features prematurely. You can look in the examples directory, the test directory,
|
I don't want to freeze features prematurely. You can look in the examples directory, the test directory,
|
||||||
or the file `src/compiler/boot.dst` to get a sense of what dst code looks like.
|
or the file `src/compiler/boot.dst` to get a sense of what dst code looks like.
|
||||||
|
|
||||||
|
For syntax highlightinh, there is some preliminary vim syntax highlighting in [dst.vim](https://github.com/bakpakin/dst.vim).
|
||||||
|
Generic lisp synatx highlighting should provide good results, however.
|
||||||
|
|
||||||
## Features
|
## Features
|
||||||
|
|
||||||
* First class closures
|
* First class closures
|
||||||
@ -30,7 +34,7 @@ or the file `src/compiler/boot.dst` to get a sense of what dst code looks like.
|
|||||||
* Mutable and immutable arrays (array/tuple)
|
* Mutable and immutable arrays (array/tuple)
|
||||||
* Mutable and immutable hashtables (table/struct)
|
* Mutable and immutable hashtables (table/struct)
|
||||||
* Mutable and immutable strings (buffer/string)
|
* Mutable and immutable strings (buffer/string)
|
||||||
* Lisp Macros
|
* Lisp Macros (Code is Data, Data is Code)
|
||||||
* Byte code interpreter with an assembly interface, as well as bytecode verification
|
* Byte code interpreter with an assembly interface, as well as bytecode verification
|
||||||
* Proper tail calls.
|
* Proper tail calls.
|
||||||
* Direct interop with C via abstract types and C functions
|
* Direct interop with C via abstract types and C functions
|
||||||
@ -38,11 +42,17 @@ or the file `src/compiler/boot.dst` to get a sense of what dst code looks like.
|
|||||||
* Lexical scoping
|
* Lexical scoping
|
||||||
* Imperative Programming as well as functional
|
* Imperative Programming as well as functional
|
||||||
* REPL
|
* REPL
|
||||||
|
* Interactive Environment
|
||||||
|
|
||||||
|
## Docmentation
|
||||||
|
|
||||||
|
API documentation and design documents can be found in the
|
||||||
|
[wiki](https://github.com/bakpakin/dst/wiki).
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
|
|
||||||
A repl is launched when the binary is invoked with no arguments. Pass the -h flag
|
A repl is launched when the binary is invoked with no arguments. Pass the -h flag
|
||||||
to display the usage information.
|
to display the usage information. Individual scripts can be run with `./dst myscript.dst`
|
||||||
|
|
||||||
```
|
```
|
||||||
$ ./dst
|
$ ./dst
|
||||||
@ -62,12 +72,6 @@ Options are:
|
|||||||
$
|
$
|
||||||
```
|
```
|
||||||
|
|
||||||
## Docmentation
|
|
||||||
|
|
||||||
API documentation and design documents will be added to the `doc` folder as they are written.
|
|
||||||
As of March 2018, specifications are sparse because dst is evolving. Check the doc folder for
|
|
||||||
an introduction of Dst as well as an overview of the bytecode format.
|
|
||||||
|
|
||||||
## Compiling and Running
|
## Compiling and Running
|
||||||
|
|
||||||
Dst can be built with Make or CMake.
|
Dst can be built with Make or CMake.
|
||||||
@ -104,8 +108,3 @@ make run
|
|||||||
## Examples
|
## Examples
|
||||||
|
|
||||||
See the examples directory for some example dst code.
|
See the examples directory for some example dst code.
|
||||||
|
|
||||||
## Editor
|
|
||||||
|
|
||||||
There is some preliminary vim syntax highlighting in [dst.vim](https://github.com/bakpakin/dst.vim).
|
|
||||||
Generic lisp synatx highlighting should provide good results, however.
|
|
||||||
|
236
doc/bytecode.md
236
doc/bytecode.md
@ -1,236 +0,0 @@
|
|||||||
# Dst Bytecode Reference
|
|
||||||
|
|
||||||
This document outlines the Dst bytecode format, and core ideas in the runtime
|
|
||||||
that are closely related to the bytecode. It should enable the reader
|
|
||||||
to write dst assembly code and hopefully understand the dst internals better.
|
|
||||||
It will also talk about the C abstractions used to implement some of these ideas.
|
|
||||||
Some experience with basic computer organization is helpful for understanding
|
|
||||||
the model of computation.
|
|
||||||
|
|
||||||
## The Stack = The Fiber
|
|
||||||
|
|
||||||
A Dst Fiber is the type used to represent multiple concurrent processes
|
|
||||||
in dst. It is basically a wrapper around the idea of a stack. The stack is
|
|
||||||
divided into a number of stack frames (`DstStackFrame *` in C), each of which
|
|
||||||
contains information such as the function that created the stack frame,
|
|
||||||
the program counter for the stack frame, a pointer to the previous frame,
|
|
||||||
and the size of the frame. Each stack frame also is paired with a number
|
|
||||||
registers.
|
|
||||||
|
|
||||||
```
|
|
||||||
X: Slot
|
|
||||||
|
|
||||||
X
|
|
||||||
X - Stack Top, for next function call.
|
|
||||||
-----
|
|
||||||
Frame next
|
|
||||||
-----
|
|
||||||
X
|
|
||||||
X
|
|
||||||
X
|
|
||||||
X
|
|
||||||
X
|
|
||||||
X
|
|
||||||
X - Stack 0
|
|
||||||
-----
|
|
||||||
Frame 0
|
|
||||||
-----
|
|
||||||
X
|
|
||||||
X
|
|
||||||
X - Stack -1
|
|
||||||
-----
|
|
||||||
Frame -1
|
|
||||||
-----
|
|
||||||
X
|
|
||||||
X
|
|
||||||
X
|
|
||||||
X
|
|
||||||
X - Stack -2
|
|
||||||
-----
|
|
||||||
Frame -2
|
|
||||||
-----
|
|
||||||
...
|
|
||||||
...
|
|
||||||
...
|
|
||||||
-----
|
|
||||||
Bottom of stack
|
|
||||||
```
|
|
||||||
|
|
||||||
Fibers also have an incomplete stack frame for the next function call on top
|
|
||||||
of their stacks. Making a function call involves pushing arguments to this
|
|
||||||
temporary stack, and then invoking either the CALL or TCALL instructions.
|
|
||||||
Arguments for the next function call are pushed via the PUSH, PUSH2, PUSH3, and
|
|
||||||
PUSHA instructions. The stack of a fiber will grow as large as needed, although by
|
|
||||||
default dst will limit the maximum size of a fiber's stack.
|
|
||||||
The maximum stack size can be modified on a per fiber basis.
|
|
||||||
|
|
||||||
The slots in the stack are exposed as virtual registers to instructions. They
|
|
||||||
can hold any Dst value.
|
|
||||||
|
|
||||||
## Closures
|
|
||||||
|
|
||||||
All functions in dst are closures; they combine some bytecode instructions
|
|
||||||
with 0 or more environments. In the C source, a closure (hereby the same as
|
|
||||||
a function) is represented by the type `DstFunction *`. The bytecode instruction
|
|
||||||
part of the function is represented by `DstFuncDef *`, and a function environment
|
|
||||||
is represented with `DstFuncEnv *`.
|
|
||||||
|
|
||||||
The function definition part of a function (the 'bytecode' part, `DstFuncDef *`),
|
|
||||||
we also store various metadata about the function which is useful for debugging,
|
|
||||||
as well as constants referenced by the function.
|
|
||||||
|
|
||||||
## C Functions
|
|
||||||
|
|
||||||
Dst uses C functions to bridge to native code. A C function
|
|
||||||
(`DstCFunction *` in C) is a C function pointer that can be called like
|
|
||||||
a normal dst closure. From the perspective of the bytecode instruction set, there is no difference
|
|
||||||
in invoking a C function and invoking a normal dst function.
|
|
||||||
|
|
||||||
## Bytecode Format
|
|
||||||
|
|
||||||
Dst bytecode presents an interface to a virtual machine with a large number
|
|
||||||
of identical registers that can hold any Dst value (`Dst *` in C). Most instructions
|
|
||||||
have a destination register, and 1 or 2 source register. Registers are simply
|
|
||||||
named with positive integers.
|
|
||||||
|
|
||||||
Each instruction is a 32 bit integer, meaning that the instruction set is a constant
|
|
||||||
width RISC instruction set like MIPS. The opcode of each instruction is the least significant
|
|
||||||
byte of the instruction. The highest bit of
|
|
||||||
this leading byte is reserved for debugging purpose, so there are 128 possible opcodes encodable
|
|
||||||
with this scheme. Not all of these possible opcode are defined, and will trap the interpreter
|
|
||||||
and emit a debug signal. Note that this mean an unknown opcode is still valid bytecode, it will
|
|
||||||
just put the interpreter into a debug state when executed.
|
|
||||||
|
|
||||||
```
|
|
||||||
X - Payload bits
|
|
||||||
O - Opcode bits
|
|
||||||
|
|
||||||
4 3 2 1
|
|
||||||
+----+----+----+----+
|
|
||||||
| XX | XX | XX | OO |
|
|
||||||
+----+----+----+----+
|
|
||||||
```
|
|
||||||
|
|
||||||
8 bits for the opcode leaves 24 bits for the payload, which may or may not be utilized.
|
|
||||||
There are a few instruction variants that divide these payload bits.
|
|
||||||
|
|
||||||
* 0 arg - Used for noops, returning nil, or other instructions that take no
|
|
||||||
arguments. The payload is essentially ignored.
|
|
||||||
* 1 arg - All payload bits correspond to a single value, usually a signed or unsigned integer.
|
|
||||||
Used for instructions of 1 argument, like returning a value, yielding a value to the parent fiber,
|
|
||||||
or doing a (relative) jump.
|
|
||||||
* 2 arg - Payload is split into byte 2 and bytes 3 and 4.
|
|
||||||
The first argument is the 8 bit value from byte 2, and the second argument is the 16 bit value
|
|
||||||
from bytes 3 and 4 (`instruction >> 16`). Used for instructions of two arguments, like move, normal
|
|
||||||
function calls, conditionals, etc.
|
|
||||||
* 3 arg - Bytes 2, 3, and 4 each correspond to an 8 bit argument.
|
|
||||||
Used for arithmetic operations, emitting a signal, etc.
|
|
||||||
|
|
||||||
These instruction variants can be further refined based on the semantics of the arguments.
|
|
||||||
Some instructions may treat an argument as a slot index, while other instructions
|
|
||||||
will treat the argument as a signed integer literal, and index for a constant, an index
|
|
||||||
for an environment, or an unsigned integer.
|
|
||||||
|
|
||||||
## Instruction Reference
|
|
||||||
|
|
||||||
A listing of all opcode values can be found in src/include/dst/dstopcodes.h. The dst assembly
|
|
||||||
short names can be found src/assembler/asm.c. In this document, we will refer to the instructions
|
|
||||||
by their short names as presented to the assembler rather than their numerical values.
|
|
||||||
|
|
||||||
Each instruction is also listed with a signature, which are the arguments the instruction
|
|
||||||
expects. There are a handful of instruction signatures, which combine the arity and type
|
|
||||||
of the instruction. The assembler does not
|
|
||||||
do any typechecking per closure, but does prevent jumping to invalid instructions and
|
|
||||||
failure to return or error.
|
|
||||||
|
|
||||||
### Notation
|
|
||||||
|
|
||||||
* The $ prefix indicates that a instruction parameter is acting as a virtual register (slot).
|
|
||||||
If a parameter does not have the $ suffix in the description, it is acting as some kind
|
|
||||||
of literal (usually an unsigned integer for indexes, and a signed integer for literal integers).
|
|
||||||
|
|
||||||
* Some operators in the description have the suffix 'i' or 'r'. These indicate
|
|
||||||
that these operators correspond to integers or real numbers only, respectively. All
|
|
||||||
bitwise operators and bit shifts only work with integers.
|
|
||||||
|
|
||||||
* The `>>>` indicates unsigned right shift, as in Java. Because all integers in dst are
|
|
||||||
signed, we differentiate the two kinds of right bit shift.
|
|
||||||
|
|
||||||
* The 'im' suffix in the instruction name is short for immediate. The 'i' suffix is short for integer,
|
|
||||||
and the 'r' suffix is short for real.
|
|
||||||
|
|
||||||
### Reference Table
|
|
||||||
|
|
||||||
| Instruction | Signature | Description |
|
|
||||||
| ----------- | --------------------------- | --------------------------------- |
|
|
||||||
| `add` | `(add dest lhs rhs)` | $dest = $lhs + $rhs |
|
|
||||||
| `addi` | `(addi dest lhs rhs)` | $dest = $lhs +i $rhs |
|
|
||||||
| `addim` | `(addim dest lhs im)` | $dest = $lhs +i im |
|
|
||||||
| `addr` | `(addr dest lhs rhs)` | $dest = $lhs +r $rhs |
|
|
||||||
| `band` | `(band dest lhs rhs)` | $dest = $lhs & $rhs |
|
|
||||||
| `bnot` | `(bnot dest operand)` | $dest = ~$operand |
|
|
||||||
| `bor` | `(bor dest lhs rhs)` | $dest = $lhs | $rhs |
|
|
||||||
| `bxor` | `(bxor dest lhs rhs)` | $dest = $lhs ^ $rhs |
|
|
||||||
| `call` | `(call dest callee)` | $dest = call($callee) |
|
|
||||||
| `clo` | `(clo dest index)` | $dest = closure(defs[$index]) |
|
|
||||||
| `cmp` | `(cmp dest lhs rhs)` | $dest = dst\_compare($lhs, $rhs) |
|
|
||||||
| `debug` | `(debug)` | Suspend current fiber |
|
|
||||||
| `div` | `(div dest lhs rhs)` | $dest = $lhs / $rhs |
|
|
||||||
| `divi` | `(divi dest lhs rhs)` | $dest = $lhs /i $rhs |
|
|
||||||
| `divim` | `(divim dest lhs im)` | $dest = $lhs /i im |
|
|
||||||
| `divr` | `(divr dest lhs rhs)` | $dest = $lhs /r $rhs |
|
|
||||||
| `eq` | `(eq dest lhs rhs)` | $dest = $lhs == $rhs |
|
|
||||||
| `eqi` | `(eqi dest lhs rhs)` | $dest = $lhs ==i $rhs |
|
|
||||||
| `eqim` | `(eqim dest lhs im)` | $dest = $lhs ==i im |
|
|
||||||
| `eqr` | `(eqr dest lhs rhs)` | $dest = $lhs ==r $rhs |
|
|
||||||
| `err` | `(err message)` | Throw error $message. |
|
|
||||||
| `get` | `(get dest ds key)` | $dest = $ds[$key] |
|
|
||||||
| `geti` | `(geti dest ds index)` | $dest = $ds[index] |
|
|
||||||
| `gt` | `(gt dest lhs rhs)` | $dest = $lhs > $rhs |
|
|
||||||
| `gti` | `(gti dest lhs rhs)` | $dest = $lhs \>i $rhs |
|
|
||||||
| `gtim` | `(gtim dest lhs im)` | $dest = $lhs \>i im |
|
|
||||||
| `gtr` | `(gtr dest lhs rhs)` | $dest = $lhs \>r $rhs |
|
|
||||||
| `gter` | `(gter dest lhs rhs)` | $dest = $lhs >=r $rhs |
|
|
||||||
| `jmp` | `(jmp label)` | pc = label, pc += offset |
|
|
||||||
| `jmpif` | `(jmpif cond label)` | if $cond pc = label else pc++ |
|
|
||||||
| `jmpno` | `(jmpno cond label)` | if $cond pc++ else pc = label |
|
|
||||||
| `ldc` | `(ldc dest index)` | $dest = constants[index] |
|
|
||||||
| `ldf` | `(ldf dest)` | $dest = false |
|
|
||||||
| `ldi` | `(ldi dest integer)` | $dest = integer |
|
|
||||||
| `ldn` | `(ldn dest)` | $dest = nil |
|
|
||||||
| `lds` | `(lds dest)` | $dest = current closure (self) |
|
|
||||||
| `ldt` | `(ldt dest)` | $dest = true |
|
|
||||||
| `ldu` | `(ldu dest env index)` | $dest = envs[env][index] |
|
|
||||||
| `lt` | `(lt dest lhs rhs)` | $dest = $lhs < $rhs |
|
|
||||||
| `lti` | `(lti dest lhs rhs)` | $dest = $lhs \<i $rhs |
|
|
||||||
| `ltim` | `(ltim dest lhs im)` | $dest = $lhs \<i im |
|
|
||||||
| `ltr` | `(ltr dest lhs rhs)` | $dest = $lhs \<r $rhs |
|
|
||||||
| `lter` | `(lter dest lhs rhs)` | $dest = $lhs <=r $rhs |
|
|
||||||
| `movf` | `(movf src dest)` | $dest = $src |
|
|
||||||
| `movn` | `(movn dest src)` | $dest = $src |
|
|
||||||
| `mul` | `(mul dest lhs rhs)` | $dest = $lhs * $rhs |
|
|
||||||
| `muli` | `(muli dest lhs rhs)` | $dest = $lhs \*i $rhs |
|
|
||||||
| `mulim` | `(mulim dest lhs im)` | $dest = $lhs \*i im |
|
|
||||||
| `mulr` | `(mulr dest lhs rhs)` | $dest = $lhs \*r $rhs |
|
|
||||||
| `noop` | `(noop)` | Does nothing. |
|
|
||||||
| `push` | `(push val)` | Push $val as arg |
|
|
||||||
| `push2` | `(push2 val1 val3)` | Push $val1, $val2 as args |
|
|
||||||
| `push3` | `(push3 val1 val2 val3)` | Push $val1, $val2, $val3, as args |
|
|
||||||
| `pusha` | `(pusha array)` | Push values in $array as args |
|
|
||||||
| `put` | `(put ds key val)` | $ds[$key] = $val |
|
|
||||||
| `puti` | `(puti ds index val)` | $ds[index] = $val |
|
|
||||||
| `res` | `(res dest fiber val)` | $dest = resume $fiber with $val |
|
|
||||||
| `ret` | `(ret val)` | Return $val |
|
|
||||||
| `retn` | `(retn)` | Return nil |
|
|
||||||
| `setu` | `(setu env index val)` | envs[env][index] = $val |
|
|
||||||
| `sig` | `(sig dest value sigtype)` | $dest = emit $value as sigtype |
|
|
||||||
| `sl` | `(sl dest lhs rhs)` | $dest = $lhs << $rhs |
|
|
||||||
| `slim` | `(slim dest lhs shamt)` | $dest = $lhs << shamt |
|
|
||||||
| `sr` | `(sr dest lhs rhs)` | $dest = $lhs >> $rhs |
|
|
||||||
| `srim` | `(srim dest lhs shamt)` | $dest = $lhs >> shamt |
|
|
||||||
| `sru` | `(sru dest lhs rhs)` | $dest = $lhs >>> $rhs |
|
|
||||||
| `sruim` | `(sruim dest lhs shamt)` | $dest = $lhs >>> shamt |
|
|
||||||
| `sub` | `(sub dest lhs rhs)` | $dest = $lhs - $rhs |
|
|
||||||
| `tcall` | `(tcall callee)` | Return call($callee) |
|
|
||||||
| `tchck` | `(tcheck slot types)` | Assert $slot does matches types |
|
|
||||||
|
|
455
doc/intro.md
455
doc/intro.md
@ -1,455 +0,0 @@
|
|||||||
# Dst Language Introduction
|
|
||||||
|
|
||||||
Dst is a dynamic, lightweight programming language with strong functional
|
|
||||||
capabilities as well as support for imperative programming. It to be used
|
|
||||||
for short lived scripts as well as for building real programs. It can also
|
|
||||||
be extended with native code (C modules) for better performance and interfacing with
|
|
||||||
existing software. Dst takes ideas from Lua, Scheme, Racket, Clojure, Smalltalk, Erlang, and
|
|
||||||
a whole bunch of other dynamic languages.
|
|
||||||
|
|
||||||
# Hello, world!
|
|
||||||
|
|
||||||
Following tradition, a simple Dst program will simply print "Hello, world!".
|
|
||||||
|
|
||||||
```
|
|
||||||
(print "Hello, world!")
|
|
||||||
```
|
|
||||||
|
|
||||||
Put the following code in a file call `hello.dst`, and run `./dst hello.dst`.
|
|
||||||
The words "Hello, world!" should be printed to the console, and then the program
|
|
||||||
should immediately exit. You now have a working dst program!
|
|
||||||
|
|
||||||
Alternatively, run the program `./dst` without any arguments to enter a REPL,
|
|
||||||
or read eval print loop. This is a mode where Dst functions like a calculator,
|
|
||||||
reading some input from stdin, evaluating it, and printing out the result, all
|
|
||||||
in an inifinte loop. This is a useful mode for exploring or prototyping in Dst.
|
|
||||||
|
|
||||||
This is about the simplest program one can write, and consists of precisely
|
|
||||||
three elements. This first element is the `print` symbol. This is a function
|
|
||||||
that simply prints its arguments to standard out. The second argument is the
|
|
||||||
string literal "Hello, world!", which is the one and only argument to the
|
|
||||||
print function. Lastly, the print symbol and the string literal are wrapped
|
|
||||||
in parentheses, forming a tuple. In Dst, parentheses and brackets are interchangeable,
|
|
||||||
brackets are used mostly when the resulting tuple is not a function call. The tuple
|
|
||||||
above indicates that the function `print` is to be called with one argument, `"Hello, world"`.
|
|
||||||
|
|
||||||
Like all lisps, all operations in Dst are in prefix notation; the name of the
|
|
||||||
operator is the first value in the tuple, and the arguments passed to it are
|
|
||||||
in the rest of the tuple.
|
|
||||||
|
|
||||||
# A bit more - Arithmetic
|
|
||||||
|
|
||||||
Any programming language will have some way to do arithmetic. Dst is no exception,
|
|
||||||
and supports the basic arithemtic operators
|
|
||||||
|
|
||||||
```
|
|
||||||
# Prints 13
|
|
||||||
# (1 + (2*2) + (10/5) + 3 + 4 + (5 - 6))
|
|
||||||
(print (+ 1 (* 2 2) (/ 10 5) 3 4 (- 5 6)))
|
|
||||||
```
|
|
||||||
|
|
||||||
Just like the print function, all arithmetic operators are entered in
|
|
||||||
prefix notation. Dst also supports the modulo operator, or `%`, which returns
|
|
||||||
the remainder of integer division. For example, `(% 10 3)` is 1, and `(% 10.5 3)` is
|
|
||||||
1.5. The lines that begin with `#` are comments.
|
|
||||||
|
|
||||||
Dst actually has two "flavors" of numbers; integers and real numbers. Integers are any
|
|
||||||
integer value between -2,147,483,648 and 2,147,483,647 (32 bit signed integer).
|
|
||||||
Reals are real numbers, and are represented by IEEE-754 double precision floating point
|
|
||||||
numbers. That means that they can represent any number an integer can represent, as well
|
|
||||||
fractions to very high precision.
|
|
||||||
|
|
||||||
Although real numbers can represent any value an integer can, try to distinguish between
|
|
||||||
real numbers and integers in your program. If you are using a number to index into a structure,
|
|
||||||
you probably want integers. Otherwise, you may want to use reals (this is only a rule of thumb).
|
|
||||||
|
|
||||||
Arithmetic operator will convert integers to real numbers if needed, but real numbers
|
|
||||||
will not be converted to integers, as not all real numbers can be safely convert to integers.
|
|
||||||
|
|
||||||
## Numeric literals
|
|
||||||
|
|
||||||
Numeric literals can be written in many ways. Numbers can be written in base 10, with
|
|
||||||
underscores used to separate digits into groups. A decimal point can be used for floating
|
|
||||||
point numbers. Numbers can also be written in other bases by prefixing the number with the desired
|
|
||||||
base and the character 'r'. For example, 16 can be written as `16`, `1_6`, `16r10`, `4r100`, or `0x10`. The
|
|
||||||
`0x` prefix can be used for hexadecimal as it is so common. The radix must be themselves written in base 10, and
|
|
||||||
can be any integer from 2 to 36. For any radix above 10, use the letters as digits (not case sensitive).
|
|
||||||
|
|
||||||
Numbers can also be in scientific notation such as `3e10`. A custom radix can be used as well
|
|
||||||
as for scientific notation numbers, (the exponent will share the radix). For numbers in scientific
|
|
||||||
notation with a radix besides 10, use the `&` symbol to indicate the exponent rather then `e`.
|
|
||||||
|
|
||||||
## Arithmetic Functions
|
|
||||||
|
|
||||||
Besides the 5 main arithmetic functions, dst also supports a number of math functions
|
|
||||||
taken from the C libary `<math.h>`, as well as bitwise operators that behave like they
|
|
||||||
do in C or Java.
|
|
||||||
|
|
||||||
# Strings, Keywords and Symbols
|
|
||||||
|
|
||||||
Dst supports several varieties of types that can be used as labels for things in
|
|
||||||
your program. The most useful type for this purpose is the keyword type. A keyword
|
|
||||||
begins with a semicolon, and then contains 0 or more alphanumeric or a few other common
|
|
||||||
characters. For example, `:hello`, `:my-name`, `:=`, and `:ABC123_-*&^%$` are all keywords.
|
|
||||||
Keywords are actually just special cases of symbols, which are similar but don't start with
|
|
||||||
a semicolon. The difference between symbols and keywords is that keywords evaluate to themselves, while
|
|
||||||
symbols evaluate to whatever they are bound to. To have a symbol evaluate to itself, it must be
|
|
||||||
quoted.
|
|
||||||
|
|
||||||
```lisp
|
|
||||||
# Evaluates to :monday
|
|
||||||
:monday
|
|
||||||
|
|
||||||
# Will throw a compile error as monday is not defined
|
|
||||||
monday
|
|
||||||
|
|
||||||
# Quote it - evaluates to the symbol monday
|
|
||||||
'monday
|
|
||||||
|
|
||||||
# Or first define monday
|
|
||||||
(def monday "It is monday")
|
|
||||||
|
|
||||||
# Now the evaluation should work - monday evaluates to "It is monday"
|
|
||||||
monday
|
|
||||||
```
|
|
||||||
|
|
||||||
The most common thing to do with a keyword is to check it for equality or use it as a key into
|
|
||||||
a table or struct. Note that symbols, keywords and strings are all immutable. Besides making your
|
|
||||||
code easier to reason about, it allows for many optimizations involving these types.
|
|
||||||
|
|
||||||
```lisp
|
|
||||||
# Prints true
|
|
||||||
(= :hello :hello)
|
|
||||||
|
|
||||||
# Prints false, everything in dst is case sensitive
|
|
||||||
(= :hello :HeLlO)
|
|
||||||
|
|
||||||
# Look up into a table - evaluates to 25
|
|
||||||
(get {
|
|
||||||
:name "John"
|
|
||||||
:age 25
|
|
||||||
:occupation "plumber"
|
|
||||||
} :age)
|
|
||||||
```
|
|
||||||
|
|
||||||
Strings can be used similarly to keywords, but there primary usage is for defining either text
|
|
||||||
or arbitrary sequences of bytes. Strings (and symbols) in dst are what is sometimes known as
|
|
||||||
"8-bit clean"; they can hold any number of bytes, and are completely unaware of things like character
|
|
||||||
encodings. This is completely compatible with ASCII and UTF-8, two of the most common character
|
|
||||||
encodings. By being encoding agnostic, dst strings can be very simple, fast, and useful for
|
|
||||||
for other uses besides holding text.
|
|
||||||
|
|
||||||
Literal text can be entered inside quotes, as we have seen above.
|
|
||||||
|
|
||||||
```
|
|
||||||
"Hello, this is a string."
|
|
||||||
|
|
||||||
# We can also add escape characters for newlines, double quotes, backslash, tabs, etc.
|
|
||||||
"Hello\nThis is on line two\n\tThis is indented\n"
|
|
||||||
|
|
||||||
# For long strings where you don't want to type a lot of escape characters,
|
|
||||||
# you can use 1 or more backticks (`\``) to delimit a string.
|
|
||||||
# To close this string, simply repeat the opening sequence of backticks
|
|
||||||
``
|
|
||||||
This is a string.
|
|
||||||
Line 2
|
|
||||||
Indented
|
|
||||||
"We can just type quotes here", and backslashes \ no problem.
|
|
||||||
``
|
|
||||||
```
|
|
||||||
|
|
||||||
# Functions
|
|
||||||
|
|
||||||
Dst is a functional language - that means that one of the basic building blocks of your
|
|
||||||
program will be defining functions (the other is using data structures). Because dst
|
|
||||||
is a Lisp, functions are values just like numbers or strings - they can be passed around and
|
|
||||||
created as needed.
|
|
||||||
|
|
||||||
Functions can be defined with the `defn` macro, like so:
|
|
||||||
|
|
||||||
```lisp
|
|
||||||
(defn triangle-area [base height]
|
|
||||||
(print "calculating area of a triangle...")
|
|
||||||
(* base height 0.5))
|
|
||||||
```
|
|
||||||
|
|
||||||
A function defined with `defn` has a number of parts. First, it has it's name, triangle-area. This
|
|
||||||
is just a symbol used to access the function later. Next is the list of parameters this function takes,
|
|
||||||
in this case two parameters named base and height. Lastly, a function made with defn has
|
|
||||||
a number of body statements, which get executed each time the function is called. The last
|
|
||||||
form in the body is what the function evaluates to, or returns.
|
|
||||||
|
|
||||||
Once a function like the above one is defined, the programmer can use the `triangle-area`
|
|
||||||
function just like any other, say `print` or `+`.
|
|
||||||
|
|
||||||
```lisp
|
|
||||||
# Prints "calculating area of a triangle..." and then "25"
|
|
||||||
(print (triangle-area 5 10))
|
|
||||||
```
|
|
||||||
|
|
||||||
Note that when nesting function calls in other function calls like above (a call to triangle-area is
|
|
||||||
nested inside a call to print), the inner function calls are evaluated first. Also, arguments to
|
|
||||||
a function call are evaluated in order, from first argument to last argument).
|
|
||||||
|
|
||||||
Because functions are first-class values like numbers or strings, they can be passed
|
|
||||||
as arguments to other functions as well
|
|
||||||
|
|
||||||
```
|
|
||||||
(print triangle-area)
|
|
||||||
```
|
|
||||||
|
|
||||||
This prints the location in memory of the function triangle area. This idea can be used
|
|
||||||
to build some powerful constructs purely out of functions, or closures as they are known
|
|
||||||
in many contexts.
|
|
||||||
|
|
||||||
Functions don't need to have names. The `fn` keyword can be used to introduce function
|
|
||||||
literals without binding them to a symbol.
|
|
||||||
|
|
||||||
```
|
|
||||||
# Evaluates to 40
|
|
||||||
((fn [x y] (+ x x y)) 10 20)
|
|
||||||
```
|
|
||||||
|
|
||||||
The above expression first creates an anonymous function that adds twice
|
|
||||||
the first argument to the second, and then calls that function with arguments 10 and 20.
|
|
||||||
This will return (10 + 10 + 20) = 40.
|
|
||||||
|
|
||||||
There is a common macro `defn` that can be used for creating functions and immediately binding
|
|
||||||
them to a name. `defn` works as expected at both the top level and inside another form. There is also
|
|
||||||
the corresponding
|
|
||||||
|
|
||||||
```lisp
|
|
||||||
(defn myfun [x y]
|
|
||||||
(+ x x y))
|
|
||||||
|
|
||||||
# You can think of defn as a shorthand for def and fn together
|
|
||||||
(def myfun-same (fn [x y]
|
|
||||||
(+ x x Y)))
|
|
||||||
|
|
||||||
(myfun 3 4) # -> 10
|
|
||||||
```
|
|
||||||
|
|
||||||
Dst has many macros provided for you (and you can write your own).
|
|
||||||
Macros are just functions that take your source code
|
|
||||||
and transform it into some other source code, usually automating some repetitive pattern for you.
|
|
||||||
|
|
||||||
# Defs and Vars
|
|
||||||
|
|
||||||
Values can be bound to symbols for later use using the keyword `def`. Using undefined
|
|
||||||
symbols will raise an error.
|
|
||||||
|
|
||||||
```
|
|
||||||
(def a 100)
|
|
||||||
(def b (+ 1 a))
|
|
||||||
(def c (+ b b))
|
|
||||||
(def d (- c 100))
|
|
||||||
```
|
|
||||||
|
|
||||||
Bindings created with def have lexical scoping. Also, bindings created with def are immutable; they
|
|
||||||
cannot be changed after definition. For mutable bindings, like variables in other programming
|
|
||||||
languages, use the `var` keyword. The assignment special form `:=` can then be used to update
|
|
||||||
a var.
|
|
||||||
|
|
||||||
```
|
|
||||||
(var myvar 1)
|
|
||||||
(print myvar)
|
|
||||||
(:= myvar 10)
|
|
||||||
(print myvar)
|
|
||||||
```
|
|
||||||
|
|
||||||
In the global scope, you can use the `:private` option on a def or var to prevent it from
|
|
||||||
being exported to code that imports your current module. You can also add documentation to
|
|
||||||
a function by passing a string the def or var command.
|
|
||||||
|
|
||||||
```lisp
|
|
||||||
(def mydef :private "This will have priavte scope. My doc here." 123)
|
|
||||||
(var myvar "docstring here" 321)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Scopes
|
|
||||||
|
|
||||||
Defs and vars (collectively known as bindings) live inside what is called a scope. A scope is
|
|
||||||
simply where the bindings are valid. If a binding is referenced outside of its scope, the compiler
|
|
||||||
will throw an error. Scopes are useful for organizing your bindings and my extension your programs.
|
|
||||||
There are two main ways to create a scope in Dst.
|
|
||||||
|
|
||||||
The first is to use the `do` special form. `do` executes a series of statements in a scope
|
|
||||||
and evaluates to the last statement. Bindings create inside the form do not escape outside
|
|
||||||
of its scope.
|
|
||||||
|
|
||||||
```lisp
|
|
||||||
(def a :outera)
|
|
||||||
|
|
||||||
(do
|
|
||||||
(def a 1)
|
|
||||||
(def b 2)
|
|
||||||
(def c 3)
|
|
||||||
(+ a b c)) # -> 6
|
|
||||||
|
|
||||||
a # -> :outera
|
|
||||||
b # -> compile error: "unknown symbol \"b\""
|
|
||||||
c # -> compile error: "unknown symbol \"c\""
|
|
||||||
```
|
|
||||||
|
|
||||||
Any attempt to reference the bindings from the do form after it has finished
|
|
||||||
executing will fail. Also notice who defining `a` inside the do form did not
|
|
||||||
overwrite the original definition of `a` for the global scope.
|
|
||||||
|
|
||||||
The second way to create a scope is to create a closure.
|
|
||||||
The `fn` special form also introduces a scope just like
|
|
||||||
the `do` special form.
|
|
||||||
|
|
||||||
There is another built in macro, `let`, that does multiple defs at once, and then introduces a scope.
|
|
||||||
`let` is a wrapper around a combination of defs and dos, and is the most "functional" way of
|
|
||||||
creating bindings.
|
|
||||||
|
|
||||||
```lisp
|
|
||||||
(let [a 1
|
|
||||||
b 2
|
|
||||||
c 3]
|
|
||||||
(+ a b c)) # -> 6
|
|
||||||
```
|
|
||||||
|
|
||||||
The above is equivalent to the example using `do` and `def`.
|
|
||||||
This is the preferable form in most cases,
|
|
||||||
but using do with multiple defs is fine as well.
|
|
||||||
|
|
||||||
# Data Structures
|
|
||||||
|
|
||||||
Once you have a handle on functions and the primitive value types, you may be wondering how
|
|
||||||
to work with collections of things. Dst has a small number of core data structure types
|
|
||||||
that are very versatile. Tables, Structs, Arrays, Tuples, Strings, and Buffers, are the 6 main
|
|
||||||
built in data structure types. These data structures can be arranged in a useful table describing
|
|
||||||
there relationship to each other.
|
|
||||||
|
|
||||||
| | Mutable | Immutable |
|
|
||||||
| ---------- | ------- | --------------- |
|
|
||||||
| Indexed | Array | Tuple |
|
|
||||||
| Dictionary | Table | Struct |
|
|
||||||
| Byteseq | Buffer | String (Symbol) |
|
|
||||||
|
|
||||||
Indexed types are linear lists of elements than can be accessed in constant time with an integer index.
|
|
||||||
Indexed types are backed by a single chunk of memory for fast access, and are indexed from 0 as in C.
|
|
||||||
Dictionary types associate keys with values. The difference between dictionaries and indexed types
|
|
||||||
is that dictionaries are not limited to integer keys. They are backed by a hashtable and also offer
|
|
||||||
constant time lookup (and insertion for the mutable case).
|
|
||||||
Finally, the 'byteseq' abstraction is any type that contains a sequence of bytes. A byteseq associates
|
|
||||||
integer keys (the indices) with integer values between 0 and 255 (the byte values). In this way,
|
|
||||||
they behave much like Arrays and Tuples. However, one cannot put non integer values into a byteseq.
|
|
||||||
|
|
||||||
```lisp
|
|
||||||
(def mytuple (tuple 1 2 3))
|
|
||||||
|
|
||||||
(def myarray @(1 2 3))
|
|
||||||
(def myarray (array 1 2 3))
|
|
||||||
|
|
||||||
(def mystruct {
|
|
||||||
:key "value"
|
|
||||||
:key2 "another"
|
|
||||||
1 2
|
|
||||||
4 3})
|
|
||||||
|
|
||||||
(def another-struct
|
|
||||||
(struct :a 1 :b 2))
|
|
||||||
|
|
||||||
(def my-table @{
|
|
||||||
:a :b
|
|
||||||
:c :d
|
|
||||||
:A :qwerty})
|
|
||||||
(def another-table
|
|
||||||
(table 1 2 3 4))
|
|
||||||
|
|
||||||
(def my-buffer @"thisismutable")
|
|
||||||
(def my-buffer2 @\====\
|
|
||||||
This is also mutable ":)"
|
|
||||||
\====\)
|
|
||||||
```
|
|
||||||
|
|
||||||
To read the values in a data structure, use the get function. The first parameter is the data structure
|
|
||||||
itself, and the second parameter is the key.
|
|
||||||
|
|
||||||
```lisp
|
|
||||||
(get @{:a 1} :a) # -> 1
|
|
||||||
(get {:a 1} :a) # -> 1
|
|
||||||
(get @[:a :b :c] 2) # -> :c
|
|
||||||
(get (tuple "a" "b" "c") 1) # -> "a"
|
|
||||||
(get @"hello, world" 1) # -> 101
|
|
||||||
(get "hello, world" 0) # -> 104
|
|
||||||
```
|
|
||||||
To update a mutable data structure, use the `put` function. It takes 3 arguments, the data structure,
|
|
||||||
the key, and the value, and returns the data structure. The allowed types keys and values
|
|
||||||
depend on what data structure is passed in.
|
|
||||||
|
|
||||||
```lisp
|
|
||||||
(put @[] 100 :a)
|
|
||||||
(put @{} :key "value")
|
|
||||||
(put @"" 100 92)
|
|
||||||
```
|
|
||||||
|
|
||||||
Note that for Arrays and Buffers, putting an index that is outside the length of the data structure
|
|
||||||
will extend the data structure and fill it with nils in the case of the Array,
|
|
||||||
or 0s in the case of the Buffer.
|
|
||||||
|
|
||||||
The last generic function for all data structures is the `length` function. This returns the number of
|
|
||||||
values in a data structure (the number of keys in a dictionary type).
|
|
||||||
|
|
||||||
# Flow Control
|
|
||||||
|
|
||||||
:)
|
|
||||||
|
|
||||||
# Combinators
|
|
||||||
|
|
||||||
:)
|
|
||||||
|
|
||||||
# Modules
|
|
||||||
|
|
||||||
:)
|
|
||||||
|
|
||||||
# The Core Library
|
|
||||||
|
|
||||||
Dst has a built in core library of over 200 functions and macros at the time of writing.
|
|
||||||
While some of these functions may be refactored into separate modules, it is useful to get to know
|
|
||||||
the core to avoid rewriting provided functions.
|
|
||||||
|
|
||||||
For any given function, use the `doc` macro to view the documentation for it in the repl.
|
|
||||||
|
|
||||||
```lisp
|
|
||||||
(doc defn) -> Prints the documentation for "defn"
|
|
||||||
```
|
|
||||||
To see a list of all global functions in the repl, type the command
|
|
||||||
|
|
||||||
```lisp
|
|
||||||
(getproto *env*)
|
|
||||||
```
|
|
||||||
Which will print out every built-in global binding
|
|
||||||
(it will not show your global bindings). To print all
|
|
||||||
of your global bindings, just use *env*, which is a var
|
|
||||||
that is bound to the current environment.
|
|
||||||
|
|
||||||
# Prototypes
|
|
||||||
|
|
||||||
:)
|
|
||||||
|
|
||||||
# Fibers
|
|
||||||
|
|
||||||
:)
|
|
||||||
|
|
||||||
# Macros
|
|
||||||
|
|
||||||
:)
|
|
||||||
|
|
||||||
# IO
|
|
||||||
|
|
||||||
:)
|
|
||||||
|
|
||||||
# The Parser Library
|
|
||||||
|
|
||||||
:)
|
|
||||||
|
|
||||||
# The Assembler
|
|
||||||
|
|
||||||
:)
|
|
||||||
|
|
||||||
# Interfacing with C
|
|
||||||
|
|
||||||
:)
|
|
Loading…
Reference in New Issue
Block a user