So, you enjoyed some of the features of PicoPoe, but want a more structured programming language? Then Lune is for you. It is very similar to PicoPoe, but it has cycles, if's, and so on. You may even define your own structured statements, if you like. Let's take a closer look.
Monotype text in bold is used for reserved words, and should not be used for new identifiers. Monotype text in italic is meant to be replaced by actual identifiers or other code.
Source code files have an extension of .lune.
There are two kinds of comments: single-line and multi-line. The first runs from a semicolon until the end of the line:
; single line comment
The second kind is text delimited by braces:
{ multiple line comment
{ nested comment }
the rest of the first comment }
An identifier is a sequence of characters that begins with either an underscore or a letter (Unicode uppercase or lowercase) and is followed by more underscores or letters, or digits. Identifiers follow camel case notation. Variables and functions begin with a lowercase letter. Constants begin with the character 'k'. Types begin with uppercase letters.
Source files begin with the imports:
import file-name, ...
import library-name.module-name, ...
import library-name.module-name.file-name, ...
...
These are useful not only for not specifying during use the full name of the entities declared in source files, but also for telling the compiler about such entities and their features in order to aid in error prevention. file-name does not have an extension.
Next in the source code file come one or more of protocols, wrappers, unions, or structs. Protocols are like this:
protocol protocol-name(argument:type, ...)
operator-declaration
array-declaration
block-declaration
...
The arguments are used for what other languages call generics and are optional. See below for block, array, and operator declarations (note that these are just declarations, not implementations). For instance:
protocol Map(Key:struct Comparable, Data:struct Object)
block addElement(k:Key) withContents(e:Data):Void
block removeElement(k:Key):Void
block getElement(k:Key):Data
block elementExists(k:Key):Bool
Note the use of struct Comparable and struct Object in the generics arguments to the protocol. This use says that Key and Data are actually types, not data. If we said only Comparable and Object, without struct, Key and Data would be data, not types. We may then say:
data myObject:Map(String, UWord)
myObject.removeElement("myElement")
Wrappers are just (extended) typedefs. This is the syntax:
wrapper wrapper-name(argument:type, ...):underlying-type
errors-declaration
literal-implementation
operator-implementation
array-implementation
block-implementation
...
Again, arguments are for generics and optional. underlying-type is the old type name. wrapper-name is the new type name. For example, we may simply say:
wrapper Handle:**Object
and, from then on, Handle and **Object will be synonyms. On the other hand, we have the option of completely redefining the interface of the underlying type and add our own methods and operators and more to the underlying type. We just cannot add new data members to a wrapper - we'll have to work with the underlying type's restrictions.
Unions are like this:
union union-name(argument:type, ...)
data-declaration
union-declaration
struct-declaration
...
union-name is optional, as are arguments. If union-name is absent, an anonymous union is being declared, and the identifiers inside the union must be different from the other identifiers where the union is declared, to avoid naming conflicts.
Structs inside unions and inside other structs are declared like this:
struct struct-name(argument:type, ...)
data-declaration
union-declaration
struct-declaration
...
Here, struct-name is optional too, as are the arguments. Here is an example:
union Registers
struct
bytes[4] eax
bytes[4] ebx
bytes[4] ecx
bytes[4] edx
struct
bytes[2] reserved
bytes[2] ax
bytes[2] reserved
bytes[2] bx
bytes[2] reserved
bytes[2] cx
bytes[2] reserved
bytes[2] dx
struct
bytes[2] reserved
byte ah
byte al
bytes[2] reserved
byte bh
byte bl
bytes[2] reserved
byte ch
byte cl
bytes[2] reserved
byte dh
byte dl
Note the use of the keyword reserved (reserved fields cannot be accessed) and of anonymous structs, similar to unions. Elsewhere we can say:
data r:Registers
r.eax <- r.bx × (r.cl / 2)
Finally, come the structs declared outside of other structs or unions. These are the classes, although they still use the keyword struct, like this:
struct struct-name(argument:type, ...):category-name
errors-declaration
equ-declaration
data-declaration
union-declaration
struct-declaration
literal-implementation
operator-implementation
array-implementation
block-implementation
initer-implementation
...
category-name may be nothing (in which case, we don't write the colon), or it may be the super class name, or a protocol name, or Private, or a general identifier beginning with an uppercase letter. Usually, a class is just a sequence of these struct declarations, each grouped under its own category. If this category name isn't specified, the class inherits from no super class. If it's nothing or the super struct name, it may have data added in this category. If it's a protocol name, we must implement in the category the methods and operators of that protocol. If the category name is the reserved keyword Private, the methods implemented in that category are not accessible to other classes. A class may have only one category of each name, except Private, which may appear more than once in a class definition. Here follows an example.
struct Rectangle:GeometricFigure
data x:Real, y:Real, w:Real, h:Real
block initWith(x:Real) and(y:Real) and(w:Real) and(h:Real):Rectangle
self.x <- x
self.y <- y
self.w <- w
self.h <- h
return(self)
block area:Real
return(w × h)
block makeSquare(s:Real):Void
w h <- s
struct Rectangle:ChangeOrigin
block moveToOrigin:Void
x y <- 0.0
block centerOnOrigin:Void
x <- -w / 2.0
y <- -h / 2.0
Here we see that class Rectangle is being defined across two categories so far, namely GeometricFigure (its super class), and ChangeOrigin, a protocol with two methods declared elsewhere in the program and implemented here. We could have added more protocol categories, or even Private or general categories, but not more super categories or empty categories. Exactly one of either of these last two must always be present, and one or the other must be the first category of a class.
There are only four primitive data types, and they're all related:
bit
bits[unsigned-word]
byte
bytes[unsigned-word]
If we say:
data s:bit
data v:bits[7]
we end up with data occupying one full byte in memory, that is, bits and bytes are packed together, not spread across bytes as in other programming languages.
If we wish to pass blocks around, we may do so with the block type:
(argument-type, ...)->(return-type, ...)
This is the signature of a block, indicating both its argument types as well as its return types. It may be used anywhere we expect a block name to be passed around.
Errors are special kinds of enumerations, defined like this:
errors errors-name
error1
error2
error3 error4
...
error3 and error4 are synonyms. Errors can only be used with the statements error() and iferror(), explained below. Here is an example of a list of errors:
errors FileError
ReadAfterEOF
DiskFull
BadHandle
FileAlreadyOpen
Constants are declared in one of several ways. The easiest case is with just one constant:
static equ constant <- expression
This works if the constant is of the same type as the class. If not, we may use:
static equ constant1:type1 <- expression1, constant2:type2 <- expression2, ...
Any combination of these two cases is possible. static is optional and it says whether the constant belongs to the class or is different per instance. Enumerations are like this:
equ enum-name
name1
name2 name3
name4 <- unsigned-word
name5
...
The identifier enum-name is optional, and, if absent, care must be taken to avoid naming conflicts. name1 is 0 (zero), name2 and name3 are on the same line and therefore synonyms (both equal to 1), name4 is initialized with a number, name5 is that number plus 1, and so on.
Data declarations are like this, for example:
static getter setter data data1:type1 <- expression1, data2:type2 <- expression2, ...
static, getter, and setter are optional. static says the data belongs to the class, not to its instances. getter allows the data to be read from outside of the class, like this:
x <- myObj.data
setter allows the data to be written to outside of its class:
myObj.data <- x
These are similar to the public/private mechanisms of other languages. The initializer expressions expression1, expression2, ... are also optional. A type is:
*type-name(generics)[array-size]
* indicates the data is in fact a pointer. We may have pointers to pointers to pointers... Generics were seen above, with the Map example. Arrays, if present, may be multidimensional, like this:
data myArray:UWord[10][20][30]
A literal is declared so:
literal regular-expression
statement
...
Here is an excerpt of an example:
struct Bool:Object
data value:bit
literal false
value <- 0
literal true
value <- 1
Operators come in several flavours:
prefixop symbol:return-type
statement
...
suffixop symbol:return-type
statement
...
linfixop symbol(argument:argument-type):return-type
statement
...
rinfixop symbol(argument:argument-type):return-type
statement
...
There are prefix operators, like ¬myVar, suffix operators like myVar--, and infix operators (left and right associative) like myVar1 < myVar2. symbol is any Unicode mathematical symbol, or combination (without whitespace), with a few exceptions. Operators may be overloaded. In case of ambiguities, the compiler should select longest match first, followed by order of imports, followed by order of implementation. The order of implementation gives us the operator precedence, from highest to lowest.
Array access are simply blocks with two special names, like the following:
block at(index:index-type):return-type
statement
...
block at(index:index-type) put(value:value-type):return-type
statement
...
We have already seen examples of blocks. They're just pieces of code, like this:
static block name1(argument1:argument1-type) name2+(argument2:argument2-type) ...:return-type, ...
statement
...
Note the similarity with Objective-C. Blocks may be overloaded. They may also be or not be static. They may return more than one expression. Note the + after name2. This means one or more occurrences of name2(...) may appear in a call to this block, like this:
block myBlock(argX:Real) and+(argY:Real):Void
...
myObj.myBlock(x) and(y1) and(y2) and(y3)
The name of this block is myBlock()and+(). It cannot coexist, in a class, with myBlock()and() nor with myBlock+()and(). At most one + may be present in a block declaration, and it may appear anywhere, not just in the last part of a block name. To access the parameters in this example, we may use:
args.count
args[unsigned-word]
An initer is simply code that gets called at class load time, to initialize static data or perform some other code at that time. Initers are usually the last blocks to be implemented in the class.
initer
statement
...
The simplest statement is probably
nop
which means no operation. Next comes a data declaration. There are two types of data declarations inside blocks. The first is similar to the data declaration we saw above, but have no getter and no setter modifiers. The may have a static modifier. The other data declaration statement is like this:
|argument:type, ...|
This kind of statement is used like this:
myNumber.repeat
|myWord:UWord|
print(myWord)
repeat is a statement declared in the class UWord that expects a Block as an argument, and exports a single parameter caught with the |...| statement. In this example, repeat executes the block as many times as the contents of myNumber, from value 0 (zero) to value myNumber-1, and in each execution of the block, myWord contains this value. Data declarations must appear before all other statements in a block, first those in a |...| statement, and then those in normal data statements. Next are the assignments (we already saw a few):
data1 data2 ... <- expression
And the associated returns:
return(expression)
return(expression1) and+(expression2)
For example, the code:
block myFuncX(x:Real) andY(y:Real):Real, Real
return(x / y) and(x \ y)
quo rem <- myFuncX(10.0) andY(5.0)
places the quotient of dividing 10.0 by 5.0 in quo and the remainder of that division in rem. If a block returns no values, a return may come all by itself:
return
A statement may also be an expression (see below). Next come the labels and the gotos:
statement1
...
@label
statement2
...
Labels are identifiers, and are preceded by the character @, indented at the same level as the (sub-)block where they appear. Gotos are like this:
goto(label)
The switch statement of C is also present:
case(expression)
if(expression1a) or+(expression1b) do
statement1
...
...
ifnone
statement2
...
If expression matches expression1a or expression1b, then do statement1. If not, repeat the test with the next if. If no matches were found, then do statement2. To fall through to the next case, append
continue
after the case's last statement. To exit from a case, or other control structure, write
break
Now, error handling. To throw an error, say:
error(error-type.error-name)
To catch an error, say:
iferror(error-type.error-name) or+(error-type) do
statement1
...
...
purge
statement2
...
These are usually the last statements in a block. Note that we may catch a single error, or a whole family of errors with these statements. Finally, there's the assembler statement:
statement1
...
asm(cpu)
asm-statement1
...
@asm-label
asm-statement2
...
statement2
...
Notice how the @asm-label is indented at the same level as the asm() statement.
Any expression may be enclosed in parentheses. It may be data, with an optional object before (this is so, from now on):
object.data
It may be a constant:
object.equ
enum-name.equ
Or a block call:
object.block1(expression1) block2(expression2) ...
If one of the arguments of the block is a Block, the call goes like this:
object.block1
statement
...
block2(expression2) ...
That is, the contents are indented. For example:
myBool.ifTrue
statement-if-true
...
ifFalse
statement-if-false
...
It may be a literal:
object.literal
It may be a struct or a union field:
struct.data
union.data
It may be the contents pointed to by a variable (we may have pointers to pointers to pointers...):
*data
It may be a meta access
data@size
data@name
data@addr
data@type
Or an array access:
data[unsigned-word]
It may involve an operator:
prefix-op expression
expression suffix-op
expression infix-op expression
It may be a cast:
(expression, type)
It may be a primitive call:
pmtv(operation)
pmtv(operation) arg+(expression)
It may be the ternary operator of other languages:
boolean ? expression-if-true : expression-if-false
It may be:
self
super
Or it may be the following structure used to initialize arrays, dictionaries or maps, or other complex structures:
[expression1a | expression1b | ..., expression2a | expression2b | ..., ...]
Finally, it may be the array:
args
used when a variable number of arguments in a block are declared (with ellipsis). We may say:
args.count
args[unsigned-word]
There's just the conditional compilation directive:
#if condition
statement1
...
#elsif condition
statement2
...
#else
statement3
...
#fi
condition may be complex, using the logical operators ¬, ∧, and ∨, parentheses, and other defined symbols in the call to the compiler.
Copyright © 2021 Rui Cuco. All rights reserved.
All trademarks mentioned in these pages belong to their respective owners.