Mech Language Specification

Mech Language Specification

Introduction

This document specifies the default syntax of the Mech programming system, which is designed for developing reactive systems like robots, games, and user interfaces. The syntax specified herein is one of many possible syntaxes and interfaces for Mech, but it may be thought of as the default texual representation of Mech.

This specification starts by defining the most atomic elements of the language, and then builds up to more complex structures.

This document is for:

  • Language designers who want to implement a parser for Mech.

  • Tool developers who want to build tools that work with Mech code.

  • Developers who want to understand the syntax and structure of Mech programs.

Notation

The grammar is specified using extended Extended Backus-Naur Form (EBNF):

╭────────┬──────────┬────────────────────────────╮
│ Symbol │ Meaning  │ Semantics                  │
├────────┼──────────┼────────────────────────────┤
│  "abc" │ terminal │ string literal "abc"       │
│ p1, p2 │ sequence │ p1 followed by p2          │
│ p1 | p2│ choice   │ p1 or p2                   │
│ [p, q] │ list     │ list of p deliniated by q  │
│   *p   │ repeat 0 │ p for 0 or more times      │
│   +p   │ repeat 1 │ p for 1 or more times      │
│   ?p   │ optional │ p for 0 or 1 time          │
│   >p   │ peek     │ p; do not consume input    │
│   ¬p   │ not      │ does not match p           │
│  (...) │ group    │ incrase precedence         │
╰────────┴──────────┴────────────────────────────╯

The grammar grammar:

grammar := + rule ;
rule := identifier
, ":="
, ";"
;
expression := term , * ( "|" , term ) ;
term := factor , * ( "," , factor ) ;
definition := identifier ;
repeat0 := "*" , factor ;
repeat1 := "+" , factor ;
optional := "?" , factor ;
peek := ">" , factor ;
not := "¬" , factor ;
list := "["
, "]"
;
group := "(" , expression , ")" ;
terminal := quote , + any-token , quote ;

Source Code Representation

File Format

Mech source code can be stored in files with either the .mec or .🤖 extension.

Source code is ecoded with UTF-8, which allows for Unicode support directly in the code. This choice has several benefits:

  • Makes code more accessible to non-English speakers

  • Enables domain experts to use notation from their fields directly in code

  • Allows for more expressive and intuitive naming conventions

  • Supports mathematical notation that closely resembles standard written forms

Literate Programming and Mechdown

Literate programming, introduced by Donald Knuth, presents programs as structured documents where explanations take precedence over code. In Mech, we support this concept through a format called Mechdown, a superset of Markdown with extensions to support embedded Mech syntax. This document is formatted with Mechdown to demonstrate how it is used.

For more information, see the Mechdown Section .

Whitespace

Whitespace in Mech is used to separate tokens and is generally ignored. This includes spaces, tabs, and newlines. However, whitespace can be significant in certain contexts:

  • In lists, whitespace is used to separate items.

  • In matrix and table definitions, whitespace deliniates columns and rows.

  • In formulas, whitespace is required around operators.

Semicolons and commas are treated as whitespace in most cases. Except in these contexts:

  • Semicolons can be used to separate statements so they can be written inline.

  • Semicolons can be used in a matrix to separate rows, so they can be written inline.

For more information, see the Whitespace design document .

Lexical Elements

Tokens

Tokens are the smallest units of meaning in Mech. They include letters, digits, punctuation, and special symbols.

Some Unicode characters are reserved for box drawing, and therefore are excluded from includsion in valid identifiers.

alpha := "a" .. "z" | "A" .. "Z" ;
digit := "0" .. "9" ;
emoji := + emoji-grapheme ;
word := + alpha ;
digit1 := + digit ;
digit0 := * digit ;
bin-digit := "0" | "1" ;
hex-digit := digit | "a" .. "f" | "A" .. "F" ;
oct-digit := "0" .. "7" ;
number := digit1 ;

Identifiers

identifier := ( alpha | emoji ) , * ( alpha ) ;

Identifiers start with letters or most UTF-8 encoded emoji characters, and can contain alphanumeric, most emojis, /, *, +, -, and ^ characters.

Examples:

Hello-Word
io/stdout
Δx^2
🤖
A*

Keywords

There are only two keywords in Mech:

true
false

Combined with Mech's Unicode support, this allows users to write code in their native language without the need to learn English keywords.

Operators and Punctuation

Punctutation:

. ! ? , : ; ' "

Symbols:

& | @ / # = \ ~ + - * ^ _

Grouping symbols:

( ) < > { } [ ]

Operators:

Assign       := = += -= *= /= ^=
Arithmetic   + - * / ^ %
Split        >- -<
Matrix       ** · ⨯ \ ' 
Logic        | & xor ⊕ ⊻ ! ¬
Set          ∪ ∩ ∖ ∁ ⊆ ⊇ ⊊ ⊋ ∈ ∉ 
Range        .. ..=
Condition    != ¬= ≠ == > < >= ≤ ≥
Transition   => -> ~>
Guard        | │ ├ └

Comments

comment-sigil := "--" | "//" ;

Examples:

-- Single line comment.
-- Also a single line comment.

Literals

Integers

number := real-number , ? "i" | ? ( "+" , real-number , "i" ) ;
integer-literal := + digit ;
decimal-literal := "0d" , + digit ;
hexadecimal-literal := "0x" , + hex-digit ;
octal-literal := "0o" , + oct-digit ;
binary-literal := "0b" , + bin-digit ;

An integer literal is a sequence of digits representing a whole number. Mech supports decimal, binary, octal, and hexadecimal integer literals, each distinguished by a unique prefix:

  • Decimal: A sequence of digits without a prefix (e.g., 42, 123456).

  • Binary: Prefixed with 0b, containing only 0 and 1 (e.g., 0b1010).

  • Octal: Prefixed with 0o, containing digits 0-7 (e.g., 0o755).

  • Hexadecimal: Prefixed with 0x, containing digits 0-9 and a-f or A-F (e.g., 0x1A3F).

Examples:

42
0b1010
0o755
0x1A3F

Floats

scientific-literal := ( float-literal | integer-literal )
, ( "e" | "E" )
, ? plus
, ? dash
;
float-literal := ? "."
, ? "."
;

Examples:

3.14
0.001
2.5e+10.

Strings

string := quote , * ( ¬ quote , text ) , quote ;

Examples:

"Hello, World!"
"characters like " and \ are escaped with, \ e.g. \""

Boolean

true-symbol := "✓" ;
false-symbol := "✗" ;
english-true-literal := "true" ;
english-false-literal := "false" ;

Examples:

true
false

Atoms

atom := "`" , identifier ;

Examples:

A
MyAtom
MyAtom123
🐦

Empty

empty := + underscore ;

Examples:

_

Kinds

kind-annotation := " <" , kind , ">" ;
kind-empty := + "_" ;
kind-atom := "`" , identifier ;
kind-map := "{"
, ":"
, "}"
;
kind-fxn := "("
, ? [ "," , kind ]
, ")"
, "="
, "("
, ? [ "," , kind ]
, ")"
;
kind-brace := "{"
, [ "," , kind ]
, "}"
, ? ":"
, ? [ "," , literal ]
;
kind-bracket := "["
, [ "," , kind ]
, "]"
, ? ":"
, ? [ "," , literal ]
;
kind-tuple := "(" , [ "," , kind ] , ")" ;
kind-scalar := identifier ;

Data Structures

Data structures in Mech can be broadly classified into two categories: ordered collections that allow duplicated elements, and unordered collections that do now allow duplicated elements.

Ordered elements, duplicates allowed

  • Vector (Nx1)

  • Row Vector (1xN)

  • Matrix (N-D)

  • Tuple

Unordered elements, no duplicates

  • Record

  • Table

  • Set

  • Map

Each data structure has its own semantics, which will be described in this section.

Matrix

A matrix is a numbered sequence of elements of a single kind, arranged in rows and columns. The number of elements is called the length of the matrix and is never negative. The number of rows and columns is called the shape of the matrix. The shape is always a pair of non-negative integers.

The shape is part of the matrix's kind; it must evaluate to a non-negative constant representable by a tuple of kind (index,index) . The shape of matrix A can be discovered using the built-in function matrix/shape() . The elements can be addressed by index indices (1,1) through matrix/shape(A) .

Matrix kinds are always two-dimensional, so row vectors and column vectors are also represented as matrices with a single row or column, respectively.

Examples:

-- 3x3 Matrix
[
1 4 7
2 5 8
3 6 9
]
-- 3x2 Matrix
[
1 3 5
2 4 6
]
-- 1x3 Row Vector
[
1
2
3
]
-- 3x1 Column Vector
[
1 2 3
]
-- 4x1 Column Vector
[
1 2 3 4
]

Fancy matrix syntax is supported so that formatted output from the Mech REPL can be used as program source or input.

┏           ┓
┃ 1   2   3 ┃
┃ 4   5   6 ┃
┗           ┛

The elements of a matrix are indexed in the following ways:

  • 1D - by their position in the matrix, starting from 1, in a column-major order starting at the top left corner of the matrix and proceeding down and to the right. The last element of the matrix, called the end of the matrix, is the most bottom right element.

  • 2D - by the row and column index, starting from 1 for each.

Negative indices indicate counting from the end of the matrix. For example, -1 is the last element, -2 is the second to last element, and so on.

Set

Examples:

{ 1 , 2 , 3 }

Tuple

tuple := "(" , ? [ expression , "," ] , ")" ;
tuple-struct := atom
, "("
, ")"
;

Examples:

( )
( 1 )
( 1 , 1 , 3 )
( 1 , ( 2 , 3 ) )
( 1 , true , "Hello" )

Record

Examples:

{ x : 1 , y : "a" , z : [
1
2
3
]
}
{ x : 1 , y : "a" , z : [
1 2 3
]
}
{ a : { b : 1 , c : "hi" } , b : [
1 2 3
]
}

Expressions

Arithmetic

add := "+" ;
subtract := "-" ;
multiply := "*" ;
divide := "/" ;
exponent := "^" ;
remainder := "%" ;

Matrix

solve := "\\" ;
dot-product := "·" ;
cross-product := "⨯" ;
matrix-multiply := "**" ;
transpose := "'" ;

Comparison

not-equal := "!=" | "¬=" | "≠" ;
equal := "==" ;
greater := ">" ;
less := " <" ;
greate-equalr := ">=" | "≥" ;
less-equal := " <=" | "≤" ;

Logical

or := "|" ;
and := "&" ;
not := "!" | "¬" ;
exclusive-or := "xor" | "⊕" | "⊻" ;

Set

union := "∪" ;
intersection := "∩" ;
difference := "∖" ;
complement := "∁" | "'" ;
subset := "⊆" ;
superset := "⊇" ;
proper-subset := "⊊" ;
proper-superset := "⊋" ;
element-of := "∈" ;
not-element-of := "∉" ;

Range

rangeinclusive := "..=" ;
rangeexclusive := ".." ;

Indexing

Slicing

bracket-subscript := "[" , [ ( select-all | range-subscript | formula-subscript ) , "," ] , "]" ;
brace-subscript := "{" , [ ( select-all | formula-subscript ) , "," ] , "}" ;
formula-subscript := formula ;
range-subscript := range-expression ;
select-all := ":" ;

Dot Index

dot-subscript := "." , identifier ;
dot-subscript-int := "." , integer-literal ;

Swizzle

swizzle-subscript := "."
, ","
, [ identifier , "," ]
;

Statements

Variable Define

define-operator := ":=" ;

Variable Assign

assign-operator := "=" ;

Op-Assign

add-assign-operator := "+=" ;
sub-assign-operator := "-=" ;
mul-assign-operator := "*=" ;
div-assign-operator := "/=" ;
exp-assign-operator := "^=" ;

Enum Define

enum-variant-kind := "(" , kind-annotation , ")" ;

Functions

Function Define

function-out-args := "(" , [ function-arg , list-separator ] , ")" ;
function-out-arg := function-arg ;
argument-list := "(" , ? [ ( call-arg-with-biding | call-arg ) , "," ] , ")" ;

State Machines

Operators

output-operator := "=>" ;
transition-operator := "->" ;
async-transition-operator := "~>" ;
guard-operator := "|"
| "│"
| "├"
| "└"
;

Specification

fsm-specification := "#"
, "("
, ? [ var , "," ]
, ")"
, "."
;
fsm-tuple-struct := grave
, "("
, [ fsm-pattern , "," ]
, ")"
;
fsm-state-definition-variables := "(" , ? [ var , list-separator ] , ")" ;
fsm-instance := "#" , identifier , ? fsm-args ;
fsm-args := "(" , ? [ ( call-arg-with-binding | call-arg ) , list-separator ] , ")" ;

Mech Programs

Programs

program := ? title , body ;
body := + section ;

Mechdown

Markdown

title := + text
, + "="
, * ( space | tab )
;
subtitle := + digit-token
, "."
, + text
, + "-"
, * ( space | tab )
, * ( space | tab )
;
number-subtitle := * ( space | tab )
, "("
, ")"
, + ( space | tab )
, + text
, * ( space | tab )
;
alpha-subtitle := * ( space | tab )
, "("
, ")"
, + ( space | tab )
, + text
, * ( space | tab )
;
paragraph-element := + ( ¬ define-operator , text ) ;
unordered-list := + list-item , ? new-line , * whitespace ;

Mech Extensions

mech-code := mech-code-alt , ( "\n" | ";" | comment ) ;