execline: language design and grammar

The `execline` language design and grammar

`execline` principles

Here are some basic Unix facts:

Unix programs are started with the execve() system call, which takes 3 arguments: the command name (which we won't discuss here because it's redundant in most cases), the command line argv, which specifies the program name and its arguments, and the environment envp.

The argv structure makes it easy to read some arguments at the beginning of argv, perform some action, then execve() into the rest of argv. For instance, the nice command works that way:
```
 nice -10 echo blah 
```
will read nice and -10 from the argv, change the process' nice value, then exec into the command echo blah. This is called
chain loading by some people, and Bernstein chaining by others.
The purpose of the environment is to preserve some state across execve() calls. This state is usually small: most programs keep their information in the filesystem.
A script is basically a text file whose meaning is a sequence of actions, i.e. calls to Unix programs, with some control over the execution flow. You need a program to interpret your script. Traditionally, this program is /bin/sh: scripts are written in the shell language.
The shell reads and interprets the script command after command. That means it must preserve a state, and stay in memory while the script is running.
Standard shells have lots of built-in features and commands, so they are big. Spawning (i.e. fork()ing then exec()ing) a shell script takes time, because the shell program itself must be initialized. For simple programs like nice -10 echo blah, a shell is overpowered - we only need a way to make an argv from the "nice -10 echo blah" string, and execve() into that argv.
Unix systems have a size limit for argv+envp, but it is high. POSIX states that this limit must not be inferior to 4 KB - and most simple scripts are smaller than that. Modern systems have a much higher limit: for instance, it is 64 KB on FreeBSD-4.6, and 128 KB on Linux.

Knowing that, and wanting lightweight and efficient scripts, I wondered: "Why should the interpreter stay in memory while the script is executing? Why not parse the script once and for all, put it all into one argv, and just execute into that argv, relying on external commands (which will be called from within the script) to control the execution flow?"

execline was born.

execline is the first script language to rely entirely on chain loading. An execline script is a single argv, made of a chain of programs designed to perform their action then exec() into the next one.
The execlineb command is a launcher: it reads and parses a text file, converting it to an argv, then executes into that argv. It does nothing more.
Straightforward scripts like nice -10 echo blah will be run just as they are, without the shell overhead. Here is what the script could look like:
```
#!/command/execlineb -P
nice -10
echo blah
```
More complex scripts will include calls to other execline commands, which are meant to provide some control over the process state and execution flow from inside an argv.

Grammar of an execline script

An execline script can be parsed as follows:

 <instruction> = <> | external options <arglist> <instruction> | builtin options <arglist> <blocklist> <instruction>
 <arglist> = <> | arg <arglist>
 <blocklist> = <> | <block> <blocklist>
 <block> = { <arglist> } | { <instrlist> }
 <instrlist> = <> | <instruction> <instrlist>

(This grammar is ambivalent, but much simpler to understand than the non-ambivalent ones.)

An execline script is valid if it reduces to an instruction.

The empty instruction is the same as the true command: when an execline component must exec into the empty instruction, it exits 0.

Basically, every non-empty instruction, be it "builtin" - an execline command - or "external" - a program such as echo or cp - takes a number of arguments, the arglist, then executes into a (possibly empty) instruction.

Some builtins are special because they also take a non-empty blocklist after their arglist. For instance, the foreground command takes an empty arglist and one block:
```
 #!/command/execlineb -P
 foreground { sleep 1 } echo blah
```
is a valid execlineb script. The foreground command uses the sleep 1 block then execs into the remaining echo blah instruction.

execline features

execline commands can perform some transformations on their argv, to emulate some aspects of a shell. Here are descriptions of these features:

The execline language design and grammar

execline principles

Grammar of an execline script

execline features

The `execline` language design and grammar

`execline` principles