Value transformation
You can apply 3 kinds of transformations to a value which is to be substituted for a variable: crunching, chomping and splitting. They always occur in that order.
Delimiters
The transformations work around delimiters. Delimiters are the semantic bounds of the "words" in your value. You can use any character (except the null character, which you cannot use in execline scripts) as a delimiter, by giving a string consisting of all the delimiters you want as the argument to the -d option used by substitution commands. By default, the string " \n\r\t" is used, which means that the default delimiters are spaces, newlines, carriage returns and tabs.
(The forstdin command is a small exception: by default, it only recognizes newlines as delimiters.)
Crunching
You can tell the substitution command to merge sets of consecutive delimiters into a single delimiter. For instance, to replace three consecutive spaces, or a space and 4 tab characters, with a single space. This is called crunching, and it is done by giving the -C switch to the substitution command. The remaining delimiter will always be the first in the sequence. Crunching is off by default, or if you give the -c switch.
Crunching is mainly useful when also splitting.
Chomping
Sometimes you don't want the last delimiter in a value. Chomping deletes the last character of a value if it is a delimiter. It is requested by giving the -n switch to the substitution command. You can turn it off by giving the -N switch. It is off by default unless mentioned in the documentation page of specific binaries. Note that chomping always happens after crunching, which means you can use crunching+chomping to ignore, for instance, a set of trailing spaces.
Splitting
In a shell, when you write
$ A='foo bar' ; echo $A
the echo command is given two arguments, foo and bar. The $A value has been split, and the space between foo and bar acted as a delimiter.
If you want to avoid splitting, you must write something like
$ A='foo bar' ; echo "$A"
The doublequotes "protect" the spaces. Unfortunately, it's easy to forget them and perform unwanted splits during script execution - countless bugs happen because of the shell's splitting behaviour.
execline provides a splitting facility, with several advantages over the shell's:
- Splitting is off by default, which means that substitutions are performed as is, without interpreting the characters in the value. In execline, splitting has to be explicitly requested by specifying the -s option to commands that perform substitution.
- Positional parameters are never split, so that execline
scripts can handle arguments the way the user intended to. To
split $1, for instance, you have to ask for it
specifically:
#!/command/execlineb -S1 define -sd" " ARG1S $1 blah $ARG1S
and $ARG1S will be split using the space character as only delimiter. - Any character can be a delimiter.
How it works
- A substitution command can request that the substitution value be split, via the -s switch.
- The splitting function parses the value, looking for delimiters.
It fills up a structure, marking the split points, and the number
n of words the value is to be split into.
- A word is a sequence of characters in the value terminated by a delimiter. The delimiter is not included in the word.
- If the value begins with x delimiters, the word list will begin with x empty words.
- The last sequence of characters in the value will be recognized as a word even if it is not terminated by a delimiter, unless you have requested chomping and there was no delimiter at the end of the value before the chomp operation - in which case that last sequence will not appear at all.
- The substitution rewrites the argv. A non-split value will be written as one word in the argv; a split value will be written as n separate words.
- Substitution of split values is performed recursively.
Decoding netstrings
Netstrings are a way to reliably encode strings containing arbitrary characters. execline takes advantage of this to offer a completely safe splitting mechanism. If a substitution command is given an empty delimiter string (by use of the -d "" option), the splitting function will try to interpret the value as a sequence of netstrings, every netstring representing a word. For instance, in the following command line:
$ define -s -d "" A '1:a,2:bb,0:,7:xyz 123,1: ,' echo '$A'
the echo command will be given five arguments:
- the "a" string
- the "bb" string
- the empty string
- the "xyz 123" string
- the " " string (a single space)
However, if the value is not a valid sequence of netstrings, the substitution command will die with an error message.
The dollarat command, for instance, can produce a sequence of netstrings (encoding all the arguments given to an execline script), meant to be decoded by a substitution command with the -d "" option.
