L43Rmap

Gem Version

map input lines with a mini language

Documentation:

These documentation code snippets are compiled into RSpec files with the speculate_about gem.

Overview:

Input lines are read and transformed to output lines by applying a simple program, expressed in simple patterns.

These patterns are chunks of code or verbatim text that are described below.

Here is a quick example

    echo "Hello World\nBye Universe" | rmap '1st: %1 , 2nd: %2 , lnb: %n(+ 1)'

would print

    1st: Hello, 2nd: World, lnb: 1
    1st: Bye, 2nd: Universe, lnb: 2

N.B. the first space after a field is ignored in the output.

We can identify three kinds of chunks here:

  • Verbatim text: "1st: ", ", ", "2nd:", ...
  • Fields: "%1 ", "%2 " and "%n"
  • S-Expressions (known as sexp): "(+ 1)"

Here is another example of a pattern and how it would be parsed, as well as some semantic explanations (in pseudo code)

  parse('% %%%(%e Hello(lpad 2 " ")' =>
    [
      Field(:line),   # represents the whole input line, à la $0 in awk 
      Verb("%"),      # escaped field leader, rendered verbatim
      Verb("("),      # idem
      Field(:empty),  # rendered as empty string
      Verb(" Hello"), # idem
      Sexp([:lpad, 2, " "])
    ]

List of Fields

  "% " | "%0"    # whole line
  "%"/-?\d+/     # split by space nth element, 1 is first, -1 is last
  "%e"           # ""
  "%n" | "%N"    # number of input | output line (0 for first line)
  "%sx"          # constant timestamp in seconds and hex (N.B. timestimp of runtime initialization, thusly the same for all lines)
  "%SX"          # dynamic timestamp in seconds and hex (at input line interpreation, maybe not the same for all lines) 
  "%s" | "%S"    # constant | dynamic timestamp in seconds base 36
  "%m" | "%M"    # constant | dynamic timestamp in milliseconds base 36
  "%mx" i| "%Mx" # constant | dynamic hex timestamp in milliseconds
  "%%" | "%("    # verbatim character following the '%'
  "%r" | "%R"    # stop processing the input line if status register is truthy | falsy (*)
  "%x" | "%X"    # stop processing if the status register is truthy | falsy, but still print the output register to stdout
  "%a" | "%A"    # abandon processing if the status register is truthy | falsy, DOES NOT print the output register to stdout (†)
  "%i" | "%I"    # set status register from top of runtime stack ("%I" negates it) (‡)

Remarks:

  • (*) Empty strings are considered falsy this allows to facilitate the supression empty lines from the output

        # use grep -v '^$' instead of this explanatory example ;)
        rmap '%i%r' # status register is input line, if false %r skips processing, input line is not pushed to output register
    

    Later on we will see a use case for this though, because rmap :D|:dump dumps the runtime registers in json format to $stderr after processing the last input line. Therefore we could use the value of "%N" available in the :out_count register for postprocessing

  • (†) Still would dump the runtime registers if :D|:dump was provided

  • (‡) if %i or %I is followed by a s-exp the s-exp will only be executed if the status register is truthy

        rmap '%3%i(lpad 2 "")' # avoids lpadding an empty field 
        rmap '%3%I("missing")' # provides a default value for the field
    

    Again, this is a short example to demonstrate the semantics, but the idiomatic way to do this would be

        rmap '%3(or "missing")' # or...
        rmap '%3(|| :missing)'  # or combinations of the above
    

    For more details concering types in s-exps and their rendering see S-Expressions

S-Expressions

S-expressions are just Lisp forms where the head must be a defined function, depending on the function it will take first arguments from the runtime and fill the rest from the compile time arguments, lpad is a perfect example, it takes it's first argument from the top of the runtime stack and applies it and the remaining arguments to itselft pushing the result to the runtime stack.

Here is a description how that works for the pattern %1(lpad 3 "0") with the following input line in the input register: " 42 times"

runtime state runtime stack remarks
START OF LINE " 42 times"
compiled("%3") ["42", " 42 times"] compiled("%1") is a method of Runtime
compiled("lpad") ["042"," 42 times"] some methods remove the top of stack

Actually every chunk will get compiled to a method of the runtime, although some optimisations occur, e.g. subsequent verbatim chunks are combined into one method

Here are all defined methods that can be used as an s-expression head

...

Implementation:

  1. Parser

Detailed Description Parser

  1. Compiler

  2. Runtime


<!-- SPDX-License-Identifier: AGPL-3.0-or-later -->