tl;dr the redo build system

This is a continuation from my previous post:

A build system for those who hate build systems

I really like using `redo` as my build script for all my non-work projects. Its simple, flexible and works well.

redo by djb

A quick refresher. `redo` is a build system that allows you to create individual scripts that perform the generation of an output file. You create multiple files to create all the different types of dependencies and the build system knows which scripts to run based on what files are needed.

Target type via script name

The target output is defined by the file name structure of the Do File. If you want to create a markdown file the Do File's extension is `.md.do`. No target extension then just `.do`.

The Do File's name defines the target's name. If the target is `report.md` the Do File is `report.md.do`. But if you want a generic rule for a given type, naming the file with `default` allows all targets of the given type to use the same script. `default.md.do` can create `report.md` as well as `letter.md`.

$ find . -name '*.do'
imgs/default.png.do
default.md.do
report.docx.do

You can tell the build system `redo report.docx` and it will look through the dependencies and build all required `png` files using the first rule, all the required markdown files with the second and the combine them all into a Word Doc named `report.docx` using the last. If there is a big directory structure Do Files can be placed wherever they need to be.

There is a helper util which lists what order the build system would look to find a matching rule. The first in that list to exist (top to bottom) is the one run to create that output.

$ redo-whichdo img/logo.png
img/logo.png.do
img/default.png.do
img/default.do
default.png.do
default.do

Do File structure

The Do Files themselves can be created in any language you want. They just need to be able to get 3 arguments and the ability to call other `redo` utilities.

The Do File is executed with the following 3 arguments:

- Arg 1: The target file name (e.g. report.docx)

- Arg 2: The target file without the extension (e.g. report)

- Arg 3: The temporary file to be used as the output

The Do File executes expecting the target output to end up in the temp file for arg 3. This occurs one of three ways:

- stdout from the Do File (`echo Test`)

- The code in the Do File writes to arg 3 (`./my-app -o $3`)

- The code creates an output and then moves it to arg 3 (`./other-app ; mv results.data $3`)

Argument 2 is useful if you're trying to do a conversion from one type to another. For example, you may want to convert a markdown file to a Word Doc:

#!/bin/sh
# default.docx.do

redo-ifchange $2.md

pandoc -f markdown -t docx -i $2.md -o $3

The 4th line `redo-ifchange $2.md` checks to see if the source file has changed since the last time this script was run. Since we are using a default file, we can use the 2nd argument to check for a specific file. For example `redo report.docx` would depend on `report.md`.

The last line does the actual work, writing the output to arg 3. This is needed so that redo can do checksumming. Since it only wants to rebuild things that need rebuilding, the utility needs to know what the output is.

A few other special files can exist. `all.do` will be run if you just call `redo`. `clean.do` can be called by doing `redo clean`.

Example Results

For my treasurer's report in the previous post I had a setup as follows:

$ find . -type f
./data/2018-12.csv
    ...
./data/2017-01.csv
./default.docx.do
./default.md.do
./graph.r
./default.png.do
./template.md.in
./clean.do

Each month I'd create a new CSV file containing two columns: KEY and VALUE. I'd export an Excel spreadsheet into a CSV file and add it to the directory. `template.md.in` contained the document format with KEY markers which were then replaced by the data for the given month. I could pick a date and get a report for it

$ redo 2018-12.docx
redo  2018-12.docx
redo    2018-12.md
redo      2018-12.png

From the output we can see that the `docx` file depends on an `md` file which depends on a `png`.

#!/bin/sh
# default.md.do

redo-ifchange template.md.in data/$2.csv $2.png

cp template.md.in $3
TMP_F=$(mktemp md-XXXXXX.tmp)

while read -r line
do
  KEY=$(echo ${line} | cut -d',' -f1)
  VALUE=$(echo ${line} | cut -d',' -f2)

  awk -v k="%$KEY%" -v v="$VALUE" '{sub(k, v); print}' $3 > $TMP_F
  mv $TMP_F $3
done < data/$2.csv

On the `redo-ifchange` line we see dependencies on the template, data and image. This is done so that if I were to change the template, the data for the template or the data for the graph they would all trigger this document to be rebuilt as well. It is not the prettiest of code, it requires a temporary file because of the shell's in ability to read from a file you're also writing to, but it works.

#!/bin/sh
# default.docx.do

redo-ifchange $2.md

pandoc -f markdown -i $2.md -t docx -o $3

Pretty straight forward what is gong on here.

#!/bin/sh
# default.png.do

redo-ifchange graph.r

Rscript graph.r -- $2

mv output.png $3

The script took the date argument and generated a graph based on all data available up to that date. The output file was fixed so it needed to be moved once created.

Final thoughts

What was nice about this solution is that each part is independent of one another. I can call `redo 2018-11.png` and just the graph is made, or I can reproduce a doc from 2016 by calling `redo 2016-01.docx`. If my template changes or my csv data changes then calling on a given output only updates if needed.

There are a few other redo utilities that do special things. For the most part on simple projects we just need to know if a dependency has been changed. There are a number of implementations out there. Here are a few of the ones I've tried.

apenwarr's Python implementation

Nils Dagsson Moskopp's shell implementation

Gyepi Sam's Go implementation

-- CC-BY-4.0 jecxjo 2023-01-21

$ published: 2023-01-21 21:30 $

back

-- Response ended

-- Page fetched on Tue May 21 11:32:18 2024