Table of Contents

make

make (specifically GNU make) only updates upstream dependencies of files that have changed based on modification time of the “source” and “destination” file. source and destination are defined by rules in the file named Makefile in the current working directory. Other build tools include google's bazel, tup 1)

Why

[2016 Frontiers Neuroinform] Using Make for Reproducible and Parallel Neuroimaging Workflow and Quality-Assurance

If you see a Makefile (especially in a top level script directory), viewing it will likely outline how important files are made and what the dependencies of each step are. It is documentation.

make is available nearly everywhere and it is still a foundation tool in many software projects. It's useful to know and less niche than e.g. tup

Usage

# output: dependencies
# 	command-to-generate-output 
# 	# MUST BE TAB INDENTED. spaces will result in error
 
ages.txt: dob_visit.txt
	./visit_age.bash dob_visit.txt > ages.txt

a file named Makefile contains the recipes/rules to create outputs from inputs. In the above example, the rule to make age.txt depends a dob_visit.txt file. When dob_visit.txt file has been modified more recently than ages.txt, the visit_age.bash will be run. nothing will run if ages.txt is newer than dob_visit.txt. Make will only regenerate ages when there are new DOBs+visits

LNCD Tools tweaks

LNCD Tools provides some functions to create “sentinel” files that can be useful to know if a pipeline with lots of file outputs should be rerun. 2)

all_BIDS.txt: $(wildcard MRData/*/DICOM/)
	./00_dcm2bids                       # makes may files like bids/*/*.nii.gz
	mkls all_BIDS.txt "bids/*/*.nii.gz" # list of all files into all_BIDS.txt

mkls will only update all_BIDS.txt if there is a new file. as long as all_BIDS.txt is newer than any DICOM/ dir, the rule will not need to run.

mkstat and mkmissing -1 $step1_glob -2 $step2_glob are variations on this idea.

1)
and scons, cons, waf cf. language specific like ant, gradle, lein, rake, cargo, dune, …
2)
This is a poor approximation of hashing and cache-ing a la bazel or ninja