Monday, September 5, 2022

[SOLVED] What is the difference between using process substitution vs. a pipe?

Issue

I came across an example for the using tee utility in the tee info page:

wget -O - http://example.com/dvd.iso | tee >(sha1sum > dvd.sha1) > dvd.iso

I looked up the >(...) syntax and found something called "process substitution". From what I understand, it makes a process look like a file that another process could write/append its output to. (Please correct me if I'm wrong on that point.)

How is this different from a pipe? (|) I see a pipe is being used in the above example—is it just a precedence issue? or is there some other difference?


Solution

There's no benefit here, as the line could equally well have been written like this:

wget -O - http://example.com/dvd.iso | tee dvd.iso | sha1sum > dvd.sha1

The differences start to appear when you need to pipe to/from multiple programs, because these can't be expressed purely with |. Feel free to try:

# Calculate 2+ checksums while also writing the file
wget -O - http://example.com/dvd.iso | tee >(sha1sum > dvd.sha1) >(md5sum > dvd.md5) > dvd.iso

# Accept input from two 'sort' processes at the same time
comm -12 <(sort file1) <(sort file2)

They're also useful in certain cases where you for any reason can't or don't want to use pipelines:

# Start logging all error messages to file as well as disk
# Pipes don't work because bash doesn't support it in this context
exec 2> >(tee log.txt)
ls doesntexist

# Sum a column of numbers
# Pipes don't work because they create a subshell
sum=0
while IFS= read -r num; do (( sum+=num )); done < <(curl http://example.com/list.txt)
echo "$sum"

# apt-get something with a generated config file
# Pipes don't work because we want stdin available for user input
apt-get install -c <(sed -e "s/%USER%/$USER/g" template.conf) mysql-server


Answered By - that other guy
Answer Checked By - Senaida (WPSolving Volunteer)