Tuesday, January 30, 2024

[SOLVED] Bash recursive function with for loop

Issue

In a root folder, I have a subfolder structure containing some mp3-files. I want to copy all subfolders and the mp3-files to an sd-card, which can be read by an mp3-player. Simply copying the entire root folder to the card messes up the mp3-file order. This problem is known an documented in the internet. One solution is to use rsync, which does not work for me - the other is to copy the files one after the other and with that the order is fine for my player. So I wrote a bash script to walk through my subfolder structure, create the some subfolders on the sd-card and copy the mp3-files into them. Since my subfolder structure is kind of random, I decided to write a recursive function to be called in each subfolder like this:

ScanKo () 
{
  cd "$1"
  for i in *
  do
    if [[ -d "$i" ]]
    then
      mkdir -p "$2/$i"
      ScanKo "$1/$i" "$2/$i"
    elif [[ "${i##*.}" = "mp3" ]]
    then
      cp "$1/$i" "$2/$i"
    fi
  done
}

ScanKo "/home/monkey/source_root/" "/media/stick/destiny_root/"

The function ScanKo gets two parameters: source and destiny root folders. I cd into the source folder and use a for loop to scan through everything. If I hit a subfolder, I create this one on the destiny point (the sd-card) and call the function again with the subfolders as the parameters. If I hit an mp3-file, I copy it to the destiny. It only works in theory. It seems, that bash looses its context, as soon as a child function returns to its mother from it was called.

I resolved this by implementing the same approach with a recursive process. The according bash script for the process with the name "sdMachen" looks like this:

cd "$1"
for i in *
do
  if [[ -d "$i" ]]
  then
    mkdir -p "$2/$i"
    bash "$3/${0##*/}" "$1/$i" "$2/$i" "$3"
  elif [[ "${i##*.}" = "mp3" ]]
  then
    cp "$1/$i" "$2/$i"
  fi
done

I call the script on the command line with three parameters:

bash sdMachen "/home/monkey/source_root/" "/media/stick/destiny_root/" "$(pwd)"

I need the third parameter, just so I can go back the location where sdMachen is located. The part

"$3/${0##*/}"

is needed just so the right bash script is found each time. Basically it is equivalent to

/folder-where-script-is/sdMachen

It works with the recursive process - but why does it not work with the recursive bash function?


Solution

There are many pitfalls associated with DIY recursive filesystem traversal in shell code. Apart from the cd issue that you encountered, other common problems include the fact that * doesn't expand to all of the entries in a directory by default, and symbolic links can cause infinite loops in code that doesn't handle them carefully. Fortunately there are established safe mechanisms for doing recursive traversals safely. The globstar mechanism in Bash is one of them. It was introduced in Bash 4.0 but was vulnerable to crashes caused by circular symlinks until Bash 4.3.

The traditional way to do recursive traversal in shell code is to use find. This Shellcheck-clean code demonstrates one way to use find to solve your problem:

#! /bin/bash -p

srcroot=$1
destroot=$2

find "${srcroot/#-/.\/-}" -type f -name '*.mp3' -printf '%P\0'  \
    |   while IFS= read -r -d '' mp3path; do
            srcpath=$srcroot/$mp3path
            destpath=$destroot/$mp3path
            destdir=${destpath%/*}
            [[ -d $destdir ]] || mkdir -p -v -- "$destdir"
            cp -v -- "$srcpath" "$destpath"
        done


Answered By - pjh
Answer Checked By - David Goodson (WPSolving Volunteer)