I needed to go through a directory structure and do a git-fetch if there was a .git repository inside. Finding all directories containing the .git is quite easy with find command.

find $root-directory -regex '^.*/.git$' -printf '%h\n' | sort -d

This is nice but it also prints unneccesary data. It does not only print the leaf directories but also all the partial paths to them.

The original project structure is something like:

My Project Structure
-projects
    |-project-1
    |   |-api
    |   |   -.git
    |   |   -file-1
    |   |   -dir-1
    |   |-gui
    |   |   -.git
    |   |-util
    |       |-files
    |           -.git
    |-project-2
        |-api
            _.git

When the find command is executed, it gives the following result:

Find Command Result
 projects/project-1
 projects/project-1/api
 projects/project-1/gui
 projects/project-1/util
 projects/project-1/util/files
 projects/project-2
 ... etc.

Awk

So how to print out just the leaf paths? One of the options is to use the (G)awk.

The AWKward way

 ind $root-directory -regex '^.*/.git$' -printf '%h\n' | sort -r | awk 'a!~"^"$0 {a=$0; print}' | sort

The awk command prints only the lines that are not prefixes of previous lines in this case. It is important to have the directories sorted correctly. This is done by sort -r. Here is how it works

projects/project-2
projects/project-1/util/files 1
projects/project-1/util 2
projects/project-1/gui 3
  1. Path is not a prefix of previous one, it will be set up as a control string for the next path. It will be printed to stdout.
  2. Path is a prefix of previous one, it is omitted and the next line is still compared to projects/project-1/util/files.
  3. Is not a prefix of projects/project-1/util/files, will be printed out ...etc.

Use a Loop

Let’s pipe the output of find into the while loop.

find -depth -type d | while read dir;  do [[ ! $prev =~ $dir/ ]] && echo "${dir}" ; prev="$dir"; done