--- title: How to sort multiples photo directories date: 2020-05-15 --- Too many backups ---------------- I have a bad habit of fearing losing data, and making a hasty backup when I need to change machines. Since it's always laborious to compare a directory of photos with what I already archive, I usually backup the whole directory to deal with it later. After a few months (sometimes years), I end up with many backup directories. Some images are duplicated across these directories, with various sizes and naming. This make the task of archiving even more difficult. I recently had to do it and I created some tiny utilities that made this task much easier. Let's dig in! Proper naming ------------- The first thing to do is to properly name an image. I chose the format `year-month-day_hour-minutes-seconds.ext`, eg. `2020-05-15_14h30m05s.jpg`. Usually this data is available in the photo [Exif](https://en.wikipedia.org/wiki/Exif). You can extract it with [exiftool](https://exiftool.org/). For instance you can create a small script named `rename-exif-date`: ```sh #!/bin/sh # Rename wrt date and hour of shot: 2020-12-25_20h03m12s.jpg exiftool -d %Y-%m-%d_%Hh%Mm%Ss%%-c.%%le "-filename fd -t f -x rename-exif-date {}` That will rename all your files. Detect thumbnails ----------------- If you have thumbnails cluttering your directories, you can detect and remove them with [feh](https://feh.finalrewind.org/). To remove files smaller than 300x300, create a script name `remove-small-images`: ```sh #!/bin/sh feh --recursive --list --max-dimension $1 --action 'rm %F' ``` `> remove-small-images 300x300` Detect duplicates ----------------- Now you should have clean directories, but potentially full of duplicates. The tool I recommend to detecting (and removing) them is [dupeGuru](https://dupeguru.voltaicideas.net/). It can detect perfect duplicates (basically the same file), or similar images with a similarity threshold based on the content.