The server directory contains files uploaded to the site by users. Often, users mistakenly download the same file many times. Names of these files are obtained, for example, such as: ustav.pdf, ustav_0.pdf, ustav_1.pdf There is no time to figure out which of these files is real.
I would like to go through the script so that all such files, if they are of the same length, are replaced by a symbolic link to one of them.
Tell me, please, how to write such a script?
find . -type f -print0 | xargs -0 md5sum - | sortfind . -type f -print0 | xargs -0 md5sum - | sortfind . -type f -print0 | xargs -0 md5sum - | sort. Files that have the same amount md5, most likely the same. You can save the md5 amounts to a file, go through uniq with it, find duplicates, and then think about something. Maybe there are 3-4 duplicate files. - KoVadim