X-Git-Url: https://fleuret.org/cgi-bin/gitweb/gitweb.cgi?a=blobdiff_plain;f=finddup.1;h=e262386a1ec035946a50828f69cf6582b3320429;hb=ab7b6e26f35ac1dfc88d9bf1e09dd289a30ea782;hp=9cc21b4f13e9f2c5536f0a89423ab9c37bc0a240;hpb=a61c9478f31b957e0d4007df9feddd6f0139ccf8;p=finddup.git diff --git a/finddup.1 b/finddup.1 index 9cc21b4..e262386 100644 --- a/finddup.1 +++ b/finddup.1 @@ -61,18 +61,17 @@ show the real path of the files .TP \fB-i\fR, \fB--same-inodes-are-different\fR files with same inode are considered as different -.TP -\fB-m\fR, \fB--md5\fR -use MD5 hashing .SH "BUGS" None known, probably many. Valgrind does not complain though. -The MD5 hashing often hurts more than it helps, hence it is off by -default. The only case when it should really be useful is when you -have plenty of different files of same size, which does not happen -often. +The current algorithm is dumb, that is it does not use any hashing of +the file content. I tried md5 on the whole file, which is not +satisfactory because files are often never read entirely hence the md5 +can not be properly computed. I also tried XOR of the first 4, 16 and +256 bytes with rejection as soon as one does not match. Did not help +either. .SH "WISH LIST"