Wednesday, May 31, 2017

Counting duplicates in a file.

I've been looking at lightweight hashes. In doing so, I needed a way to look for collisions. I've been using the Solaris 2.6 kernel file as a test target as it is from a 32-bit RISC processor, so it has 32-bit words. First, you convert the kernel into 32-bit words in ASCII.
xxd -c4 -g4 -p unix
I then run scripts that results in unix.hash, and from there, I need to look for collisions.
sort unix.hash | uniq -c | grep -v '^ *1 ' | sort -nr 

No comments:

Post a Comment