Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Presumably, the idea is to count sufficiently similar images as the same image when doing the deduplication, so traditional hashing doesn't help here.


Maybe they were thinking of fingerprinting, e.g., by downsampling. It’s dual in some sense to hashing—a small change to some data should produce a large change in its hash, but a small change in its fingerprint.


Hashing is the correct term here, but it's not a cryptographic hash, but rather a perceptual hash: https://en.wikipedia.org/wiki/Perceptual_hashing




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: