Goats yell like humans and we seem to recognize the sounds as human voices, even when they're not. It's probably a bad example, but maybe we "recognize" human voices even when they're not there. Similar to how we recognize faces everywhere that aren't really faces (e.g. the face on the moon).
But anyway, I'm playing devil's advocate with myself. I agree that's definitely a human voice on the intercom.