That's an interesting idea from a UX standpoint. How do you see the generation of the audio fingerprint happening? How would an audio fingerprint be generated? Would a user "record" it or would it be algorithmically derived from the textual fingerprint like the various icon / color coding approaches are doing?