projects


A Scholarly History of Identity Resolution

Okay, a bit of a grandiose title for a review paper.

Identity Resolution -- resolving records and data about a person -- has a surprisingly long history for a topic which comes up in Computer Science curricula. It suffers both from a ridiculous diversity of terminology (record linkage; data matching; list washing; coreference resolution; entity disambiguation; duplicate detection; deduplication; reference reconciliation; object [re]identification; and data integration) which has limited cohesion of approaches, and a mass of associated terminology and literature regarding specific sub-tasks of interest: authorship attribution for written text and facial recognition for much image data.

My aim would be to build up a comprehensive picture of the body of research, recording key attributes of the papers in a database and writing up summaries of their contents. I already have a reasonably-sized seed corpus to work with due to background reading for my other work, and vice versa a deeper understanding of the literature would help with my thesis.