Java named entity recognition library for Persons Name "Parts" -


my current project needs improve data quality of our customers details.

one issue have customers names have seperate data capture input fields first, middle names , surnames, in many cases each part of name entered incorrectly.

we need clean data hold.

this data quality issue impacts when contact our customers in correspondance, because not know first name, middle names , surnames offend customers using inappropriate salutation

we need named entity recognition library can not detect persons names, detct first, middle , surnames.

what makes data quality task harder have 100 million customers, our customer base world wide need able identify first , middle , surnames, e.g. given name, patronymic, , different ordr of parts. know customers nationallity.

does named entity recognition exist specific person name parts?

i realise "perfect" solution impossible, sure can improve data quality have.

i mentioned first, middle , surnames thats name structure familiar with, understand following examples of facing

in many parts of world, parts of names derived titles, locations, genealogical information, caste, religious references, , on. here few examples:      indian name kogaddu birappa timappa nair follows order villagename-fathersname-givenname-lastname.     rajasthani name aditya pratap singh chauhan composed of givenname-fathersname-surname-castename.      in part of india name madurai mani iyer represents townname-givenname-castename.      arabic abu karim muhammad al-jamil ibn nidal ibn abdulaziz al-filistini translates "father of karim, muhammad (given name), beautiful, son of nidal, son of abdulaziz, palestinian". karim muhammad's first-born son. 

there simple, universal solution companies seem surprisingly unwilling apply:

include salutation if, , if, communication human being preparing communication recipient. in case, part of paying attention recipient writing correct salutation taking account recipient's culture.

if computer-generating communication using names database, honest doing. show name supplied on whatever form came from. not attempt use construct formal salutation. not change in way. communications computer-generated try pretend individual attention silly, if not sufficiently incorrect cause actual annoyance.


Comments

Popular posts from this blog

powershell Start-Process exit code -1073741502 when used with Credential from a windows service environment -

twig - Using Twigbridge in a Laravel 5.1 Package -

c# - LINQ join Entities from HashSet's, Join vs Dictionary vs HashSet performance -