Enter a name and we'll parse it into separate fields like GivenName
, MiddleInitial
, Surname
, or Corporation
Don't know Python? We've set up an easy-to-use API so you can parse on your own stack. Get started »
Don't know how to code?
Parse up to 500 names using our bulk parser »
probablepeople
is a Python library for parsing unstructured western name strings into components, using conditional random fields.
This parser is even able to handle couples and company names, since they're often mixed with person names in real world datasets.
> pip install probablepeople
Pass in a name string to the probablepeople.tag()
method, and it will return a tuple containing an OrderedDict
with tagged name parts and a String
with the name type.
>>> import probablepeople >>> probablepeople.tag("Mr George 'Gob' Bluth II") (OrderedDict([ ('PrefixMarital', 'Mr'), ('GivenName', 'George'), ('Nickname', "'Gob'"), ('Surname', 'Bluth'), ('SuffixGenerational', 'II')]), 'Person') >>> probablepeople.tag('Lucille & George Bluth') (OrderedDict([ ('GivenName', 'Lucille'), ('And', '&'), ('SecondGivenName', 'George'), ('Surname', 'Bluth')]), 'Household') >>> probablepeople.tag('Sitwell Housing Inc') (OrderedDict([ ('CorporationName', 'Sitwell Housing'), ('CorporationLegalType', 'Inc')]), 'Corporation')