Merge commit '2af495d056616cc0f757a055114b56df2e0d5d84' as 'projects/bad-nlp/name-database'

This commit is contained in:
2023-03-20 18:03:18 -06:00
669 changed files with 423076 additions and 0 deletions

View File

@@ -0,0 +1,19 @@
# 2000 US Census Surname Database
[From http://www.census.gov/genealogy/www/freqnames2k.html](http://www.census.gov/genealogy/www/freqnames2k.html)
## Files
- app\_c.csv: contains a list of surnames recorded at least 100 times in the 2000 census
## Record Fields
- rank: the absolute rank of the name in the census
- count: how many people counted in the census had the surname
- prop100k: the proportion of people with the surname per 100,000 people
- cum_prop100k: the cumulative proportion per 100k of the surname and every higher ranked surname before it
- pctwhite: percentage of White people with this name
- pctblack: percentage of Black people with this name
- pctapi: percentage of Asian people or Pacific islanders with this name
- pctaian: percentage of American Indians or Alaskan natives with this name
- pct2prace: percentage of people of more than one race with this name
- pcthispanic: percentage of Hispanic people with this name