Experimental Setup Validation

Let's verify that this approach offers meaningful information. To do that, we can compare with a more objective source: the Social Security Administration's database of names given to children in the United States. I've selected the 3,630 names with a year of over 300 babies. I then ran the above RoBERTa setup with each: testing "I heard [MASK] name was Aaden", then "Aaliyah", and so on.

As a way of verifying the setup, we can compare the probability RoBERTa assigns to the actual historical ratio. The correlation between the two is 90%, indicating that RoBERTa's probabilities really do correspond to some underlying intuition about the gender split of different names.1


  1. This is the main way I ranked different prompts and language models, incidentally.
NamesGirlsBoysTotal% Girls% Girls 2021RoBERTa % Female
Names
Aaden
Girls
5
Boys
5013
Total
5018
% Girls
0.1
% Girls 2021
0
RoBERTa % Female
5.14
Names
Aaliyah
Girls
98342
Boys
101
Total
98443
% Girls
99.9
% Girls 2021
100
RoBERTa % Female
97.4
Names
Aarav
Girls
0
Boys
6613
Total
6613
% Girls
0
% Girls 2021
0
RoBERTa % Female
5.87
Names
Aaron
Girls
4353
Boys
596930
Total
601283
% Girls
0.72
% Girls 2021
0.33
RoBERTa % Female
2.13
Names
Abagail
Girls
5798
Boys
0
Total
5798
% Girls
100
% Girls 2021
100
RoBERTa % Female
97.31
Names
Abbey
Girls
17406
Boys
35
Total
17441
% Girls
99.8
% Girls 2021
100
RoBERTa % Female
32.94
Names
Abbie
Girls
21794
Boys
330
Total
22124
% Girls
98.51
% Girls 2021
100
RoBERTa % Female
85.02
Names
Abbigail
Girls
11942
Boys
5
Total
11947
% Girls
99.96
% Girls 2021
100
RoBERTa % Female
78.63
Names
Abby
Girls
59990
Boys
181
Total
60171
% Girls
99.7
% Girls 2021
100
RoBERTa % Female
95.87
Names
Abdiel
Girls
0
Boys
6145
Total
6145
% Girls
0
% Girls 2021
0
RoBERTa % Female
5.28

Experimental Setup Validation

Let's verify that this approach offers meaningful information. To do that, we can compare with a more objective source: the Social Security Administration's database of names given to children in the United States. I've selected the 3,630 names with a year of over 300 babies. I then ran the above RoBERTa setup with each: testing "I heard [MASK] name was Aaden", then "Aaliyah", and so on.

As a way of verifying the setup, we can compare the probability RoBERTa assigns to the actual historical ratio. The correlation between the two is 90%, indicating that RoBERTa's probabilities really do correspond to some underlying intuition about the gender split of different names.1


  1. This is the main way I ranked different prompts and language models, incidentally.
NamesGirlsBoysTotal% Girls% Girls 2021RoBERTa % Female
Names
Aaden
Girls
5
Boys
5013
Total
5018
% Girls
0.1
% Girls 2021
0
RoBERTa % Female
5.14
Names
Aaliyah
Girls
98342
Boys
101
Total
98443
% Girls
99.9
% Girls 2021
100
RoBERTa % Female
97.4
Names
Aarav
Girls
0
Boys
6613
Total
6613
% Girls
0
% Girls 2021
0
RoBERTa % Female
5.87
Names
Aaron
Girls
4353
Boys
596930
Total
601283
% Girls
0.72
% Girls 2021
0.33
RoBERTa % Female
2.13
Names
Abagail
Girls
5798
Boys
0
Total
5798
% Girls
100
% Girls 2021
100
RoBERTa % Female
97.31
Names
Abbey
Girls
17406
Boys
35
Total
17441
% Girls
99.8
% Girls 2021
100
RoBERTa % Female
32.94
Names
Abbie
Girls
21794
Boys
330
Total
22124
% Girls
98.51
% Girls 2021
100
RoBERTa % Female
85.02
Names
Abbigail
Girls
11942
Boys
5
Total
11947
% Girls
99.96
% Girls 2021
100
RoBERTa % Female
78.63
Names
Abby
Girls
59990
Boys
181
Total
60171
% Girls
99.7
% Girls 2021
100
RoBERTa % Female
95.87
Names
Abdiel
Girls
0
Boys
6145
Total
6145
% Girls
0
% Girls 2021
0
RoBERTa % Female
5.28