Soundex Coding
With Soundex, the “sound” of names —
the phonetic sound to be exact — is coded.
This is of great help, since it avoids most problems of misspellings
or alternate spellings.
For example: Scherman, Schurman, Sherman and Shireman and Shurman
are indexed together as NARA Soundex Code "S655".
Surname soundex indexing is not alphabetical, but is listed by the
letter-and-number code.
If several surnames have the same code, their index cards are
arranged alphabetically by given name.
Example: S655 Arthur, S655 Betsy, S655 Charles.
To convert names to Soundex codes, use JewishGen's
JOS Calculator... or you can manually
encode a name using the following instructions and charts.
I. Russell (NARA) Soundex Coding
The Russell Soundex system is used by many indexes at the
U.S. National Archives and Records Administration, including
indexes to Census Records, Passenger Lists, and Naturalization
Records.
In the 1930s, the Work Projects Administration (WPA) did a complete
Soundex index of the 1880, 1900, 1910 (partial), 1920, and 1930 (partial)
censuses (for details, see the Census
section of the JewishGen FAQ). The census information was copied
onto file cards, alphabetically coded, and filed by state.
NARA Soundex coding rules:
- Coding consists of a letter followed by three numerals.
Examples: L123, C472, S160.
- The first letter of a surname is not coded,
it is retained as the initial letter.
- A, E, I, O, U, Y, W, and H are not coded.
- Double letters are coded as one letter (as in Lloyd).
- Prefixes to surnames like
"van", "Von", "Di", "de", "le", "D", "dela" or "du"
are sometimes disregarded in coding.
- Code the following letters to three digits,
using 0 at the end if needed.
Letter |
Code |
B P F V |
1 |
C S K G J Q X Z |
2 |
D T |
3 |
L |
4 |
M N |
5 |
R |
6 |
For additional Russell soundex information see The Source,
A Guidebook of American Genealogy, by Arlene Eackle and Johni Cerny.
II. Daitch-Mokotoff Soundex Coding
The Daitch-Mokotoff Soundex System was created by Randy Daitch and
Gary Mokotoff of the Jewish Genealogical Society (New York), because
they concluded the system developed by Robert Russell in 1918 in use
today by the U.S. National Archives and Records Administration (NARA)
does not apply well to many Slavic and Yiddish surnames.
Daitch-Mokotoff Soundex also includes refinements that are independent
of ethnic considerations.
The rules for converting surnames into D-M Code numbers are
listed below.
They are followed by the coding chart.
- Names are coded to six digits, each digit representing a sound
listed in the coding chart (below).
- When a name lacks enough coded sounds for six digits, use zeros
to fill to six digits. GOLDEN which has only four coded
sounds [G-L-D-N] is coded as 583600.
- The letters A, E, I, O, U, J, and Y are always coded at the
beginning of a name as in Alpert 087930.
In any other situation, they are ignored except when two of them
form a pair and the pair comes before a vowel, as in
Breuer 791900 but not Freud.
- The letter H is coded at the beginning of a name, as in Haber
579000, or preceding a vowel, as in Manheim 665600,
otherwise it is not coded.
- When adjacent sounds can combine to form a larger sound, they are
given the code number of the larger sound.
Mintz which is not coded MIN-T-Z but MIN-TZ 664000.
- When adjacent letters have the same code number, they are coded
as one sound, as in TOPF, which is not coded TO-P-F 377000 but
TO-PF 370000. Exceptions to this rule are the letter
combinations MN and NM, whose letters are coded separately,
as in Kleinman, which is coded 586660 not 586600.
- When a surname consists or more than one word, it is coded as if
one word, such as "Ben Aron", which is treated as "Benaron".
- Several letter and letter combinations pose the problem that they
may sound in one of two ways.
The letter and letter combinations CH, CK, C, J, and RS
are assigned two possible code numbers.
The Daitch-Mokotoff Soundex Coding Chart
Letter |
Alternate Spelling |
Start of a name |
Before a vowel |
Any other situation |
NC = not coded |
AI |
AJ, AY |
0 |
1 |
NC |
AU |
|
0 |
7 |
NC |
Ą |
(Polish a-ogonek) |
NC |
NC |
6 or NC |
A |
|
0 |
NC |
NC |
B |
|
7 |
7 |
7 |
CHS |
|
5 |
54 |
54 |
CH |
Try KH (5) and TCH (4) |
CK |
Try K (5) and TSK (45) |
CZ |
CS, CSZ, CZS |
4 |
4 |
4 |
C |
Try K (5) and TZ (4) |
DRZ |
DRS |
4 |
4 |
4 |
DS |
DSH, DSZ |
4 |
4 |
4 |
DZ |
DZH, DZS |
4 |
4 |
4 |
D |
DT |
3 |
3 |
3 |
EI |
EJ, EY |
0 |
1 |
NC |
EU |
|
1 |
1 |
NC |
Ę |
(Polish e-ogonek) |
NC |
NC |
6 or NC |
E |
|
0 |
NC |
NC |
FB |
|
7 |
7 |
7 |
F |
|
7 |
7 |
7 |
G |
|
5 |
5 |
5 |
H |
|
5 |
5 |
NC |
IA |
IE, IO, IU |
1 |
NC |
NC |
I |
|
0 |
NC |
NC |
J |
Try Y (1) and DZH (4) |
KS |
|
5 |
54 |
54 |
KH |
|
5 |
5 |
5 |
K |
|
5 |
5 |
5 |
L |
|
8 |
8 |
8 |
MN |
|
|
66 |
66 |
M |
|
6 |
6 |
6 |
NM |
|
|
66 |
66 |
N |
|
6 |
6 |
6 |
OI |
OJ, OY |
0 |
1 |
NC |
O |
|
0 |
NC |
NC |
P |
PF, PH |
7 |
7 |
7 |
Q |
|
5 |
5 |
5 |
RZ, RS |
Try RTZ (94) and ZH (4) |
R |
|
9 |
9 |
9 |
SCHTSCH |
SCHTSH, SCHTCH |
2 |
4 |
4 |
SCH |
|
4 |
4 |
4 |
SHTCH |
SHCH, SHTSH |
2 |
4 |
4 |
SHT |
SCHT, SCHD |
2 |
43 |
43 |
SH |
|
4 |
4 |
4 |
STCH |
STSCH, SC |
2 |
4 |
4 |
STRZ |
STRS, STSH |
2 |
4 |
4 |
ST |
|
2 |
43 |
43 |
SZCZ |
SZCS |
2 |
4 |
4 |
SZT |
SHD, SZD, SD |
2 |
43 |
43 |
SZ |
|
4 |
4 |
4 |
S |
|
4 |
4 |
4 |
TCH |
TTCH, TTSCH |
4 |
4 |
4 |
TH |
|
3 |
3 |
3 |
TRZ |
TRS |
4 |
4 |
4 |
TSCH |
TSH |
4 |
4 |
4 |
TS |
TTS, TTSZ, TC |
4 |
4 |
4 |
TZ |
TTZ, TZS, TSZ |
4 |
4 |
4 |
Ţ |
(Romanian t-cedilla) |
3 or 4 |
3 or 4 |
3 or 4 |
T |
|
3 |
3 |
3 |
UI |
UJ, UY |
0 |
1 |
NC |
U |
UE |
0 |
NC |
NC |
V |
|
7 |
7 |
7 |
W |
|
7 |
7 |
7 |
X |
|
5 |
54 |
54 |
Y |
|
1 |
NC |
NC |
ZDZ |
ZDZH, ZHDZH |
2 |
4 |
4 |
ZD |
ZHD |
2 |
43 |
43 |
ZH |
ZS, ZSCH, ZSH |
4 |
4 |
4 |
Z |
|
4 |
4 |
4 |
Letter |
Alternate Spelling |
Start of a name |
Before a vowel |
Any other situation |
Examples of Daitch-Mokotoff Soundex Coding:
AUERBACH = 097500
A | UE | R | B | A | CH | |
0 | NC | 9 | 7 | NC | 5 | Pad |
0 | | 9 | 7 | | 5 | 00 |
|
OHRBACH = 097500
O | H | R | B | A | CH | |
0 | NC | 9 | 7 | NC | 5 | Pad |
0 | | 9 | 7 | | 5 | 00 |
|
LIPSHITZ = 874400
L | I | P | SH | I | TZ | |
8 | NC | 7 | 4 | NC | 4 | Pad |
8 | | 7 | 4 | | 4 | 00 |
|
LIPPSZYC = 874400
L | I | P | P | SZ | Y | C | |
8 | NC | 7 | NC | 4 | NC | 4 | Pad |
8 | | 7 | | 4 | | 4 | 00 |
|
LEWINSKY = 876450
L | E | W | I | N | S | K | Y | |
8 | NC | 7 | NC | 6 | 4 | 5 | NC | Pad |
8 | | 7 | | 6 | 4 | 5 | | 0 |
|
LEVINSKI = 876450
L | E | V | I | N | S | K | I | |
8 | NC | 7 | NC | 6 | 4 | 5 | NC | Pad |
8 | | 7 | | 6 | 4 | 5 | | 0 |
|
SZLAMAWICZ = 486740
SZ | L | A | M | A | W | I | CZ | |
4 | 8 | NC | 6 | NC | 7 | NC | 4 | Pad |
4 | 8 | | 6 | | 7 | | 4 | 0 |
|
SHLAMOVITZ = 486740
SH | L | A | M | O | V | I | TZ | |
4 | 8 | NC | 6 | NC | 7 | NC | 4 | Pad |
4 | 8 | | 6 | | 7 | | 4 | 0 |
|
For additional Daitch-Mokotoff soundex information, see
Where Once We Walked by Gary Mokotoff and Sallyann Amdur Sack
(Avotaynu, 2002), pages 567-569; or Gary Mokotoff's article
"Soundexing and Genealogy",
on the Avotaynu website.
To convert names to their corresponding Soundex codes,
use JewishGen's JOS Calculator.
|