JewishGen's Holocaust Database —
Instructions for Volunteers
Press Release, July 12, 2000: "Yad Vashem, the Holocaust Martyrs’
and Heroes’ Remembrance Authority, which holds the largest repository
in the world of names of Holocaust victims, and JewishGen, Inc., an
organization which represents the leading Internet site for those
researching their Jewish heritage have signed a data sharing agreement...
Yad Vashem is to work with JewishGen, Inc., who will be providing a
network of volunteers from special interest and research groups worldwide,
to index and digitize names from about 10,000 lists..."
As a team of volunteers, we are all working towards making the names from these
Lists accessible via JewishGen. Together our goal is to provide error free
transcriptions of these Lists so that others may benefit. These instructions have
been established to help meet that goal. Included here are explanations of
the project roles, process, and
rules for both transcription and
validation.
Initially the process may be a bit cumbersome because of our current inability to
receive materials electronically from Yad Vashem. This adds a dependency on snailmail
to the process, which is something we are working to alleviate in the future.
After our volunteers return the transcribed and validated files, they will be
forwarded on to Yad Vashem for some additional validation. Once re-validated, the
files will be returned to JewishGen so that they can be loaded into searchable
databases.
The files used as input on this project are the "property" of Yad Vashem which has,
under the terms of its data sharing agreement with JewishGen, allowed JewishGen
volunteers to transcribe the data and subsequently make the data available in a
searchable database on the JewishGen website. Under no circumstances shall this data
be given in whole or in part to other volunteers on the project or to any other
persons. Volunteer transcribers and validators are prohibited from sharing their data
with another person unless directed to do so by the Group Leader or Project Technical
Coordinator. Under no circumstances should the completed spreadsheet be provided by
any group leader to the data entry volunteers or to any other persons. The completed
file is to be forwarded only to Nolan Altman, Technical Coordinator for the Yad Vashem
Project. Signing the Volunteer Agreement ensures that volunteers agree and understand
both the terms of the project and the confidential nature of the data, and how it
therefore must be treated.
All volunteers must submit a JewishGen
Volunteer Agreement before being provided with project materials. Please specify
Yad Vashem Project on the form when asked for your assignment.The signed
agreement should then be sent via snailmail to your GROUP LEADER. Once all agreements
for a Group are collected and snailmailed to the project Technical Coordinator, then
work on the List can begin. Note that if you have previously submitted an agreement
but indicated assignments other than the Yad Vashem Project, then it is important that
you submit this new agreement to represent your participation specifically in this
project.
GROUP LEADERs, TRANSCRIBERs and VALIDATORs: Feel free to contact
Nolan Altman, this project's
Technical Coordinator, with any project related questions you may have.
Be sure to include the List identifier on all correspondence - it is the
unique identifier for each list and is in the format of JGxxxx.
Each List being indexed will have a single GROUP LEADER and
one or many TRANSCRIBERs and one or many VALIDATORs.Our work
may at times be difficult because of the quality of
the copies that we are using as input - that is why validation
is such an important part of our process. Thus in an effort to
strengthen the validation process, the VALIDATOR should never
be the same person as the TRANSCRIBER for any single section.
-
A GROUP LEADER will be responsible for:
- Finding volunteer TRANSCRIBERs and VALIDATORs
- Collecting volunteer agreements from all TRANSCRIBERs and VALIDATORs
- Managing the transcription of the List
- Managing the validation of the List
- Providing monthly status to the Technical Coordinator
- Enabling ease of communication across the group
- Compiling all transcribed materials
- Packaging the materials for return
-
Several TRANSCRIBERs will be responsible for transcribing
their assigned sections of the List while:
- Strictly adhering to the template provided for transcription
- Using the rules provided to ensure accurate transcriptions
- Providing monthly status to the GROUP LEADER
- Providing all materials needed by the section VALIDATOR
- Several VALIDATORs will be responsible for validating
their assigned sections of the List while:
- Using the rules provided to ensure accurate transcriptions
- Providing monthly status to the GROUP LEADER
- Returning all materials to the GROUP LEADER
-
The Lists will be provided to the GROUP LEADERs. (Initially this will
be via snailmail, until such time as the lists can be provided electronically.)
-
After the GROUP LEADER submits their JewishGen Volunteer Agreement, the list-specific-EXCEL-template
will also be electronically provided to the GROUP LEADER. Each TRANSCRIBER will be required
to use the template that has been customized for their specific List.
-
The GROUP LEADER will:
- Divide the List into equally manageable sections, where each
section contains x number of pages
- Seek volunteers who will each be responsible for transcribing
individual sections of the List
- Seek volunteers who will each be responsible for validating
individual sections of the List
The number of volunteers and size of the List will probably
determine how many sections are created from each List. It may
be helpful to identify each section with a letter and make
assignments in that manner. For example: Pages 1-10 may comprise
section A of the List which will be transcribed by person x, and
validated by person y.
- The GROUP LEADER will direct the TRANSCRIBERs and the VALIDATORs to
these instructions. In return, each will send via snailmail, a signed
copy of the JewishGen Volunteer Agreement to the GROUP LEADER.
All questions, comments, and/or concerns with respect to this project and its processes
are certainly welcomed, but should initially be brought to the attention of the GROUP LEADER,
and then as necessary to the Project Technical Coordinator. It would be inappropriate to
raise concerns on these topics via any special interest or other public discussion group.
-
The GROUP LEADER will send to the TRANSCRIBERs:
- Assigned section(s) of the List (initially via snailmail)
- List-specific-EXCEL-template
- Name, e-mail (and initially snailmail) address of the section VALIDATOR
The GROUP LEADER will send to the VALIDATORs:
- Identification of assigned sections and the names of associated TRANSCRIBERs
- Name, e-mail (and initially snailmail) addresses of the GROUP LEADER
-
Each TRANSCRIBER will use the list-specific-EXCEL-template to transcribe
the information for their section(s) of the List, while following the transcription
rules contained in these instructions. If additional rules apply for any specific List
then documentation will be provided within the template.
-
All TRANSCRIBERs will provide an e-mail update to their Group Leader before the
25th of the month to document how many pages of their section have been transcribed,
in order to track progress.
-
When the transcription of their sections are completed, the TRANSCRIBERs will send
(initially via snailmail) the original assigned section, and e-mail the
softcopy of their transcribed sections to the assigned VALIDATOR for that section.
-
Each VALIDATOR will ensure the correctness of each section of transcribed data by
comparing the original documents to the transcribed files, while following the
validation rules contained in these instructions.
-
All VALIDATORs will provide an e-mail update to their Group Leader before the 25th
of the month to document how many pages of their section have been validated, in order to
track progress.
-
When the validation of their sections are completed, the VALIDATORs will send
(initially via snailmail) the original assigned section, and e-mail the
softcopy of their sections to the List's GROUP LEADER.
-
The GROUP LEADER will provide an e-mail update to the TECHNICAL COORDINATOR before
the 1st of the month to document how many volunteers have contributed in their effort,
how many pages of their List have been transcribed and how many validated. Be sure to include
the List identifier that is in the format of "JGxxxx", on all correspondence.
-
When the GROUP LEADER receives all validated sections, they will:
- Compile all sections together into one file, which will have only one set of headers,
and will contain names in the same order as the original List
- Ensure that the number of names in the original List corresponds to the number of
names transcribed
- E-mail the softcopy file back to the Technical Coordinator
- Send the hardcopy file via snailmail back to the Technical Coordinator
-
All volunteers are now experienced and are ready to work on another List!!
-
EXCEL is to be used for transcription while using the list-specific-EXCEL-template
provided, which has been customized to the List.
-
IMPORTANT NOTE: We have discovered some incompatibilities using EXCEL 2000. As a workaround
be sure to save all files that will be shared with other people, with the filetype described as
Microsoft Excel 97-2000 & 5.0/95 Workbook. This format seems to be more compatible across various products. Otherwise it may
be possible that files cannot be easily emailed, viewed or edited amongst transcribing groups.
-
For this project many but not all of the characters that we need to use are represented by the
Arial CE font, which includes the characters of the Central European languages. The
list-specific-EXCEL-templates have all been defined to use the default font of Arial CE. It is
important that this not be changed.
If the original List includes special or accented characters, then be sure to follow the
guidelines for special characters.
In addition - at the specific request of Yad Vashem and only for lists written in German: for any letter with an umlaut (two dots above), specify
only the letter without specifying
the umlaut and then add the letter "e" following that letter. By example, the name Schönberg in
a list would be transcribed as Schoenberg, by removing the umlaut from the letter "o",
and adding the letter "e" after it.
If there are no special or accented characters in the List being transcribed,
then this information does not apply. No actions relative to fonts or special keystrokes are required.
-
Do not change the column headers in the list-specific-EXCEL-template
-
A spreadsheet file can contain multiple worksheets. Each can be selected via a tab within a
single file. Our list-specific-EXCEL-templates will contain three worksheets. The worksheet
titled INSTRUCTIONS should be read first, as it includes any rules that are specific to your List.
As the guidelines specified in this document are to be applied to all Lists, those
on the INSTRUCTIONS worksheet within each template are meant to take precedence over this document.
The DATA ENTRY worksheet - that seems self explanatory. The FONTS
worksheet needs to be updated by the transcriber,
to indicate which special characters were used in the transcription of the list. This will
insure that when the lists are eventually loaded into databases, that it is clear if any
special characters were used, that are not contained in ARIAL CE - as they will need to be changed
appropriately on the upload. Without this, the data may be loaded incorrectly.
-
First column of each row of data in the template shall contain a tracking number
that is associated with the original List. This value should be repeated
in the first column of every row of the spreadsheet (For example: JG0137)
- Data from the original List should be entered as it appears in that list, without
interpretation unless otherwise instructed. This means that the data should always be
entered as it appears on the page while using the same spellings. For
example, do not change Haim to Chaim, or Bialstok to Bialystok. Also enter the data in the
same order as it appears on the original List because this will ease the validation process
being performed by both the volunteer validator and by Yad Vashem.
-
Ensure surnames are entered in all capital letters. Capitalize only the first
letter for given names, town names, or months.
Ensure that honorary titles are placed in a separate field and are not
included in the surname, first name nor given name fields.
For example DR JOHN SMITH would be entered as SMITH in the surname field, JOHN in the
given name field, and DR in the honorary title
field.
Hungarian women's names are created from the name of the husband, with the suffix "ne".
For those, the surname field should contain the husband's name without the suffix.
For example - the name Farkash Alexne, would be entered into the template as FARKASH for surname,
nothing for the field titled given name and Alex in the field for husband's first name.
If there is a 2nd name listed, then it is probably the woman's maiden name and first name.
If this is the case, then this information should be entered into the template in the fields titled
given name and maiden name.
- Names of places should be entered as they appear with no additional information
(For example: Paris would be entered as Paris and NOT as Paris, France)
- Transcribe town names exactly as they appear; do not attempt to modernize
or translate the locality name. Another column may be added later, which will
contain the modern native town name.
-
Dates are to be entered as follows:
- Documents are all assumed to use DAY/MONTH/YEAR format
- Ensure the year is entered as 4 digits
(For example: 02/10/44 would be entered as 02/10/1944)
- Ensure the day and month are entered as 2 digits
(For example: 3/8/44 would be entered as 03/08/1944)
- Ensure that dates documented with month spelled out, are entered numerically
(For example: 6 July 1944 would be entered as 06/07/1944 and NOT 07/06/1944)
IMPORTANT NOTE:
It seems that there are inconsistencies
with respect to how EXCEL handles dates in the various levels of the product. Also
EXCEL has been known to change dd/mm to mm/dd, so attention to details here are very important.
To avoid these problems - dates entered must be entered as text fields.
This means that any date field must start with a single quote. This is a sign to EXCEL
to treat the data as text. Make sure that it is a single quote, and not a double one,
and do not type a matching end quote. After the data is entered and once the enter key is hit,
the quote should disappear and the date should appear correctly.
- Any person or individual piece of information about a person that appears scratched out, crossed out or otherwise removed from the original List, needs to be
identified to Yad Vashem. A comment should be added to the field titled transcription team
comments to order to indicate which field or column of data appears to have some original
data removed. For example, if an entire line is crossed out, then the transcription team
comments field should indicate "person crossed off list", and all other fields of data
for that person should appear blank. As another example: if the data in the place
of birth field is crossed out and a different town name is written and therefore was used
when transcribing the list, then the transcription team comments field should indicate "original place of birth
crossed out and replaced".
-
If any information from the original List is illegible, then simply enter "?" in the
appropriate field. If you can almost make it out but you are not 100% sure then you
can enter several possible surnames in the surname field, separated by '/' and indicating
the uncertainty by marking them with '?' also. For example if the surname is either
LEWIT or LEWIN then enter the surname as 'LEWIT ? / LEWIN ?'. Do not enter more than three
possible values for any field. Note that if after the final validation is performed, the data cannot be further
clarified, then it will be entered into the JewishGen searchable databases as is.
-
Each template will include a field titled transcription team comments. It is intended that this
field be used as a means of communications from the transcriber to the Validator or to Yad Vashem
who will also be validating each list. Information here might refer to a question that arose relative to a specific line in the original List.
-
Review the Transcription Rules, and take note if any rules in addition to these, were
included in the list-specific-EXCEL-template.
-
TAKE NOTE: Since the date format being entered may be contrary to that in day to day
use by the TRANSCRIBERs, the VALIDATOR should ensure particular focus on any fields that
contain dates.
-
Proofread each and every line of the assigned section and if corrections are necessary, then
change the data entered into the spreadsheet as appropriate:
- Ensure that all rules have been followed
- Validate that data has been correctly transcribed
- Ensure that the number of names in the original section is the same as the number
transcribed
- Ensure the 2nd portion of the date field entered (the yy portion of xx/yy/2000) does not
exceed the value of 12
- For fields that contain a question mark (?) or alternative spellings - do your best
to determine what is contained in the original list, and be sure that is what has been
transcribed. After you have validated the field - and maybe even changed the data entered
by the TRANSCRIBER - if you are 100% confident that the transcription matches the original
document then remove the ? or alternative spellings as appropriate. If there is any doubt
that the field has been correctly transcribed then leave the alternatives and/or "?" in place.
A small subset of the transcription rules have been
provided courtesy of Robinn Magid, author of JRI-Poland Step-by-Step Guide.
Jewish Records Indexing - Poland, Inc. is an independent non-profit
U.S. tax-exempt organization.
Last Update: Nov 1, 2003 WSB
|