Autopsy Plutonium Database Codebook

This codebook accompanies the Autopsy Plutonium database (file labeled Autopsy_Pu , originally published by J.F. McInroy, et al., Health Physics, vol 37 (July 1979). It gives information about variables, units, codes and recording inconsistencies.

This codebook accompanies the Autopsy Plutonium database (file labeled Autopsy_Pu , originally published by J.F. McInroy, et al., Health Physics, vol 37 (July 1979). It gives information about variables, units, codes and recording inconsistencies.

Study: "Re-Evaluation of Plutonium in Autopsy Tissue"

McInroy, et al., Health Physics, July 1979

This codebook accompanies the first release of the Autopsy Plutonium database ("Autopsy_Pu" on the file label), as originally published in "Plutonium in Autopsy Tissue: A Revision and Updating of Data Reported in LA-4875"  by J.F. McInroy, E.E. Campbell, W.D. Moss, G.L. Tietjen, B.C. Eutsler and H.A. Boyd in volume 37 of the peer-reviewed journal Health Physics in July of 1979.

Warning!! This codebook gives the user valuable information about the meaning of the variables, such as the units in which the information was gathered, the codes used for "below minimum reportable levels", and inconsistencies in coding practices for such variables as cause_of_death and table/area. Without reading this codebook closely and following its advice, a user of this database could inadvertently make serious errors and come up with highly misleading results.

There are four types of variables: the ID variable, "CASENO", unique for each autopsy;  demographic variables; tissue assay variables; and database navigation variables.


Table 1.  The ID Variable, CASENO

variable  type  format  explanation




subject's case number


Table 2.  The Demographic Variables

variable   type format   explanation and units
 occup character   $24.  subject occupation*
 city  character  $24.  city of residence*
 state  character    $2.

 state of residence,*

2-letter abbreviation

 nation  character  $16.  nation of residence*
 cause_of_death  character  $16.  coroner description*
 HEW_CODE  character    $7.  precursor to the ICD code
 sex  character    $1.  gender: M-male, F-female, N-not known
 age  numeric      3.  age in years*
 years_reside  numeric      3.   years residing in city subject died in*
 year_death  numeric      4.  year of death
 weight  numeric      3.   weight in kilograms*
* Fields frequently left blank       * All variables subject to knowledge of coroner

Note:  There have been a few changes to the Autopsy_Pu data to improve standardization and integrity of analysis.

The variable "nation" has been added to accommodate one Canadian.  To our knowledge, this is the only non-US resident, but we are following a couple of other incomplete entries.  We intend to add US everywhere in the nation field after we have finished researching these other entries.

The cause-of-death variable is the best example of why some data need to be adjusted.  Health investigators often work from an hypothesis regarding a disease, for example, heart disease.  If one is interested in myocardial infarction, for instance, it has appeared in the original data in numerous forms, including:

Table 3.  Various Abbreviations for Myocardial Infarction, HEW_CODES 420.0 and 420.1

Myocardial infarct  Myocard infarct  Myocardial infarction
 Cardio infarct  Cardio infarctio  Myo infarc

Then there are pulmonary infarc's and cerebral infarc's, and all the variations therein.  The investigator requires all the myocardial infarct's collapsed into one group, all the cardio infarct's in one group, the pulmonary infarct's and the cerebral infarct's to do the same, unless a subtype is specified.  We spelled out myocardial, cardial and infarction and reduced groups splintered by these common abbreviations.  We are still reviewing this grouping and will do further revisions as they prove to be judicious.  We take care not to generalize any death into a category that obscures a detail that was included in the original categorization.

The variable HEW-Code was originally entered in a less standardized form.  This code was a precursor to today's  ICD-9, or International Classification of Diseases, version 9 code.  It comes in three parts, X123.789. The first part is a letter designation which is present in some of the codes, and which refers to different groupings of diseases.  In the Autopsy_Pu database, the only letter code that occurs is which designates a trauma injury.  The 123 part is the main numeric part. We refer to it as the NMO part of the ICD Code. The"N" groups all diseases into 10 very large associations, and the remaining "M" and "O" places subdivide diseases among those associations.

The decimal allows additional numbers to narrow down the fairly specific NMO disease number, locating it to a specific tissue or pathogen.  Because so much information can be packed into a disease code, a professional standard for representing codes has become accepted, to always end a code in a decimal and a zero if it is a whole-number code.  In the original, some codes ended in a decimal and no zero, and some ended with no decimal. These have all been adjusted to the standard. In many cases, some of the cause-of-death variables were not known. In this case, the field is filled with N/A.

Age. Three subjects are reported as having the age of 0 in the original dataset:  cases 8-150, 19-010, and 19-032. These should be taken with a degree of skepticism in light of these cases occupations, weights (in kilograms) and organ sample weights (in grams). For instance, review some of their information:

Table 4.  Three Subjects Whose Age Equals Zero






















































It is my educated guess that none of them is an infant, and that "age = 0" really means that their age was unknown, in spite of what the data say. Those are some mighty heavy livers and lungs for babies.


The tissue assay variable pattern is as follows:  name of organ or tissue and comment field (presenting its sample status: whether it was "lost in analysis" or the "analysis is not available"), and the numeric assay variables:  the wet weight and volume of the sample, volume actually analyzed, and then the radioactivity measures ("activity" for short).  These are activity of the volume analyzed, and activity standardized to one standard organ.  There is also an occasional measurement of the standard deviation of the analysis, which we record when available.

Table 5.  Organ Tissue Assay Variables


 Variable Name  Contents and Units of Measurements
 Organ  "Yes" if data present;  comments about missing samples if not
 Organ_wetweight  Wet weight of whole sample in grams
 Organ_volume  Volume of sample in cubic centimeters
 Organ_volanalyzed  Sample volume analyzed in cubic centimeters
 Organ_activityper  Activity per volume analyzed in disintegrations/minute
 Organ_stddev  Standard deviation of analysis
 Organ_actstdorg  Activity per standard organ in disintegrations/minute


The complete list of variables and their characteristics is given in the SAS output labeled "Autopsy_Pu_Data_Contents" following this Codebook:  click here to open the Data Contents.

Variables which are names of organs and tissues such as "Liver" get a "Y" for "yes" if assay values are available. Entries such as "Lost in analysis", "Not Received", "Analysis Not Available", etc., are entered in the organ name field. When there are values, they are entered in fields of variable names starting with "organname_wetweight".


Addition of leading zeros.  In the assay values "organ_activityper" and "organ_actstdorg", many values smaller than 1.0 were given without a zero preceding the decimal point.  This is now considered a nonstandard way to render these values, as very small print, hard-to-read fonts and n-th generation copies can make decimal points illegible.  A leading zero is now considered essential for sub-unitary values;  thus, we have presented all values less than one with a leading zero.

Please Note Numeric Codes for Missing ("999")!

"999" represents " in a numeric field for the variables "organ_actstdorg".  Exclude this value from your numerators before making calculations.  "999" is a numeric code for missing, but left in the data it will grossly inflate your dose calculations.


In the original dataset, there were also variables for activity of the analyzed sample and activity of a kilogram.  The activity of the standard organ is the value used for dose calculations, so we chose to use that value.  It is a product of the activity of the analyzed sample.  The activity of a kilogram is also an arithmetic result of the sample activity, so we elected not to include that variable. A researcher may take the entered variables (weight of the sample, volume of the sample, volume analyzed and activity per volume analyzed) and obtain their own activity per kilogram if they need it.  If there is a perceived need in the user community for these additional variables, we will take up a collection and add them.

Table 6.  Database Navigation Variables

 Variable Name







Table Number 




 Primary Sampling Area




 Journal Page Number


The following table gives the dominant values for the tables of the Autopsy_Pu database.

Table 7.  Table-to-Geographic Area Navigation Lookup Table


 Table  Area
 A-1  Los Alamos City
 A-2  Greater New Mexico and Others
 A-3  Colorado
 A-4  New York City
 A-5  Pennsylvania
 A-6  Georgia and South Carolina
 A-7  Illinois


In reality, Table A-2 is a catch-all for miscellaneous cases,  having subjects from a Canadian province and 9 states other than New Mexico, including Texas, California, North Carolina, Michigan, Nebraska, Nova Scotia and more.

The final navigation variable is journal page number. Each case has its page number referenced so the researcher can go back to the article and inspect the original entry.


This database will be enhanced with counties of the subjects' resident cities, distances to the nearest test site or nuclear weapons factory, and necessary corrections as they are brought to our attention.  To the best of our knowledge now, the data in this database are correct and reliable for analyses, so long as the caveats in this codebook are understood, followed and relied upon.

If you have any questions, comments or suggestions, please direct them to Rita Fellers at the University of North Carolina at Chapel Hill,, 919-619-8091. 

Here is your link to the SAS Proc Contents Position output: