Inventory

Inventory
Prev	Chapter 3. Baboon Data: Primary Source Material	Next

This section contains data about the origin, identity, location, and various other traits about the tissue and nucleic acid samples in the users' inventory. This includes samples currently residing in the users' inventory, as well as older samples that may have previously been in use but have since been sent to others, consumed, discarded, or lost. Because of this, the data in this section serve as both a historical record of all samples that have ever been in the users' possession and an active record of the samples that are currently in the users' possession.

Note

The text in this section uses the terms "nucleic acid" and "nucleic acid sample" interchangeably^[115]. At the time of this writing, the system does not attempt to record details at the molecular level, so the reader can be assured that comments about the location, source, etc. of a specific "nucleic acid" should be interpreted as referring to a sample and not a specific molecule.

Note

A careful reader may notice many similarities between the way data about tissue samples and nucleic acid samples are recorded, and may wonder why the two data sets aren't unified into a single table. When the TISSUE_DATA and NUCACID_DATA tables were initially designed they were not so similar, and there were some design-based reasons to keep them separate. As of Babase 5.6 that is no longer true, and with some adjustments those two tables surely could now be combined into a single table. In retrospect, we should have designed them this way from the beginning. Changing them now would require a non-trivial amount of work, so in separate tables they remain.

LOCATIONS

This table contains one row for every location that may be used to store tissue or nucleic acid samples.

Samples may be stored in varied locations with different organizations/research groups ("institutions"). The Institution column is included to allow easy segregation of locations across these varying locales.

The name of each distinct location is recorded in the Location column. Different organizations have their own conventions about how to organize and name storage locations, so this code may be a very descriptive and specific space ("Shelf 1, Rack 2, Box 3, Position D") or something more general ("PINK BOX").

Each Institution-Location pair must be unique.

To allow the use of nondescriptive general Location values but retain the ability to enforce uniqueness of specific ones, the boolean column Is_Unique is included. When Is_Unique is TRUE, the row's LocId may occur at most once across both the NUCACID_DATA.LocId and TISSUE_DATA.LocId columns (once total, not once per table). When FALSE, the LocId may be used any number of times in either table.

Column Descriptions

LocId (Location Identifier

A unique identifier for the location. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Institution

The INSTITUTIONS.Institution indicating the organization or research group at which this row's Location exists.

This column may not be NULL.

Location

A textual column naming this location.

This column may not be NULL.

Is_Unique

A boolean indicating whether or not this location at this institution is unique.

This column defaults to TRUE.

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

NUCACID_CONC_DATA (NUCleic ACID CONCentration DATA)

This table contains one row for every quantification of a nucleic acid sample's concentration. All concentrations are recorded in picograms per microliter (pg/μL).

A nucleic acid sample cannot be quantified before it was created, before the source tissue sample was collected, nor before the tissue sample's donor entered the study population (if applicable); the Conc_Date cannot be before the related NUCACID_DATA.Creation_Date, TISSUE_DATA.Collection_Date, nor the related BIOGRAPH.Entrydate. These dates already have a required sequence to them — Entrydate <= Collection_Date <= Creation_Date <= Conc_Date — so in many cases it may be sufficient for the system to only require that Conc_Date is after the Creation_Date. However, any of these date columns can be NULL, so for the sake of completeness the system separately checks that Conc_Date is greater than each of them.

Some quantification methods may use a different unit of concentration than that used in this table. Nanograms per microliter (ng/μL) is especially common. Such concentrations must be converted to pg/μL before they are added to this table.

Tip

Use the NUCACID_CONCS view instead of this table. It includes an additional column that indicates concentration in ng/μL, and also allows the insertion of quantifications in ng/μL. The conversion to ng/μL is thus performed by the system and not the user.

Warning

Do not assume that the number of significant figures employed in the Pg_ul column is the "true" number of significant figures for this quantification. This table records concentrations from a variety of quantification methods with varying levels of accuracy and stores them all in a single column that records all data to the nearest 0.1 pg/μL^[116]. When new data are added, this column pays no attention to the number of provided significant figures and may indicate more than were actually used at the time of quantification. See the example below.

Example 3.2. (Mis)Use of Significant Figures in NUCACID_CONC_DATA

The concentration of a new DNA sample is determined to be 10.0 ng/μL, which has 3 significant figures. When recorded in NUCACID_CONC_DATA, this concentration will be recorded in Pg_ul as 10000.0 pg/μL, with 6 significant figures. A user should not assume that this quantification was originally performed with 6 significant figures' accuracy.

Column Descriptions

NACId (Nucleic Acid Concentration Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

NAId (Nucleic Acid Identifier)

The NUCACID_DATA.NAId of the quantified sample.

This column may not be NULL.

Conc_Method

The NUCACID_CONC_METHODS.Conc_Method used to quantify this concentration.

This column may not be NULL.

Conc_Date (Concentration Date)

The date that this concentration was quantified.

This column may be NULL, when the date is unknown.

Pg_ul (Picograms per microliter)

The concentration of the sample according to this quantification, in picograms per microliter (pg/μL).

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

NUCACID_CREATORS (NUCleic ACID CREATORS)

This table contains one row for every person involved with the creation of a specific nucleic acid sample. When a nucleic acid sample has multiple creators, each of them is recorded here in a separate row.

Most nucleic acid samples are created via "extraction". This table favors using "creation" rather than "extraction", for reasons explained in the discussion of the NUCACID_DATA table.

Each NAId-Creator combination must be unique; a sample cannot have the same creator more than once.

Tip

Use the NUCACIDS view to insert data into this table. It provides a simple way to determine the appropriate NAId value to use, and for a human data enterer to provide multiple creators in a single row.

Column Descriptions

NACrId (NUCACID_CREATORS Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

NAID (Nucleic Acid Identifier)

The NUCACID_DATA.NAId of the related nucleic acid sample.

This column may not be NULL.

Creator

The LAB_PERSONNEL.Initials of this creator.

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

NUCACID_DATA (General information about NUCleic ACID samples)

This table contains one row for every nucleic acid sample that is or ever has been in the inventory. With a few exceptions, each nucleic acid sample's "source" tissue sample is recorded in the TId column, and is attributed to a specific individual in the UIId column.

Tip

Always use the NUCACIDS view in place of this table. It contains additional related columns which may be of interest.

When the nucleic sample is attributed to a specific individual and its source tissue is provided, the individual indicated in this table must be that of the source tissue. That is, when the UIId and TId are both non-NULL, the UIId in this table must be equal to the TId's TISSUE_DATA.UIId.

This table records a nucleid acid sample's current location using the LocId column. Values in this column constrain and are constrained by values in the TISSUE_DATA.LocId column, and may or may not be unique, as discussed in the LOCATIONS table.

The Name_on_Tube column indicates whatever "name" or other identifying information is recorded on the tube. Because of labeling errors or misidentification in the field, this value may not indicate the true identity of the individual(s) from whom this sample came.

Two columns in this table record information related to the sample's creation: Creation_Date and Creation_Method. Also the related table, NUCACID_CREATORS. In laboratory vernacular, the term "extraction" is usually favored over "creation" for most nucleic acid sample types. However, some samples are not "extracted" and are instead generated via a laboratory procedure (e.g. reverse transcription, dilution, PCR amplification, etc.). Because of this, the generic term "creation" is used here.

A sample's Creation_Date cannot be before the source tissue's Collection_Date, nor before the source individual's Entrydate, if any. It may often be redundant to verify that Creation_Date is on or after both dates, but this redundancy is intended, as discussed above.

This table attempts to keep an ongoing record of a sample's current volume in the Actual_Vol_ul column. It is left to the user to judge this column's accuracy, which depends greatly on 1) how diligently the lab personnel keep the data manager(s) informed of changes, and 2) the amount of time that has passed since this volume was determined^[117]. To assist users in making these judgments, the date that the Actual_Vol_ul was last updated is recorded in the Actual_Vol_Date column. A sample's current volume cannot be recorded without also recording this date; both of the Actual_Vol_ul and Actual_Vol_Date columns must be NULL or both non-NULL.

A sample cannot have its current volume determined before the sample was created; the Actual_Vol_Date must be on or after the sample's Creation_Date.

It is unlikely, though not impossible, that a sample's volume might increase after its creation. The system will report a warning when a sample's Actual_Vol_ul is greater than its Initial_Vol_ul.

A nucleic acid sample might have been created from one or more other nucleic acid samples. When this occurs, the nucleic acid sources are recorded in the NUCACID_SOURCES table.

When a nucleic acid sample has more than one nucleic acid source, the sample might contain samples from more than one source tissue, and/or more than one individual. In either case, the TId and/or UIId columns may not be applicable to the sample and will need to be validated using alternative rules. Two boolean columns — Multi_Indivs and Multi_TIds — are used to tell the system how to validate those columns.

The Multi_Indivs column is used to indicate whether this sample's sources are from multiple individuals, regardless of the number of sources the sample has. When TRUE, the UIId must be NULL. When FALSE, the UIId must not be NULL.

The Multi_TIds column is used to indicate whether this sample's sources are originally from multiple tissue samples, or TId's. As above, this is mostly independent from the number of individuals attributed to the sample. When TRUE, the TId must be NULL. When FALSE, the TId must not be NULL.

Column Descriptions

NAId (Nucleic Acid Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

TId (Tissue Identifier of Source)

The TISSUE_DATA.TId of the tissue sample from which this nucleic acid sample originated.

This column may be NULL, indicating that the sample has multiple TId sources.

UIId (Unique Individual Identifier)

The UNIQUE_INDIVS.UIId of the individual to whomm this nucleic acid sample is attributed.

This column may be NULL, indicating that this sample is attributed to more than one individual.

LocId (Identifier for the sample's current location)

The LOCATIONS.LocId indicating the current locale and location of the nucleic acid sample.

This column may not be NULL.

Name_on_Tube

The name of the source individual, according to the label on the tube.

This column may be NULL, when there is no identifying information on the tube. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

NucAcid_Type (Nucleic Acid sample Type)

The NUCACID_TYPES.NucAcid_Type of this nucleic acid sample.

This column may not be NULL.

Creation_Date

The date that this nucleic acid sample was created. When the process to generate a sample lasts more than one day, this is the date that the procedure was completed.

This column may be NULL, when the creation date is unknown.

Creation_Method

The NUCACID_CREATION_METHODS.Creation_Method describing how this nucleic acid sample was created.

This column may not be NULL.

Initial_Vol_ul (Initial Volume in μL)

The sample's volume, in microliters, when it was first created.

This column may be NULL, when the initial volume is unknown.

Actual_Vol_ul (Actual Volume, in μL)

The sample's volume, in microliters, as of the Actual_Vol_Date.

This column may be NULL, when users have not updated the sample's "current" volume or when the sample has not yet been used.

Actual_Vol_Date (Date of the recorded Actual Volume)

The date that the Actual_Vol_ul was determined.

This column may be NULL, when users have not updated the sample's "current" volume or when the sample has not yet been used.

Multi_Indivs

A boolean, indicating whether or not this sample is attributed to more than one individual.

This column may not be NULL. It is presumed that most nucleic acid samples will be attributed to a single individual, so this column's default value is FALSE.

Multi_TIds

A boolean, indicating whether or not this sample has more than one source tissue.

This column may not be NULL. It is presumed that most nucleic acid samples will have a single tissue source, so this column's default value is FALSE.

Notes

Comments or miscellaneous information about this nucleic acid sample.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

NUCACID_LOCAL_IDS (LOCAL IDentifierS for NUCleic ACID samples)

This table contains one row for every name or ID used only at a specific institution (an ID that is "local" to that institution) to describe a particular nucleic acid.

Identity of samples is maintained by the system as much as possible, but when working with samples in the laboratory this is often inconvenient or impractical. Different groups and institutions often have their own systems for giving unique names to their samples, and while these names may be useful and meaningful for humans, they are mostly unhelpful from the database's perspective. They're vulnerable to typos, and can be very confusing when a sample is shared between institutions. However, these "local names" remain important for the people who are actually using these samples, so these identifiers are recorded in this table, one per nucleic acid sample, per institution.

Every combination of NAId and Institution must be unique; an NAId cannot go by more than one local name at the same Institution.

Every combination of Institution and LocalId must be unique; the same local name cannot be used at a single Institution more than once.

Column Descriptions

NAId (Nucleic Acid Identifier)

The NUCACID_DATA.NAId of the nucleic acid sample.

This column may not be NULL.

Institution

The INSTITUTIONS.Institution indicating the organization or research group at which this NaId's name is used.

This column may not be NULL.

LocalId (Local Identifier)

The local name used for this NAId at this Institution.

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

NUCACID_SOURCES

This table contains one row for every nucleic acid sample having another nucleic acid as its source.

Tip

Always use the NUCACID_SOURCES_EXT view in place of this table. It has additional columns which may be of interest.

Often, nucleic acid samples are created through some "extraction" process in which the nucleic acids are purified from a tissue sample (e.g. a blood draw, a buccal swab, etc.) However, there are also numerous different methods by which nucleic acid samples may instead be created from another nucleic acid sample (e.g PCR^[118], reverse transcription, dilution, etc.). In addition to recording the identity of the source nucleic acid, this table includes the Relationship column, which indicates the nature of the connection between the row's nucleic acid and its source nucleic acid. This relationship may be simple enough to explain in a single word (e.g. "DILUTION"), or complex enough to require a lengthy explanation. To allow this flexibility, Relationship is not constrained to a set of legal values in a support table.

A nucleic acid sample cannot indicate itself as its source; the NAId and Source_NAId cannot be equal.

A nucleic acid cannot have been created before its source; the related Creation_Date of this NAId must be on or after the Source_NAId's related Creation_Date.

When a sample has a single source tissue, its source nucleic acid samples must also be from that tissue and only that tissue. That is, when a sample's NUCACID_DATA.TId is not NULL, all of its source nucleic acid samples must also have that same TId.

When a sample is attributed to a single individual, its source nucleic acid samples must also be from that individual and only that individual. That is, when a sample's NUCACID_DATA.UIId is not NULL, all of its source nucleic acid samples must also have that same UIId.

A single sample can have multiple sources, but each of those multiple sources must be unique. Every NAId-Source_NAId pair must be unique.

Column Descriptions

NAId (Nucleic Acid Identifier)

The NUCACID_DATA.NAId of the nucleic acid that has another nucleic acid as its source.

This column may not be NULL.

Source_NAId (Nucleic Acid Identifier of Source)

The NUCACID_DATA.NAId of the source nucleic acid.

This column may not be NULL.

Relationship

A textual description of how this nucleic acid and its source are connected.

This column may not be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

POPULATIONS

This table contains one row for every population under observation, and/or from which tissue or nucleic acid samples have been collected.

In this context, the term "population" refers to a particular species at a specific location. "The baboons in the Amboseli basin in Kenya", for example, are a population. "All baboons", or "all wildlife in the Amboseli basin", are not.

In the common vernacular, a population is often referred to only by the name of its site, e.g. "Gombe" when referring to the Gombe chimpanzees. Because of this, the Pop_Name and Site columns may seem redundant, but when setting vernacular aside it should be obvious that these two columns contain objectively different information. In practice, users may elect to enter the same value in both of these columns, but the two columns remain independent of each other.

Special Values

PopId 1 has special meaning to the system. Data integrity rules for the UNIQUE_INDIVS table presume that the population with this PopId is the population whose individuals are recorded in BIOGRAPH. No other code should be created to refer to that population.

Column Descriptions

PopId (Population Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Pop_Name (Population Name)

The name of the population.

This column may not be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Species_Sci_Name (Scientific Name of the Species)

The scientific name of this population's species.

This column may be NULL, when unknown or not applicable. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Species_Common_Name (Common Name of the Species)

The common name of this population's species.

This column may not be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Wild_Captive

A code indicating whether or not the population is wild or captive. The legal values are shown below.

POPULATIONS.Wild_Captive Values

W: Wild.
C: Captive.
U: Unknown.
NA: Not applicable.

This column may not be NULL.

Site

The location of the population.

This column may not be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Notes

Comments or miscellaneous information about this population.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

TISSUE_DATA (General information about TISSUE samples)

This table contains one row for every tissue sample that is or ever has been in the inventory.

Tip

Always use the TISSUES view in place of this table. It contains additional related columns which may be of interest.

This table records a tissue sample's current location using the LocId column. Values in this column constrain and are constrained by values in the NUCACID_DATA.LocId column, and may or may not be unique, as discussed in the LOCATIONS table.

If a sample was collected from an individual in BIOGRAPH — if the related UNIQUE_INDIVS.UIId has a PopId of 1 — the sample's Collection_Date must be on or after that individual's Entrydate. Depending on the sample's Tissue_Type, the Collection_Date may also be constrained by the individual's Statdate. See TISSUE_TYPES for more information.

Note

The Collection_Date and Collection_Time columns are misnomers. Some tissue samples were not collected; rather, their origination might better be described as having been "generated", "aliquotted", "pooled", etc.

The system will return a warning if a sample's Collection_Date is after the individual's Statdate, but only when the sample's Tissue_Type indicates that the Collection_Date is not constrained by the individual's Statdate. That is, when the related TISSUE_TYPES.Max_After_Statdate is NULL.

From time to time, field observers may mistakenly record the wrong collection date on a tube. To help identify when this has occurred, the system uses the CENSUS table to confirm whether the Collection_Date is a date that the individual was actually observed^[119]. The result of that confirmation is indicated in the Collection_Date_Status column.

Note

When a sample's Collection_Date is not a Date on which the individual was recorded present in CENSUS, the Collection_Date is not necessarily "wrong". There are numerous circumstances in which a sample may have been collected without a census being performed.

Tip

Do not assume that the date written on a sample's label will always match the Collection_Date. When data managers determine that the date written on a label is erroneous, they may be able to determine the true date and update the Collection_Date as needed.

A tissue sample might have been created from one or more other tissue samples. When this occurs, the tissue's sources are recorded in the TISSUE_SOURCES table.

When a tissue sample has more than one tissue source, the sample might contain samples from more than one individual. In that case, some individual-specific columns may not be applicable to the sample and will need to be validated using alternative rules. The boolean Multi_Indivs column is used to tell the system how to validate those columns. When TRUE, this indicates that the tissue's sources are from multiple individuals, and therefore:

UIId must be NULL.
Misid_Status must be NULL.
Collection_Date_Status is not calculated and is set to NULL

When Multi_Indivs is FALSE, this indicates that the sample is only associated with a single individual, and therefore:

UIId cannot be NULL.
Misid_Status cannot be NULL.
Collection_Date_Status is calculated and will not be NULL

The system will return a warning if a sample is indicated as having multiple individuals, but does not have multiple tissue sources from different individuals. That is, if a tissue has a TRUE Multi_Indivs but does not have at least two tissue sources in TISSUE_SOURCES from at least two different UIIds.

Column Descriptions

TId (Tissue Identifier)

A unique identifier for the tissue sample. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

UIId (Unique Individual Identifier)

The UNIQUE_INDIVS.UIId of the individual from whom this tissue sample was collected.

This column may be NULL, as described above.

LocId (Identifier for the sample's current location

The LOCATIONS.LocId indicating the current locale and location of the sample.

This column may not be NULL.

Name_on_Tube

The name of the individual from whom this tissue sample was collected, according to the label on the tube.

This column may be NULL, when there is no identifying information on the tube. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Collection_Date

The date the sample was collected or originated.

This column may be NULL, when the date is unknown.

Collection_Time

The time the sample was collected or originated.

This column may be NULL, when the time is unknown.

Tissue_Type

The TISSUE_TYPES.Tissue_Type of this tissue sample.

This column may not be NULL.

Storage_Medium

The STORAGE_MEDIA.Storage_Medium in which the sample is stored.

This column may not be NULL.

Misid_Status (Misidentification Status)

The MISID_STATUSES.Misid_Status of this tissue sample.

This column may be NULL, as described above.

Collection_Date_Status

A code indicating whether this row's Collection_Date is or isn't plausible according to available CENSUS data. The legal values are:

Valid TISSUE_DATA.Collection_Date_Status Values

Code	Description
`0`	This individual is part of the main population and has a non-"absent" CENSUS row on this Collection_Date, OR this individual is not part of the main population and we have no basis to question the accuracy of this Collection_Date
`1`	This Collection_Date is `NULL`, OR this individual is part of the main population and either i) has no CENSUS rows on this Collection_Date or ii) has only "absent" censuses on this Collection_Date

This column is automatically maintained by the database and will only be NULL when Multi_Indivs is TRUE. Attempts to manually populate or update this column are silently ignored.

Multi_Indivs

A boolean, indicating whether or not this sample is attributed to multiple individuals. The value of this column affects how other columns are validated, as described above.

This column may not be NULL. It is presumed that most samples will not be attributed to multiple individuals, so the default value for this column is FALSE.

Notes

Comments or miscellaneous information about this tissue sample.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

TISSUE_LOCAL_IDS (LOCAL IDentifierS for TISSUE samples)

This table contains one row for every name or ID used only at a specific institution (an ID that is "local" to that institution) to describe a particular tissue sample.

For more details about the reason for this table and the difference between a "local" name/identifier and an ID generated by the database, see the discussion for the NUCACID_LOCAL_IDS table.

Every combination of TId and Institution must be unique; a TId cannot go by more than one name at the same Institution.

Every combination of Institution and LocalId must be unique; the same local name cannot be used at a single Institution to describe more than one sample.

Column Descriptions

TId (Tissue Identifier)

The TISSUE_DATA.TId of the tissue sample.

This column may not be NULL.

Institution

The INSTITUTIONS.Institution indicating the locale in which this TId's name is used.

This column may not be NULL.

LocalId

The local name used for this TId at this Institution.

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

TISSUE_SOURCES

This table contains one row for every tissue sample having another tissue as its source.

Tip

Always use the TISSUE_SOURCES_EXT view in place of this table. It has additional columns which may be of interest.

In addition to recording the identity of the source tissue, this table includes the Relationship column, which indicates the nature of the connection between the row's tissue and its source tissue. This relationship may be simple enough to explain in a single word (e.g. "ALIQUOT"), or complex enough to require a lengthy explanation. To allow this flexibility, Relationship is not constrained to a set of legal values in a support table.

A tissue sample cannot indicate itself as its source; the TId and Source_TId cannot be equal.

A tissue sample cannot have been collected before its source; the related Collection_Date of this TId must be on or after the Source_TId's related Collection_Date.

Depending on the details of the Relationship, a tissue sample and its source may or may not be from a different individual. The system does not require that the related UIId of the TId and that of the Source_TId be equal. However, the system will return a warning when they are not equal.

A single sample can have multiple sources, but each of those multiple sources must be unique. Every TId-Source_TId pair must be unique.

Usually, a sample should not have sources from different individuals if it is not indicated as coming from multiple individuals. It is theoretically possible, so the system will return a warning when this occurs. That is, when a sample whose TISSUE_DATA.Multi_Indivs is FALSE has multiple sources in this table with different UIId's or has any sources with a TRUE Multi_Indivs.

Column Descriptions

TId (Tissue Identifier)

The TISSUE_DATA.TId of the tissue sample that has another tissue sample as its source.

This column may not be NULL.

Source_TId (Tissue Identifier of Source)

The TISSUE_DATA.TId of the source tissue sample.

This column may not be NULL.

Relationship

A textual description of how this tissue sample and its source are connected.

This column may not be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

UNIQUE_INDIVS (All UNIQUE INDIVidualS)

This table contains one row for every individual under observation, and every individual from whom tissue or nucleic acid samples have been collected.

In contrast to BIOGRAPH, which records the identities of every individual in the main study population^[120], this table also records the identities of all the individuals in other populations from whom there are tissue or nucleic acid samples recorded in the inventory. All individuals in BIOGRAPH are also included in this table, whether or not tissue or nucleic acid samples exist in the inventory. This presents a problem: there are two tables that separately track the identities of all individuals in the main population. To address this, the triggers have been written to ensure that BIOGRAPH retains primary authority over all individuals in the main population.

Management of individuals in the main population is done by BIOGRAPH (see its discussion for more information), so the ability to perform inserts/updates/deletes in this table for those individuals is heavily constrained, as follows:

Inserting rows for individuals in the main population is only allowed for the unknown individual or for individuals in BIOGRAPH who have not yet been added to this table^[121].
The unknown individual's row can only be updated or deleted by an administrator.
Deleting rows for individuals in the main population is only allowed for individuals who are no longer in BIOGRAPH ^[122].
Updating rows for individuals in the main population is only allowed when changing only the Notes column.
Any individual's PopId cannot be updated to add or remove the individual from the main population.

Tip

Do not manually insert or delete rows in this table for individuals in BIOGRAPH. Perform those actions in BIOGRAPH, and the action will automatically be performed in this table, as well. Manual inserts and deletes in this table should only be done for individuals who are not in BIOGRAPH.

The IndivId column is used to record the individual's name or similar ID. Study projects and research institutions each have their own rules of nomenclature for their individuals, so this might be a lengthy name, an abbreviation, a series of numbers, or some mix of these. This value is not unique; the same identifier may be used more than once across different populations. However, per PopId, each IndivId must be unique; a population cannot use the same identifier more than once.

Special Values

PopId 1 is the population recorded in BIOGRAPH, so any row with this PopId (with a few exceptions, discussed below) must use the individual's Bioid as its IndivId.

IndivId UNKNOWN indicates the unknown individual, and is allowed to have PopId 1 and not be a Bioid.

Column Descriptions

UIId (Unique Individual Identifier)

A unique identifier for the individual. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

IndivId (Individual Identifier)

The name/identifier for this individual.

This column may not be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

PopId (Population Identifier)

The POPULATIONS.PopId of the individual's population.

This column may not be NULL.

Notes

Comments or miscellaneous information about this individual.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

^[115]Also "tissue" and "tissue sample", but those two terms aren't terribly different anyway.

^[116]This is expected to be the highest plausible accuracy to ever be used for the concentrations stored in this table. This can easily be expanded if needed.

^[117]Even in the coldest of cold storage, frozen samples will slowly evaporate over time. A 100-μL sample that is frozen and stored for 5 years is unlikely to still be the full 100 μL at the end of that time.

^[118]It is presumed that any reader who cares enough about nucleic acid samples to read this documentation is already familiar with the polymerase chain reaction. We will not attempt to explain it here.

^[119]Admittedly, this approach is imperfect and is likely underestimating the true prevalance of the problem. The date written on a sample may not be the true date it was collected but may still be a date that the individual was censused. Unfortunately, there is little else that the system can do to recognize when this occurs.

^[120]That is, the population whose data are recorded throughout the many tables in Babase.

^[121]Related rows in this table are automatically inserted when rows are inserted into BIOGRAPH, so manual insertion of these rows is effectively not allowed.

^[122]Similar to inserts, related rows in this table are automatically deleted when rows are deleted from BIOGRAPH, so manual deletion of these rows is effectively not allowed.

Prev	Up	Next
Darting	Home	SWERB Data (Group-level Geolocation Data)