Babase Chado Extensions

Babase has extended Chado. The table related changes are documented in this section.

ANALYSIS (Analyses)

The ANALYSIS table contains one row per computational analysis.

The extension of the ANALYSIS table within Babase

The ANALYSIS table has been extended by Babase with the addition of the Type_Id column and a requirement that ANALYSIS.Name be unique. ANALYSIS is therefore documented in this section with the new tables Babase has added to Chado.

Note

In Babase computational analyses are the result of high throughput genetic sequencing. There are two broad classes of results, individual baboon SNV sites and functional genomic analysis of individual baboons.[10] SNV analysis yields annotations of the genome, the locations of SNV sites. The SNV sites are in turn related to genetic samples and hence to individual baboons. Functional genomic analysis is also related to features on the genome but does not necessarily in and of itself result in new feature annotations. Both classes of analysis produce per individual baboon per feature metrics. Functional genomic analysis tends to produce a single metric per feature per baboon. SNV site analysis produces 3 metrics per feature per baboon, one each for homozygous reference, heterozygous, and homozygous alternate. SNV analysis also produces an additional metric per SNV site.

Computational analysis can be related to each other to capture the many steps and combinations of analytical results that may go into a final analysis. (See ANALYSIS_RELATIONSHIP.)

Analysis_Id

Unique numeric identifier of the ANALYSIS row. This column may not be NULL.

Name

A short name for the analysis. This need not be unique.

Description

Textual description of the analysis.

Program

The program used to run the analysis. This column may not be NULL.

Note

This column is not validated. It may be better to use the ANALYSISPROP table, supplying the program version as the value, or even the ANALYSIS_CVTERM table, each version of each program having a separate controlled vocabulary term. Further, a Babase analysis may be the result of more than one program.

A decision is required as to how to use this column in Babase, especially as it may not be NULL.

Programversion

The version of the program used to run the analysis. This column may not be NULL. (But see the note regards Program above.)

Algorithm

The algorithm used in the analysis. (But see the note regards Program above.)

Sourcename

Where the data came from that was used in the analysis.

Note

Since Babase analyses tend to use samples collected from individual baboons this column is probably uninteresting and is unlikely to be used.

Sourceversion

Version associated with Sourcename.

SourceURI

The URL associated with Sourcename.

Note

While this column is potentially useful, the ANALYSIS_DBXREF table provides a more powerful alternative.

Timeexecuted

The date and time the analysis was executed. This column may not be NULL.

Note

Because Babase analyses may be complex and require many steps that take place over time this column is less useful. None the less, it is probably a good idea to put a value in this column to record when the analysis was complete.

This column, being an actual timestamp datatype, is readily searchable, sortable, and so forth. Other dates related to the analysis may be optionally recorded in the ANALYSISPROP table.

Term_Id

The primary type of the analysis. Ancillary types may be recorded in ANALYSIS_CVTERM.

Note

In Babase the primary types would be things like gene expression or DNA methylation.

ANALYSIS_DBXREF (Analysis to External Database Object Cross-References)

ANALYSIS_DBXREF contains one row per external database object related to an analysis. Possible external database objects are inputs into or outputs of the analysis.

Note

In the case of Babase, ANALYSIS_DBXREF is used to relate the FASTQ files which went into the analysis to the analysis.

The combination of Analysis_Id and DBXref_Id must be unique.

ANALYSIS_DBXREF rows are automatically deleted when the related ANALYSIS row is deleted. ANALYSIS_DBXREF rows are automatically deleted when the related DBXREF row is deleted.

Analysis_DBXref_Id

A unique number identifying the row. This column may not be NULL.

Analysis_Id

Identifier of the ANALYSIS row with which the external database object is related. An ANALYSIS.Analysis_Id value. This column may not be NULL.

DBXref_Id (External Database Object Identifier)

Identifier of the external database object related to the ANALYSIS row. A DBXREF.DBXref_Id value. This column may not be NULL.

Is_Current

A Boolean. When TRUE the relationship between the analysis and the external database object is a current, official, relationship. When FALSE the relationship is outdated. This column may not be NULL. The default value for this column is TRUE

ANALYSIS_CVTERM (Analysis Typing/Tagging)

ANALYSIS_CVTERM contains one row per analysis per each ancillary type or tag used to classify the analysis. As usual in Chado, the CVTERM table is used to supply the vocabulary (or vocabularies) used in classification.

Note

In Babase ancillary types could be things like preliminary result or final result.

The combination of Analysis_id and CVTerm_Id must be unique.

ANALYSIS_CVTERM rows are automatically deleted when the related ANALYSIS row is deleted. ANALYSIS_CVTERM rows are automatically deleted when the related CVTERM row is deleted.

Analysis_CVTerm_Id

A unique number identifying the row. This column may not be NULL.

Analysis_Id

The analysis assigned an ancillary type or tag. An ANALYSIS.Analysis_Id value. This column may not be NULL.

CVTerm_Id (Controlled Vocabulary Term Identifier)

The ancillary type or tag applied to the analysis. A CVTERM.CVTerm_Id value. This column may not be NULL.

Is_Not

A Boolean value which serves to negate the type or tag (to negate the controlled vocabulary term). This column may not be NULL.

ANALYSIS_RELATIONSHIP (Analysis Inter-Relationships)

ANALYSIS_RELATIONSHIP contains one row per subject/verb/object or parent/child relationship between 2 analyses. This table allows organization of analyses into acyclic directed graphs to record things like which preliminary analysis went into a final analysis, or even which set of analytic results are a subset of some larger result. It is one of the many relationship tables in Chado.

The combination of Subject_Id, Object_Id, Type_Id, and Rank must be unique.

ANALYSIS_RELATIONSHIP rows are automatically deleted when either the related Subject_Id ANALYSIS row is deleted or when the related Object_Id ANALYSIS row is deleted. ANALYSIS_RELATIONSHIP rows are automatically deleted when the related CVTERM row is deleted.

Analysis_Relationship_Id

A unique number identifying the row. This column may not be NULL.

Subject_Id

The subject of the subject-predicate-object sentence. An ANALYSIS.Analysis_Id value. This column may not be NULL.

Object_Id

The object of the subject-predicate-object sentence. An ANALYSIS.Analysis_Id value. This column may not be NULL.

Type_Id

The relationship between the subject analysis and the object analysis. A CVTERM.CVTerm_Id value that has a non-0 Is_Relationshiptype value. This column may not be NULL.

Value (Additional Notes)

Additional textual notes on the relationship between the analyses.

Rank

A number which gives the value an ordinal position among the other ANALYSIS_RELATIONSHIP values, per Subject_Id/Object_Id pairing, related to the CVTERM row. This column may not be NULL.

The default Rank value is 0.

For more on Rank values see The Chado Rank Column.

ANALYSIS_ND_EXPERIMENT (Analysis to Natural Diversity Experiment Result Set Relationships)

ANALYSIS_ND_EXPERIMENT relates analyses to natural diversity experiment result sets in, potentially, a many-to-many fashion. The table contains one row for every pairing of analysis to natural diversity experiment result set.

Caution

Care should be taken when using this table and the Chado genetics module. Use of both can result in multiple links between the ND_EXPERIMENT and the FEATURE table, by way of different join paths. The Chado genetics module links the FEATURE and ND_EXPERIMENT tables by way of the GENOTYPE table. The Chado Companalysis module links FEATURE to ND_EXPERIMENT, using this table, by way of the ANALYSIS table. It is therefore possible to introduce inconsistent relationships into the database should the 2 different join paths end at different FEATURE rows when this is inappropriate.

Tip

Using ANALYSIS_ND_EXPERIMENT it is possible to construct many-to-many relationships between ANALYSIS rows and ND_EXPERIMENT rows, but care should be taken since it is unlikely that such many-to-many relationships are desirable. It is probably more appropriate to construct one-to-many relationships between ANALYSIS and ND_EXPERIMENT rows, or the reverse, to construct many-to-one relationships between ANALYSIS and ND_EXPERIMENT rows.

Note

When doing analyses in Babase, whether of SNV sites or functional genomic analysis, an ND_EXPERIMENT row represents the result set of an analysis of a genomic feature. Therefore the typical Babase use-case has multiple ND_EXPERIMENT rows per ANALYSIS row.

Analysis_ND_Experiment_Id

A unique number identifying the pairing of ANALYSIS row to ND_EXPERIMENT row. This column may not be NULL.

Analysis_Id

The analysis related to the natural diversity experiment result set. An ANALYSIS.Analysis_Id value. This column may not be NULL.

ND_Experiment_Id

The natural diversity experiment result set related to the analysis. A ND_EXPERIMENT.ND_Experiment_Id value. This column may not be NULL.

Type_Id

The type of relationship between the analysis and the natural diversity experiment result set -- a CVTERM.CVTerm_Id value. This column may not be NULL.

ND_EXPERIMENT_FEATURE (Natural Diversity Experiment Result Set to Feature Relationships)

The ND_EXPERIMENT_FEATURE table contains one row per every relationship between a natural diversity result set and a feature -- that is, one row per pairing of ND_EXPERIMENT row with a FEATURE row.

ND_EXPERIMENT_FEATURE relates feature to natural diversity experiment result sets in, potentially, a many-to-many fashion. The table contains one row for every pairing of feature to natural diversity experiment result set.

Tip

Using ND_EXPERIMENT_FEATURE it is possible to construct many-to-many relationships between FEATURE rows and ND_EXPERIMENT rows, but care should be taken since it is unlikely that such many-to-many relationships are desirable, at least not per analysis[11]. It is probably more appropriate to, per analysis, construct one-to-many relationships between FEATURE and ND_EXPERIMENT rows, or the reverse, to construct many-to-one relationships between FEATURE and ND_EXPERIMENT rows. In some cases it is appropriate to construct one-to-one relationships between FEATURE and ND_EXPERIMENT rows.

The combination of ND_Experiment_Id and Feature_Id must be unique.

ND_EXPERIMENT_FEATURE rows are automatically deleted whenever any related ND_EXPERIMENT row is deleted. ND_EXPERIMENT_FEATURE rows are automatically deleted whenever any related FEATURE row is deleted. ND_EXPERIMENT_FEATURE rows are automatically deleted whenever any related CVTERM row is deleted.

ND_Experiment_Feature_Id

A unique number identifying the pairing of FEATURE row to ND_EXPERIMENT row. This column may not be NULL.

Feature_Id

The feature, typically an analyzed site in some input material, related to the natural diversity experiment result set. An FEATURE.Feature_Id value. This column may not be NULL.

ND_Experiment_Id

The natural diversity experiment result set related to the feature. A ND_EXPERIMENT.ND_Experiment_Id value. This column may not be NULL.



[10] This is not to say that the individual baboon is the unit of analysis. Rather that the result of analyses yields results which pertain to individual baboons.

[11] Where an analysis is an analysis of an experiment; in effect, where an analysis is an actual experiment.


Page generated: 2021-09-27T09:55:49-04:00.