Babase:

Technical Specifications for the Amboseli Baboon Project Data Management System

Jeanne Altmann, PhD.

Susan C. Alberts, PhD.

Jacob B. Gordon

Leah Gerber

ER Diagram conversion to Dia 

Leah Gerber

ER Diagram layout 

Karl O. Pinc

ER Diagram layout 

Anne Ndeti Hubbard

ER Diagram layout 

Anne Ndeti Hubbard

DocBook formatting 

Karl O. Pinc

DocBook formatting 

Document generated: 2024-08-22 14:16:10.

Copyright Notices

Copyright (C) 2005-2023 Karl O. Pinc, Jeanne Altmann, Susan Alberts, Leah Gerber, Jake Gordon, The Meme Factory, Inc.

Except as otherwise noted permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled GNU Free Documentation License.

Copyright (C) 1996-2011 The PostgreSQL Global Development Group

The appendix titled Database Transactions Explained is Copyright (C) 1996-2011 by the PostgreSQL Global Development Group, distributed under the terms of the license of the University of California below.

Permission to use, copy, modify, and distribute this software and its documentation for any purpose, without fee, and without a written agreement is hereby granted, provided that the above copyright notice and this paragraph and the following two paragraphs appear in all copies.

IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS ON AN "AS-IS" BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.

March 2, 2005


Acknowledgments

We gratefully acknowledge the support of the National Science Foundation for the supporting the collection of the majority of the data stored in the database; in the past decade in particular we acknowledge support from IBN 9985910, IBN 0322613, IBN 0322781, BCS 0323553, BCS 0323596, DEB 0846286, DEB 0846532 and DEB 0919200. We are also very grateful for support from the National Institute of Aging (R01AG034513-01 and P01AG031719) and the Princeton Center for the Demography of Aging (P30AG024361). We also thank the Chicago Zoological Society, the Max Planck Institute for Demographic Research, the L.S.B. Leakey Foundation and the National Geographic Society for support at various times over the years. In addition, we thank the National Institute of Aging (R03-AG045459-01) for supporting recent work extending the database to incorporate genetic and genomic data.

Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation, the National Institute of Aging, the Princeton Center for the Demography of Aging, the Chicago Zoological Society, the Max Planck Institute for Demographic Research, the L.S.B. Leakey Foundation, the National Geographic Society, or any other organization which has supplied support for this work.

Table of Contents

1. Introduction
This Document
Conventions Used In This Document
A Guide for the Reader
System Design
To Start Babase
Other Resources
2. Babase System Architecture
Databases
The babase Database
The babase_copy Database
The babase_test Database
Users, Groups and Database Permissions
The babase_readers group
The babase_editors group
Schemas
The babase schema
The babase_something_views schemas
The babase_history schema
The babase_pending schema
The sandbox schema
The devel schema
The per-user schemas
Table Overview
The Sys_Period Column
Entity-Relationship Diagrams
Views
Special Values
Indexes
The Babase Program Code
3. Baboon Data: Primary Source Material
Group Membership and Life Events
ALTERNATE_SNAMES (Alternate Short Names)
BEHAVE_GAPS (Gaps in Behavior Observations)
BIOGRAPH (Baboon Biographical Data)
CENSUS (Group Membership)
CONSORTDATES (First Consortship Dates)
DEMOG (Demography Notes)
DISPERSEDATES (Dispersal Dates)
GROUPS (Groups)
MATUREDATES (Sexual Maturity Dates)
RANKDATES (Adult Rank Attainment Dates)
Physical Traits
WP_AFFECTEDPARTS (Body PARTS AFFECTED by Wounds/Pathologies)
WP_DETAILS (Wound/Pathology DETAILS)
WP_HEALUPDATES (Wound/Pathology HEAL UPDATES)
WP_OBSERVERS (Wound/Pathology OBSERVERS)
WP_REPORTS (Wound/Pathology REPORTS)
Sexual Cycles
CYCGAPS (Gaps in Female Cycle Observations)
CYCLES (Female Sexual Cycles)
CYCPOINTS (Female Sexual Cycle Events)
PREGS (Pregnancies)
REPRO_NOTES (Textual NOTES about REPROduction)
SEXSKINS (Sexskin Turgesence Measurements)
Social and Multiparty Interactions
ALLMISCS (Ad-libitum sample data)
CONSORTS (multiparty disputes over CONSORTshipS)
FPOINTS (Point data on Females)
INTERACT_DATA (Interactions)
MPIS (Multiparty InteractionS)
MPI_DATA (Multiparty dyadic Interactions)
MPI_PARTS (Multiparty Interaction PARTicipantS)
PARTS (Participants in interactions)
POINT_DATA (Point observation data)
NEIGHBORS (point observation data on Neighbors)
SAMPLES (all-occurrences Samples)
Darting
ANESTHS (Extra Sedation Administered During Darting)
BODYTEMPS (Darting Body Temperature Measurements)
CHESTS (Darting Chest Circumference Measurements)
CROWNRUMPS (Darting Crown-to-Rump Measurements)
DART_SAMPLES (Darting Tissue Sample Records)
DARTINGS (Baboon Darting Events)
DPHYS (Darting Physiological Measurements)
HUMERUSES (Darting Humerus Length Measurements)
PCVS (Darting Blood Measurements)
TEETH (Darting Tooth Data)
TESTES_ARC (Darting Testes circumference Data)
TESTES_DIAM (Darting Testes Diameter Data)
TICKS (Darting Tick and Parasite Data)
ULNAS (Darting Ulna Length Measurements)
VAGINAL_PHS (Darting Vaginal pH Measurements)
Inventory
LOCATIONS
NUCACID_CONC_DATA (NUCleic ACID CONCentration DATA)
NUCACID_CREATORS (NUCleic ACID CREATORS)
NUCACID_DATA (General information about NUCleic ACID samples)
NUCACID_LOCAL_IDS (LOCAL IDentifierS for NUCleic ACID samples)
NUCACID_SOURCES
POPULATIONS
TISSUE_DATA (General information about TISSUE samples)
TISSUE_LOCAL_IDS (LOCAL IDentifierS for TISSUE samples)
TISSUE_SOURCES
UNIQUE_INDIVS (All UNIQUE INDIVidualS)
SWERB Data (Group-level Geolocation Data)
AERIALS (Aerial photos)
GPS_UNITS (Individual GPS Devices)
QUAD_DATA (map Quadrants)
SWERB_BES (Begin/Ends: Uninterrupted bouts of group-level observation)
SWERB_DATA (Group Level GPS Point Samples)
SWERB_DEPARTS_DATA (Observation team departures from camp)
SWERB_DEPARTS_GPS (SWERB GPS Departure data)
SWERB_GWS (SWERB Grove and Waterholes)
SWERB_GW_LOC_DATA (SWERB Grove/Waterhole Location Data)
SWERB_LOC_DATA (LOCation-Specific DATA from GPS Points)
SWERB_LOC_GPS (Secondary Data for LOCations in GPS points)
SWERB_OBSERVERS
TREES
Weather Data
RAINGAUGES (Rain Measurements)
RGSETUPS (Rain Gauge Setups)
TEMPMINS (Minimum Temperature Measurements)
TEMPMAXS (Maximum Temperature Measurements)
DIGITAL_WEATHER (Digitally Collected Weather Data)
WREADINGS (Weather Readings)
4. Baboon Data: Analyzed
Darting
FLOW_CYTOMETRY
WBC_COUNTS (White Blood Cell Counts)
Group Membership and Life Events
DAD_DATA (Paternity analysis results)
MEMBERS (Day-by-day Group Membership)
RANKS (Rankings Within Groups)
RESIDENCIES (Group Residency, in bouts)
Physical Traits
HORMONE_KITS (KITS used to assay HORMONE concentration)
HORMONE_PREP_DATA (Lab PREParations performed on hormone samples)
HORMONE_PREP_SERIES (SERIES of Lab PREParations performed on hormone samples)
HORMONE_RESULT_DATA (RESULTs of HORMONE concentration assays)
HORMONE_SAMPLE_DATA (Tissue SAMPLEs used for HORMONE analysis)
HYBRIDGENE_ANALYSES
HYBRIDGENE_SCORES
SWERB Data
SWERB_LOC_DATA_CONFIDENCES
Interpolation
Interpolation's 3 Fundamentals
Interpolation Visualized
The Interpolation Rules
Expectations and Implications
The Social Group Residency Rules
Which Group?
The 15/29 test
The Residency Algorithm
The Sexual Cycle Day-By-Day Tables
CYCGAPDAYS (Day-by-day Periods of No Observation)
CYCSTATS (Female Fertility Cycle States)
MDINTERVALS (Mdate to Ddate Intervals)
MMINTERVALS (Mdate to Mdate Intervals)
REPSTATS (Female Reproductive States)
Sexual Cycle Determination
Automatic Sequencing
Automatic Mdate Generation
5. Support Tables
General Support Tables
BODYPARTS
LAB_PERSONNEL
OBSERVERS (Data Collection Staff)
OBSERVER_ROLES
UNKSNAMES (problem in identifying focal's neighbor or a lone male)
Group Membership and Life Events
BSTATUSES (Birth Accuracy Indicators)
CONFIDENCES (death cause (nature and agent), dispersal, and matgrp Confidence levels)
DAD_SOFTWARE
DCAUSES (Causes of Death)
DEMOG_REFERENCES (Demography Note References)
DEATHNATURES (Natures of Death Causes)
ENTRYTYPES (Categories of Entry to Study Population)
GAP_END_STATUSES (Explanations for Behavior Gap Ends)
MSTATUSES (Maturity Marker Statuses)
DAD_DATA_COMPLETENESS (Completeness Scores in Paternity Analyses)
DAD_DATA_MISMATCHES (Types of Genetic Mismatches)
RNKTYPES (Ranking Categories)
STATUSES (Indicators of Record and Baboon Vividity)
Physical Traits
HORMONE_IDS
HORMONE_PREP_PROCEDURES
HYBRIDGENE_SOFTWARE
MARKERS
WP_HEALSTATUSES
WP_REPORTSTATES
WP_WOUNDPATHCODES
Social And Multiparty Interactions
ACTIVITIES
ACTS (Interaction Types)
DATA_STRUCTURES (Data structures produced by Psion devices)
CONTEXT_TYPES (multiparty Interaction Context Categories)
FOODCODES (Food item Codes)
FOODTYPES (Food Types)
KIDCONTACTS (spatial relationship between mother and infant)
MPIACTS (Multiparty Interaction Types)
NCODES (Neighbor classifications)
PARTUNKS (problem identifying a multiparty interaction participant)
POSTURES
PROGRAMIDS (Program used on the device)
SAMPLES_COLLECTION_SYSTEMS
SETUPIDS (Setup files used in a data collection program)
STYPES (Focal Sample Types)
STYPES_ACTIVITIES (Activity values that are used with each SType)
STYPES_NCODES (Ncodes that are used with each SType)
STYPES_POSTURES (Postures that are used with each SType)
SUCKLES (infant suckling activity)
Sexual Cycles and The Sexual Cycle Day-By-Day Tables
PCSCOLORS (ParaCallosal Skin Colors)
Darting
DART_SAMPLE_CATS (Darting Sample Categories)
DART_SAMPLE_TYPES (Sample Types)
DRUGS (darting anesthetics)
LYMPHSTATES (Lymph node conditions)
PARASITES (Parasites and their indicators)
TCONDITIONS (Tooth Conditions)
TICKSTATUSES (parasite count classifications)
TOOTHCODES (kinds of teeth)
TOOTHSITES (Locations of deciduous or adult teeth)
TSTATES (State of Tooth existence)
Inventory
INSTITUTIONS
MISID_STATUSES (MISIDentification STATUSES)
NUCACID_CONC_METHODS (NUCleic ACID CONCentration quantification METHODS)
NUCACID_CREATION_METHODS (NUCleic ACID CREATION METHODS)
NUCACID_TYPES (NUCleic ACID TYPES)
STORAGE_MEDIA
TISSUE_TYPES
SWERB Data
ADCODES (SWERB Ascent and Descent relationships)
PLACE_TYPES (codes for various landscape features)
PREDATORS (codes for observed predators)
SWERB_LOC_CONFIDENCES (SWERB Location Confidence Values)
SWERB_LOC_STATUSES (SWERB Location Statuses)
SWERB_TIME_SOURCES (SWERB Time Sources)
SWERB_XYSOURCES (SWERB Time Sources)
Weather Data
WEATHER_SOFTWARES (Programs used for digital weather data reporting)
WSTATIONS (Weather Stations)
6. The Babase Views
Group Membership and Life Events
CENSUS_DEMOG (CENSUS extended with DEMOG information)
CENSUS_DEMOG_SORTED (CENSUS_DEMOG, Sorted)
CYCPOINTS_CYCLES (CYCPOINTS extended with CYCLES information)
CYCPOINTS_CYCLES_SORTED (CYCPOINTS_CYCLES, Sorted)
DEMOG_CENSUS (DEMOG, showing CENSUS information)
DEMOG_CENSUS_SORTED (DEMOG_CENSUS, Sorted)
GROUPS_HISTORY
PARENTS
POTENTIAL_DADS (Potential Dads)
PROPORTIONAL_RANKS (RANKS extended with calculated PROPORTIONAL ranks)
Physical Traits
ESTROGENS
GLUCOCORTICOIDS
HORMONE_PREPS
HORMONE_RESULTS
HORMONE_SAMPLES
PROGESTERONES
TESTOSTERONES
THYROID_HORMONES
WOUNDSPATHOLOGIES (All Wound/Pathology Data, Together)
WP_DETAILS_AFFECTEDPARTS (WP_DETAILS, extended with WP_AFFECTEDPARTS)
WP_HEALS (WP_HEALUPDATES, extended)
WP_REPORTS_OBSERVERS (WP_REPORTS, extended with WP_OBSERVERS)
Sexual Cycles
CYCLES_SEXSKINS (CYCLES extended with SEXSKINS information)
CYCLES_SEXSKINS_SORTED (CYCLES_SEXSKINS, Sorted)
MATERNITIES (completed reproductive events)
MTD_CYCLES (CYCLES and Mdate, Tdate, and Ddate CYCPOINTS data)
SEXSKINS_CYCLES (CYCLES extended with SEXSKINS information)
SEXSKINS_CYCLES_SORTED (SEXSKINS_CYCLES, Sorted)
SEXSKINS_REPRO_NOTES (SEXSKINS extended with REPRO_NOTES)
Social and Multiparty Interactions
ACTOR_ACTEES (Complete social interactions, INTERACT_DATA extended twice with PARTS)
INTERACT (INTERACT_DATA, with enhanced dates and times)
INTERACT_SORTED
MPI_EVENTS (Dyadic social interactions that comprise multiparty interaction collections, MPIS joined with MPI_DATA extended twice with MPI_PARTS)
POINTS (POINT_DATA, with enhanced times)
POINTS_SORTED (POINTS, Sorted)
SAMPLES_GOFF (SAMPLES, with the Group OF the Focal)
Darting
ANESTH_STATS (darting additional Anesthetic Statistics)
BODYTEMP_STATS (darting Body Temperature Statistics)
CHEST_STATS (darting Chest circumference Statistics)
CROWNRUMP_STATS (darting Crown-to-Rump Statistics)
DSAMPLES (darting sample records with columns for each sample type)
DENT_CODES (darting Dentition records with columns for each Toothcode)
DENT_SITES (darting Dentition records with columns for each Toothsite)
HUMERUS_STATS (darting Humerus length Statistics)
PCV_STATS (darting PCV Statistics)
TESTES_ARC_STATS (darting Testes circumference Statistics)
TESTES_DIAM_STATS (darting Testes Diameter Statistics)
ULNA_STATS (darting Ulna length Statistics)
VAGINAL_PH_STATS (darting Vaginal pH Statistics)
Inventory
LOCATIONS_FREE (LOCATIONS available for storage)
NUCACID_CONCS (NUCACID_CONC_DATA, extended)
NUCACIDS (NUCACID_DATA, extended)
NUCACIDS_W_CONC (NUCleic ACIDS With CONCentration data)
TISSUES
TISSUES_HORMONES
SWERB Data (Group-level Geolocation Data)
QUADS (map Quadrants)
SWERB (Group level gps point samples)
SWERB_DATA_XY (The SWERB_DATA table with separate X and Y coordinates)
SWERB_DEPARTS (SWERB observation team Departures from camp)
SWERB_GW_LOCS (SWERB Grove and Waterhole Locations)
SWERB_GW_LOC_DATA_XY (The SWERB_GW_LOC_DATA table with separate X and Y coordinates)
SWERB_LOC_GPS_XY (The SWERB_LOC_GPS table with separate X and Y coordinates)
SWERB_LOCS (placement of a group at a landscape feature)
SWERB_UPLOAD (facility for uploading data into SWERB)
Weather Data
MIN_MAXS (Manually collected minimum and maximum temperature and rain data)
MIN_MAXS_SORTED (MIN_MAXS, Sorted)
Views Which Add Gid To Tables
The BIRTH_GRP View
The ENTRYDATE_GRP View
The STATDATE_GRP View
The CONSORTDATES_GRP View
The CYCGAPDAYS_GRP View
The CYCGAPS_GRP View
The CYCSTATS_GRP View
The DARTINGS_GRP View
The DISPERSEDATES_GRP View
The MATUREDATES_GRP View
The MDINTERVALS_GRP View
The MMINTERVALS_GRP View
The RANKDATES_GRP View
The REPSTATS_GRP View
7. Data Entry
Data Entry Overview
Automatically Generated IDs
8. Babase Programs
Data Maintenance Programs
SWERB_UPLOAD: View to upload into SWERB
Upcen: Update CENSUS table
Upmpi: Upload Multiparty Interactions
Updart: Upload Darting Data
Uptick: Load darting parasite data
Psionload: Load Psion point/sample data
Upload: Upload Into Any Table or View
Useful Programs and Functions
Functions
Logout: Logout From Babase Custom Programs
Wwwdiff: World Wide Web based Difference program
Overview of Data Analysis Procedures
A. Manipulating Date and Time Values
B. Querying-All-Occurrences-Interactions
C. Alteration Of Sexual Cycle Ids (Cid)
D. Babase Revision History
Changes in Babase 5.x
Babase 5.0
Babase 5.1
Babase 5.2
Babase 5.3
Babase 5.4
Babase 5.5
Changes in Babase 4.x
Babase 4.0
Babase 4.1
Babase 4.2
Babase 4.3
Babase 4.4
Babase 4.4.1
Babase 4.5
Babase 4.6
Changes in Babase 3.x
Babase 3.0
Non-numbered backward in-compatible changes
Changes after Babase 2.0
Changes to Babase between 1.0 and 2.0
GROUPS
BIOGRAPH
Interpolation and MEMBERS
The Sexual Cycle Information
JPSAMPS and FPSAMPS (and POINT_DATA and FPOINTS)
Time Representation
The All-Occurrences Focal Point Data
Support Tables
The Addition of Views
E. DocBook, Styling and other issues
F. Restrictions: Things Not To Do
G. Database Transactions Explained
H. The Warning Sub-System
Introduction to the Warning Sub-System
An Overview of the Warning Sub-System Data Structures
The Warning Sub-System Main Tables
INTEGRITY_QUERIES
INTEGRITY_WARNINGS (Warning Sub-System Results)
Warning Sub-System Support Tables
IQTYPES (Integrity Query Types)
WARNING_REMARKS (Remarks Regards Warning Results)
The Warning Sub-System Functions (Activating The Warning Sub-System)
run_integrity_queriesexecute one or more of the queries stored in the INTEGRITY_QUERIES table
I. Temporal Tables and babase_history
Introduction to Temporal Tables
How it Works
Temporal Tables in Action
Querying the history
Specific Examples

List of Figures

2.1. Key to the Babase Entity Relationship Diagrams
2.2. Babase Group Membership Entity Relationship Diagram
2.3. Babase Life Events Entity Relationship Diagram
2.4. Babase Sexual Cycle Entity Relationship Diagram
2.5. Babase Sexual Cycle Day-To-Day Tables Entity Relationship Diagram
2.6. Babase Social Interactions Entity Relationship Diagram
2.7. Babase Multiparty Interactions Entity Relationship Diagram
2.8. Babase Darting Logistics and Morphology Entity and Relationship Diagram
2.9. Babase Darting Physiology Entity and Relationship Diagram
2.10. Babase Darting Samples Entity and Relationship Diagram
2.11. Babase Darting Teeth and Ticks Entity and Relationship Diagram
2.12. Babase Inventory Entity Relationship Diagram
2.13. Babase Physical Traits Hormone Data Entity Relationship Diagram
2.14. Babase Physical Traits Genetic Hybrid Score Data Entity Relationship Diagram
2.15. Babase Physical Traits Wounds and Pathologies Data Entity Relationship Diagram
2.16. Babase SWERB Core Tables Entity Relationship Diagram
2.17. Babase SWERB Grove/Waterhole Location Tables Entity Relationship Diagram
2.18. Babase Manual Weather Data Entity Relationship Diagram
2.19. Babase Digital Weather Data Entity Relationship Diagram
4.1. An Individual is Censused Present and Absent
4.2. Interpolating From Presences and Absences
4.3. Interpolating Group Membership
4.4. Computing Interp Values
4.5. The 14 Day Interpolation Limit
4.6. Interpolation at Birth
4.7. Alive and Present When Last Censused
4.8. Alive and Absent in Last Census
4.9. Interpolation to Statdate When Dead
4.10. Midpoint Days
4.11. The Midpoint Rule Adjusts Intervals
4.12. An Individual is Censused in 2 Groups
4.13. Interpolating Each Group Separately
4.14. A Closer Look at Intervals
4.15. Presence and Absence Interpolated Separately
4.16. Combining Presence and Absence Intervals
4.17. Group Membership Given Multiple Groups
4.18. Pre-Analyzed Data Truncates Interpolation Intervals
4.19. Pre-Analyzed Data Interrupts Interpolation
4.20. An example 15/29 test
4.21. 29-day windows at the end of a residency period
6.1. Query Defining the CENSUS_DEMOG View
6.2. Entity Relationship Diagram of the CENSUS_DEMOG View
6.3. Query Defining the CENSUS_DEMOG_SORTED View
6.4. Entity Relationship Diagram of the CENSUS_DEMOG_SORTED View
6.5. Query Defining the CYCPOINTS_CYCLES View
6.6. Entity Relationship Diagram of the CYCPOINTS_CYCLES View
6.7. Query Defining the CYCPOINTS_CYCLES_SORTED View
6.8. Entity Relationship Diagram of the CYCPOINTS_CYCLES_SORTED View
6.9. Query Defining the DEMOG_CENSUS View
6.10. Entity Relationship Diagram of the DEMOG_CENSUS View
6.11. Query Defining the DEMOG_CENSUS_SORTED View
6.12. Entity Relationship Diagram of the DEMOG_CENSUS_SORTED View
6.13. Query Defining the GROUPS_HISTORY View
6.14. Entity Relationship Diagram of the GROUPS_HISTORY View
6.15. Query Defining the PARENTS View
6.16. Entity Relationship Diagram of the PARENTS View
6.17. Query Defining the POTENTIAL_DADS View
6.18. Entity Relationship Diagram of the foundation of the POTENTIAL_DADS View
6.19. Entity Relationship Diagram of that portion of the POTENTIAL_DADS View which places the mother and potential father in the same group during the fertile period
6.20. Entity Relationship Diagram of that portion of the POTENTIAL_DADS View having easily computed columns
6.21. Entity Relationship Diagram of that portion of the POTENTIAL_DADS View involving social interactions
6.22. Query Defining the PROPORTIONAL_RANKS View
6.23. Entity Relationship Diagram of the PROPORTIONAL_RANKS View
6.24. Query Defining the ESTROGENS View
6.25. Entity Relationship Diagram of the ESTROGENS View
6.26. Query Defining the GLUCOCORTICOIDS View
6.27. Entity Relationship Diagram of the GLUCOCORTICOIDS View
6.28. Query Defining the HORMONE_PREPS View
6.29. Entity Relationship Diagram of the HORMONE_PREPS View
6.30. Query Defining the HORMONE_RESULTS View
6.31. Entity Relationship Diagram of the HORMONE_RESULTS View
6.32. Query Defining the HORMONE_SAMPLES View
6.33. Entity Relationship Diagram of the HORMONE_SAMPLES View
6.34. Query Defining the PROGESTERONES View
6.35. Entity Relationship Diagram of the PROGESTERONES View
6.36. Query Defining the TESTOSTERONES View
6.37. Entity Relationship Diagram of the TESTOSTERONES View
6.38. Query Defining the THYROID_HORMONES View
6.39. Entity Relationship Diagram of the THYROID_HORMONES View
6.40. Query Defining the WOUNDSPATHOLOGIES View
6.41. Entity Relationship Diagram of the WOUNDSPATHOLOGIES View
6.42. Query Defining the WP_DETAILS_AFFECTEDPARTS View
6.43. Entity Relationship Diagram of the WP_DETAILS_AFFECTEDPARTS View
6.44. Query Defining the WP_HEALS View
6.45. Entity Relationship Diagram of the WP_HEALS View, Overall
6.46. Entity Relationship Diagram of the WP_HEALS View for rows with an update to a wound/pathology report
6.47. Entity Relationship Diagram of the WP_HEALS View for rows with an update to a wound/pathology cluster
6.48. Entity Relationship Diagram of the WP_HEALS View for rows with an update to an affected body part
6.49. Query Defining the WP_REPORTS_OBSERVERS View
6.50. Entity Relationship Diagram of the WP_REPORTS_OBSERVERS View
6.51. Query Defining the CYCLES_SEXSKINS View
6.52. Entity Relationship Diagram of the CYCLES_SEXSKINS View
6.53. Query Defining the CYCLES_SEXSKINS_SORTED View
6.54. Entity Relationship Diagram of the CYCLES_SEXSKINS_SORTED View
6.55. Query Defining the MATERNITIES View
6.56. Entity Relationship Diagram of the MATERNITIES View
6.57. Query Defining the MTD_CYCLES View
6.58. Entity Relationship Diagram of the MTD_CYCLES View
6.59. Query Defining the SEXSKINS_CYCLES View
6.60. Entity Relationship Diagram of the SEXSKINS_CYCLES View
6.61. Query Defining the SEXSKINS_CYCLES_SORTED View
6.62. Entity Relationship Diagram of the SEXSKINS_CYCLES_SORTED View
6.63. Query Defining the SEXSKINS_REPRO_NOTES View
6.64. Entity Relationship Diagram of the SEXSKINS_REPRO_NOTES View
6.65. Query Defining the ACTOR_ACTEES View
6.66. Entity Relationship Diagram of the ACTOR_ACTEES View
6.67. Query Defining the INTERACT View
6.68. Entity Relationship Diagram of the INTERACT View
6.69. Query Defining the INTERACT_SORTED View
6.70. Entity Relationship Diagram of the INTERACT_SORTED View
6.71. Query Defining the MPI_EVENTS View
6.72. Entity Relationship Diagram of the MPI_EVENTS View
6.73. Query Defining the POINTS View
6.74. Entity Relationship Diagram of the POINTS View
6.75. Query Defining the POINTS_SORTED View
6.76. Entity Relationship Diagram of the POINTS_SORTED View
6.77. Query Defining the SAMPLES_GOFF View
6.78. Entity Relationship Diagram of the SAMPLES_GOFF View
6.79. Query Defining the ANESTH_STATS View
6.80. Entity Relationship Diagram of the ANESTH_STATS View
6.81. Query Defining the BODYTEMP_STATS View
6.82. Entity Relationship Diagram of the BODYTEMP_STATS View
6.83. Query Defining the CHEST_STATS View
6.84. Entity Relationship Diagram of the CHEST_STATS View
6.85. Query Defining the CROWNRUMP_STATS View
6.86. Entity Relationship Diagram of the CROWNRUMP_STATS View
6.87. Query Defining the DSAMPLES View
6.88. Query Defining the DENT_CODES View
6.89. Entity Relationship Diagram of the DENT_CODES View
6.90. Query Defining the DENT_SITES View
6.91. Entity Relationship Diagram of the DENT_SITES View
6.92. Query Defining the HUMERUS_STATS View
6.93. Entity Relationship Diagram of the HUMERUS_STATS View
6.94. Query Defining the PCV_STATS View
6.95. Entity Relationship Diagram of the PCV_STATS View
6.96. Query Defining the TESTES_ARC_STATS View
6.97. Entity Relationship Diagram of the TESTES_ARC_STATS View
6.98. Query Defining the TESTES_DIAM_STATS View
6.99. Entity Relationship Diagram of the TESTES_DIAM_STATS View
6.100. Query Defining the ULNA_STATS View
6.101. Entity Relationship Diagram of the ULNA_STATS View
6.102. Query Defining the VAGINAL_PH_STATS View
6.103. Entity Relationship Diagram of the VAGINAL_PH_STATS View
6.104. Query Defining the LOCATIONS_FREE View
6.105. Entity Relationship Diagram of the LOCATIONS_FREE View
6.106. Query Defining the NUCACID_CONCS View
6.107. Entity Relationship Diagram of the NUCACID_CONCS View
6.108. Query Defining the NUCACIDS View
6.109. Entity Relationship Diagram of the NUCACIDS View
6.110. Query Defining the NUCACIDS_W_CONC View
6.111. Entity Relationship Diagram of the NUCACIDS_W_CONC View
6.112. Query Defining the TISSUES View
6.113. Entity Relationship Diagram of the TISSUES View
6.114. Query Defining the TISSUES_HORMONES View
6.115. Entity Relationship Diagram of the TISSUES_HORMONES View
6.116. Query Defining the QUADS View
6.117. Entity Relationship Diagram of the QUADS View
6.118. Query Defining the SWERB View
6.119. Entity Relationship Diagram of the SWERB View
6.120. Query Defining the SWERB_DATA_XY View
6.121. Entity Relationship Diagram of the SWERB_DATA_XY View
6.122. Query Defining the SWERB_DEPARTS View
6.123. Entity Relationship Diagram of the SWERB_DEPARTS View
6.124. Query Defining the SWERB_GW_LOCS View
6.125. Entity Relationship Diagram of the SWERB_GW_LOCS View
6.126. Query Defining the SWERB_GW_LOC_DATA_XY View
6.127. Entity Relationship Diagram of the SWERB_GW_LOC_DATA_XY View
6.128. Query Defining the SWERB_LOC_GPS_XY View
6.129. Entity Relationship Diagram of the SWERB_LOC_GPS_XY View
6.130. Query Defining the SWERB_LOCS View
6.131. Entity Relationship Diagram of the SWERB_LOCS View
6.132. Query Defining the SWERB_UPLOAD View
6.133. Entity Relationship Diagram of the SWERB_UPLOAD View
6.134. Query Defining the MIN_MAXS View
6.135. Entity Relationship Diagram of the MIN_MAXS View
6.136. Query Defining the MIN_MAXS_SORTED View
6.137. Entity Relationship Diagram of the MIN_MAXS_SORTED View
6.138. Query Defining the BIRTH_GRP View
6.139. Entity Relationship Diagram of the BIRTH_GRP View
6.140. Query Defining the ENTRYDATE_GRP View
6.141. Entity Relationship Diagram of the ENTRYDATE_GRP View
6.142. Query Defining the STATDATE_GRP View
6.143. Entity Relationship Diagram of the STATDATE_GRP View
6.144. Query Defining the CONSORTDATES_GRP View
6.145. Entity Relationship Diagram of the CONSORTDATES_GRP View
6.146. Query Defining the CYCGAPDAYS_GRP View
6.147. Entity Relationship Diagram of the CYCGAPDAYS_GRP View
6.148. Query Defining the CYCGAPS_GRP View
6.149. Entity Relationship Diagram of the CYCGAPS_GRP View
6.150. Query Defining the CYCSTATS_GRP View
6.151. Entity Relationship Diagram of the CYCSTATS_GRP View
6.152. Query Defining the DARTINGS_GRP View
6.153. Entity Relationship Diagram of the DARTINGS_GRP View
6.154. Query Defining the DISPERSEDATES_GRP View
6.155. Entity Relationship Diagram of the DISPERSEDATES_GRP View
6.156. Query Defining the MATUREDATES_GRP View
6.157. Entity Relationship Diagram of the MATUREDATES_GRP View
6.158. Query Defining the MDINTERVALS_GRP View
6.159. Entity Relationship Diagram of the MDINTERVALS_GRP View
6.160. Query Defining the MMINTERVALS_GRP View
6.161. Entity Relationship Diagram of the MMINTERVALS_GRP View
6.162. Query Defining the RANKDATES_GRP View
6.163. Entity Relationship Diagram of the RANKDATES_GRP View
6.164. Query Defining the REPSTATS_GRP View
6.165. Entity Relationship Diagram of the REPSTATS_GRP View
H.1. Warning Sub-System Entity Relationship Diagram

List of Tables

2.2. The Main Babase Tables
2.3. The Babase Support Tables
2.4. The Babase Views
2.5. The table_GRP Views
6.1. Columns in the CENSUS_DEMOG View
6.2. Columns in the CENSUS_DEMOG_SORTED View
6.3. Columns in the CYCPOINTS_CYCLES View
6.4. Columns in the CYCPOINTS_CYCLES_SORTED View
6.5. Columns in the DEMOG_CENSUS View
6.6. Columns in the DEMOG_CENSUS_SORTED View
6.7. Columns in the GROUPS_HISTORY View
6.8. Columns in the PARENTS View
6.9. Columns in the POTENTIAL_DADS View
6.10. Columns in the PROPORTIONAL_RANKS View
6.11. Columns in the ESTROGENS View
6.12. Columns in the GLUCOCORTICOIDS View
6.13. Columns in the HORMONE_PREPS View
6.14. Columns in the HORMONE_RESULTS View
6.15. Columns in the HORMONE_SAMPLES View
6.16. Columns in the PROGESTERONES View
6.17. Columns in the TESTOSTERONES View
6.18. Columns in the THYROID_HORMONES View
6.19. Columns in the WOUNDSPATHOLOGIES View
6.20. Columns in the WP_DETAILS_AFFECTEDPARTS View
6.21. Columns in the WP_HEALS View
6.22. Columns in the WP_REPORTS_OBSERVERS View
6.23. Columns in the CYCLES_SEXSKINS View
6.24. Columns in the CYCLES_SEXSKINS_SORTED View
6.25. Columns in the MATERNITIES View
6.26. Columns in the MTD_CYCLES View
6.27. Columns in the SEXSKINS_CYCLES View
6.28. Columns in the SEXSKINS_CYCLES_SORTED View
6.29. Columns in the SEXSKINS_REPRO_NOTES View
6.30. Columns in the ACTOR_ACTEES View
6.31. Columns in the INTERACT View
6.32. Columns in the INTERACT_SORTED View
6.33. Columns in the MPI_EVENTS View
6.34. Columns in the POINTS View
6.35. Columns in the POINTS_SORTED View
6.36. Columns in the SAMPLES_GOFF View
6.37. Columns in the ANESTH_STATS View
6.38. Columns in the BODYTEMP_STATS View
6.39. Columns in the CHEST_STATS View
6.40. Columns in the CROWNRUMP_STATS View
6.41. Columns in the DSAMPLES View
6.42. Columns in the DENT_CODES View
6.43. Columns in the DENT_SITES View
6.44. Columns in the HUMERUS_STATS View
6.45. Columns in the PCV_STATS View
6.46. Columns in the TESTES_ARC_STATS View
6.47. Columns in the TESTES_DIAM_STATS View
6.48. Columns in the ULNA_STATS View
6.49. Columns in the VAGINAL_PH_STATS View
6.50. Columns in the LOCATIONS_FREE View
6.51. Columns in the NUCACID_CONCS View
6.52. Columns in the NUCACIDS View
6.53. Columns in the NUCACIDS_W_CONC View
6.54. Columns in the TISSUES View
6.55. Columns in the TISSUES_HORMONES View
6.56. Columns in the QUADS View
6.57. Columns in the SWERB View
6.58. Columns in the SWERB_DATA_XY View
6.59. Columns in the SWERB_DEPARTS View
6.60. Columns in the SWERB_GW_LOCS View
6.61. Columns in the SWERB_GW_LOC_DATA_XY View
6.62. Columns in the SWERB_LOC_GPS_XY View
6.63. Columns in the SWERB_LOCS View
6.64. Columns in the SWERB_UPLOAD View
6.65. Columns in the MIN_MAXS View
6.66. Columns in the MIN_MAXS_SORTED View
8.1. The Babase SQL Functions
8.2. Data Analysis Procedures
H.1. The Warning Sub-System Tables
H.2. The Warning Sub-System Support Tables

List of Examples

1.1. A note
1.2. A caution
1.3. A warning
1.4. Text denoted important
1.5. A tip
2.1. Creating table foo in the sandbox schema
2.2. Granting permission to table foo in the sandbox schema
2.3. Creating table foo in user mylogin's schema
3.1. Crossovers during a fusion
3.2. (Mis)Use of Significant Figures in NUCACID_CONC_DATA
4.1. An Agonism Matrix
4.2. Kits with no Correction, and NULL Correction
4.3. A familiar "series" of events
4.4. Determining the GrpOfResidency when absent
4.5. Resident in a nonexistent group
A.1. Using the Postgresql date_trunc() function to set seconds to zero
A.2. Using the Babase date_mod() function to return the minutes and seconds.
A.3. Using the Postgresql to_char() function to convert times to HH:MM text
B.1. Finding all the all-occurrences interactions
C.1. Splitting a sexual cycle in two
H.1. Inserting a query into INTEGRITY_QUERIES using dollar quoting
H.2. Executing all INTEGRITY_QUERIES
H.3. Executing a single INTEGRITY_QUERIES.Query
H.4. Executing INTEGRITY_QUERIES of the bdate type
I.1. Querying "as of" a date
I.2. Querying a table's history "as of" a date

Chapter 1. Introduction

This Document

This document describes the Babase baboon data management system. This includes a description of the tables, the intended use of all related programs and directories, the design of the system, and procedures for maintaining the data management system itself. This document does not include the procedures actually used to enter data into the system, or the details of how to operate the systems programs. Nor does it include any instructions on the operation or administration of the computer itself. Further information on the topics not covered in this document can be found in the Protocol for Data Management: Amboseli Baboon Project document.

The Protocol for Data Management: Amboseli Baboon Project document is an important adjunct to the Babase system, but it is not considered part of the system itself because it describes the use of the system but not the capabilities of the system. It is important to maintain the distinction between use and capabilities so that when an enhancement is needed, it is clear whether the desired result can be obtained by altering the way the system is used, or whether the system itself needs to be modified. It is also important to provide different types of documentation to those who operate the system from those who manage and maintain the system because each of these two groups do not need to know all the details of the others' work.

Any deviation from the standards described in this document should be discussed with the project directors and may God have mercy on your souls.

Conventions Used In This Document

This document follows a number of conventions, most of them typographic but some of them stylistic. Some output formats, particularly plain text, have limited typographic capabilities so the various forms of typographic markup are not always distinguishable, either from each other or from the surrounding text.

Each table in Babase is documented in a section of its own, beginning with a description of the table as a whole and continuing with sub-sections for each column in the table. Of particular importance is the sentence that describes what a row in the given table represents. These are summarized in the textual tables given in the Table Overview section.

Interrelationships between the columns of a table, or between tables, is documented at the beginning of the table's section, not in the sub-sections documenting the columns themselves. Although relationships between 2 tables concern both of the tables the description of each such relationship appears only once in this document, in the overall description of one of the of the two tables concerned. On occasion there may be be brief mention elsewhere.

All TABLE NAMES are written in UPPER CASE. Column Names are in lower case with Initial Capitals. SOMETABLE.Somecolumn is shorthand for the Somecolumn column of the SOMETABLE table. The use of a period to separate the table from the column name is the convention used by SQL to eliminate ambiguity regarding which table a column belongs to. When a column name includes an acronym the acronym is capitalized, as is the first letter of the next word when the acronym begins the column name. For example, PCSColor.

Actual database values are typographically distinguished from the surrounding text, as in the following sentence: The Sname (short name) of the baboon Pebbles is PEB.

When this document defines a word, uses it for the first time, or otherwise wishes to refer to a word or phrase as a thing in and of itself, the word or phrase is typographically distinguished as follows: The word census has several meanings within this document.

Text that has special meaning to computer systems is typographically distinguished as follows: The SQL SELECT statement is the standard method for retrieving data from relational databases.

Emphasized text is typographically distinguished as follows: Always backup your data.

When the words must or cannot or the phrases must not or may not are used, the system will not allow a contrary condition. For example: "Sname must be a unique data value" or "A user with read-only permissions may not change data values." Babase will immediately raise an error when a dis-allowed change is attempted and the change will not take effect.

When the words should or ought are used the system does not enforce the condition. It may or may not report a violation of the condition. An example: "The sexual cycle event referred to in the pregnancy table's Conceive column should date the conception that began the pregnancy." In this case the system has no way of knowing when the pregnancy began and so no way of validating the date.

When the phrase the system will report is used there is some mechanism for reporting a an unusual but not dis-allowed condition. Unlike prohibited conditions, unusual conditions are not generally reported at the time the condition is created.[1]

The documentation is written with a tendency to emphasize Special Values. So, for example, not alive is often written instead of dead because Babase has a special value that means alive but the system is not aware of a particular code that means dead. The result is an occasional double negative.

Significant but often slightly off-topic paragraphs are set off from the surrounding material as a note, shown in Example 1.1.

Example 1.1. A note

Note

Written material has no voice that can be raised, but attention can be drawn with typographical conventions.


When the reader should take care, particularly when the system might do something unexpected in a given circumstance, this is noted in a caution. Example 1.2 shows how a caution is set off from the surrounding text.

Example 1.2. A caution

Babase will reject your change if you try to do something that is not allowed, like giving a male an onset of turgesence date.

Caution

When the rejected change is one of a number of changes bundled into a transaction none of the changes will make it into the database.


When a mis-use of the system will lead to incorrect results, particularly when such results are not obvious, this document contains a warning. Example 1.3 show how warnings are set off from the surrounding text.

Example 1.3. A warning

Warning

Babase cannot detect when an Sname is mis-typed, so it is possible to inadvertently assign a female's sexual cycle to the wrong female.


To otherwise draw the readers attention to material some text is marked important. Example 1.4 show how important text is set off from the surrounding material.

Example 1.4. Text denoted important

Babase has a number of components, many of them, like the SQL web interface, are third party tools, not written by the Babase developers.

Important

When the third party tools are upgraded their look may change but the features they provide should remain. As Babase is composed of Free Software the Babase project always has the option of customizing any of its third party tools and can contribute its improvements back to the program's developers for inclusion into future releases.


Suggestions as to how to use Babase are noted in tips, as are remarks on how data are presently entered in Babase or recorded in the field. Example 1.5 show how a tip is set off from the regular document text.

Example 1.5. A tip

Tip

Lick all the chocolate off your fingers before beginning data entry.


Often, the tips are the result of best practice developed from considered experience and so document how Babase is used at the time of this writing. However, as best practice continues to develop and field protocols change, the Protocol for Data Management: Amboseli Baboon Project and the Amboseli Baboon Research Project Monitoring Guide should always be consulted. Those documents have precedence over the tips presented herein should there be conflicting advice.

Supplemental and cross-referential material is presented in footnotes.

A Guide for the Reader

Anyone who is changing or adding programs to the system should read this entire document. Chapter 3: “Baboon Data: Primary Source Material is particularly important for all those using the system. Chapter 2: “Babase System Architecture provides the introduction to Babase. It explains fundamental concepts without which Babase cannot be understood, although some portions can be skipped; the sections “The Babase Program Code” and “Indexes” are primarily of interest to programmers and the section “Special Values” is for the data maintainers. Everyone will want to pay special attention to the “Entity-Relationship Diagrams” section. These diagrams can also be found in PDF form in The Babase Pocket Reference, where they may be easier on the eye. The section “Data Maintenance Programs” of chapter 8: “Babase Programs is of little interest to those who only want to retrieve information from the system. Portions of the “Useful Programs and Functions” section of the same chapter is of interest to the more sophisticated user. Note that some functions may be hidden in Next links, depending on the format chosen when reading this document. Data maintainers should be sure to understand chapter 5: “Support Tables. Those who are only retrieving data from Babase need not read chapter 7: “Data Entry.

System Design

The Babase system is designed to facilitate the retrieval, storage, and maintenance of the Amboseli Baboon Project data. Data integrity is foremost. Analytical power, ease of use, and low cost are secondary goals. The system consists of tables to store and organize the data, software supporting data validation and derivative data generation, stand-alone programs used to facilitate the entry and maintenance of the data, a minimal tool set supporting the maintenance of the Babase system software itself, and documentation. data are retrieved from Babase using the SQL language, the standard[2] language used to query relational databases. SQL is declarational as opposed to procedural; from a single SQL query (a single statement) the database determines how to best retrieve the data requested, no matter the number of tables or criteria required. SQL provides a single, powerful, interface for ad-hoc data retrieval and manipulation. Generic software provides the bulk of the user interface[3], traditionally the most complex and costly software component.[4] Consequently there are few stand-alone programs written specifically for Babase. The overall philosophy of the systems implementation is to keep the software as easy to maintain as possible while assuring data integrity. To this end, the system is comprised of as many generic components as possible and the design requires custom programming for only the most crucial features.

Babase puts as much intelligence as possible into the database itself, including automatic data validation and complex automatic analysis and storage of the derived data.[5] Babase extends its sometimes complex and rather abstract database structures with alternative, more familiar and user-friendly, means of accessing the underlying data[6]. These constructs are, in so far as is possible, made indistinguishable from the underlying data when querying and updating the database. Babase often generates derivative data for more ready analysis. This is, for the most part, transparent to the user. The end-user is insulated from implementation details, the number of interfaces (primarily SQL) the user must learn is minimized, and the user is free to work with the data structures that embody the conceptual model best suited to the task at hand.[7]

Data input is an example of how Babase incorporates generic programs. The prototypical way to import data into Babase is in bulk, via a plain text file having columns delimited by the tab character. These are easily produced by almost any spreadsheet program; it is expected that most data imported into Babase will be typed into a spreadsheet and then exported to tab-delimited text for upload.[8] The use of generic interfaces reduces cost, and minimizing the number of novel interfaces frees the end-user to concentrate on the task at hand.

Babase is designed to be accessed over the Internet, primarily via the web. Although there are exceptions[9] the majority of Babase is accessed via a W3C compliant web browser. Individually assigned usernames and passwords are used, along with encryption, to secure the database content. The Babase Wiki provides content for an the structure of the project's web site. Another example of Babase leveraging a generic program, the wiki allows project members collaborate, share information, and build the project's web site without programmer intervention.

Babase is built upon standards[10] and popular, widely deployed, Open Source and Free Software. This means, among other things, that the tools used to build and run Babase are very likely available to anyone free of cost, and that the skill-sets required for the system maintenance of and, to some extent, use of Babase are readily learned[11] and unlikely to become obsolete[12].[13] The Babase source code itself is Free Software[14] and may be downloaded by the public.[15]

Note

The database design attempts 5th normal form, no redundant data, no empty data elements allowed, etc. What we've actually wound up with is about 3rd normal form.

To Start Babase

The Babase system is accessed over the web. Any web browser may be used to view the data using the phpPgAdmin generic database interface. More advanced usage of the website will likely require a web browser that conforms to the international standards for the web defined by the World Wide Web Consortium , otherwise known as the W3C ,as we have put forth no particular effort to accommodate non-standards conforming browsers. The browser must support CSS2 style sheets and XHTML 1.0. Note that at the time of this writing Microsoft Internet Explorer does not provide adequate style sheet support. Other browsers that do have such support include Mozilla , Mozilla Firefox ,Apple's Safari ,and Opera. The W3.org site maintains a list of browsers supporting style sheets.

Babase's URL (web address) is https://papio.biology.duke.edu/ . Be sure to type the s in https . This secures your web connection.

You must access most of the Babase web site using a secure communications protocol ( HTTPS ) that encrypts all communication to foil eavesdroppers and checks the identity of the web site itself. The Babase project has signed its own security certificate, the certificate that ensures you are talking with the website you think you are.[16] Our certificate expires annually and is re-generated.

Your browser probably will not trust that our website is who it says it is and so will very likely object when you first access the Babase web site, and annually thereafter. You may tell your browser to accept our certificate permanently.

Other Resources

Resources related to Babase include:

Babase users are encouraged to ask questions, both on the Babase mailing list and on the mailing lists setup for questions on the software that Babase is made of.



[1] Immediate reporting of some unusual conditions could be added to Babase at a later date.

[2] More or less. The last actual SQL standard was issued a very long time ago. None the less SQL is pervasive and, although specific SQL statements may not always be, the skill set involved in SQL use is quite portable.

[3] There are many PostgreSQL user interfaces available, although at the time of this writing only 2, phpPgAdmin and psql, are installed on the Babase database server. Many of these front-ends must be installed on the local workstation. These may require that the Babase VPN be running before initiating a connection to the database. Some of available front-ends may be found via the PostgreSQL FAQ question regarding graphical user interfaces for PostgreSQL.

[4] It's those pesky unpredictable users. Computer software would be a lot easier to write if it weren't for users always messing things up and then insisting on knowing what happened.

[5] A process which, admittedly, sometimes conflicts with the notion of easily maintaining the software. On the other hand when done right this approach does wonders for data integrity.

[7] These features also free the user from software interface lock-in. The database may be accessed and maintained with the software of choice. Data integrity, in both raw and derived data, is assured. Significantly, these features are those that allow Babase to leverage generic programs, using them for the bulk of its user interface as opposed to building a custom, Babase specific, interface.

[8] Of course, because Babase has no designated front-end and so much data validation takes place inside the database itself, any program able to talk with PostgreSQL, the database engine Babase uses, can be used to import data into the database. So there are no real limits on how data must be structured for import into Babase.

[9] There are 2 Unix shell programs that provide peripheral utility; both do tasks that can be done with other tools but are handy to have automated. The use of these programs are documented on the Babase Wiki. Comprehensive documentation of these programs should probably be added to this document.

The Unix Shell Programs

babase-copy-babase-schema

Copies the entire content of the babase schema from one database to another.

babase-user-add

Adds a postgresql user, granting the permission to use Babase

There is also the ranker program, which runs on the local workstation and uses the Internet to communicate with the database. Developed separately from the rest of Babase, neither the source code management of nor the documentation for the ranker program is particularly well integrated into Babase.

[10] Actual standards, not de facto ones.

[11] Because open standards and the documentation for Open Source and Free Software programs are available, without cost; and because the inherently transparent and public nature of open standards, Open Source and Free Software leads not only to a wealth of good instructional material freely available on the Internet but also rounds out the basic requirements of a complete learning environment by ensuring that the software itself is available to everyone.

[12] Because once software is released and distributed under a Free or Open Source license it cannot be locked away and made unavailable, and because open standards are rarely changed in a backwards-incompatible way.

[13] Consequently the skills are rather widely available. The difficult part, as always, is finding the all of the relevant skills at once. For more on this see The Babase Program Code section.

[14] Presently licensed under the GPL Version 3 or later.

[15] Babase database content is not available to the public.

[16] We do this rather than paying one of the regular certification authorities to validate our identity. These certification authorities appear to validate the identity of their customers by virtue of little more than having successfully been paid.

Chapter 2. Babase System Architecture

Databases

Databases are collections of information, all of which can be queried and otherwise manipulated alone or in aggregation with all other database content.[17] Babase contains three databases.

The babase Database

The babase database contains the real information. All research takes place in this database.

The babase_copy Database

The babase_copy database contains a copy of the babase database. It is a place to try out dangerous things that might break the babase database.

The babase_test Database

The babase_test database contains a few bits of made up information. It is a place to try out random things and a place where the babase developers can work on alterations and enhancements.

Users, Groups and Database Permissions

Each user is given a login and a password they must use to gain access to the database. It is good form to change your password occasionally.[18]

The database can grant specific users various levels of access to specific tables, although such access is not common as it is difficult to administer and maintain such a fine grained degree of control. For further information see the PostgreSQL documentation on Database Users and Privileges.

Rather than maintain database access privileges on a per-user basis it is more convenient to place users in groups and then grant these groups different levels of database access.

Babase contains the following groups:

The babase_readers group

The members of this group have read access to Babase data and cannot add, delete, or otherwise alter any of the data.

The babase_editors group

The members of this group have unlimited rights to the Babase data. They may add data, delete data, or alter existing data. They may not, however, alter the structure of the babase database or change the rules to which the data are required to conform. Thus, they may not add or delete tables, alter triggers, or write or replace stored procedures.

Schemas

Schemas partition databases. Tables, procedures, triggers, and so forth are all kept in schemas. Schemas are like sub-databases within a database. The salient difference between schemas and databases is that a single SQL statement can refer to objects in the different schemas of the parent database, but cannot refer to objects in other databases -- tables within a database can be related, but tables in different databases cannot. Babase uses schemas to partition each database into areas where users have a greater or lesser degree of freedom to make changes. For further information on schemas see the schema documentation for PostgreSQL.

Each database is divided into the same schemas. That is, each schema described below exists within each of the databases described here.

The system looks at the different schemas for objects, for example table names appearing in SQL queries, in the order in which the schemas are listed below. If the table does not appear in the first schema it looks in the second, and so forth. As soon as a table is found with the name given, that table is used and the search stops.

To explicitly reference an object in a specific schema, place the name of the schema in front of the object, separating the two with a period (e.g. schemaname.tablename).

The babase schema

The babase schema holds the official Babase tables. Everything in the babase schema is documented and supported.

In this schema the babase_readers and babase_editors have the access described above.

The babase_something_views schemas

Babase contains a number of schemas that exist to simplify things for those interested only in particular portions of Babase. These schemas contain nothing but views that reference other parts of Babase, the parts that are especially relevant and useful to those interested only in one of the broad categories of Babase data. These schemas and their corresponding categories are:

The categories of Babase data and their schemas
SchemaCategory
babase_cycles_viewsSexual Cycles
babase_darting_viewsDarting
babase_demog_viewsGroup Membership and Life Events
babase_physical_traits_viewsPhysical Traits
babase_social_viewsSocial and Multiparty Interactions
babase_support_viewsSupport Tables
babase_swerb_viewsSWERB Data (Group-level Geolocation Data)
babase_weather_viewsWeather Data
babase_group_viewsViews Which Add Gid To Tables

These schemas provide an overview of the major areas of Babase. They should be especially useful to those starting out with Babase or those interested only in particular portions of Babase data.

The views in these schemas may only be queried. Any updating of Babase data must be done in the babase schema.

Note

Some of Babase's tables and views appear in more than one of these schemas, some in none.

Warning

Do not create any views that reference the views in these schemas. Reference the babase schema instead. Any views created that reference anything in these category schemas will be destroyed on occasion as Babase is modified.

The babase_history schema

The babase_history schema contains a table for each temporal table in the babase schema. The tables in this schema store the "old" versions of data from those temporal tables, allowing the ability to query for earlier versions of the data. See the Temporal Tables and babase_history appendix for more details.

The name of each table in this schema should be a concatenation of 1) the name of the related babase schema table, and 2) "_HISTORY". For example, a table in the babase schema called SOMETABLE would have a table in the babase_history schema called SOMETABLE_HISTORY.

Group permissions in the babase_history schema

Members of the babase_readers and babase_editors groups both have the same permissions in the babase_history schema: they have read access to the data but cannot perform INSERT, UPDATE, or DELETE commands in any tables[19], nor can they add new tables to the schema. Only administrators are allowed to perform these actions.

The babase_pending schema

The babase_pending schema holds tables pending planned integration into Babase. The tables in this schema are intended to be used with the official Babase tables but, unlike the official Babase tables, there is no automated validation process and the table structure has not been thoroughly reviewed. The tables in babase_pending are to be used but their content and structure may change when officially incorporated into Babase.

Documentation on the content of the babase_pending schema may be found on the babase_pending page of the Babase Wiki.

The difference between this schema and the sandbox schema is in the permissions granted.

babase_readers permissions in the babase_pending schema

Members of the babase_readers group have the same permissions they do in the babase schema, they have read access to the data but cannot add, delete or modify it. However, unlike in the babase schema, individual users may be granted the right to add, delete, or change data on a table-by-table basis.

babase_editors permissions in the babase_pending schema

Members of the babase_editors group have the permissions they normally have in the babase schema, they may add, delete or modify all data in the schema's tables.

The sandbox schema

The sandbox schema holds tables that are used together with the official Babase tables but have not yet made it into the Babase project. They will not be documented in the Babase documentation.

The groups have the following permissions:

babase_readers permissions in the sandbox schema

The babase_readers have all the permissions in the sandbox schema that the babase_editors have in the babase schema. They may add, delete, or modify any information in the schema but may not alter the structure of the schema by adding or removing tables, procedures, triggers, or anything else.

babase_editors permissions in the sandbox schema

The babase_editors have all the permissions of the babase_readers, plus they may add or delete tables, stored procedures, or any other sort of object necessary to control the structure of the data.

Because of the schema search order the schema name must be used to qualify anything created in the sandbox schema. E.g.

Example 2.1. Creating table foo in the sandbox schema


CREATE TABLE sandbox.foo (somecolumn INTEGER);
              


PostgreSQL, the database underlying Babase, is secure by default. This means that any tables or other database objects cannot be accessed by anyone but their creator without permission of the creator. Babase_editors who create tables in the sandbox schema should use the GRANT statement to grant access to Babase's other users.

This is done as follows:

Example 2.2. Granting permission to table foo in the sandbox schema


GRANT ALL ON sandbox.foo TO GROUP babase_editors;
GRANT SELECT ON sandbox.foo TO GROUP babase_readers;

              


There is one other issue. Only the creator of a table can change its structure -- to add another column, change the table name, etc. And only the creator can destroy (DROP) the table.

The devel schema

The devel schema holds tables undergoing integration into Babase. Normally it is empty, but during the design and development of new tables it may contain the tables being developed.

The tables in this schema do not necessarily contain valid or finalized data and so are not expected to be used for other than developmental purposes.

Permissions are granted in the devel schema on the same basis as the granting of permissions in the babase schema.

The difference between this schema and the sandbox schema is that the development tools support the creation and modification of the tables in the devel schema, which facilitates the movement of tables from the devel schema into the babase schema.

The per-user schemas

Each user has her own schema, a schema named with the user's login. Users have permissions to do anything they want in their own schemas, and no permissions whatsoever to anybody else's schema. A user's schema is private.

Caution

Users are not encouraged to grant others permissions to the tables in their schema, as shown in the Section: “The sandbox schema” above. A user's schema is deleted when she leaves Babase. All shared tables belong in the sandbox schema where they can be maintained without regard to personnel changes.

Because of the schema search order the schema name must be used to qualify anything created in the user's schema. E.g.

Example 2.3. Creating table foo in user mylogin's schema


CREATE TABLE mylogin.foo (somecolumn INTEGER);
            


Table Overview

The data in Babase are stored in tables. Tables can be visualized as grids, with rows and columns. Each row represents a single real-world thing or event, an entity, e.g. a baboon. Each cell in the row contains a single unit of information, e.g. a birth date, a name, and a sex. The row holds the entirety of the information belonging to the entity as an isolated thing, e.g. baboon database entities consist of a birth date, a name, and a sex. Each column contains one and only one kind of information, e.g. birth date.

Table 2.1 is an example of a database table that might be used to represent baboons, one baboon per row. Notice that each cell contains one and exactly one unit of information.

Table 2.1. A Simple Database Table

BirthNameSex
May 23, 1707AliceFemale
February 12, 1809BobMale
July 22, 1822CarolFemale

Anyone working with Babase will require a familiarity with the database's tables. An understanding of the entity each row represents is critical when working with a table. The tables below provide short definitions of the entities each The babase schema table holds in its rows.

Some of the tables in Babase exist to define a vocabulary. These are the support tables. For lack of a better term, the remainder of the tables are labeled main tables in Table 2.2.

Warning

Tables which have names ending in _DATA should not be used, there is always a view of the data in these tables that may be used in their place. Tables ending in _DATA may change in future Babase minor releases, breaking queries and programs which use the table. Use of the corresponding views will ensure compatibility with future Babase releases.

Table 2.2. The Main Babase Tables

Group Membership and Life Events
TableOne row for each
ALTERNATE_SNAMESrescinded sname
BIOGRAPHanimal, including fetuses
CENSUSday each individual is (or is not) observed in a group
CONSORTDATESmale who has a known first consortship
DEMOGmention of an individual's presence in a group within a field textual note
DISPERSEDATESmale who has left his maternal study group
GROUPSgroup (including solitary males)
MATUREDATESindividual who is sexually mature
RANKDATESindividual[a] who has attained adult rank
 
Analyzed: Group Membership and Life Events
TableOne row for each
DAD_DATAoffspring having a paternity analysis
MEMBERSday each individual is alive
RANKSmonth each individual is ranked in each group
RESIDENCIESbout of each individual's residency
Physical Traits
TableOne row for each
WP_AFFECTEDPARTSbody part affected by a specific wound/pathology
WP_DETAILSwound or pathology cluster indicated on a report
WP_HEALUPDATESupdate on progress of wound/pathology healing
WP_REPORTSwound/pathology report
 
Analyzed: Physical Traits
TableOne row for each
HORMONE_KITSkit or protocol used to assay hormone concentration
HORMONE_PREP_DATAlaboratory preparation performed on a sample in the specified series
HORMONE_PREP_SERIESseries of preparations and assays performed on a sample
HORMONE_RESULT_DATAassay for hormone concentration in a sample
HORMONE_SAMPLE_DATAtissue sample used in hormone analysis
HYBRIDGENE_ANALYSESanalysis of genetic hybrid scores
HYBRIDGENE_SCORESgenetic hybrid score for an individual from an analysis
 
Sexual Cycles
TableOne row for each
CYCGAPSfemale for each initiation or cessation of a continuous period of observation
CYCLESfemale's cycle (complete or not)
CYCPOINTSMdate (menses), Tdate (turgesence onset), or Ddate (deturgesence onset) date of each female
PREGStime a female becomes pregnant
SEXSKINSsexskin measurement of each female
 
The Sexual Cycle Day-By-Day Tables
TableOne row for each
CYCGAPDAYSfemale for each day within a period during which there is not continuous observation
CYCSTATSday each female is cycling -- by M, T and Ddates
MDINTERVALSday each female is cycling and is between M and Ddates
MMINTERVALSday each female is cycling -- by Mdates
REPSTATSday each female has a known reproductive state
 
Social and Multiparty Interactions
TableOne row for each
ALLMISCSfree form all-occurrences datum
CONSORTSmultiparty dispute over a consortship
FPOINTSpoint observation of a mature female
INTERACT_DATAinteraction between individuals
MPIScollection of multiparty interactions
MPI_DATAsingle dyadic interaction of a multiparty interaction collection
MPI_PARTSparticipant in a dyadic interaction of a multiparty interaction collection
PARTSparticipant in each interaction
POINT_DATAindividual point observation
NEIGHBORSneighbor recorded in each point sample
SAMPLESfocal sample
 
Darting
TableOne row for each
ANESTHStime additional sedation is administered to a darted individual
BODYTEMPSbody temperature measurement taken of a darted individual
CHESTSchest circumference measurement made of a darted individual
CROWNRUMPScrown to rump measurement made of a darted individual
DART_SAMPLESsample type collected at each darting
DARTINGSdarting of an animal when data was collected
DPHYSdarting event during which physiological measurements were taken
HUMERUSEShumerous length measurement made of a darted individual
PCVSpacked cell volume measurement taken from a darted individual
TEETHpossible tooth site within the mouth on which data was collected for every darting event during which dentition data was collected
TESTES_ARCevery testicle width/length measurement recorded, as measured along a portion of the circumference
TESTES_DIAMevery testicle width/length measurement recorded, as measured along the diameter
TICKSdarting event during which data on ticks and other parasites were recorded
ULNASulna length measurement made of a darted individual
VAGINAL_PHSvaginal pH measurement made of a darted individual
 
Analyzed: Darting
TableOne row for each
FLOW_CYTOMETRYflow cytometric analysis of a blood sample collected during a darting
WBC_COUNTScount from a blood smear collected during a darting
 
Inventory
TableOne row for each
LOCATIONSLocation that can be used to store tissue and nucleic acid samples
NUCACID_CONC_DATAQuantification of a nucleic acid sample's concentration
NUCACID_DATANucleic acid sample that is or ever has been in the inventory
NUCACID_LOCAL_IDSName/ID used to identify a nucleic acid sample at a particular institution
NUCACID_SOURCESNucleic acid sample that has another nucleic acid sample as its source
POPULATIONSStudy population under observation or from which tissue or nucleic acid samples have been collected
TISSUE_DATATissue sample that is or ever has been in the inventory
TISSUE_LOCAL_IDSName/ID used to identify a tissue sample at a particular institution
TISSUE_SOURCESTissue sample that has another tissue sample as its source
UNIQUE_INDIVSIndividual under observation or from whom tissue or nucleic acid samples have been collected
 
SWERB Data (Group-level Geolocation Data)
TableOne row for each
AERIALSaerial photo used for map quadrant specification
GPS_UNITSGPS device
QUAD_DATASWERB map quadrant
SWERB_BESuninterrupted bout of group-level observation
SWERB_DATAevent related to group-level geolocation
SWERB_DEPARTS_DATAdeparture from camp of a observation team which collected SWERB data
SWERB_GWSgeolocated physical object (grove or waterhole)
SWERB_GW_LOC_DATArecorded location of a geolocated physical object (grove or waterhole)
SWERB_LOC_DATAobservation of a group at a time at a geolocated physical object
SWERB_LOC_DATA_CONFIDENCESanalyzed observation of a location
SWERB_LOC_GPSobservation of a group at a time at a geolocated physical object made using gps units and a protocol that requires 2 waypoint readings
SWERB_OBSERVERSdeparture from camp of an observer who drove or collected SWERB data
 
Weather Data
TableOne row for each
RAINGAUGESrain gauge reading
RGSETUPSrain gauge installation
TEMPMAXSmaximum temperature reading
TEMPMINSminimum temperature reading
DIGITAL_WEATHERdigital weather reading reported from an electronic weather collection device
WREADINGSmanually collected meteorological data collection event
 

[a] At this time of this writing only males have data entered into RANKDATES.


The significant aspects of the the support tables are: the Id column -- the name of the column holding the vocabulary term, which columns of which tables use the vocabulary, and what sort of vocabulary the table defines. Table 2.3 summarizes this information.

Note

The Id columns throughout Babase do not allow values that are NULL, or which are textual but contain no characters, or which consist solely of spaces.

Table 2.3. The Babase Support Tables

General Support Tables
TableId ColumnRelated Column(s) One entry for every possible choice of...
BODYPARTSBodypartTICKS.Bodypart, BODYPARTS.Bodyregion, WP_AFFECTEDPARTS.Bodypartpart of the body
LAB_PERSONNELInitialsHYBRIDGENE_ANALYSES.Analyzed_By, NUCACID_CREATORS.Creator, WBC_COUNTS.Counted_Byperson who generates data, usually in a lab setting
OBSERVERSInitialsSAMPLES.Observer, WREADINGS.WRperson, RGSETUPS.RGSPerson, CROWNRUMPS.CRobserver, CHESTS.Chobserver, ULNAS.Ulobserver, HUMERUSES.Huobserver, SWERB_OBSERVERS.Observerperson who record observational data
OBSERVER_ROLESInitialsOBSERVERS.Role, OBSERVERS.SWERB_Observer_Role, OBSERVERS.SWERB_Driver_Role, SWERB_OBSERVERS.Roleway in which a person can be involved in the data collection process
UNKSNAMESUnksnameNEIGHBORS.Unksname and the SWERB_UPLOAD viewproblem in identifying neighbor of focal during point sampling or in identifying a lone male in a SWERB other group observation
 
Group Membership and Life Events
TableId ColumnRelated Column(s) One entry for every possible choice of...
BSTATUSESBstatusBIOGRAPH.Bstatusbirthday estimation accuracy
CONFIDENCESConfidenceBIOGRAPH.DcauseNatureConfidence, BIOGRAPH.DcauseAgentConfidence, DISPERSEDATES.Dispconfidence, BIOGRAPH.Matgrpconfidencedegree of certitude in nature of death, agent of death, disperse date assignment, or maternal group assignment
DAD_SOFTWARESoftwareDAD_DATA.Softwaresoftware package used to perform genetic paternity analysis
DCAUSESDcauseBIOGRAPH.Dcausecause of death
DEATHNATURESNatureDCAUSES.Naturereason for death
DEMOG_REFERENCESReferenceDEMOG.Referencedata source for demography notes
MSTATUSESMstatusMATUREDATES.Matured, RANKDATES.Rankedmaturity marker date estimation process
DAD_DATA_COMPLETENESSCompletenessDAD_DATA.Completenesscategory of analysis completeness
DAD_DATA_MISMATCHESMismatchDAD_DATA.Consensus_Mismatchcategory of genetic mismatch
RNKTYPESRnktypeRANKS.Rnktyperank ordering assigned to subject and month
STATUSESStatusBIOGRAPH.Statusbaboon alive at last observation
 
Physical Traits
TableId ColumnRelated Column(s) One entry for every possible choice of...
HORMONE_IDSHormoneHORMONE_KITS.Hormonehormone that may be extracted and assayed for
HORMONE_PREP_PROCEDURESIdHORMONE_PREP_DATA.Procedureprocedure that may be performed in preparation for a hormone assay
HYBRIDGENE_SOFTWARESoftwareHYBRIDGENE_ANALYSES.Softwaresoftware used for genetic hybrid score analysis
MARKERSMarkerHYBRIDGENE_ANALYSES.Markertype of genetic marker used for genetic hybrid score analysis
WP_HEALSTATUSESHealstatusWP_HEALUPDATES.HealStatushealing progress used in healing updates
WP_REPORTSTATESReportStateWP_REPORTS.ReportStatestatus of wound/pathology report
WP_WOUNDPATHCODESWoundPathCodeWP_DETAILS.WoundPathCodewound or pathology
 
Social and Multiparty Interactions
TableId ColumnRelated Column(s) One entry for every possible choice of...
ACTIVITIESActivityPOINT_DATA.Activityactivity classification
ACTSActINTERACT_DATA.Actinteraction classification
DATA_STRUCTURESData_StructureSETUPIDS.Data_Structureversion of data structure produced by the data collection devices
CONTEXT_TYPESContext_typeMPIS.Context_typecontext in which a multiparty interaction occurs
FOODCODESFoodcodePOINT_DATA.Foodcodename of a food item
FOODTYPESFtypeFOODCODES.Ftypefood category
KIDCONTACTSKidcontactFPOINTS.Kidcontactspatial relationship between mother and infant
MPIACTSMpiactMPI_DATA.MPIActmultiparty interaction classification
NCODESNcodeNEIGHBORS.Ncodeneighbor classification
PARTUNKSUnksnameMPI_PARTS.Unksnameproblem in identifying participant in a multiparty interaction
POSTURESPosturePOINT_DATA.Posturedesignated posture
PROGRAMIDSProgramidSAMPLES.Programidversion of each program used on the devices to collect focal sampling data
SAMPLES_COLLECTION_SYSTEMSCollection_SystemSAMPLES.Collection_Systemdevice or "system" used in the field for collecting focal sampling data
SETUPIDSSetupidSAMPLES.Setupidsetupfile used on the devices to collect focal sampling data
STYPESSTypeSAMPLES.STypeprotocol for focal sampling data collection
STYPES_ACTIVITIESSType-Activity pairSAMPLES.SType, ACTIVITIES.Activityactivity classification allowed to be used in each focal sampling protocol
STYPES_NCODESSType-Ncode pairSAMPLES.SType, NCODES.Ncodeneighbor classification allowed to be used in each focal sampling protocol
STYPES_POSTURESSType-Posture pairSAMPLES.SType, POSTURES.Postureposture classification allowed to be used in each focal sampling protocol
SUCKLESSuckleFPOINTS.Kidsuckleinfant suckling activity
 
Sexual Cycles and The Sexual Cycle Day-By-Day Tables
TableId ColumnRelated Column(s) One entry for every possible choice of...
PCSCOLORSColorSEXSKINS.Colorparacallosal skin coloration
 
Darting
TableId ColumnRelated Column(s) One entry for every possible choice of...
DART_SAMPLE_CATSDs_catDART_SAMPLE_CATS.DS_Catcategory of darting sample type
DART_SAMPLE_TYPESDS_TypeDART_SAMPLE_TYPES.DS_Typetype of sample collected during dartings
DRUGSDrugDRUGS.Druganesthetic drug
LYMPHSTATESLymphstateDPHYS.Ringnode, DPHYS.Lingnode, DPHYS.Raxnode, DPHYS.Laxnode, DPHYS.Lsubmandnode, DPHYS.Rsubmandnodelymph node condition
PARASITESPARASITETICKS.Tickkindparasite species, species developmental stage, or kind of parasite sign counted
TCONDITIONSTconditionTEETH.Tconditionphysical condition of a tooth
TICKSTATUSESTickstatusTICKS.Tickstatusparasite count outcome category
TOOTHCODESToothTEETH.Toothadult or deciduous tooth
TOOTHSITESToothsiteTOOTHCODES.Toothsitedental site within the mouth
TSTATESTstateTEETH.Tstatetooth presence
 
Inventory
TableId ColumnRelated Column(s) One entry for every possible ...
INSTITUTIONSInstitutionLOCATIONS.Institution, NUCACID_LOCAL_IDS.Institution, TISSUE_LOCAL_IDS.Institutionpossible locale where tissue and nucleic acid samples can be stored or used
MISID_STATUSESMisid_StatusTISSUE_DATA.Misid_Statuslevel of confidence in the identity of a tissue sample
NUCACID_CONC_METHODSConc_MethodNUCACID_CONC_DATA.Conc_Methodmethod used for quantifying nucleic acid concentrations
NUCACID_CREATION_METHODSCreation_MethodNUCACID_DATA.Creation_Methodmethod used for creating nucleic acid samples
NUCACID_TYPESNucAcid_TypeNUCACID_DATA.NucAcid_Typetype of nucleic acid sample
STORAGE_MEDIAStorage_MediumTISSUE_DATA.Storage_Mediummedium used for storage/archiving of tissue samples
TISSUE_TYPESTissue_TypeTISSUE_DATA.Tissue_Typetype of tissue sample
 
SWERB Data (Group-level Geolocation Data)
TableId ColumnRelated Column(s) One entry for every possible ...
ADCODESADCodeSWERB_LOC_DATA.ADcoderelationship between baboon groups and sleeping groves.
SWERB_LOC_CONFIDENCESConfSWERB_LOC_DATA_CONFIDENCES.Confidenceconfidence score used when analyzing the accuracy of a recorded observation of a location.
SWERB_LOC_STATUSESConfSWERB_LOC_DATA.Loc_Statusstatus for a recorded observation of a location.
SWERB_TIME_SOURCESSourceSWERB_BES.Bsource, SWERB_BES.Esourcedata source used to estimate beginning and ending of observation bouts
SWERB_XYSOURCES (SWERB Time Sources)SourceSWERB_GW_LOC_DATA.XYSource data source used to obtain XY coordinates
 
Weather Data
TableId ColumnRelated Column(s) One entry for every possible choice of...
WEATHER_SOFTWARESWSoftwareDIGITAL_WEATHER.WSoftwaresoftware used to retrieve data from an electronic weather collection instrument
WSTATIONSWstationWREADINGS.Wstationmeteorological data collection location or device

The Sys_Period Column

Beginning with Babase 5.0, nearly every table in Babase has a column called "Sys_Period", which shows the range of time when the data in a row is considered "valid". When a row in a table in the babase schema is updated or deleted, the "old" version is no longer "valid" and is saved in a corresponding table in the babase_history schema.

Note

All data in the babase schema are valid, simply by virtue of their being in that schema. Users should not let this discussion of validity mislead them into undue suspicion of the accuracy of the data.

Updates to this column should only be performed automatically by the system, when data are inserted, updated, or deleted. Manual updates to this column are only allowed when done by an admin[20].

In the babase schema, the lower bound of the Sys_Period column indicates when the row was last updated, when the row was inserted to the table, or when the Sys_Period column was added to the table, whichever is most recent. The upper bound of the Sys_Period column for tables in that schema will always be NULL, meaning "no end" (yet).

In the babase_history schema, each row represents an old "version" of the row. In these tables, the lower bound of the Sys_Period column is the timestamp of the INSERT or UPDATE that created that version of the row, or the date and time that the Sys_Period column was added to the original table, whichever is most recent. The upper bound is the timestamp of the INSERT, UPDATE, or DELETE that rendered the row no longer "valid".

In all tables, this column is a timestamp range (with time zone), with inclusive lower bound and exclusive upper bound. The lower bound cannot be NULL, and defaults to the current_timestamp when the row is inserted/updated.

Entity-Relationship Diagrams

Most tables have have an id, or key, column that contains a number unique to that row within its table. The id can be used, in perpetuity, to refer to its related row and distinguish it from all the other rows of the table. Ids are arbitrary, although for convenience they are often sequentially generated integers. The name of the column is not always Id, although it sometimes is.

A relationship is established between the rows of two tables when an id value from one table appears as data in the other. The relationship notion is made most clear by way of diagrams and examples. If the next paragraph is unclear, don't worry. Have a look at the Babase diagrams below by way of example and see if that does not clear things up. The relationship concept is at the heart of relational databases and, while the underlying idea is rather simple, it took many years to develop relational database concepts[21] so don't expect a full understanding immediately.

When an id value of a row in one table appears as data in a second table, the data in the second table can be used to retrieve the identified row from the first table.[22] When an id value of a row in the first table appears as data only once in the second table, the two tables are said to have a one-to-one relationship. One row in the first table relates to one (or possibly zero) row(s) in the second table. When a row's id value can appear in more than one row of a second table, the two tables are said to have a one-to-many relationship. One row of the first table can be related to many rows in the second table. One-to-many relationships are more common than one-to-one relationships. The relationship between the various Babase tables can be visualized in entity relationship diagrams, as shown here. In this diagram each table (entity) is a box, and each box contains a list of the table's columns. The lines between the boxes represent the relationships between the tables.

Note

If you have trouble viewing the diagrams in your browser, you may wish to view them in PDF format. The diagrams are available in The Babase Pocket Reference (approx. 4.8MB) in PDF form.

Figure 2.1. Key to the Babase Entity Relationship Diagrams

If we could we would display the diagram key here.


Figure 2.2. Babase Group Membership Entity Relationship Diagram

If we could we would display a diagram here depicting censusing and group membership.


Figure 2.3. Babase Life Events Entity Relationship Diagram

If we could we would display here a diagram depicting maturity markers and ranking.


Figure 2.4. Babase Sexual Cycle Entity Relationship Diagram

If we could we would display a diagram here depicting female sexual cycle information.


Figure 2.5. Babase Sexual Cycle Day-To-Day Tables Entity Relationship Diagram

If we could we would display a diagram here depicting female sexual cycle day-to-day tables.


Figure 2.6. Babase Social Interactions Entity Relationship Diagram

If we could we would display a diagram here depicting social interactions and focal point samples.


Figure 2.7. Babase Multiparty Interactions Entity Relationship Diagram

If we could we would display a diagram here depicting multiparty interactions.


Figure 2.8. Babase Darting Logistics and Morphology Entity and Relationship Diagram

If we could we would display a diagram here depicting darting logistics and morphology.


Figure 2.9. Babase Darting Physiology Entity and Relationship Diagram

If we could we would display a diagram here depicting darting logistics and morphology.


Figure 2.10. Babase Darting Samples Entity and Relationship Diagram

If we could we would display a diagram here depicting darting logistics and morphology.


Figure 2.11. Babase Darting Teeth and Ticks Entity and Relationship Diagram

If we could we would display a diagram here depicting darting logistics and morphology.


Figure 2.12. Babase Inventory Entity Relationship Diagram

If we could we would display a diagram here depicting the Babase Inventory tables.


Figure 2.13. Babase Physical Traits Hormone Data Entity Relationship Diagram

If we could we would display a diagram here depicting the Babase Physical Traits Hormone Data tables.


Figure 2.14. Babase Physical Traits Genetic Hybrid Score Data Entity Relationship Diagram

If we could we would display a diagram here depicting the Babase Physical Traits Genetic Hybrid Score Data tables.


Figure 2.15. Babase Physical Traits Wounds and Pathologies Data Entity Relationship Diagram

If we could we would display a diagram here depicting the Babase Physical Traits Wounds and Pathologies Data tables.


Figure 2.16. Babase SWERB Core Tables Entity Relationship Diagram

If we could we would display a diagram here depicting the SWERB core tables.


Figure 2.17. Babase SWERB Grove/Waterhole Location Tables Entity Relationship Diagram

If we could we would display a diagram here depicting the SWERB Grove/Waterhole Location tables.


Figure 2.18. Babase Manual Weather Data Entity Relationship Diagram

If we could we would display here a diagram depicting the manual weather data tables.


Figure 2.19. Babase Digital Weather Data Entity Relationship Diagram

If we could we would display here a diagram depicting the digital weather data tables.


Views

Views provide an alternative to direct reference of Babase tables. Views appear to be tables, but are really pre-composed queries into the underlying Babase tables. Views can be used almost anywhere in Babase in place of a table, specifically, they can be queried just like tables. An SQL query can freely intermix the use of tables and views.

Important

Babase uses views to hide implementation details, details that may change as Babase develops. Tables that have names ending in _DATA should not be used, there is always a view of the data in these tables that may be used in their place. Tables ending in _DATA may change in future Babase minor releases, breaking queries and programs that use the table. Use of the corresponding views will ensure compatibility with future Babase releases.

Views make it easy to reuse complex or commonly used queries, or portions of queries. They allow a database designed around the capabilities of the computer to be interacted with in a fashion that makes sense to people. Although the views do not appear in the entity relationship diagrams that document the underlying database, and so are omitted from the high level overview these diagrams provide, most Babase users will greatly benefit if they take the time to understand how the views fit into the overall database and will usually find it easier to work with the views than with the underlying tables.

Table 2.4. The Babase Views

Group Membership and Life Events
ViewOne row for eachPurposeTables/Views used
CENSUS_DEMOGCENSUS rowMaintenance of CENSUS rows that are extended with DEMOG information.CENSUS, DEMOG
CENSUS_DEMOG_SORTEDCENSUS rowMaintenance of CENSUS_DEMOG rows in a pre-sorted fashion.CENSUS, DEMOG
CYCPOINTS_CYCLESCYCPOINTS rowMaintenance of CYCPOINTS rows that are extended with CYCLES information.CYCLES, CYCPOINTS
CYCPOINTS_CYCLES_SORTEDCYCPOINTS rowThe CYCPOINTS_CYCLES view sorted by CYCLES.Sname, by CYCPOINTS.Date.CYCLES, CYCPOINTS
DEMOG_CENSUSDEMOG rowMaintenance of DEMOG rows.CENSUS, DEMOG
DEMOG_CENSUS_SORTEDCENSUS rowMaintenance of DEMOG_CENSUS rows in a pre-sorted fashion.CENSUS, DEMOG
GROUPS_HISTORYGROUPS rowDepiction of GROUPS rows in a more human-readable format.GROUPS
PARENTSBIOGRAPH row for which there is either a row in MATERNITIES with a record of the individual's mother or there is a row in DAD_DATA with a record of the individual's father -- with a non-NULLDad_consensus.Easy access to parental information.BIOGRAPH, MATERNITIES, DAD_DATA, MEMBERS
POTENTIAL_DADS(completed) female reproductive event for every male more than 2192 days old (approximately 6 years) present in the mother's group during her fertile periodResearch into paternity, especially the selection of potential fathers for further genetic testing.MATERNITIES, MEMBERS (multiple times), ACTOR_ACTEES (multiple times), BIOGRAPH, RANKDATES, MATUREDATES
PROPORTIONAL_RANKSRANKS rowAutomatic calculation of proportional ranks from the ordinal ranks in RANKS.RANKS
 
Physical Traits
ViewOne row for eachPurposeTables/Views used
ESTROGENSHORMONE_RESULT_DATA row with an estrogen kitEasy access to estrogen data.BIOGRAPH, HORMONE_KITS, HORMONE_PREP_DATA, HORMONE_PREP_SERIES, HORMONE_RESULT_DATA, HORMONE_SAMPLE_DATA, TISSUE_DATA, UNIQUE_INDIVS
GLUCOCORTICOIDSHORMONE_RESULT_DATA row with a glucocorticoid kit.Easy access to glucocorticoid data.BIOGRAPH, HORMONE_KITS, HORMONE_PREP_DATA, HORMONE_PREP_SERIES, HORMONE_RESULT_DATA, HORMONE_SAMPLE_DATA, TISSUE_DATA, UNIQUE_INDIVS
HORMONE_PREPSHORMONE_PREP_DATA rowPresents HORMONE_PREP_DATA with identifying information from TISSUE_DATA and BIOGRAPH. Also useful for maintaining data in HORMONE_PREP_DATA.BIOGRAPH, HORMONE_PREP_DATA, HORMONE_PREP_SERIES, HORMONE_SAMPLE_DATA, TISSUE_DATA, UNIQUE_INDIVS
HORMONE_RESULTSHORMONE_RESULT_DATA rowPresents HORMONE_RESULT_DATA with identifying information from TISSUE_DATA and BIOGRAPH. Also useful for maintaining data in HORMONE_RESULT_DATA.BIOGRAPH, HORMONE_KITS, HORMONE_PREP_SERIES, HORMONE_RESULT_DATA, HORMONE_SAMPLE_DATA, TISSUE_DATA, UNIQUE_INDIVS
HORMONE_SAMPLESHORMONE_SAMPLE_DATA rowPresents HORMONE_SAMPLE_DATA with identifying information from TISSUE_DATA and BIOGRAPH. Also useful for maintaining data in HORMONE_SAMPLE_DATA.BIOGRAPH, HORMONE_SAMPLE_DATA, TISSUE_DATA, UNIQUE_INDIVS
PROGESTERONESHORMONE_RESULT_DATA row with a progesterone kit.Easy access to progesterone data.BIOGRAPH, HORMONE_KITS, HORMONE_PREP_DATA, HORMONE_PREP_SERIES, HORMONE_RESULT_DATA, HORMONE_SAMPLE_DATA, TISSUE_DATA, UNIQUE_INDIVS
TESTOSTERONESHORMONE_RESULT_DATA row with a testosterone kit.Easy access to testosterone data.BIOGRAPH, HORMONE_KITS, HORMONE_PREP_DATA, HORMONE_PREP_SERIES, HORMONE_RESULT_DATA, HORMONE_SAMPLE_DATA, TISSUE_DATA, UNIQUE_INDIVS
THYROID_HORMONESHORMONE_RESULT_DATA row with a thyroid hormone kit.Easy access to thyroid hormone data.BIOGRAPH, HORMONE_KITS, HORMONE_PREP_DATA, HORMONE_PREP_SERIES, HORMONE_RESULT_DATA, HORMONE_SAMPLE_DATA, TISSUE_DATA, UNIQUE_INDIVS
WOUNDSPATHOLOGIESWP_AFFECTEDPARTS rowQuerying of wounds/pathologies data (without heal updates).WP_REPORTS, WP_OBSERVERS, WP_DETAILS, WP_AFFECTEDPARTS, BODYPARTS
WP_DETAILS_AFFECTEDPARTSWP_AFFECTEDPARTS rowUpload of WP_DETAILS and WP_AFFECTEDPARTS rows.WP_DETAILS, WP_AFFECTEDPARTS, BODYPARTS
WP_HEALSWP_HEALUPDATES rowUpload and viewing of WP_HEALUPDATES rows.WP_REPORTS, WP_DETAILS, WP_AFFECTEDPARTS, BODYPARTS, WP_HEALUPDATES
WP_REPORTS_OBSERVERSWP_REPORTS rowUpload of WP_REPORTS and WP_OBSERVERS rows.WP_REPORTS, WP_OBSERVERS
Sexual Cycles
ViewOne row for eachPurposeTables/Views used
CYCLES_SEXSKINSCYCLES rowMaintenance of SEXSKINS rows.CYCLES, SEXSKINS
CYCLES_SEXSKINS_SORTEDCYCLES rowThe CYCLES_SEXSKINS view sorted by CYCLES.Sname, by SEXSKINS.Date.CYCLES, SEXSKINS
MATERNITIESbirth or fetal lossSummarizes (completed) reproductive events.BIOGRAPH, PREGS, CYCPOINTS, CYCLES
MTD_CYCLESCYCLES rowPresents CYCLES together with Mdate, Tdate, and Ddate CYCPOINTS information for a view of an "entire" sexual cycle as a single row.CYCLES, CYCPOINTS
SEXSKINS_CYCLESSEXSKINS rowMaintenance of SEXSKINS rows.CYCLES, SEXSKINS
SEXSKINS_CYCLES_SORTEDSEXSKINS rowThe SEXSKINS_CYCLES view sorted by CYCLES.Sname, by SEXSKINS.Date.CYCLES, SEXSKINS
SEXSKINS_REPRO_NOTESSEXSKINS row, or REPRO_NOTES rowMaintenance of SEXSKINS rows.CYCLES, REPRO_NOTES, SEXSKINS
 
Social and Multiparty Interactions
ViewOne row for eachPurposeTables/Views used
ACTOR_ACTEESINTERACT rowMaintenance of social interaction data, INTERACT rows and POINTS. A view optimized for highest performance when working with these tables. Analysis of social interaction data.INTERACT, PARTS
INTERACTINTERACT_DATA rowPresents INTERACT_DATA with additional date and time columns that transform the underlying date and time columns in useful and interesting ways.INTERACT_DATA
INTERACT_SORTEDINTERACT_DATA rowPresents the INTERACT view sorted in a fashion expected to ease maintenance.INTERACT_DATA
MPI_EVENTSMPI_DATA rowAnalysis and correction of multiparty interaction data.MPI_DATA, MPI_PARTS, MPIACTS
POINTSPOINT_DATA rowPresents POINT_DATA with the Ptime column transformed into a column that may be useful and interesting.POINT_DATA
POINTS_SORTEDPOINTS rowPresents POINTS sorted by Sid, and within that by Ptime.POINTS
SAMPLES_GOFFSAMPLES rowPresents SAMPLES with an additional column Grp_of_focal, which has the group of the focal at the time of sampling.SAMPLES
 
Darting
ViewOne row for eachPurposeTables/Views used
ANESTH_STATSunique ANESTHS.Dartid value -- for each darting during which additional anesthetic was administeredAnalysis and eyeballing of data involving additional administration of anesthetic when darting.ANESTHS
BODYTEMP_STATSunique BODYTEMPS.Dartid value -- for each darting having body temperature measurementsAnalysis and eyeballing of darting body temperature measurements.BODYTEMPS
CHEST_STATSunique CHESTS.Dartid value -- for each darting having chest circumference measurementsAnalysis and eyeballing of darting chest circumference measurements.CHESTS
CROWNRUMP_STATSunique CROWNRUMPS.Dartid value -- for each darting having crown-to-rump measurementsAnalysis and eyeballing of darting crown-to-rump measurements.CROWNRUMPS
DSAMPLESunique DARTINGS.Dartid value -- for each dartingVisualization of all samples collected per darting.DARTINGS, MEMBERS, DART_SAMPLES
DENT_CODESunique TEETH.Dartid value -- for each darting with recorded tooth informationPerusal and maintenance of TEETH rows by kind of tooth.TEETH
DENT_SITESunique TEETH.Dartid value -- for each darting with recorded tooth informationPerusal of TEETH rows by position in the mouth.TEETH, TOOTHCODES
HUMERUS_STATSunique HUMERUSES.Dartid value -- for each darting having humerus length measurementsAnalysis and eyeballing of darting humerus length measurements.HUMERUSES
PCV_STATSunique PCVS.Dartid value -- for each darting having PCV measurementsAnalysis and eyeballing of darting PCV measurements.PCVS
TESTES_ARC_STATSunique TESTES_ARC.Dartid value -- for each darting having at least one measurement of testes length or width circumferenceAnalysis of testes length and width measurements taken during darting.TESTES_ARC
TESTES_DIAM_STATSunique TESTES_DIAM.Dartid value -- for each darting having at least one measurement of testes length or width diameterAnalysis of testes length and width measurements taken during darting.TESTES_DIAM
ULNA_STATSunique ULNAS.Dartid value -- for each darting having ulna length measurementsAnalysis and eyeballing of darting ulna length measurements.ULNAS
VAGINAL_PH_STATSunique VAGINAL_PHS.Dartid value -- for each darting having vaginal pH measurementsAnalysis and eyeballing of darting vaginal pH measurements.VAGINAL_PHS
 
Inventory
ViewOne row for eachPurposeTables/Views used
LOCATIONS_FREELOCATIONS row that isn't used in NUCACID_DATA or in TISSUE_DATAQuerying of available ("free") locations for storing new samplesLOCATIONS, NUCACID_DATA, TISSUE_DATA
NUCACID_CONCSNUCACID_CONC_DATA rowConverting and standardizing units of nucleic acid concentrationNUCACID_CONC_DATA, NUCACID_CONC_METHODS, NUCACID_LOCAL_IDS
NUCACIDSNUCACID_DATA rowShowing data about nucleic acids in a human-readable formatNUCACID_DATA, TISSUE_DATA, UNIQUE_INDIVS, BIOGRAPH, NUCACID_LOCAL_IDS, NUCACID_SOURCES
NUCACIDS_W_CONCNUCACID_DATA rowShowing data about nucleic acids in a human-readable format, including concentrations from the most-recent quantificationsNUCACID_DATA, NUCACID_CONC_DATA, TISSUE_DATA, UNIQUE_INDIVS, BIOGRAPH, NUCACID_LOCAL_IDS, NUCACID_SOURCES
TISSUESTISSUE_DATA rowShowing data about tissue samples in a human-readable formatTISSUE_DATA, UNIQUE_INDIVS, BIOGRAPH, TISSUE_LOCAL_IDS
TISSUES_HORMONESTISSUE_DATA rowProviding an expanded set of information about tissue samples used for hormone analysis. Also useful for simultaneous upload of data to TISSUE_DATA and HORMONE_SAMPLE_DATATISSUE_DATA, UNIQUE_INDIVS, BIOGRAPH, TISSUE_LOCAL_IDS, HORMONE_SAMPLE_DATA
 
SWERB Data (Group-level Geolocation Data)
ViewOne row for eachPurposeTables/Views used
QUADSQUAD_DATA rowQuerying of X, Y coodinates from and maintenance of QUAD_DATA rows.QUAD_DATA
SWERBSWERB_DATA row -- for every SWERB event, departure from camp excludedCollects SWERB related information spread among several tables and separates geolocation points into X and Y coordinates.SWERB_DATA, QUADS, SWERB_BES, SWERB_DEPARTS_DATA, SWERB_DEPARTS_GPS
SWERB_DATA_XYSWERB_DATA row -- for every SWERB event, departure from camp excludedSeparates SWERB_DATA geolocation points into X and Y coordinates for ease of maintenance.SWERB_DATA
SWERB_DEPARTSSWERB_DEPARTS_DATArow -- for every departure from camp of every observation team, for those observation teams which have collected SWERB dataCollects departure related information spread among several tables and separates geolocation points into X and Y coordinates.SWERB_DEPARTS_DATA, SWERB_DEPARTS_GPS
SWERB_GW_LOCSSWERB_GW_LOC_DATA row -- for every geolocation of an object, of a grove or waterholeCollects SWERB grove and waterhole location information spread between tables and separates geolocation points into X and Y coordinates.SWERB_GW_LOC_DATA, QUADS
SWERB_GW_LOC_DATA_XYSWERB_GW_LOC_DATA row -- for every geolocation of an object, of a grove or waterholeSeparates SWERB_GW_LOC_DATA geolocation points into X and Y coordinates for ease of maintenance.SWERB_GW_LOC_DATA
SWERB_LOC_GPS_XYSWERB_LOC_GPS row -- for every time a group is observed at a geolocated physical object, usually a grove or waterhole, and 2 GPS waypoints are required to by the protocol to collect the dataSeparates SWERB_LOC_GPS geolocation points into X and Y coordinates for ease of maintenance.SWERB_LOC_DATA, ADCODES
SWERB_LOCSSWERB_LOC_DATA row -- for every time a group is observed at a geolocated physical object, usually a grove or waterholePresents the relationship between the groups and physical features of the landscape in a more comprehensive manner for simpler querying.SWERB_LOC_DATA, ADCODES
SWERB_UPLOADrow uploaded into SWERBThis view returns no rows, it is used only to upload data into the swerb portion of Babase.SWERB_DEPARTS_DATA, SWERB_DEPARTS_GPS, SWERB_BES, SWERB_DATA, SWERB_LOC_DATA
 
Weather Data
ViewOne row for eachPurposeTables/Views used
MIN_MAXSWREADINGS rowAnalysis and correlation of manually collected weather data.WREADINGS TEMPMINS TEMPMAXS RAINGAUGES
MIN_MAXS_SORTEDWREADINGS rowThe MIN_MAXS view sorted for convienience.WREADINGS TEMPMINS TEMPMAXS RAINGAUGES

In addition to the above views there are a number of views which produce the group of a referenced individual as of a pertinent date. These views are all named after the table from which they are derived, with the addition of the suffixed _GRP. They are nearly identical to the table from which they derive, differing only by the addition of a column named Grp. The views which produce an individual's group are listed in the following table.


Special Values

To as great an extent as possible Babase utilizes a controlled vocabulary within the system's data store. Again, as far as is possible, this vocabulary may be tailored by adding or deleting codes to tables that define the vocabulary used elsewhere.[23]

At times, the Babase system recognizes that particular codes have special meanings, for example, the BIOGRAPH table's F (female) Sex code or the 0 (alive) Status code. The meaning of these codes is fixed into the logic of the system. As examples, an individual must be female to be allowed to have a menstruation, or, the individual must be alive if a sexual cycle event is to post-date the individual's Statdate. Some of these codes, like sex, are not defined in tables, they are hardcoded into the system. Others are defined in support or other tables. Because these codes have intrinsic meaning, they cannot be removed from the Babase system nor should their presence in the data be used to code a different meaning from that which the code presently has. For example, the meaning of STATUSES code value 0 should not be changed to mean death due to meteorite impact because the system's programs would then allow dead individuals to have sexual cycles. Each of the special values that the system requires retain particular meaning is listed in the Special Values section of the table's documentation. For further information on the meaning of the special values, see the description of the data table(s) that contain the code values. Should the meaning of one of these special values need to be changed, the logic in the Babase programs should be adjusted to reflect the change.

Babase prevents ordinary users from altering rows that contain special values in an attempt to prevent mis-configuration of the system. Only users with permissions to modify a table's triggers may alter the table's special values. This is not a panacea. To return to the example above, not only does the system expect a STATUSES code of 0 to mean alive, it also expects 0 to be the only code on STATUSES that means alive. If another STATUSES code is created to indicate a more specific sort of alive-ness, unless re-programmed the system will consider all individuals given that code to be dead, not alive. A careful review of the documentation should be undertaken before modifying the content of tables that instantiate special values.

Indexes

Indexes are a feature of databases, a feature which greatly speeds data retrieval. In return there is a small cost in the time it takes to change table content, and cost in disk space used. Databases generally require indexes to perform efficently. It is a good idea to index the tables each user has in their personal schema.

There is no documentation on the indexes used in Babase. In general, there is an index for each way the tables are commonly referenced. For example, if records are often looked up on the basis of date, there will be an index on the date. As a practical guide, there is an index on each of the columns at the endpoint of a relational line in the above entity-relationship diagrams, as well as an index on every date column with the exception of the CYCPOINTS table's Edate and Ldate columns. Almost all indexes are b-tree indexes.

The Babase Program Code

Babase uses common and widespread Unix development tools and techniques[24] to minimize a new developer's learning curve. This is a vain hope. Babase is complex and contains a lot of moving parts.

The remainder of this section describes conventions and procedures that those working with the Babase source code are expected to follow. It is of interest primarily to those who work with, or are considering working with, the code. It is not a comprehensive list, guidance should be taken from the existing code.

Anything and everything that is part of Babase should be checked into the project's revision control system.

All data values used in the code should be abstracted, either via m4 or PHP defines, using names that begin with bb_.

Minimize hardcoding. The use of data values in the code should be minimized. By keeping the number of hardcoded values to a minimum, the values used within the system can be altered through procedural changes alone, expensive programming can be avoided, and the flexibility of the system is increased.[25]

All database extension, triggers, functions, etc. should be written in PL/PgSQL, supplemented by m4.

All stand-alone programs should be accessible via the web. They should be written in PHP and styled with CSS2. The web pages they produce should be XHTML 1.0 compliant and should pass W3C validation at http://validator.w3.org/. Style sheets should pass the CSS validator at http://jigsaw.w3.org/css-validator/. Programs that access the database should obtain their PostgreSQL login credentials from the user, preferably using the existing PHP library code.

Each database user must be assigned unique login credentials to the PostgreSQL database. Each user is responsible for the security of his own login credentials and should never use login credentials that are not her own. All code should support this paradigm.

Every file should begin with a statement of copyright.

Each program, function, or procedure should have documented: its input arguments; its return value; any side effects including changes to pass-by-reference arguments, changes to the screen, changes to the database cursors, etc.

Clarity in your code is more important than efficiency. If the code is not clear, it is less likely to work and more likely to have bugs introduced upon maintenance. There is no point in getting a wrong answer quickly.

See the README files in the source tree's directories for information on how the source code is organized.



[17] As security restrictions permit, of course.

[18] That way if you unknowingly revealed your password to the terrorists last weekend when you were drunk, by the time everybody sobers up the password will have been changed and the amount of damage done is limited.

[19] There is one exception to this rule. Members of the babase_editors group actually may insert data to these tables, but only when it is done automatically as part of an UPDATE or DELETE in a The babase schema table.

[20] Manual updates probably shouldn't be allowed either, but we need to allow automatic updates resulting from legitimate data changes made by babase editors. To allow this, the rule is that only admins are allowed to update this column at all, and the "versioning" function is always run as an admin.

[21] Don't try this at home! Trained Professionals Only! Etc. ;-)

[22] And the reverse is true. The id of a row in the first table can be used to find the row in the second table that holds it.

[23] Examples may be readily found in the Chapter: “Support Tables.

[24] Usually.

[25] This is very important but the reasons behind it are not obvious, coding values into the programs means creating office procedures that cannot be altered without a programmer. For example, encoding the value of the unknown group into the system would make it impossible to create different unknown groups for animals disappearing from different groups, or different unknown groups for animals disappearing in varying states of health, or whatever.

Chapter 3. Baboon Data: Primary Source Material

Table of Contents

Group Membership and Life Events
ALTERNATE_SNAMES (Alternate Short Names)
BEHAVE_GAPS (Gaps in Behavior Observations)
BIOGRAPH (Baboon Biographical Data)
CENSUS (Group Membership)
CONSORTDATES (First Consortship Dates)
DEMOG (Demography Notes)
DISPERSEDATES (Dispersal Dates)
GROUPS (Groups)
MATUREDATES (Sexual Maturity Dates)
RANKDATES (Adult Rank Attainment Dates)
Physical Traits
WP_AFFECTEDPARTS (Body PARTS AFFECTED by Wounds/Pathologies)
WP_DETAILS (Wound/Pathology DETAILS)
WP_HEALUPDATES (Wound/Pathology HEAL UPDATES)
WP_OBSERVERS (Wound/Pathology OBSERVERS)
WP_REPORTS (Wound/Pathology REPORTS)
Sexual Cycles
CYCGAPS (Gaps in Female Cycle Observations)
CYCLES (Female Sexual Cycles)
CYCPOINTS (Female Sexual Cycle Events)
PREGS (Pregnancies)
REPRO_NOTES (Textual NOTES about REPROduction)
SEXSKINS (Sexskin Turgesence Measurements)
Social and Multiparty Interactions
ALLMISCS (Ad-libitum sample data)
CONSORTS (multiparty disputes over CONSORTshipS)
FPOINTS (Point data on Females)
INTERACT_DATA (Interactions)
MPIS (Multiparty InteractionS)
MPI_DATA (Multiparty dyadic Interactions)
MPI_PARTS (Multiparty Interaction PARTicipantS)
PARTS (Participants in interactions)
POINT_DATA (Point observation data)
NEIGHBORS (point observation data on Neighbors)
SAMPLES (all-occurrences Samples)
Darting
ANESTHS (Extra Sedation Administered During Darting)
BODYTEMPS (Darting Body Temperature Measurements)
CHESTS (Darting Chest Circumference Measurements)
CROWNRUMPS (Darting Crown-to-Rump Measurements)
DART_SAMPLES (Darting Tissue Sample Records)
DARTINGS (Baboon Darting Events)
DPHYS (Darting Physiological Measurements)
HUMERUSES (Darting Humerus Length Measurements)
PCVS (Darting Blood Measurements)
TEETH (Darting Tooth Data)
TESTES_ARC (Darting Testes circumference Data)
TESTES_DIAM (Darting Testes Diameter Data)
TICKS (Darting Tick and Parasite Data)
ULNAS (Darting Ulna Length Measurements)
VAGINAL_PHS (Darting Vaginal pH Measurements)
Inventory
LOCATIONS
NUCACID_CONC_DATA (NUCleic ACID CONCentration DATA)
NUCACID_CREATORS (NUCleic ACID CREATORS)
NUCACID_DATA (General information about NUCleic ACID samples)
NUCACID_LOCAL_IDS (LOCAL IDentifierS for NUCleic ACID samples)
NUCACID_SOURCES
POPULATIONS
TISSUE_DATA (General information about TISSUE samples)
TISSUE_LOCAL_IDS (LOCAL IDentifierS for TISSUE samples)
TISSUE_SOURCES
UNIQUE_INDIVS (All UNIQUE INDIVidualS)
SWERB Data (Group-level Geolocation Data)
AERIALS (Aerial photos)
GPS_UNITS (Individual GPS Devices)
QUAD_DATA (map Quadrants)
SWERB_BES (Begin/Ends: Uninterrupted bouts of group-level observation)
SWERB_DATA (Group Level GPS Point Samples)
SWERB_DEPARTS_DATA (Observation team departures from camp)
SWERB_DEPARTS_GPS (SWERB GPS Departure data)
SWERB_GWS (SWERB Grove and Waterholes)
SWERB_GW_LOC_DATA (SWERB Grove/Waterhole Location Data)
SWERB_LOC_DATA (LOCation-Specific DATA from GPS Points)
SWERB_LOC_GPS (Secondary Data for LOCations in GPS points)
SWERB_OBSERVERS
TREES
Weather Data
RAINGAUGES (Rain Measurements)
RGSETUPS (Rain Gauge Setups)
TEMPMINS (Minimum Temperature Measurements)
TEMPMAXS (Maximum Temperature Measurements)
DIGITAL_WEATHER (Digitally Collected Weather Data)
WREADINGS (Weather Readings)

These tables contain the permanent records of baboon-related data. For the most part this data are as collected in the field, although presumably the field staff is not perfect and there will be some errors that are corrected before data entry into Babase. Some columns, and more rarely entire rows, do contain derived data. Some of the derived data, such as pregnancy parity, is manually maintained, other derived data, such as sexual cycle sequence numbers or menses dates computed from onset of turgesence, is maintained by the system. The documentation clearly indicates which data are collected in the field, which data are derivative, and how derived data values are constructed.[26]

Group Membership and Life Events

ALTERNATE_SNAMES (Alternate Short Names)

This table records cases where short names (Snames) were assigned to individuals and then the choice of name was rescinded. It contains one row for every rescinded Sname, linking the rescinded value to the Sname presently assigned to the individual.

A new row may not be inserted into BIOGRAPH with an Sname value that is an Alternate_Sname value. However, in order to accommodate cases of switched identities, ALTERNATE_SNAME rows may have Alternate_Sname values which appear in the BIOGRAPH.Sname column.

The Sname value must differ from the Alternate_Sname value.

Sname

A three-letter code (an id) that uniquely identifies a particular animal (an Sname) in BIOGRAPH. This code can be used to retrieve information from BIOGRAPH or other places where the animal's three-letter code appears.

This column may not be NULL and may not be 998.

Alternate_Sname (Alternate Short Name)

An Sname once associated with the individual identified in the Sname column. This column may not be empty, it must contain exactly 3 characters, it may not contain lower case letters, and it may not contain the space character. This column may not be NULL.

Name_Alternate (Alternate Name)

The name associated with the alternate sname. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

This column may be NULL.

Notes

Notes regarding the existence of the alternate Sname. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

BEHAVE_GAPS (Gaps in Behavior Observations)

This table lists and explains "behavior gaps": periods of time during which behavioral data (e.g. interactions, focal sampling) for an indicated group are (or are suspected to be) sparse, lacking, or simply lower than normal, for a known reason. The "known reason" is an important element of these gaps; periods of time where data collection happens to dip below the norm for unknown reasons are not included in this table.

Data from gap periods are not any less "valid" than data from any other times. However, when aggregating and analyzing data, the sparseness of data in a given period may affect the final results. The purpose of this table is to point out such periods and allow users to decide for themselves how to deal with them.

Reasons for gaps vary widely, so they are noted in a text column rather than with a support table of possible "gap reasons". This makes querying for reasons unwieldy, but this is by design; the table is intended to be used as a guide for thoughtful consideration[27] of time periods where gaps in observation may be affecting analyses.

When discussed in this table, a "gap" does not necessarily mean a complete absence of data for the indicated period. It may merely refer to periods where collected data is sparser than usual. Also, a gap does not necessarily indicate that all data types are uniformly sparse. It may be that the gap only applies to a single type of data. Users should pay attention to the Gap_End_Status and Notes columns for details about which data types are affected.

Identification of a gap is done by a data manager. The system is not involved with this process, and does not handle data from gap periods differently than data from any other time periods. Those kinds of judgments are left for the user to make.

A group may have overlapping behavior gaps; it's possible for more than one factor to affect observation of a group at the same time.

A gap's Gap_End must be after its Gap_Start, or NULL. The Gap_End can only be NULL if the group's GROUPS.Cease_To_Exist is NULL. This allows for recording of ongoing, not-yet-completed gaps.

A gap's Gap_End and Gap_End_Status must both be NULL or both be non-NULL.

Column Descriptions

BGId (Behave_Gaps Identifier)

A unique integer identifying the BEHAVE_GAPS row.

This column is automatically maintained by the database and must not be NULL.

Grp (Group)

The Gid of the group affected by this gap.

This column must contain a Gid value of a row on the GROUPS table. This column may not be NULL.

Gap_Start (Start Date of the Gap)

The date on which the gap began. This date must be between the group's GROUPS.Start and GROUPS.Cease_To_Exist, inclusive.

This column may not be NULL.

Gap_End (End Date of the Gap)

The date on which the gap ended. This date must be between the group's GROUPS. Start and GROUPS.Cease_To_Exist, inclusive.

This column may be NULL, see above.

Gap_End_Status

The reason for, or status of, the gap's end. The legal values for this column are defined by the GAP_END_STATUSES support table.

This column may be NULL, see above.

Notes (Explanatory Notes)

Text notes about the gap, especially information about the gap's cause.

This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

BIOGRAPH (Baboon Biographical Data)

This table records the basic biographical data on baboons. It contains one row for each baboon, including still births and fetal deaths (collectively, fetal losses), on which data have been collected. In all cases the Statdate value must not be less than the Birth value. Live animals, those with a Status of 0, must have a recorded cause of death of not applicable, a Dcause of 0. Live animals that have no associated CENSUS rows (absences excepted) must have a Statdate equal to their Birth date. Animals with no recorded cause of death, a Dcause of 0, must have not applicable as the degree of confidence in both the nature and agent of death; their DcauseNatureConfidence and DcauseAgentConfidence must both be 0.

The system will generate an error when it finds a birth date that is later than the the team's last contact with the mother -- when the Birth date is later than the mother's Statdate.[28]

All individuals with an Sname, i.e. those that aren't fetal losses, must have a Name and will have rows in MEMBERS. Individuals with an Sname may not have their Sname removed (set to NULL).

Caution

The Psionload program treats an Sname value of 998 in a special fashion. 998 may not be used as an Sname value. See the Psionload documentation below for details.

Those rows that record data on fetal losses must maintain the following relations between their data values: the Sname, Name, Entrydate, and Entrytype values must be NULL; the Statdate must be the same as the birth date (Birth); and the Status must not be 0 (alive). Because fetal losses have no Sname they cannot have corresponding CENSUS rows and there will not be any record of their group membership in MEMBERS.

Entrydate and Entrytype can only be NULL for fetal losses--when their Sname is also NULL. Otherwise, they cannot be NULL and Entrydate must be between the individual's Birth and Statdate values, inclusive. When Entrytype is B (Birth), the Entrydate must be the individual's Birth. When Entrytype is any other value, Entrydate cannot equal Birth.

The Statdate of live individuals is derived from the CENSUS table. An actual census does not have to be taken. Any observation of an individual in a group that results in a row being added to CENSUS is sufficient, except that Absences don't count. When there are no non-absent censuses and the individual is alive, then the Statdate is the Entrydate. This column is automatically updated when CENSUS is updated to ensure that these conditions remain true. When the individual is not alive the Statdate is the date of death.

Caution

Living individuals, unlike dead ones, can have MEMBERS rows created by the interpolation procedure that locate the individual in a group on a date later than the individual's Statdate. For further information see: Interpolation at the Statdate.

In a like fashion, living individuals, unlike dead ones, can have CYCPOINTS rows created by automatic Mdate generation on a date later than the individual's Statdate. For further information see: Automatic Mdate Generation.

Caution

Male Dispersed dates may be after the Statdate when the individual is alive and there are subsequent censuses of the group from which the individual dispersed.

Caution

When dates are encoded as intervals to account for uncertainty in the data, as with the CYCPOINTS Edate and Ldate columns, the latter end of the interval may post-date the Statdate.

Aside from the preceding caveats, Babase does not allow data to be related with an individual when the date of the data postdates the individual's Statdate. Therefore Statdate provides a convenient way of determining the end of the time interval during which there are data on an individual, a way that is independent of whether the individual is alive or dead.

An individual's Dcause represents a specific Nature and Agent of death. When considering the associated DcauseNatureConfidence and DcauseAgentConfidence values, it is important to remember that a Dcause should be interpreted as "if Nature, then Agent". It is tempting to assume that this means that the DcauseAgentConfidence cannot be higher than the DcauseNatureConfidence, but this is not so. The DcauseAgentConfidence is assigned contingent on the associated Nature being true, so it is possible for the DcauseAgentConfidence to be higher than the DcauseNatureConfidence. For this reason, the system has no rules validating the DcauseAgentConfidence based on the DcauseNatureConfidence, nor vice versa.

Confidence in the accuracy of the estimated birth date is categorized in the Bstatus column. The estimated range of possible birth dates might not be as symmetrical around the Birth date as is implied in BSTATUSES, so the specific boundaries of this range are recorded in the EarliestBirth and LatestBirth columns.

The EarliestBirth and LatestBirth columns cannot be NULL, unless the Bstatus is 9.0 ("unknown"), in which case both EarliestBirth and LatestBirth must be NULL.

The EarliestBirth must be on or before the individual's Birth, which must be on or before the individual's LatestBirth. LatestBirth must be on or before the individual's Statdate, but only for individuals with non-absent rows in CENSUS.[29]

The LatestBirth must be on or before the Entrydate, unless the individual's Entrytype is B (Birth). As mentioned above, when Entrytype is B, the Entrydate must equal the Birth date. In these cases, if there is any uncertainty about when the individual's "true" birth date is, the LatestBirth might legitimately be after the Birth date and therefore after the Entrydate. The LatestBirth should never be long after the Entrydate[30], so even in these cases there are boundaries placed on LatestBirth. When Entrytype is B (Birth), the LatestBirth cannot be more than 29 days[31] after the Entrydate unless either or both of them is NULL.

The system will return a warning if the length of time between EarliestBirth and LatestBirth is more than Bstatus years[32]. Similarly, the system will return a warning if the EarliestBirth is more than (0.5 × Bstatus) years before the Birth date, and another if the LatestBirth is more than (0.5 × Bstatus) years after the Birth date.

It's possible for an individual's Bstatus to be "too high", based on the length of time between EarliestBirth and LatestBirth. That is, a high Bstatus could mistakenly be used for individuals whose EarliestBirth - LatestBirth range is significantly less than Bstatus years. The system will return a warning if the length of time between the EarliestBirth and LatestBirth is less than or equal to the length of time indicated by a smaller BSTATUSES.Bstatus value.

When inserting or updating data in this table, the system can use the row's Bstatus to automatically populate the EarliestBirth and LatestBirth columns, if desired. When the Bstatus is not 9.0 ("unknown"):

While the UNIQUE_INDIVS table contains a larger list of all individuals across multiple populations, this table is the primary authority for the "main" population. When rows are inserted or deleted in this table, related rows are automatically inserted or deleted in UNIQUE_INDIVS with IndivId = the Bioid, and PopId = 1. Individuals in the main population cannot be added to UNIQUE_INDIVS before being added to this table.

Column Descriptions

Bioid (Biograph IDentifier)

A unique integer identifying the BIOGRAPH row.

Babase rarely uses this identifier; it exists for the convenience of application programs and for distinguishing individuals without Snames (fetal losses) from each other and from other individuals.[33]

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Sname (Short Name)

The short name of the individual. This is an exactly three character long name abbreviation which is used to identify the individual and so must be a unique data value. It may not contain lower case letters or spaces.

Tip

The Sname is usually, but not always, the first 3 characters of the Name.

This value appears in many other places in the system and so should not be changed without changing all the other places in the database where the abbreviation appears; really, once established, the only reason to change this column is because the short name had already been used. [34] Because this is unlikely, Babase does not allow the Sname to be changed. The Sname is always composed of capital letters and may not contain a space.[35] This column should only be NULL if the row represents a fetal loss.

Name

The name of the individual. This is a textual column used for descriptive purposes. This value must be unique when a comparison is done in a case insensitive fashion. This column should only be NULL if the row records a fetal loss. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

The Pid value, from the PREGS table, of the individual's mother's pregnancy that ended in the birth[36]of the individual. This column may be NULL. A NULL value indicates there is no record of the individual's mother.

Caution

More than one individual may have the same Pid, as long as they were products of the same pregnancy. This occurs when twins are born into the study population.

This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sex

The sex of the individual. The legal values are:

Valid Sex Values
CodeDescription
Mthe individual is male
Fthe individual is female
Uthe individual is of unknown sex

This column may not be NULL.

Birth

The date the pregnancy ends. For live births and for individuals whose maternity is unknown, this is their estimated birth date. Otherwise, this is the date of the fetal loss. (A pregnancy that ends with the mother's death is considered as a spontaneous abortion (fetal loss) for this purpose.)

This column may not be NULL.

The BSTATUSES.Bstatus categorizing the quality of the birth date estimate.

This column may not be NULL.

The maternal group of the individual, the Gid of the group into which the individual was born.

This column must contain a Gid value of a row on the GROUPS table. This column may not be NULL.

Tip

If the maternal group is not known, the maternal group should be recorded as the unknown group.

Matgrpconfidence (confidence in maternal group assignment)

The degree of confidence in the assignment of the Matgrp value. The legal values for this column are defined by the CONFIDENCES support table.

This column may not be NULL.

Entrydate

The date the individual entered the study population.

Note

Because of Interpolation, it may seem like this column could be maintained automatically. However, the opacity of "non-interpolating" rows in CENSUS and the related historical analyses prevent accurate automatic determination of the entry date for many individuals. For more information, see CENSUS.Status and Interpolation, Data are not Re-Analyzed.

This column can be NULL, only if the row represents a fetal loss.

Entrytype

The way the individual entered the study population. The legal values for this column are defined by the ENTRYTYPES table.

This column can be NULL, only if the row represents a fetal loss.

Statdate

The status date of the individual. When the individual is alive, this is the latest date on which the animal was censused and found in a group.

This column may not be NULL.

The state of the individual's life at the Statdate. The legal values for this column are defined by the STATUSES support table.

This column may not be NULL.

The cause of death or circumstances associated with death. The legal values for this column are defined by the DCAUSES support table.

This column may not be NULL.

DcauseNatureConfidence (Confidence in Nature of Death)

The degree of confidence in the nature of the individual's death or circumstances associated with the individual's death (their DCAUSES.Nature). The legal values for this column are defined by the CONFIDENCES support table.

This column may not be NULL.

DcauseAgentConfidence (Confidence in Agent of Death)

The degree of confidence in the agent of the individual's death or circumstances associated with the individual's death (their DCAUSES.Agent). The legal values for this column are defined by the CONFIDENCES support table.

This column may not be NULL.

Alt_Snames (Alternate Short Names Exists)

A boolean value indicating whether or not there exist rows on the ALTERNATE_SNAMES table related to the individual's Sname. This value is true if and only if there exists a row on ALTERNATE_SNAMES with an Sname value which is the individual's sname or there exists an ALTERNATE_SNAMES row with a Alternate_Sname value which is the individual's sname.

The value in this column is automatically maintained and will never be NULL.

EarliestBirth (Earliest estimated Birth date)

The earliest estimated birth date for this individual.

The values in this column may be calculated automatically, as discussed above.

This column may be NULL, but only when the accuracy of the birth estimate is unknown (when Bstatus is 9.0).

LatestBirth (Latest estimated Birth date)

The latest estimated birth date for this individual.

The values in this column may be calculated automatically, as discussed above.

This column may be NULL, but only when the accuracy of the birth estimate is unknown (when Bstatus is 9.0).

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

CENSUS (Group Membership)

The population census table. Aside from the BIOGRAPH.Matgrp column, this table is the origin of all information regarding group membership. This table holds all the field census data and any information regarding group membership that is recorded in the field demography notes. It contains one row per animal per group per day censused. There is an additional row per individual per demography note for those days when there is a demography note regarding the individual and group but no census of the group. (See DEMOG.)

Tip

One way to have Babase record that an individual is alone is to first create a row in GROUPS meaning alone, and then to assign individuals who are alone to this group. The alone-ness of an individual can then be tracked in the same fashion as group membership, although the Babase user does then need to be aware that the members of the alone group are not actually proximate to one another.

The system will report individuals who are first censused in a group other than their maternal group (BIOGRAPH.Matgrp). The exceptions to this are when the maternal group is the unknown group or that first census row records an absence.

The system will report individuals with a BIOGRAPH.Sname that do not have any related (non-absent) CENSUS rows.

The Date must be between the Grp's related GROUPS.Start and Cease_To_Exist, inclusive, with one exception. Rows indicating absences — rows whose Status is A — may occur outside of the date range for a group's lifetime. These may sometimes be needed during fission/fusion periods to manually prevent an individual from being interpolated into a group that no longer exists or doesn't yet exist. However, a need for such absences is rare, so the system will report a warning for any "absent" censuses before the Grp's Start or after its Cease_To_Exist, exclusive.

The system will report a warning when CENSUS rows have a Status of C or D and a Date before the individual's LatestBirth, and another warning if before the individual's Entrydate.

As noted in the MEMBERS documentation, Babase does not allow an individual to be in more than one group on a given day.

Ideally, the original field census data sheets could be recovered from CENSUS, but there are several situations where that is not possible:

First, a datum is lost when an individual is actually censused in two groups on the same day because of movement between groups and the timing of the censuses.[37] In this situation a decision should be made as to which group CENSUS should record the individual's presence on that day. A demography note should then be added to DEMOG, with text that notes the individual's presence in the second group. This results, technically, in all of the information from both censuses, or other location information, being entered into the database. However, it should be remembered that, because the information regarding the second census is in textual form, it is not readily available to automated tools.

Second, it may be necessary during group fissions and fusions to record a different Grp than what was actually recorded because it is usally not clear in real time that a fission/fusion has begun. There is necessarily a lag between when a change can be seen retroactively and when the field notebooks are actually updated to reflect the existence of the newly-formed group(s). For fusions it is important to construct group membership in Babase carefully, for the sake of maintaining group residency. If an individual is a resident of one parent group and is censused in another, the residency algorithm recognizes the other parent group as an entirely different group. That is, it does not recognize that the groups will soon be related. To prevent a loss of residency due to an apparent group change, censuses in the other parent group(s) should be recorded with the daughter group as the Grp whenever at least some of both parent groups are together.

Example 3.1. Crossovers during a fusion

"Bruce" has been a resident of the "Gotham" group for years. "Clark", meanwhile, is a resident of the "Metropolis" group, and "Diana" is the alpha female (so, definitely resident) of the "Amazons" group. On 01 June, the three will permanently fuse together and form the "JLA" group, after first being seen together on 01 May. (JLA's Start is 01 May, Permanent is 01 June) Throughout May, census records show Bruce making short visits to Metropolis and to the Amazons. Knowing that the groups are in a fusion period, whenever Bruce is with Metropolis or the Amazons, he and all members of the group he is with should be recorded as being in the JLA. Similarly on dates later in the month when Bruce and his close associates Robin and Alfred — along with Clark and sometimes his sister Kara — were with the Amazons, all members of the Amazons and their friends from Gotham and Metropolis should be recorded in the JLA.

In January of the same year, Clark made a brief visit to Gotham. That was before the fusion began in May, so that visit's Grp need not be changed in any way[38].


Third, some CENSUS rows are derived from analyses of historical data and employ MEMBERS-style rows where group members generally have a row on every date of a given month that they were present, rather than just those dates when censuses were performed. See the Status column for details.

Caution

Be careful when changing these data. When CENSUS data are inserted, deleted, or updated, the MEMBERS table and BIOGRAPH.Statdate column are automatically updated via Interpolation. Also, remember that rank will almost certainly change should group membership change.

Cenid

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row. Cenid links CENSUS to DEMOG.

This column may not be NULL.

Date

The date of the census, or the date of the demography note (when Status is D).

Note

The date value must not be more than a year later than the present moment. This rule prevents accidental data entry errors from creating so many rows in MEMBERS that all available disk space is used.

This column may not be NULL.

The individual whose location is being recorded. The three-letter code that uniquely identifies an individual in BIOGRAPH. There will always be a row in BIOGRAPH for the individual identified here.

This column may not be NULL.

The group where the individual is located. This is a Gid value from GROUPS. This column should contain the most specific sub-grouping available -- subject to the constraints of the data entry protocol, of course. Aggregation into larger groupings is accomplished by retrieving the associated MEMBERS.Supergroup of the individual on the date of census.

This column may not be NULL.

Note

Usage exception: For dates between 21 Mar 1990 and 29 Feb 1992, inclusive, the group recorded for the sub-groups of Alto's group do not necessarily reflect the actual groupings of the animals on a particular day, but are instead indications of the group-splitting process. See the Protocol for Data Management: Amboseli Baboon Project document for further explanation.

Status

A one letter code indicating the source of the location information. Status is the source of MEMBERS.Origin data. The current codes are as follows: C (census), A (absent), D (demography), and M or N (manual). Other values derived from analysis of historical data include: S, E, F, B, G, T, L, and R.

The CENSUS.Status Codes

C

(census) The animal was found in the group on a field census sheet: from the census datasheets. (There may or may not be a corresponding demography note on DEMOG as well.)

Tip

A C Status is marked on the field census data sheet as an X.

A

(absent) The animal was not found in the group on a field census sheet. Note that while an individual should not be recorded present in more than one group on the same day, s/he may be absent from several groups on any given day.

Tip

An A Status is marked on the field census data sheet as an 0.

D

(demography) The animal was noted, in the field notebooks or elsewhere, to be in a group but was not marked present in a field census of a study group on that day.[39] There should be a DEMOG row associated with the CENSUS row. The individual may or may not have been marked absent on the same group's field census for the day.[40]

Tip

A D Status is marked on the field census data sheet as an 0, when there exists a corresponding place on the census data sheet.

Warning

The system will allow CENSUS rows with a Status of D to be entered without there being a corresponding DEMOG row in existence.[41] However it is expected that these rows exist only long enough to allow entry of a related DEMOG row. The system will report CENSUS rows with a Status of D that have no related DEMOG row.

M

(manual, interpolated) This code provides a way to manually supplement what is in the CENSUS table when there is no other way to get the data in. Babase considers this code to be the same as the C code.

N

(manual, not interpolated) This code provides an alternative way to manually supplement what is in the CENSUS table when there is no other way to get the data in. This code does not interpolate, it is presumed to be the result of some analysis.

S

(Susan's data) The data comes from the old DISPERSE database where the record had both a Datein and a Dateout.

E

(ending date) The data comes from the old DISPERSE database where the record had a Datein but not a Dateout.

F

(final date) The data comes from the old DISPERSE database where there is a Dateout and the last recorded location is before the Statdate.

B

(birth date) The data comes from the old DISPERSE database where the record had a Dateout but not a Datein.

T

(total) The data comes from the old DISPERSE database where the record had neither a Datein nor a Dateout.

G

(gap) The data are a record of the animal in the unknown group when the animal appeared in the old DISPERSE database but where there was a gap between times of recorded location.

L

(lineage) The group is from the Matgrp on the old CYCTOT database, either because the animal did not appear in the DISPERSE database, or because the first location for the animal in the old DISPERSE database had a Datein and this Datein was after the birth date of the animal.

R

(result of Alto's breakup) The datum is S, E, F, B, G, T, or L datum that has had locations which were changed from 1.0 to the group in which the animal was censused on 15/4/92. This change left all R rows as part of a contiguous series of days during which the animals are located in the Alto's sub-group as censused on 15/4/92, and the time-adjacent locations were not 1.0.

This column may not be NULL.

Cen

Cen is whether or not the CENSUS row represents an entry on a field census data sheet. TRUE means the CENSUS row exists because of an entry on a census data sheet, FALSE means there was no census done and the CENSUS row exists to support a demography note, manual notation of absence, etc. Cen should only be TRUE when Status is C, A, or D.

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

CONSORTDATES (First Consortship Dates)

This table records the dates of first consortship for males; this is a maturational milestone in males that we have analyzed in several contexts. It contains one and only one row for every individual for which there is a recorded first consortship. Individuals who have not yet consorted, or individuals that have consorted but whose first consortship date is not known, do not appear in the table.

Tip

Currently it only contains values for males; females may be added if desired.

Tip

All dates are exact, no BY dates are entered as we do for MATUREDATES and RANKDATES, so there is no Status column.

When there is a row in this table there must be a sexual maturity date in MATUREDATES, and the consortship date must be later than the sexual maturity date. The Consorted date cannot be before the individual's Entrydate, nor after the individual's Statdate. The individual must be at least 5 years of age on his Consorted date. The system will report a warning if the individual is 12 or more years of age on his Consorted date.

Sname

A three-letter code (an id) that uniquely identifies a particular animal (an Sname) in BIOGRAPH. This code can be used to retrieve information from BIOGRAPH or other places where the animal's three-letter code appears. This column may not be NULL.

Consorted

The date the individual had its first consortship. This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

DEMOG (Demography Notes)

This table holds the text that records group membership information not written on the regular field census sheets, especially that from the field demography notes. DEMOG provides a means of notating CENSUS rows, and thus facilitates management of additional free form CENSUS rows, rows that do not directly correspond with the field census sheets.[42] Thus, in conjunction with these corresponding CENSUS rows, the DEMOG rows capture group membership information that otherwise would not appear in the CENSUS table.

DEMOG contains one and only one row for every individual for every date for every group where the individual was noted present in free form textual field notes or other miscellaneous sources. The DEMOG row holds textual information. There is always exactly one corresponding CENSUS row, which holds the corresponding group membership information in the usual coded and structured form. (Note that only some CENSUS rows will have DEMOG rows; CENSUS rows that originate entirely in the regular censuses of groups will not, in general, have an associated DEMOG row). A single field note referring to more than one individual must appear in DEMOG as two (or more) separate rows, one row per individual. Multiple field notes pertaining to a single individual on a single date must be combined into one piece of text and entered in a single DEMOG row. (See the Protocol for Data Management: Amboseli Baboon Project for structure of the demography data as entered by the operator.)

Adding or removing DEMOG rows automatically updates the CENSUS.Status column of the corresponding CENSUS row.

Tip

Use the DEMOG_CENSUS view to upload datasets into this table. Use CENSUS_DEMOG view to maintain this table by hand.

Caution

The data integrity rules require that when a demography note is entered the CENSUS row be created before the related DEMOG row.

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row. Cenid links CENSUS to DEMOG.

This column may not be NULL.

Reference

A code that identifies the written field notebook or other source where the demography note can be found.

The legal values for this column are defined by the DEMOG_REFERENCES support table, see below. This column may not be NULL.

Comment

The demography note text pertaining to the CENSUS row with the given Cenid.

This column may be NULL.[43]This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

DISPERSEDATES (Dispersal Dates)

This table records dates of dispersal for males (females do not disperse and do not appear in this table). It contains one and only one row for every male who has a known date of dispersed from the study groups. Males who have not yet dispersed do not have a row in this table. Only males can have rows on this table.

Tip

All dates are exact, no BY dates are entered as we do for MATUREDATES and RANKDATES, so there is no Status column.

The system will report a warning when there is a row in this table and there is no sexual maturity date in MATUREDATES. The Dispersed date must be on or after the individual's Entrydate. The Dispersed date cannot be after the individual's Statdate when the individual is not alive (when BIOGRAPH.Status is not 0). When the individual is alive the Dispersed date may only be after the Statdate when the individual has been censused absent (CENSUS.Status is A) in the group[44] and the Dispersed date is not after the earliest such post-Statdate census date.

The system will returning a warning when the Dispersed date is before his LatestBirth.

Sname

A three-letter code (an id) that uniquely identifies a particular animal (an Sname) in BIOGRAPH. This code can be used to retrieve information from BIOGRAPH or other places where the animal's three-letter code appears. This column may not be NULL.

Dispersed

The date the individual (male) left its maternal group. This column may not be NULL.

Dispconfidence (Dispersal date confidence)

The degree of confidence in the assignment of dispersal date or rationale behind the assignment of the dispersal date. The legal values for this column are defined by the CONFIDENCES support table.

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

GROUPS (Groups)

This table contains one row for every group on which there is some recorded information. This includes not only the study groups and non-study groups, but also temporary daughter groups and the special group Unknown[45](See the Protocol for Data Management: Amboseli Baboon Project for when to use this special group.) When a daughter group becomes a regular group (after a fission or fusion is complete), the new group should be given a Permanent date to indicate that it is now a permanent group (Permanent is not NULL). Any old daughter groups that did not become permanent should be left in GROUPS to support the daughter grouping membership history.

Tip

This table serves primarily as a tool for the system for data validation. To see its contents in a more human-readable format, use the GROUPS_HISTORY view.

Every reference to a group elsewhere in the Babase system corresponds to a Gid of one of the records in this table. Temporary groups (those with Permanent of NULL) must have a non-NULL> From_group value. Permanent groups must not have a Permanent value that is earlier than their Start value. Permanent groups may or may not have a NULL From_group value.

Note that there is no particular reason to remove from GROUPS those daughter groups that exist for only a short time during group fission. Those sorts of groups can remain temporary forever.

Tip

The MEMBERS.Supergroup column may be used to determine the supergroup of an individual on any given date.

Neither a GROUPS row's From_group value nor its To_group value may be the same as its Gid value.

A group's Permanent and From_group cannot both be NULL. But both can be non-NULL.

The Cease_To_Exist value must be NULL or greater than the Start value. The Study_Grp value must be NULL or must not be less than the Start value. When the Cease_To_Exist and the Study_Grp value are both non-NULL the Study_Grp value must not be after the Cease_To_Exist value.

The Cease_To_Exist value must also be greater than or equal to all daughter groups' Start values.

The Last_Reg_Census value must be NULL or greater than the Start value. It also must be less than or equal to the group's Cease_To_Exist date, unless the Cease_To_Exist is also NULL. And Last_Reg_Census must be NULL or Study_Grp must be NULL or Last_Reg_Census must be on or after the Study_Grp date. The Last_Reg_Census must be NULL when Study_Grp is NULL.

The Cease_To_Exist must be the day preceeding the Permanent date of any daughter groups, unless the daughter group's Permanent is NULL. An important consequence is that all of a group's permanent daughter groups must have the same Permanent date.

A group that is a fusion product cannot have a fission parent -- the From_group must be NULL when the group is the result of group fusion, i.e., when the group's Gid appears in the To_group column of another group. [46]

Caution

The system enforces the rules of the 3 previous paragraphs "on-commit". In a transaction ending with a ROLLBACK, any changes to this table will not be validated against these rules. This means it is possible for an invalid change to appear error-free if executed in a rolled-back transaction. Committed transactions (and commands executed outside of transactions) perform this check as expected.

The One_letter_code value must be unique within the time period from the group's Start date through the group's Cease_To_Exist date, inclusive of endpoints.

Individuals cannot be placed into rows in the CENSUS table before the Start date of the group, or cannot be censused in the group at all if the value of the Start column is NULL. Individuals cannot be placed into rows of the CENSUS table after the Cease_To_Exist value of the group. Note that both these restrictions apply to all CENSUS rows, even those that indicate the individual is absent from the group.

Gaps in observation of a group cannot be added to the BEHAVE_GAPS table if the Gap_Start or Gap_End are before the Start date of the group. Similarly, gaps cannot be added to BEHAVE_GAPS if the Gap_Start or Gap_End are after the Cease_To_Exist date.

Warning

Some gaps in BEHAVE_GAPS may have a Gap_Start date that is equal to the group's Start or Permanent date, implying that the gap started because of the opening of observation of the group.[47] Gaps may also have a BEHAVE_GAPS.Gap_End date equal to the group's Last_Reg_Census or Cease_To_Exist date, implying that the gap ended because of the group's end.[48] If the Start, Permanent, Last_Reg_Census, or Cease_To_Exist column is updated, then these implications will no longer be true. The system makes no attempt to judge whether these implications really are true or just coincidence, so data managers must exercise this judgment. When changing any of these dates in GROUPS, be sure to check for rows in BEHAVE_GAPS with Gap_Start or Gap_End dates that also should be updated, and correct them as needed.

Special Values

Group 9.0, Unknown, has a special meaning. Individuals are placed in this group by Interpolation when their whereabouts are unknown. Also, a SWERB_DATA.Seen_grp value of 9.0 in rows with an Event value of O indicates an exceptional circumstance where Seen_grp is allowed to equal the related SWERB_BES.Focal_grp value. Another group code for unknown whereabouts should not be created.

The 10.0 group has the special meaning of lone animal. The SWERB_UPLOAD view uses this value as the SWERB_DATA.Seen_grp when a lone animal is sighted. Another group code for lone animals should not be created.

The 99.0 group has the special meaning of predator sighting. The SWERB_UPLOAD view uses this value as the SWERB_DATA.Seen_grp when a predator is sighted. Another group code for predator sightings should not be created.

Column Descriptions

Gid

A positive numeric value with six digits (4 decimal places) that identifies the group. Each Gid must be unique. This column may not be NULL.

Name

The spelled out name of the group. This column must be unique, and unique insensitive of case. This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

From_group

The Gid of the group from which this group split off, if the group is a fission product. This column may be NULL.

To_group

The Gid of the group formed when the daughter of the group is the result of group fusion.

This column may be NULL to indicate there is no daughter group or the daughter groups are fission products.

Permanent

This column contains the date the group became a permanent, regular group, or contains NULL if it has not and is a temporary daughter group. For groups that were created as a result of fissions or fusions this column represents the end date of the fission/fusion period. For groups that were already intact when observation began this column represents the first day of observation on that group.

Note

Permanent affects whether or not an individual can be censused only in a daughter group and still be ranked in the parent supergroup. See RANKS and MEMBERS.Supergroup for further information.

Start

The date the group came into existence (or the earliest date it must have existed in the case of those groups existent before they were monitored.) The value of this column may be NULL to indicate the group exists but is not monitored.

If any parent group has the daughter group as its To_group then the start date is also the date the fusion started.[49]

Cease_To_Exist

The date on which the group is deemed to have permanently dissolved into fission products or merged into a fusion product. This column may be NULL for groups still under observation, groups that have not yet dissolved/merged, and groups whose dissolution/merge occurred while not under regular observation.

Last_Reg_Census (Last Regular Census)

The date of the last regular census done on the group for study groups that were dropped or ceased to exist because of fission/fusion. This column may be NULL if the group hasn't been dropped or was never a study group.

Three_letter_code

A 3 character, and exactly 3 character, code that uniquely identifies the group. The characters must all be upper case. This code is used by the Psion data collection devices and in SWERB observations taken using handheld GPS units and exists solely as a cross reference from those devices to the regular Babase group Gids. This column may be NULL if the group is never monitored using the Psion devices or SWERB GPS devices.

One_letter_code

A 1 character, and exactly 1 character, code that uniquely identifies the group within the time period of the groups existence. The character must all be upper case. This code is used to cross reference SWERB waypoint data to the regular Babase group Gids. This column may be NULL.

Study_Grp

The date the group first became an "official" study group[50] or NULL if the group was never a study group.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

MATUREDATES (Sexual Maturity Dates)

This table records sexual maturity dates, the dates of menarche or testicular enlargement. It contains one and only one row for every animal who matured in a study group or who lived in a study group as a sexually mature individual, and it may occasionally contain a row for a male who was known to mature but who did not live in a study group. Individuals who have not yet matured do not have a row in this table. All sexually mature individuals should have a row in this table. Entry into sexual maturity is not always an obvious or definite event[51], especially for males, so the Matured may be recorded as the first of the month in which the individual entered maturity.

There are restrictions on when an individual may become mature. The age of an individual at sexual maturity (Matured) must be at least 1016 days. This is about 2.7 years of age. The system will issue a warning when the sexual maturity occurs on or before the 3rd birthday. Individuals with a Mstatus of O (On) must be mature before 2922 days of age (8 years). The system will issue a warning when the sexual maturity occurs on or after the 7th birthday. An individual's sexual maturity date must be on or before his Statdate.

Some maturity dates are based on irregular observations of individuals before the long-term study began, or before the individuals entered an "official" study group. Either way, these individuals' Matured dates may be long before their Entrydate. Because of this, the system will allow but issue a warning when the month of the maturity date is earlier than the month of the individual's entry into the study population (their Entrydate).

For females, when Mstatus is O (On) Matured must be the first T date recorded in the female's sexual cycling data in the CYCPOINTS table. When Mstatus is not O Matured may not be after the first Tdate.

Caution

Changing a female's first Tdate can automatically change the female's Matured date. See CYCPOINTS.

Sname

A three-letter code (an id) that uniquely identifies a particular animal (an Sname) in BIOGRAPH. This code can be used to retrieve information from BIOGRAPH or other places where the animal's three-letter code appears. This column may not be NULL.

Matured

This is the date of menarche for females and the date of testicular enlargement for males, when either of these dates are known. Otherwise, this is the date by which the individual is considered to be sexually mature. See the Protocol for Data Management: Amboseli Baboon Project for more information regarding the dates used when the transition to maturity was not observed.[52] This column may not be NULL.

Mstatus (Sexual Maturity Status)

The status of the maturity date, that is, its precision, accuracy, quality, or other pertinent characteristics when it comes to the use of the value. The legal values for this column are defined by the MSTATUSES support table, see below. This column may not be NULL.

Tip

This column records whether the animal became mature ON a given (known) date, or BY a given (known) date. If a date is designated as an ON date[53] then we are saying that we know the animal attained that marker ON that date.[54] If a date is designated as a "BY" date the animal was adult or subadult BY that date but we do not know when the individual attained it. This scheme allows easy identification of which animals are infants or juveniles on any given day and which are not.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

RANKDATES (Adult Rank Attainment Dates)

This table records dates individuals first attained adult rank. It allows one and only one row for every individual who has attained adult rank. Individuals who have not yet obtained adult rank do not have a row in this table.

The system will report a warning when an individual has a rank (in RANKS) before their Ranked date that is higher (where 1 is highest) than another individual who has already attained adult rank.

Tip

RANKDATES currently contains only data for males but data for females may be added.

When there is a row in this table there must be a sexual maturity date in MATUREDATES. When MATUREDATES.Mstatus is O (On) then the rank attainment date must be later than the sexual maturity date. Otherwise, the rank attainment date must not be before the sexual maturity date. The Ranked date cannot be after the individual's Statdate. All individuals must be 5 or more years of age on their rank attainment date. Individuals with a Rstatus of O (On) must be less than 12 years of age on their rank attainment date. The system will report a warning for any males over 8.5 (exclusive) that have not yet attained adult rank.

It is possible that an individual will be known to have attained rank in a non-study group before they entered the study population (their Entrydate). Because of this, the system will allow but issue a warning if an individual's Ranked is before the first of the month of his Entrydate.

Sname

A three-letter code (an id) that uniquely identifies a particular animal (an Sname) in BIOGRAPH. This code can be used to retrieve information from BIOGRAPH or other places where the animal's three-letter code appears. This column may not be NULL.

Ranked

The date the individual first attained a rank among adults. The date must fall on the first of the month. This column may not be NULL.

Rstatus

The status of the rank date, that is, its precision, accuracy, quality, or other pertinent characteristics when it comes to the use of the value. The legal values for this column are defined by the MSTATUSES support table. This column may not be NULL.

Tip

The legal values for this column are O (for ON) and B (for BY), as with Mstatus in the MATUREDATES table above.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

Physical Traits

This section contains data about specific physical traits of the individuals.

In general, the data in this section tends to be collected "globally". That is, the data are collected for all living individuals in the study population (or as close to "all" as is possible). While many of the Darting tables could also easily be considered "physical traits", those data are only collected during dartings and therefore are not "global" but instead only available for individuals in that fraction of the population that has been darted.

WP_AFFECTEDPARTS (Body PARTS AFFECTED by Wounds/Pathologies)

Records which body parts were affected in each related wound or pathology Cluster, and the quantity of these wounds/pathologies affecting the specific part when that quantity is known. This table contains one row for each recorded body part, per associated wound/pathology (from WP_DETAILS) cluster. For example, if a report indicates two clusters affecting body part A and another cluster affecting body part B, this will be recorded in three rows in this table: two for body part A and one for body part B.

Each WPDId-Bodypart pair must be unique; a wound/pathology cluster can be associated with a particular body part only once.

The Quantity_Affecting_Part column records the quantity of individual wounds/pathologies in the related cluster that are affecting this row's body part. When this quantity is unknown or unclear from the report, or when the related wound/pathology is not obviously countable (e.g. "fatigue"), this column should be NULL. When a single wound/pathology affects more than one body part, this wound/pathology will be counted more than once: the Quantity_Affecting_Part column should be 1 for each of the affected parts' separate rows. For example, if there was a long slash/laceration extending from the arm to the trunk, this would be recorded with a Quantity_Affecting_Part of 1 in both the "arm" row and the "trunk" row, effectively counting a single wound twice.

Warning

Remember, the Quantity_Affecting_Part column indicates the number of wounds/pathologies that were affecting the specified body part. When aggregating data across multiple rows (e.g. sum, average, etc.), remember that individual wounds/pathologies affecting multiple body parts will be counted in more than one row of this table. Using this column to count the number[55]of discrete, independent wounds/pathologies may overestimate the true number.

Tip

When adding or updating data in this table, use the WP_DETAILS_AFFECTEDPARTS view. It is includes related columns from BODYPARTS to facilitate easy entry of Bodypart values, and from WP_DETAILS to determine the appropriate WPDId.

Tip

Use the WP_HEALS or WOUNDSPATHOLOGIES views to select data from this table. These views include related data from the other wound/pathology tables, respectively with and without related healing updates.

Column Descriptions

WPAId (WP_AffectedParts Identifier)

A unique identifier for the row. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

WPDId (WP_Details Identifier)

The WP_DETAILS.WPDId of the wound/pathology associated with this body part.

This column may not be NULL.

Bodypart

The BODYPARTS.Bpid for this body part.

This column may not be NULL.

Quantity_Affecting_Part

A positive integer indicating how many wounds/pathologies of the related type are affecting this body part.

This column may be NULL, when the quantity is unknown, unclear, or uncountable.

When not NULL, this column cannot exceed 9[56].

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

WP_DETAILS (Wound/Pathology DETAILS)

This table contains one row for each cluster of wounds or pathologies that are indicated in a report.

Similar to our use of "report" in WP_REPORTS, a "cluster" of wounds/pathologies is mostly just a data management term: useful for bookkeeping but lacking biological relevance. For our uses, a "cluster" is a group of one or more co-occuring wounds/pathologies of the same type (same WoundPathCode). They are "co-occurring" in that these wounds/pathologies were observed to appear on the same date and were likely not acquired independently. The decision to divide multiple wounds/pathologies into separate one-wound/pathology clusters or to group them into clusters of multiple wounds/pathologies is mostly made by the data manager.

In many cases, how exactly to cluster a set of wounds/pathologies is not a decision, but a necessity. When multiple wounds/pathologies of the same type are indicated on a report, there may be particular MaxDimension, ImpairsLocomotion, and/or InfectionSigns values that apply to some but not all of the wounds/pathologies (e.g. "This one slash is impairing locomotion, those other three are not"). In these cases, it is necessary to divide the multiple wounds/pathologies into separate clusters.

Clusters are numbered in the Cluster column, which must be unique per WPRId.

Some WoundPathCode values may inherently imply that the ImpairsLocomotion or InfectionSigns column(s) be a particular value. For example, if an individual is limping, by definition this pathology is impairing the individual's locomotion so the ImpairsLocomotion value should always be Y for that pathology. Because of this possibility, some validation of the ImpairsLocomotion and InfectionSigns columns is controlled by values in the WP_WOUNDPATHCODES table and its ImpairsLocomotion and InfectionSigns columns. See the WP_WOUNDPATHCODES documentation for more details.

The system will return a warning for any WP_DETAILS rows that do not have at least one related row in WP_AFFECTEDPARTS.

Tip

When adding or updating data in this table, use the WP_DETAILS_AFFECTEDPARTS view. It facilitates inserting or updating data with the related WId instead of the WPRId.

Tip

Use the WP_HEALS or WOUNDSPATHOLOGIES views to select data from this table. These views include related data from the other wound/pathology tables, respectively with and without related healing updates.

Column Descriptions

WPDId (WP_Details Identifier)

A unique identifier for this row. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

WPRId (WP_Reports Identifier

The WP_REPORTS.WPRId of the report in which this wound/pathology cluster was recorded.

This column may not be NULL.

WoundPathCode (Wound/Pathology Code)

The WP_WOUNDPATHCODES.WoundPathCode for this wound/pathology cluster.

This column may not be NULL.

Cluster (Cluster identifier)

A positive integer identifying this cluster of wounds/pathologies.

This column may not be NULL.

MaxDimension

The estimated maximum dimension of this cluster's wound or wounds (e.g. length, depth, etc., as applicable), in centimeters.

This column may be NULL, when a dimension is not recorded or not applicable.

ImpairsLocomotion

A character, indicating if this cluster's wound/pathology impairs the individual's locomotion. Legal values are Y, N, or U, meaning "Yes", "No", and "Unknown" (or Uncertain, or Unspecified), respectively.

This column may not be NULL.

InfectionSigns

A character, indicating if signs of infection (e.g. oozing, stiffness, redness) were observed. Legal values are Y, N, or U, meaning "Yes", "No", and "Unknown" (or Uncertain, or Unspecified), respectively.

This column may not be NULL.

Notes

Comments or descriptive notes about this wound/pathology.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

WP_HEALUPDATES (Wound/Pathology HEAL UPDATES)

This table contains one row for each instance where an observer provides an update on a report. These updates discuss how the wounds/pathologies have healed (or possibly how they haven't healed), so these updates are generally referred to as "heal updates".

Heal updates may be very specific, referring to a particular body part (a WP_AFFECTEDPARTS row). They may be a bit more vague and refer only to a particular wound/pathology (a WP_DETAILS row). They may even be so vague that the identity of the report (a WP_REPORTS row) being updated is the only "known" datum. To flexibly accommodate this variation, this table includes the WPRId, WPDId, and WPAId columns. These columns allow the recording of the report being updated, the particular wound/pathology from that report, or the body part affected by that wound/pathology, respectively. In each row of this table, one of these columns (WPRId, WPDId, and WPAId) must not be NULL, and the others must be NULL.

The Date in this table must be on or after the associated report's Date.

A heal update may indicate that an individual is missing, or presumed dead. For this reason, the Date may be after the individual's Statdate. However, the system will send a warning when the Date is more than 90 days after the individual's Statdate.

When wounds/pathologies are especially severe or life-changing, heal updates may continue for years after the related Date. However, these are rare. The system will return a warning when a Date is more than 365 days after its related Date.

Tip

Use the WP_HEALS view instead of this table. It presents the data in a format more hospitable for humans to read, and performs the somewhat-tricky task of joining the different ID columns to their respective wound/pathology tables.

Column Descriptions

WPHId (WP_HealUpdates Identifier)

A unique identifier for the row. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

WPRId (WP_Reports Identifier)

The WP_REPORTS.WPRId of the related wound/pathology report, if a more specific indicator is not known.

WPDId (WP_Details Identifier)

The WPDId of the related wound/pathology cluster, if known and if the related body part being updated is not known.

WPAId (WP_AffectedParts Identifier)

The WPAId of the related body part, if known.

Date

The date of this heal update.

This column may not be NULL.

HealStatus

The WP_HEALSTATUSES.HealStatus indicating how well the related wound(s)/pathology(ies) have healed.

This column may not be NULL.

Notes

Textual notes about the healing (or lack thereof) in this update.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

WP_OBSERVERS (Wound/Pathology OBSERVERS)

This table records the observers of the wound/pathology reports, one row for each observer. When a report has multiple observers, each of them is recorded in this table in a separate row.

Each WPRId-Observer combination must be unique; a report cannot have the same observer more than once.

Tip

Use the WP_REPORTS_OBSERVERS view to insert data into this table. It provides a simple way to determine the appropriate WPRId value to use, and for a human data enterer to provide multiple observers in a single row.

Tip

Use the WP_HEALS or WOUNDSPATHOLOGIES views to select data from this table. These views include related data from the other wound/pathology tables, respectively with and without related healing updates.

Column Descriptions

WPOId (WP_Observers Identifier)

A unique identifier for the row. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

WPRId (WP_Reports Identifier)

The WP_REPORTS.WPRId of the related report.

This column may not be NULL.

Observer

The OBSERVERS.Initials of the observer.

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

WP_REPORTS (Wound/Pathology REPORTS)

Records each distinct report of wounds/pathologies for an individual. When a wound or pathology is first seen, field observers usually report it with a specialized form that helps systematically report various pieces of pertinent data. These data may include but are not limited to which kind(s) of wound/pathology was observed, which body part(s) was affected, wound size (if applicable), and updates on later dates describing how well the individual has healed. These reports may describe something small like a single scrape or limp, or something larger like a set of bleeding wounds on several body parts. This table contains one row for each of these wound/pathology "reports".

It is difficult to provide a precise definition of a "report" in this sense. The aforementioned specialized forms are not always used, so a "report" does not always refer to these forms. Some wounds may be especially serious or life-altering, and pathologies may be chronic or reoccuring, meaning that some wounds/pathologies may recur throughout life in several reports. Because of this, each "report" is not necessarily a distinct "instance" of wound/pathology. Frankly, the distinction between "reports" made in this table is mostly artificial, useful for bookkeeping but lacking biological relevance. In general, each "report" is a discrete observation of wounds/pathologies for an individual on a specific date. A "report" may be an elaborate form, a brief note, or something in between.

Each combination of Sname, Date, and non-NULL Time must be unique; an individual can have multiple reports on the same date, but not at the same time.

The Date must be between the individual's Entrydate and Statdate, inclusive; the individual must be alive and in the study population when the report was created. The system will return a warning if the Date is before theindividual's LatestBirth.

The Grp indicates the group written on the form by the observer. For a variety of reasons (e.g. immigrations, group fissions/fusions), the Grp column may be different from the individual's Grp on this Date. Because of this, validation of the Grp column is limited: the Date must be on or after the group's Start date. The system will return a warning when a report occurs after its Grp has ceased to exist; that is, the system will return a warning when the report's Date is after the group's Cease_To_Exist.

The system will return a warning for any WP_REPORTS rows that do not have at least one related row in WP_DETAILS.

Tip

When adding new data to this table, use the WP_REPORTS_OBSERVERS view. It simplifies the process of adding multiple observers to WP_OBSERVERS.

Tip

Use the WP_HEALS or WOUNDSPATHOLOGIES views to select data from this table. These views include related data from the other wound/pathology tables, respectively with and without related healing updates.

Column Descriptions

WPRId (WP_Reports Identifier)

A unique identifier for this report. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

WId (Wound/pathology report Identifier, non-automated)

A unique identifier for this report, generated by data management.

This column may not be NULL.

Date

The date that the wounds/pathologies in this report were first observed.

This column may not be NULL.

Time

The time that the wounds/pathologies in this report were first observed, if known.

This column may be NULL, when the time is unknown.

Sname

The BIOGRAPH.Sname of the individual whose wounds/pathologies are described in this report.

This column may not be NULL.

Grp (Group)

The GROUPS.Gid of the group in which the individual was located when the wounds/pathologies were recorded, according to the observer(s).

This column may not be NULL.

ObserverComments

Comments or descriptive notes about the wounds/pathologies from the observers on initial observation.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

ReportState

The WP_REPORTSTATES.ReportState of this report.

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

Sexual Cycles

CYCGAPS (Gaps in Female Cycle Observations)

Records of the initiation and cessation of continuous periods of observation during which all of a female's cycling events are presumed, for the purpose of analysis, to have been observed. This table contains one row for each female for each initiation or cessation of a continuous period of observation.

A female is considered to be under continuous observation when all of her sexual cycle transition events -- Mdates, Tdates, and Ddates -- are observed or clearly implied by observational data.[57] When CYCGAPS contains a record of observation cessation this is an indication that some of a female's sexual cycle events have gone unrecorded. For this reason when the interval enclosed by a Mdate, Tdate, Ddate sequence contains CYCGAPS rows indicating interruption of observation, the sexual cycle transition dates to either side of the interruption must be in different sexual cycles. For further information on this and other ways CYCGAPS interacts with the rest of Babase, see the documentation on the CYCLES, CYCPOINTS, PREGS, and SEXSKINS tables.

The presumption is that females are under continuous observation -- females with no CYCGAPS are presumed to be under continuous observation. Consequently a female's earliest CYCGAPS Code must be E (End), denoting the end of a period of observation.

A female may not have two start of observation (Code S) without an intervening end of observation (Code E), or vice versa. Otherwise there would be starts without ends or ends without starts. Single day observation rows ("points", Code P) may only occur between an end of observation/start of observation pair of rows. There must be a 1-day interval between a female's CYCGAPS rows, with the single exception that an end of observation may be dated the day after a start of observation. Otherwise the same pattern of observation could be recorded using fewer rows.

Rows with a Code value of S (Start) or P (single Point), that mark the beginning of observational periods or that represent isolated single days of observation, must have a value in the State column. All other rows, those with a code of E (End) that represent the end of an observational period, must have no value (NULL) in the State column. When a State value is present, it must correspond to the sexual cycle transition information on CYCPOINTS. For further information regarding required correspondences between CYCGAPS and CYCPOINTS, and how changes in CYCPOINTS can automatically change CYCGAPS with a Code of S, see the CYCPOINTS documentation below.

Note

To simplify updates to this table, all of the above conditions are validated on transaction commit.

Warning

Any changes to the Date or Code — including UPDATE and all INSERT and DELETE commands — cause cascading updates to the CYCGAPDAYS table upon transaction commit. However, the validation for several other tables — especially CYCPOINTS — depends on the accuracy of CYCGAPDAYS. As a result, transactions involving simultaneous updates to both CYCGAPS and CYCPOINTS may result in spurious data, because validation on the latter may not be reliable. Therefore, when making changes for a given individual to both CYCGAPS and CYCPOINTS, don't do them in the same transaction. Specifically, CYCGAPS inserts, updates, or deletes should be performed in a transaction where no other tables are affected[58][59].

Only females may have CYCGAPS rows.

This table is used in the construction of the sexual cycle day-by-day tables. It also affects the determination of which sexual cycle events (CYCPOINTS) are part of a single sexual cycle (CYCLES), the construction of automatic Mdates, and the validation of sexual cycles with respect to pregnancies.

Caution

The State value is ignored in all a female's CYCGAPS rows with Dates on or before the female's Matured, excepting the row with the latest date, as the sexual cycle day-by-day tables contain no rows before the date of sexual maturity.

The combination of Sname and Date is unique.

All rows must be while the individual is alive. That is, the Date must be on or after the individual's Birth and on or before her Statdate.

Column Descriptions

Gapid

A number that uniquely identifies each row.

Sname

The short name of the female. This column should contain the Sname of a female in BIOGRAPH. This column may not be NULL.

To simplify the database code, this value may not be changed.

Code

What kind of endpoint the date records. Legal values are:

The CYCGAPS.Code Values
CodeMnemonicDefinition
SStartthe date is the start of a period of observation
EEndthe date is the end of a period of observation
PPointthe date is an isolated observation that belongs with no other observations, it is both a start and an end of an observational period

Date

The date upon which observations began or ended. Observations were made on the given date.

State (NULL allowed)

The state of the female's sexual cycle on the given date. Valid values are:

The CYCGAPS.State Values
CodeMnemonicDefinition
Mmensesfollicular -- Mdate (inclusive) to Tdate (exclusive)
Sswellingfollicular -- Tdate (inclusive) to 5 days prior to Ddate (exclusive)
Oovulating5 days prior to Ddate (inclusive) to Ddate (exclusive)
Ddeturgesenceluteal -- Ddate (inclusive) to Mdate (exclusive)
PpregnantDdate (exclusive) to birth (exclusive)
Llactatingbirth (inclusive) to Tdate (exclusive)

Must not be NULL when Code is S or P, must be NULL when code is E. See discussion in the table description above.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

CYCLES (Female Sexual Cycles)

This table records information on the sexual cycle of the females, one row per female per cycle.

Tip

The user may find it easiest to understand the function of this table by considering the CYCPOINTS_CYCLES view, which joins the CYCLES and CYCPOINTS tables.

Rows in this table depend upon rows in CYCPOINTS: Babase automatically manages the creation and destruction of CYCLES on the basis of the the sexual cycle transition events recorded in CYCPOINTS. The fundamental sexual cycle is a Mdate, Tdate, Ddate sequence. The Babase system automatically creates one row in CYCLES for every Mdate, Tdate, Ddate series in CYCPOINTS. Similarly, if an Mdate, Tdate, Ddate series is removed from CYCPOINTS, the corresponding row is removed from CYCLES. However, the rules Babase uses when automatically creating, destroying, or updating CYCLES are complicated by menarche, death, and gaps in observation.

In some cases there are turgescences of small size and short duration that typically occur during a pregnancy, prior to maturity, after a span of time spent in postpartum amenorrhea, or during a time of stress. These brief turgescent periods are not recorded as cycles because they were deemed too brief or small to be biologically functional.

Tip

If it ever becomes desirable to know when these brief turgescences occurred, this information should be recoverable from the SEXSKINS table, where the actual size of a female's turgescence is recorded.

Caution

CYCLES is special in that some of its data are automatically maintained by the system. The columns Seq and Series are updated automatically. For further information see the documentation that follows, and each column's documentation.

Tip

CYCLES rows should always have related CYCPOINTS rows[60], but as a practical matter it is necessary to create the CYCLES row before creating the related CYCPOINTS rows. This requires noting the Cid of the new cycles row so that it can be referenced in the new CYCPOINTS rows. Rather than do this by hand the CYCPOINTS_CYCLES view can be used. This allows a Sname to be specified with each new CYCPOINTS row and leaves it up to the system to either find or create an appropriate CYCLES row.

The system will report as an error those rows on CYCLES with no related CYCPOINTS rows[61]. CYCLES with no related CYCPOINTS must have a NULL Seq.

The aggregation of CYCPOINTS rows into cycles is automatically managed by Babase. The determination is based on the order in time of a female's CYCPOINTS rows and the information on gaps in observation present in CYCGAPS. The transition events recorded in CYCPOINTS are collected into sexual cycles, each cycle having (at most) an onset of menses date (Mdate), an onset of turgesence date (Tdate), and an onset of deturgesence date (Ddate), appearing in the order given here when ordered by date, and with none of the female's other Mdate, Tdate, or Ddate CYCPOINTS rows on the interval. Some sexual cycles may lack one or more of the transition events. This may occur for biological reasons — there must not be a resumption of menses date (Mdate) in an individual's first adolescent cycle, nor in the firct cycle after a pregnancy — or simply because there are no data available to identify the date(s). In the latter case, CYCGAPS should be updated with a record of the gap in observation and the respective row is omitted from CYCPOINTS.

Part of Babase's automatic management of cycles is the management of cycle sequence numbers. Babase assigns a sequence number (Seq) to each of a female's cycles, beginning with 1 at menarche and counting up. As a consequence of the numbering scheme, the sexual cycle with a sequence (Seq) of 1 must not have an onset of menses date (Mdate).

Gaps in periods of continuous observation (CYCGAPS) impact Babase's determination of what constitutes a cycle. The presence of a gap in observation forces a change in cycle. (However, gaps in observation, missing cycles, do not cause gaps in the sequence numbering.) The introduction or removal of a gap, or for that matter the addition or removal of new CYCPOINTS rows, can result in the split of an existing cycle into two -- the creation of a new CYCLES row --, or the merging of two previously distinct cycles into one -- the destruction of an existing CYCLES row. When this occurs the later CYCPOINTS rows retain their Cid, it is the earlier CYCPOINTS rows that change their Cid and move between cycles.[62][63][64]

The sexual cycles themselves are aggregated into periods of continuous observation, termed series, indicated by the assignment of a Series number to each CYCLES row. The aggregation of a female's sexual cycles into a series is also automatically managed by Babase, based on the information in CYCGAPS. Although series are computed based on CYCGAPS, the series value aggregates and numbers sexual Mdates, Tdates, and Ddates, not periods of observation. A consequence is that some periods of observation may not have an associated Series number. Some observational periods may occur before the female's sexual maturity date or before any recorded sexual cycle transition events (CYCPOINTS). An individual's first period of continuous observation containing Mdates, Tdates, or Ddates has a Series of 1, the second a Series of 2, etc.

Aggregating a female's CYCLES rows into a series indicates that the collection of data points is believed to be complete, no unobserved or unrecorded sexual cycle transitions (CYCPOINTS rows) occurred during the time spanned by the series. This allows the Series to be used as the basis for an analysis of sexual cycle transition intervals.

Tip

Those CYCLES with a Series of 1 for those females that have an O (On) Mstatus have Seq values that equal the ordinal numbering of the female's actual cycles, her first ever cycle having a Seq of 1, her second a Seq of 2, etc. All other CYCLES rows have Seq values that are useful for ordering each female's cycles but not for comparison between females.

Caution

Because a gap in observation always triggers a change in cycle, and because cycles must be complete, i.e. must contain a Mdate, a Tdate, and a Ddate, if there is no gap in observation it is impossible to have a single cycle missing nothing but a Tdate, i.e. it is impossible to have a cycle with a Mdate and a Ddate but no Tdate. If necessary, an estimated Tdate may be entered to work around this limitation.[65]

The system reports an error when the combination of Sname and Seq is not unique.[66]

Column Descriptions

Cid (sexual Cycle IDentifier)

A numeric identifier identifying each sexual cycle. It is unique across all cycles of all females.

This column need not be manually specified when the row is created.

The value of this column may not be altered after a row is created.

This column must not be NULL.

Sname

The short name of the female. This column must contain the Sname of a female in BIOGRAPH.

The value of this column may not be altered after a row is created.

This column must not be NULL.

Seq (Sequence)

The first sexual cycle of a female has a Seq value of 1, the second a value of 2, etc. The system will report an error if the Seq does not begin with 1 or is not contiguous. This column does not need to be manually maintained.

Caution

There are no gaps in the sequence numbers assigned to a female. Even when records of cycles are missing, the first recorded cycle after the missing period has a sequence one greater than the last recorded cycle before the missing period.

If the user does specify a value for this column the system may recompute and replace the supplied value at any time.

This column may be NULL when the row is first inserted, so that the system can set the value correctly when CYCPOINTS are subsequently inserted, but it may not be changed from a non-NULL value to NULL.

Series

Number indicating with which series of continuous observation the transition event belongs. Events that are isolated observations have a series of their own. As with Seq, the Series are per-female. Each female begins with a Series of 1 and is incremented with each interruption in regular observation. For further information see the description of the CYCLES table above.

The system will report an error if the Series does not begin with 1 or if the Series does not progress in a contiguous fashion. This column does not need to be manually maintained.

If the user does specify a value for this column the system may recompute and replace the supplied value at any time.

This column may be NULL when the row is first inserted, so that the system can set the value correctly when CYCPOINTS are subsequently inserted, but it may not be changed from a non-NULL value to NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

CYCPOINTS (Female Sexual Cycle Events)

This table records information on the sexual cycle of the females, one row per female per event.

Tip

The user may find it easiest to understand the function of this table by considering the CYCPOINTS_CYCLES view, which joins the CYCLES and CYCPOINTS tables.

The usual events that mark the transitions of a female baboon's sexual cycles are onset of menses (Mdate), onset of turgesence (Tdate), and onset of deturgesence (Ddate). These different transition event dates are distinguished by Code values of M, T, and D respectively. In addition to these usual observations of transition states, CYCPOINTS contains one other kind of row, estimations of when unobserved sexual cycle transitions occurred; notably the automatically calculated onset of menses dates but also unobserved onset of deturgesenceses (Ddates) related to pregnancy conception events[67].

The unusual events that impact female cycling records, notably death and the cessation or initiation of long term observation, are recorded in other tables.

Note

The interval between conception and birth (or fetal death) is the length the pregnancy, by definition, and CYCPOINTS is only place in Babase where conceptions are recorded. For this reason CYCPOINTS includes rows for the Ddate events that begin every pregnancy, including those that record estimated, unobserved, Ddates. It may be that all that is known about a cycle is that a Ddate must have occurred because a pregnancy resulted.

Although Babase requires pregnancies to have a conception Ddate, and consequently there may be pregnancies for which an estimated (Source of E) Ddate must be entered, there is nothing preventing the user from creating estimated CYCPOINTS rows for the other Codes.

Caution

CYCPOINTS is special in that some of its data are automatically maintained by the system. The Cid and Source columns can be updated by automatic processes. For further information see the documentation of the CYCLES table and each column's documentation.

The presence of a Ddate row can trigger the automatic generation of a Mdate 13 days later. For further information see the section on Automatic Mdate Generation.

Only Mdates are automatically assigned, and only Mdates may have a Source of A (Automatic). Mdates may be manually given a Source of A, although this may well not be a good idea as the Automatic Mdate Generation process may remove the A row at any point. It is even less of a good idea because automatic Mdates are not validated, so it is quite simple to enter an invalid automatic Mdate.

During a period of continuous observation — a series — sexual cycle transition events (CYCPOINTS) should not be missing, except that Mdates cannot be assigned in the case of the first adolescent cycle (at maturity) or at the start of a Resume cycle. An individual's Mdates, Tdates and Ddates should all appear in Mdate-Tdate-Ddate order. The system will report an error if this is not the case.[68] In consequence the combination of Cid and Code must be unique.[69]

Usually a female does not have multiple CYCPOINTS rows for a given date, although there is an exception. A female's onset of menses date (Mdate) may be the same as her onset of turgesence (Tdate) date. Otherwise, none of a female's CYCPOINTS rows may share a date.

Babase allows each sexual cycle transition event to be associated with 3 dates, the date of record (Date), the earliest possible date (Edate), and the latest possible date (Ldate). The earliest (Edate) and latest (Ldate) possible dates may be NULL. The earliest possible date (Edate) may not be later than the date of record (Date), and the latest possible date (Ldate) may not earlier than the date of record (Date). A female's earliest Tdate may, and likely will, have an earliest possible date (Edate) assigned that is before onset of menarche.

A number of constraints on CYCPOINTS involve the females' sexual maturity dates (MATUREDATES.Matured). When an individual's sexual maturity date is determined by observation, MATUREDATES.Mstatus is O (On), her earliest Tdate must be equal to her sexual maturity date.

Warning

When a female's MATUREDATES.Mstatus is O (On) her MATUREDATES.Matured is automatically set to her earliest Tdate. Any error in the Tdate value will be reflected in the maturity date. This is not true of females with MATUREDATES.Mstatuses that are not O. These maturity dates must be manually maintained.

No date-of-records may occur before a female's maturation date. All of an individual's date-of-record (Date) and late (Ldate) sexual cycle transition date values must be on or after the individual's onset of menarche date (MATUREDATES.Matured). All of an individual's early dates (Edate), Bdates of record (Date), and the first Tdate date-of-record (Date), sexual cycle transition dates must be after the individual's Birth date.

Females with CYCPOINTS rows must have a sexual maturity date. The system will report mature females with no CYCPOINTS rows on or after her maturity date (MATUREDATES.Matured).

All early date (Edate) and date-of-record (Date) values must be on or before the individual's Statdate.

Caution

Even when an individual is dead, late (Ldate) dates may be after the Statdate. This is because death is rarely observed; although the Statdate contains a single date, the uncertainty surrounding the date of death is reflected in the sexual cycle event Ldate.

There are gaps in observation. If the first cycling event in a series -- the first Mdate, Tdate, or Ddate -- falls on the day observation resumes then things are pretty simple. The state of sexual cycling at the time observation resumes, CYCGAPS.State, must correspond with the event. For a menses CYCGAPS.State is M and so forth. The situation is slightly complicated by the swelling-follicular and ovulating states. The details are this: If the first CYCPOINTS row in the series falls on the first day of the series, the CYCGAPS.State must be M (Menses, follicular) when the CYCPOINTS.Code is M (onset of Menses); CYCGAPS.State must be D (Deturgesence) when the CYCPOINTS.Code is D (onset of Deturgesence); CYCGAPS.State must be S (Swelling, follicular) when the CYCPOINTS.Code is T (onset of Turgesence) and the subsequent Ddate in the series is more than 5 days after the Tdate or there is no subsequent Ddate; and CYCGAPS.State must be O (Ovulating) when the CYCPOINTS.Code is T (onset of Turgesence) and the subsequent Ddate in the series is not more than 5 days after the Tdate.

If the above is not the case, i.e. the first cycling event in the series falls on the day observation resumes and CYCPOINTS.Code is M but the CYCGAPS.State is not, then the State of the CYCGAPS row is automatically changed to enforce correspondence between CYCGAPS and CYCPOINTS.

But what if observation starts and then later the first Mdate, Tdate, or Ddate is observed? What happens (to CYCSTATS) between the start of observation and the first event? That's what CYCGAPS.State is supposed to address and it needs to be set appropriately. This cannot always be done automatically either, although usually it can.

If the first CYCPOINTS row in the series does not fall on the first day of the series, the CYCGAPS.State must be D (Deturgesence) when the first CYCPOINTS.Code is M (onset of Menses); the CYCGAPS.State must be S (Swelling, follicular) when the CYCPOINTS.Code is D (onset of Deturgesence) and the CYCPOINTS.Date is more than 5 days after the CYCGAPS.Date; and the CYCGAPS.State must be O (Ovulating) when the CYCPOINTS.Code is D (onset of Deturgesence) and the CYCPOINTS.Date is not more than 5 days after the CYCGAPS.Date.

In these cases, as before, the State of the CYCGAPS row is automatically changed to enforce correspondence between CYCGAPS and CYCPOINTS.

The final set of possibilities have to do with Tdates, which are complicated because they occur at menarche and after pregnancies, as well as after menses. The system will report an error if the first CYCPOINTS row in a series does not fall on the first day of the series and the first CYCPOINTS row is a Tdate and the CYCGAPS.State is something other than M (Menses), P (Pregnant), or L (Lactating). Because there are 3 possibilities in this case, the CYCGAPS.State value is not automatically assigned.

Note

All of the validation and possible updating of the CYCGAPS.State is performed on transaction commit.

Warning

Because deleting CYCPOINTS changes a female's cycling state -- a representation of which Babase keeps in the sexual cycle day-by-day tables -- but not the interval of time during which she was under observation (CYCGAPS), removing Mdates, Tdates, or Ddates from CYCPOINTS at the beginning of a series can, possibly, leave the beginning of the series either in an incorrect state or the correct state for an overly long period of time. This can be equally true when the dates of the first CYCPOINTS in a series are changed. Removing all the CYCPOINTS Mdate, Tdate, and Ddate rows from a series will leave the entire observational period in the State specified by the CYCGAPS row that denotes the start of the observational period. This may or may not be correct, especially when the CYCGAPS.State was automatically changed due to the insertion or deletion of CYCPOINTS rows.

When deleting all sexual cycle transition CYCPOINTS rows from an observational period it is best to delete them all in a single transaction, or to delete later rows before earlier rows. Deleting CYCPOINTS rows from the beginning of the observational period changes the CYCGAPS.State value marking the start of the observational period.

CYCPOINTS rows must not fall in an interval of no observation, excepting estimated (Source is E) Ddates (Code D) that are also conception events. (See PREGS.Conceive.) None of the different kinds of date values -- early (Edate), date-of-record (Date), or late (Ldate) -- of the individual's CYCPOINTS rows may be in an interval during which the individual is not under observation -- may fall on a date on which the individual has a row in CYCGAPDAYS. The system will allow but report as an error CYCPOINTS rows with a Source of E and a Code of D that are not referenced in PREGS.Conceive.[70]

Caution

CYCPOINTS and CYCLES are intimately related. Be sure to read and understand the CYCLES documentation.

Once a row is created it must remain associated with the same female -- any re-assignment of Cid must retain the association between the CYCPOINTS row and the old Cid's female.

Note

There are plans afoot to automatically fill in the early and late dates. The early dates would include the day after the immediately prior census date, the late date would be the day of the immediately following census date. There must also be a mechanism for manually overriding the automatic dates.

Warning

When making changes to data for individuals with observation gaps, avoid updating this table in a transaction that also makes changes to CYCGAPS. See above for more information.

Column Descriptions

Cpid (sexual Cycle data Points IDentifier)

A numeric identifier unique to each row. This is used to reference the sexual cycle transition elsewhere in the database. This column may not be NULL.

This column need not be manually assigned when the row is created. It may not be changed.

Cid (sexual Cycle IDentifier)

A numeric identifier identifying each sexual cycle. It is unique across all cycles of all females, but shared by all CYCPOINTS rows comprising a cycle -- a Mdate, Tdate, Ddate sequence -- of a female.[71]

This column need not be manually specified when the row is created using the CYCPOINTS_CYCLES view. If it is not specified, the system will determine with which cycle the row should be associated and assign the correct Cid. Should the system find that the sexual cycle transition date belongs in a new cycle, it will make and assign a new Cid.[72] If the column is specified the system does the same work, but when it is appropriate to create a new cycle the supplied value is used.

As the system does the same amount of work whether or not the user specifies a value, the only utility in specifying a value is to manually assign a specific Cid to a new sexual cycle which Babase would otherwise automatically create.

Tip

When sexual cycle transition dates are incorrectly aggregated into sexual cycles, i.e. when the Cid is wrong, it is probably because the record of when the female was under observation — the data on the CYCGAPS table — is incorrect. Correcting CYCGAPS may correct the problem.

Caution

The system automatically assigns, or re-assigns, Cid values as CYCPOINTS and, especially, CYCGAPS rows are inserted, deleted, and altered to keep the database in a state consistent with the definition of a sexual cycle. For this reason any particular Cid is not guaranteed to forever identify a particular Sname/Date/Code. Cpids may be used for this purpose, or the data itself. For further information see the CYCLES documentation.

Supplying a NULL value causes the system to recompute the correct value and use it in place of the NULL.

Date

The date-of-record of the transition event. See the Protocol for Data Management: Amboseli Baboon Project for information regarding the determination of this date from the field data. This column may not be NULL.

Edate (Early Date)

Earliest possible date of the transition event. This column may be NULL when there is no need to record a range of date values.

Ldate (Ldate Date)

Latest possible date of the transition event. This column may be NULL when there is no need to record a range of date values.

Source

Code indicating from whence the data were derived. D (Data -- the default) for observed data. A (Auto) for automatically inserted rows (see Automatic Mdate Generation). E (Estimated) for estimated values not to be used in other computations, such as estimated D dates entered to relate mothers and pregnancies.

This column may not be changed after the row is created.

Code

The type of sexual cycle transition:

The CYCPOINTS.Code Values
CodeDescription
Monset of Menses, a sexual cycle transition event
Tonset of Turgesence, a sexual cycle transition event
Donset of Deturgesence, a sexual cycle transition event

This column may not be changed.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

PREGS (Pregnancies)

This table contains one row for each recorded pregnancy. A pregnancy is defined to be an event occurring to a mother; a single pregnancy could result in more than one fetus. The only time there will not be a related BIOGRAPH row for the zygote(s) is when the pregnancy is still in progress[73], otherwise there will always be a BIOGRAPH row that records the progeny of the pregnancy.

The progeny may not be born before being conceived -- the conception date (Ddate via Conceive) of the pregnancy must not be later than the birth date value (Birth) of the associated BIOGRAPH row, the child. The mother may not resume cycling until after birth -- the birth date value of the associated BIOGRAPH row must not be later than the resumption of cycling date values (Resume).

The sequence of a female's pregnancies when ordered by parity must correspond with the sequence when ordered by conception date.

The sequence number (CYCLES.Seq obtained via CYCPOINTS.Cid) of the sexual cycle event immediately following pregnancy (Resume) must always be exactly one more than the sequence number of the sexual cycle event associated with conception (Conceive). Only one pregnancy is allowed per conception event -- each Conceive value differs from all the others. These rules ensure that the resumption date follows the conception date and that there is no overlap of pregnancy time periods, from conception date to birth date or, if known, resumption of sexual cycling date, among the pregnancies associated with a particular female.[74] The female associated with the conception sexual cycle event (Conceive) must be the same as the female associated with the sexual cycle event immediately following pregnancy (Resume).

There must not be a resumption of menses date (Mdate) in the sexual cycle (CYCPOINTS.Cid) of the Resume cycle.

The pregnancy must terminate in a birth or fetal loss before the female resumes cycling; the only exception is cessation of observation as described below. The Resume column must be NULL until there is a row in BIOGRAPH with a Pid referring to the pregnancy.

Note

Note that the check for pregnancy termination, as well as the parity sequence checks, are not performed until the database transaction is committed. This allows a pregnancy discovered after subsequent pregnancies are already on-record to be added to the database by making multiple changes within a single database transaction. Inserting the new PREGS row, inserting a BIOGRAPH row for the progeny, and then updating the PREGS.Resume of the new pregnancy within a single transaction allows the referential integrity rules to be satisfied when the transaction commits.

Caution

Babase keeps a record of the reproductive state of mature females in the sexual cycle day-by-day tables. If these tables are to be correct Babase must know when each pregnancy ends (see BIOGRAPH.Birth), and when cycling resumes. When there is no record of the end of a pregnancy or resumption of cycling Babase must know whether this is due to cessation of observation or just cessation of data entry.

Babase cannot detect when the user has failed to enter rows in CYCGAPS when observation of a pregnant female has ceased. However, it will report errors and unusual conditions it can detect.

The system will report a warning: when an ongoing pregnancy exceeds 191 days -- when there are more than 191 days between the conception date (PREGS.Conceive) and the Statdate, and there are no progeny recorded for the pregnancy (in BIOGRAPH.Birth), and when there are no gaps in observation (see CYCGAPS) during the 191 day interval; when it appears that a conception date should be estimated but it is not -- when there is no Tdate in the conception cycle but the conception Ddate[75] is not estimated, and there is no gap in observation between the conception date and all of the female's prior CYCPOINTS rows.

The system will report an error: when a female has sexual cycles while a pregnancy is ongoing[76] -- when the female has Tdate CYCPOINTS rows that post-date her pregnancy's Conceive date but pre-date gaps in observation, and the pregnancy has no (NULL) Resume.[77] A female must not have any CYCPOINTS rows that postdate a pregnancy with a NULL Resume, unless the first CYCPOINTS row is a Tdate or unless they postdate a gap in observation following the pregnancy.

Warning

The Resume column is automatically updated by Babase. so long as there is no gap in observation (See CYCGAPS) between the conception date and the Tdate that resumes cycling. It is set to the Tdate immediately following the conception date. The system will report an error if there is a gap in the observation of sexual cycle events (CYCPOINTS and the Resume column is not NULL.[78]

Tip

The temporary creation of a gap in observation (CYCGAPS) allows a conception-birth-resumption sequence to be inserted into a pre-existing series of sexual cycle events (CYCPOINTS).

Column Descriptions

Pid

The contents of this column uniquely identifies the pregnancy record. The Pid must be the mother's Sname followed by the probable parity. Because the Pid is only used to identify the record, it is not necessary to change the Pid just because the parity of the pregnancy is found to have changed. Once a unique Pid is established, it may not be changed. When retrieving data from this table the safe approach is to assume nothing about the contents of this column except that it will uniquely identify a pregnancy.

Note

The preferred way to obtain the bearer of the pregnancy is to find the female associated with the ovulation by joining PREGS.Conceive with CYCPOINTS.Cpid to find CYCPOINTS.Cid, join that with CYCLES.Cid to find CYCLES.Sname, and then use that value to find the mother's BIOGRAPH row.[79][80]

Warning

The Parity column must always be used to obtain a meaningful parity value. As Pids cannot change, should a pregnancy be missed and correction only entered into Babase after the entry of a subsequent pregnancy, the female's subsequent Pid will forever contain an incorrect parity.[81]

Parity

The cardinality of the pregnancy. 1 for a female's first pregnancy, 2 for a female's second pregnancy, and so forth. There must not be gaps in the pregnancies, sequenced by Parity, of any female. When the first pregnancy is known, the Parity sequence begins with 1. When the first pregnancy is not known, the Parity sequence begins with 101.

The parity of a female's first pregnancy must be specified. This tells the system whether the parity sequence begins with 1 or 101. The system will automatically generate the parity of subsequent pregnancies, when the user does not supply a parity. When the user does specify a parity the system compares the supplied value with the value it computes for the column and and raises an error if the two do not match. As a special exception the parity is allowed to be in the 100s rather than the 1s, although the parity must remain sequential and without gaps when only the 10s and 1's place of the female's pregnancy parities are considered. E.g. the parity sequence may be either 1, 2, 3 or 1, 2, 103 but not 1, 2, 104. The 1 in the 100ths place signals that there has been a period of no observation[82] and a pregnancy may have been missed. When a pregnancy's parity is changed from the 1's (or 10's) to the 100s Babase will update the parity of subsequent pregnancies so that they are also in the 100s. Babase will only allow a change from the 100s to the 1s (or 10s) of the smallest of a female's pregnancy parities that are larger than 100 -- the first pregnancy after a period of no observation. In this case Babase will not change the parity of subsequent pregnancies; this must be done manually, from smallest to largest. Babase will not allow a change from the 100s to the 1s (or 10s) of a female's pregnancy parities that are larger than the smallest parity larger than 100.

Supplying a NULL value for the Parity causes the system to recompute the correct value, a value one larger than the parity of the previous pregnancy, and use it in place of the NULL.

Conceive

The information related to the Ddate event that initiated the pregnancy. This is the Cpid of a CYCPOINTS row of the mother. The related CYCPOINTS row should record the date of conception and must record a Ddate.

This column must contain a unique datum.

Tip

When the date of conception is estimated because there is no sexual cycle data, the conception date recorded should be 178 days before the recorded birthday.

This column must not be NULL.

Resume (NULL allowed)

The resumption of cycling event (Tdate) of the first cycle following the pregnancy. This is the Cpid of a row in CYCPOINTS, which must record a Tdate. This column may be NULL in those cases when resumption of cycle information is not known. When this column is not NULL, it should contain a unique datum.

This column may be automatically updated. (See the description of the PREGS table above.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

REPRO_NOTES (Textual NOTES about REPROduction)

This table records textual notes made by field observers about a female's reproductive status. It contains one row for every date on which such a note was recorded, per female.

Most of the data related to a female's reproduction is recorded systematically by observers and stored in the other tables in this section. In addition to those data, observers occasionally record miscellaneous notes or comments related to a female's reproductive state. Those notes are recorded in this table.

This table only records notes about female reproduction; the Sname must be female in BIOGRAPH.

All notes made about a female on a single day are recorded in a single row; every Sname-Date pair must be unique.

Reproductive notes can only be recorded while the female is alive and under observation; the Date must be between the female's Entrydate and Statdate, inclusive.

It is rare but possible for a note to be recorded before a female reaches sexual maturity. The system will return a warning for rows that are before a female's Matured date, or for rows with females who do not appear in MATUREDATES at all.

Usually, if an observer took the time to write a note about a female, then they also will have recorded the size and/or color of her paracollosal skin. The system will return a warning if a female does not have a row in SEXSKINS whose Date matches the note's Date.

Tip

The SEXSKINS_REPRO_NOTES view is useful for simultaneous uploading of data to this table and to SEXSKINS.

Column Descriptions

RNId (Reproductive Note Identifier)

A unique identifier for the note. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Sname

The BIOGRAPH.Sname of the related individual.

This column may not be NULL.

Date

The date that this note was recorded.

This column may not be NULL.

Note

The text of the note.

This column may not be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

SEXSKINS (Sexskin Turgesence Measurements)

This table records information on the females' sexskins, including size and/or color. It contains one row for every recorded observation of each female's sexskin.

Babase requires sexskin measurements be associated with sexual cycles (CYCLES) in accordance with the rules described in the Sexual Cycle Determination section.

Caution

Because sexskin measurements must be related to a female's sexual cycle (a CYCLES row), her Mdate, Tdate, and Ddate sexual cycle events (her CYCPOINTS rows) must be updated before sexskin information may be entered.

Tip

Use the CYCLES_SEXSKINS, SEXSKINS_CYCLES, or SEXSKINS_REPRO_NOTES views to maintain this table.

Note

The checks that compare all the sexskins of a particular cycle raise their errors immediately when the error is a result of changes made directly to the SEXSKINS table. But, should an error condition be created as a result of automatic shifting of sexskins between cycles due to changes to the sexual cycle dates (See CYCPOINTS) the errors are not immediately reported.

Tdates normally occur at some point during the transition from sexskin Size 0 to Size 1, but can occur during the transition from sexskin Size 0 to Size 5. Measurements larger than 5 cannot come on or before the Tdate of the cycle. The system will generate a warning when there is sexskin measurement larger than 1 before the Tdate. The Tdate of a cycle must be after the dates of all the cycle's sexskin measurements of zero that precede the earliest 1 or greater measurement occurring in the cycle.

A Ddate occurs when the sexskin begins to deturgesce. The Ddate of a cycle must be after the last measurement before the largest measurement of the cycle.[83] The system will report a warning when Ddates occur after sexskin turgesence has begun to subside -- Ddates after the first measurement following the largest sexskin measurement(s) of the cycle.

Sexskin turgesence normally begins after menses, so sexskin measurements (the Size) before the related cycle's Mdate cannot be larger than 0. When the Size is greater than 0 and there is no Mdate in the sexual cycle to which the SEXSKINS row is assigned, the system will generate an error unless the sexual cycle's Tdate falls on the individual's MATUREDATES.Matured date and the maturity date is an ON date[84], or the cycle is the first after a pregnancy (the Cid is a PREGS.Resume value), or the cycle's first CYCPOINTS row after a (CYCGAPS) gap is 30 or fewer days after that gap's end date. In the latter case the system will generate a warning. The sexskin measurement on the Mdate cannot be larger than 1, unless the Mdate is also a Tdate in which case the measurement cannot be larger than 5. The system will generate a warning when the sexskin measurement on the Mdate is larger than 0.

In constrast with the Size column, the Color column has no rules governing which values are allowed during different stages of a cycle.

Sexskin rows associated with one cycle must not be contemporaneous with Mdates, Tdates, Ddates, or sexskin turgesence observations related to a different cycle. All of the SEXSKINS Date values associated with a particular cycle must be later than the Mdate, Tdate, and Ddate of the previous cycle and earlier than the Mdate, Tdate, and Ddate of the succeeding cycle. There must not be any overlap of the cycles' sexskin measurement dates, over the time period from a cycle's earliest sexskin measurement date to its latest, between the sexskin measurement dates of a female's different cycles.

Sexskin observations cannot occur during gaps in observation. That is, each row's Date cannot be during any of the individual's gap periods in CYCGAPS. However, there is an exception: sexskin observations are allowed on the date of "point" observations in CYCGAPS.

The combination of Sname, from the associated CYCLES row, and Date must be unique.

The combination of Date and Cid must be unique.

Usually the observer records both the size and the color on a date, but occasionally they might only one record one and not the other. Because of this, the system allows either of the Size and Color columns to be NULL, but will also return a warning in this case. It is an error if both of those columns are NULL.

Column Descriptions

Sxid (Sexskins IDentifier)

A unique integer which identifies the SEXSKINS row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Cid (Cycle IDentifier)

The CYCLES identifier associated with the sexskin measurement. This is a Cid from the CYCPOINTS table. This column can be used to retrieve the Sname of the female that was measured as well as all other data collected on the cycle.

This column is automatically assigned by the system. Although some (arbitrary) cycle must be associated with the SEXSKINS row upon insert in order to relate the row to a female, the system always uses the Sexual Cycle Determination rules to re-assign the row to the appropriate cycle.

This column may not be NULL.

Date

The date of the observation. This date must be after the individuals Birth date. The date must not be after the individual's Statdate. This column may not be NULL.

Size

This column contains a number indicating the size of the sexskin in a metric with units that are integers, with the exception that 0.5 value is allowed, ranging from 0 through 20, inclusive.

This column may be NULL, but only when the Color is not NULL.

Color (ParaCallosal Skin color)

A PCSCOLORS.Color code indicating the observed paracallosal skin color.

This column may be NULL, but only when the Size is not NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

Social and Multiparty Interactions

ALLMISCS (Ad-libitum sample data)

One row for every unstructured data collection event recorded during all-occurrences protocols. The ALLMISCS row containing data collected during a particular sample is related to the SAMPLES row representing the sample. Samples do not have a fixed number of related rows on ALLMISCS, any particular sample may have one, none, or many. Further information may be found on SAMPLES.

A variety of ad-libitum data may be collected during sample data collection. Some of these ad-libitum data can be placed in the INTERACT_DATA and POINT_DATA tables, in which case ALLMISCS is not involved. The data that does not conform to the design of INTERACT_DATA and POINT_DATA is kept in the ALLMISC table.

Note

Consortships recorded as ad-libitum data during focal point sampling are not stored on INTERACT_DATA because INTERACT_DATA requires that consortships have a starting and an ending time and data collected during focal point sampling is without duration. Such consortship data are stored as an ALLMISCS row. Babase presumes that all consortships are recorded systematically during the day on paper and entered into Babase and so it is not necessary to attempt to place ad-libitum consortship data recorded during focal sampling into INTERACT_DATA. . Consortship data are collected during focal samples in order to note whether focal animals are engaged in consortships during a particular sample, and not to record the consortship per se.

Note

Mounts involving the focal individual during all-occurrences sampling are recorded both in the focal sample data and on the paper field ad-libitum records. Consequently, to avoid duplicates in INTERACT_DATA, Babase stores the mounts recorded in the focal data in the ALLMISCS table, but not the INTERACT_DATA table. Mounts in the ALLMISCS table are therefore redundant and may be ignored.

Warning

Babase does the same thing with ejaculations recorded in the focal data as it does with mounts: it records them in ALLMISCS rather than INTERACT_DATA. However, the protocol says nothing about ejaculations occurring during all-occurrences sampling. Anyone researching ejaculations will need to investigate this further.

For further information regarding the information collected see the Amboseli Baboon Research Project Monitoring Guide. For further information regarding which ad-libitum data winds up in ALLMISCS see the Protocol for Data Management: Amboseli Baboon Project. For further information on the structure of the ad-libitum text that is eventually stored in ALLMISCS, see the documentation for the focal sampling data collection program, or see the Amboseli Baboon Research Project Monitoring Guide if the focal sampling data were handwritten.

The combination of Sid and Time must be unique.

Almid (Allmiscs IDentifier)

A unique integer which identifies the ALLMISCS row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Sid (Sample Identifier)

The focal point data set in which the data were collected. (See SAMPLES.Sid.)

Atime (time)

The time the ad-libitum data were taken. This column stores the time using a data type having a precision of one second but the precision and accuracy of the data values are dependent upon the focal data collection system's timekeeping, the operator, and the protocol and is surely not one second. Consult the Amboseli Baboon Research Project Monitoring Guide.

The time may not be before 05:00 and may not be after 19:00.

Txt (unstructured Text)

The unstructured ad-libitum information collected.

At present the text in this column actually does have some structure[85] but appears in ALLMISCS because Babase contains no other place suitable for the storage of the data. The text begins with a one letter code followed by a comma. The allowed one letter codes and their meaning are:

C

Consortship. This is redundant information. Because consortships happen over time these consortships should always also be independently recorded and therefore independently entered into INTERACT_DATA and PARTS.

U

Unknown. This was once reserved for meta-information -- the field data collection team's comments on the process of data collection -- but its meaning has since become confused with the O code.

O

Other. Other information about the baboons or their environment. Its meaning has become confused with the U code.

For further information see the Amboseli Baboon Research Project Monitoring Guide.

This column may not be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

CONSORTS (multiparty disputes over CONSORTshipS)

One row for every MPIS row (multiparty interaction) involving a consortship. This table extends the MPIS table to include information about consortships.[86]

Mpiid (Multiparty Interaction IDentifier)

A unique integer which identifies the MPIS row -- the multiparty interaction.

Because the CONSORTS table extends the MPIS table, the two tables have a one-to-one relationship, this value also uniquely identifies the CONSORTS row.

The value of this column may not be changed.

Female

The disputed female. A BIOGRAPH.Sname of a female.

This column may be NULL when the consorted female is unrecorded.

Had

The male who consorted with the female prior to the multiparty interaction. A BIOGRAPH.Sname of a male.

This column may not be NULL.

Got

The male who consorted with the female after the multiparty interaction. A BIOGRAPH.Sname of a male.

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

FPOINTS (Point data on Females)

This table contains data from focal sampling points during which the observer recorded information about the focal individual's infant, one row for each focal sampling point. Any focal sampling protocol that includes recording this kind of information is almost certainly going to require that the focal individual be a female, hence the "F" in this table's name.

Note

Despite its name, this table does not require that a focal individual be of any particular Sex. Requirements like those are set and enforced by the STYPES.Sex column.

Whether or not a focal sample is allowed to have data in this table is determined by the sample's SAMPLES.SType and that SType's related STYPES.Has_FPoints value. See the STYPES table for more information.

Each FPOINTS row is connected to a POINT_DATA row via the Pntid column. That is, each row in this table must have exactly one row in POINT_DATA with the same Pntid. The system will report a warning for those POINT_DATA rows that belong to a sample whose SType's related Has_FPoints is TRUE but which do not have a related FPOINTS row. While every FPOINTS row must have a related row in POINT_DATA, not every POINT_DATA row has a related FPOINTS row.

Tip

Because every FPOINTS row must have a related POINT_DATA row, when entering a point the POINT_DATA row must be entered before the FPOINTS row.

Pntid (Point Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row. Pntid links FPOINTS with POINT_DATA in a one-to-one manner.

This column may not be NULL.

Kidcontact (female/infant position)

The position of the infant with respect to the focal female. The legal values for this column are defined by the KIDCONTACTS support table.

This column may not be NULL.

Kidsuckle (Suckling activity)

The suckling activity of the infant. The legal values for this column are defined by the SUCKLES support table.

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

INTERACT_DATA (Interactions)

This table contains a row for every recorded interaction between animals, including all-occurrence data taken during focal point samples but excluding multiparty interactions (MPI_DATA). Each row records when the interaction occurred. Further information on the interaction is stored elsewhere, notably PARTS. Each interaction in INTERACT_DATA is represented as though it occurs between two ordered individuals designated actor and actee -- thus resulting in two rows in the PARTS table.

Warning

The INTERACT view should always be used in place of this table. (See Views for the rationale.) INTERACT is an extension of this table which may be useful. It is identical to INTERACT_DATA but is extended with alternate representations of dates and times.

Tip

The ACTOR_ACTEES view provides a way to view interactions as a single rows.[87]

The actual Date of an interaction is usually known. However, in some cases only the year and month of an interaction were recorded without specifying the day. The specificity (or lack thereof) of the Date is indicated by the boolean Exact_Date column. When Exact_Date is FALSE, this indicates that the year and month of the Date are known, but not the day. In these cases, the Date must be the first day of the month.

The Date of the interaction is constrained by various related dates of its participants, as follows:

  • The Date cannot be before a participant's Entrydate, with one exception. When Exact_Date is FALSE the Date can be before the participant's Entrydate, but the month and year of the Date cannot be before those of the participant's Entrydate.

  • The Date cannot be after a participant's Statdate.

  • A female may not participate in a mount, consortship, or ejaculation interaction before menarche (MATUREDATES.Matured). When Exact_Date is FALSE the Date may be before her Matured date, but the month and year of the Date cannot be before those of her Matured date.

  • A male may not participate in a mount, consortship, or ejaculation interaction before 4 years of age[88]. When Exact_Date is FALSE the Date may be before he reaches that age, but the month and year of the Date cannot be before the month and year in which he reaches that age.

  • The system will return a warning when the Date is before the LatestBirth of either participant in the interaction. When Exact_Date is FALSE, the Date may be before the LatestBirth but the month and year of the Date cannot be before the month and year of the LatestBirth.

Many rules surrounding INTERACT_DATA's values are closely tied to the project's data collection protocols. There are two sorts of data collected on behavioral interactions: all-occurrences data and ad-libitum data. All occurrences data are collected only during focal animal samples. They are data on all the occurrences of a particular behavior or interaction during a given time interval and/or involving a participating focal individual.[89] All occurrences data will always have an INTERACT_DATA.Sid that is not NULL. Ad-libitum data are data that are collected opportunistically at the will of the observer; we do not assume that ad lib data capture all the occurrences of a given behavior. Ad-libitum data, which generally are not collected as part of focal animal samples, usually have a NULL Sid value (only those collected during a focal animal sample have a non-NULL Sid). Some sorts of interactions are only collected during focal sampling and not as ad libitum data outside of focal samples. Approach (ACTS.Class = P), and request to groom (ACTS.Class = R) are these interactions; they are only collected during all-occurrences sampling and must have a non-NULL Sid. Although consortship and mount[90] data are collected as all-occurrences data during focal point samples, these data are also collected, simultaneously and in more detail, in ad libitum notes. Consequently, they appear in Babase as ad libitum data in INTERACT_DATA, not as all occurrences data, and consortships (ACTS.Class = C), mounts (ACTS.Class = M), and ejaculation (ACTS.Class = E) rows always have a NULL Sid.

Tip

An individual's all-occurrences interactions can be distinguished from ad-libitum data by using the Sid column to reference SAMPLES to see if the individual is the focal of an all-occurrences sample. An example is presented in Appendix B.

Note

INTERACT_DATA rows having a related SAMPLES row, having a non-NULL Sid[91], will automatically have an Observer value equal to the value in the related SAMPLES.Observer column -- the system automatically synchronizes observer values between related INTERACT_DATA and SAMPLES rows. Such automatically assigned values cannot be changed. To change the observer the SAMPLES.Observer column must be changed.

Caution

Care must be taken when breaking a relationship between INTERACT_DATA and SAMPLES, when setting INTERACT_DATA.Sid to NULL. The automatically assigned INTERACT_DATA.Observer value may no longer be correct and so may require manual adjustment.

An INTERACT_DATA row with a NULL Sid and a non-NULL Observer cannot be updated with a non-NULL Sid unless the Observer value is also set to NULL -- manually assigning an observer to an ad-lib interaction precludes relating the interaction to a focal point sampling period. Setting Observer to NULL when changing Sid to a non-NULL value causes the system to automatically assign the correct value to Observer -- causes the system to automatically synchronize observers.[92] Likewise, an INTERACT_DATA row with a non-NULL Sid cannot be inserted unless the Observer value is either NULL or matches that of the related SAMPLES.Observer value -- new focal sample interactions must be consistent with respect to the observers recorded in the INTERACT_DATA and SAMPLES tables. When an INTERACT_DATA row with a non-NULL Sid and a NULL Observer value is inserted then the Observer value is automatically updated with the related SAMPLES.Observer value -- again, the observer associated with the interaction is automatically brought into sync with the focal sample.

INTERACT_DATA encodes interaction time and duration by storing the start and stop times of the interaction. The columns Start and Stop are used for this purpose. Consortships may have a NULL in either the Start or the Stop time when the respective value is unknown, otherwise the Start time must precede the Stop time. Ad-libitum sample agonism and grooming interactions (ACTS.Class values of A and G respectively) must have a NULL in both the Start and Stop columns. All-occurrences agonism, grooming, approach (ACTS.Class = P), and request to groom (ACTS.Class = R) interactions must have non-NULL Start times that equal Stop times. Start always equals Stop for mounts (ACTS.Class = M) and ejaculations (ACTS.Class = E).

The columns of this table that contain times, Start and Stop, are stored using a data type that has a precision of 1 second. The Amboseli Baboon Research Project Monitoring Guide must be consulted regarding the precision and accuracy of these data. It is expected that ad-libitum datum is entered with a 1 minute precision.[93] Consequently the seconds portion of the time values must always be 0 when Sid is NULL. All-occurrences interaction data (Sid is not NULL) do contain seconds.[94]

When more than one observer is with a group at the same time, they are responsible for making sure that each interaction is only recorded in only one notebook, not duplicated across multiple observers' notebooks (see the Amboseli Baboon Research Project Monitoring Guide for more details). For this reason, it should be emphasized that the Observer column only indicates who recorded this row's interaction (when known), not who actually saw it.

The system will report a warning for interactions which occur between individuals who are not in the same group on the date of the interaction.

Column Descriptions

Iid

A positive integer that uniquely identifies the interaction. This number is assigned by the system. This column must not be NULL.

Sid (all-occurrences Sample IDentifier)

The origin of the data. When the interaction data were collected during all-occurrences sampling this column holds a SAMPLES.Sid identifying the all-occurrences sample during which the data were collected, otherwise this column is NULL.

Act (kind of interaction)

A code indicating the kind of interaction. The ACTS support table defines the legal values for this column.

Note

Although Act contains ACTS.Act values, it is often the broader ACTS.Class classification that is of interest.

This column may not be NULL.

Date

The date on which the interaction took place. This column may not be NULL.

Start (interaction Starting time)

The time the interaction began or, in the case of all-occurrences data, the time the interaction was recorded in the field.

The data type of this column has a 1 second precision. The precision and accuracy of the data itself is dependent upon the protocol and the operator and is almost surely not 1 second. Consult the Amboseli Baboon Research Project Monitoring Guide.

The time may not be before 05:00 and may not be after 20:00.

This column may be NULL.

Stop (interaction ending time)

The time the interaction stopped or, in the case of all-occurrences data, the time the interaction was recorded in the field.

The data type of this column has a 1 second precision. The precision and accuracy of the data itself is dependent upon the protocol and the operator and is almost surely not 1 second. Consult the Amboseli Baboon Research Project Monitoring Guide.

The time may not be before 05:00 and may not be after 20:00.

This column may be NULL.

Observer

The OBSERVERS.Initials of the person who recorded this interaction.

This column may be NULL.

Handwritten

A boolean indicating whether or not the observer recorded the interaction by hand[95] . This value is TRUE if yes, FALSE if no.

This column may not be NULL.

Exact_Date

A boolean indicating whether or not the Date is the specific date of the interaction.

This column defaults to TRUE, and cannot be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

MPIS (Multiparty InteractionS)

One row for each collection of multiparty interactions.

Multiparty interactions are recorded as an ordered series of dyadic interactions. Each complete series has a single MPIS row in the database.

Note

This is a separate data set from the dyadic interactions recorded in INTERACT_DATA and related tables. Interactions appearing there do not appear in the multiparty interaction data, or vice versa.

The date of the multiparty interaction must be between the Entrydate and Statdate, inclusive, of all the participants. The system will return a warning for each participant whose LatestBirth is after the date of the interaction.

The two participants in the dyadic interactions must be different individuals, the two MPI_PARTS.Snames must be different.

The Context column must be NULL when the Context_type value is N, no context.

The Context_type column must be C (Consortship) and the Context column must be NULL when a related CONSORTS row exists. The system will generate a warning when the Context_type column is C and there is no related CONSORTS row.

Mpiid (Multiparty Interaction IDentifier)

A unique integer which identifies the MPIS row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Date

The date the interactions occurred.

Context_type

Multiparty interactions may be categorized by the context in which they occur. This column identifies the context of the multiparty interaction.

The legal values of this column are defined by the CONTEXT_TYPES support table. This column may not be NULL.

Context (Unstructured text)

Unstructured text describing the context in which the multiparty interaction occurred.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

MPI_DATA (Multiparty dyadic Interactions)

Multiparty interactions are recorded as collections of individual dyadic interactions. This table contains one row for every dyadic interaction of a multiparty interaction collection. Each interaction is represented as though it occurs between two ordered individuals designated actor and actee -- these individuals are recorded in the MPI_PARTS table. The dyadic interactions within the collection are time-wise sequenced. Two rows may have the same sequence number (Seq), indicating that the two interactions occurred simultaneously.

Tip

The MPI_EVENTS view provides a convenient way to view multiparty interactions as single rows.

Caution

Babase records little in the way of causality among the various interactions collected together under the multiparty interaction collection umbrella. At the time of this writing the data protocols require that the initial interaction is a kind of agonism or a kind of help request, so that can be considered causal of the remaining interactions. However there is nothing, other than time-wise sequencing, linking particular requests for help with aid supplied. As a result it is impossible, in the general case, to associate help supplied with help requested. For example, an individual may request help twice, from two different individuals, and then receive help from an third individual. The columns recording the results of help requests (Helped and Active) must therefore be used with caution, as must any attempt to correlate the specifics of help given with help requested.

Multiparty interactions which occur simultaneously must have the same MPIAct values.

The system will generate a warning when more than two MPI_DATA rows, sharing a Mpiid, have the same Seq value -- when there are more than two dyadic interactions occurring simultaneously.

The first interaction of a multiparty interaction (those with a Seq of 1) must be an agonism or a request for help, the MPIAct value must be that of an MPIACTS row having a Kind value of A or R.

The first interaction of a multiparty interaction collection is expected to be a single dyadic interaction unless otherwise allowed by the MPIACTS table -- the first interaction of a multiparty interaction collection may only occur simultaneously with another interaction, the two dyadic interactions both having a Seq of 1, when all of these initial interactions have MPIAct values that relate the rows to MPIACTS rows having TRUE Multi_first values.

The Helped and Active columns are meaningful when the MPI_DATA row records a request for help.[96] These columns must be NULL when the MPI_DATA row does not record a request for help, otherwise they must not be NULL. The system will generate a warning when the Helped column indicates that no help was given but there are subsequent interactions which record help being given (where the MPIAct values have H MPIACTS.Kind values) to the individual who requested help. The system will generate a warning when Active is TRUE and there are no subsequent AH interactions where the help-requestee is the recipient of help in the same multiparty interaction collection. The system will generate a warning when Helped is true and Active is FALSE and there are no subsequent PH interactions where the help-requestee is the recipient of help in the same multiparty interaction collection.

Mpidid (Multiparty Interaction Data IDentifier)

A unique integer which identifies the MPI_DATA row, and thereby the interaction the row records.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Mpiid (Multiparty Interaction IDentifier)

A number identifying the multiparty interaction collection (MPIS) of which the MPI_DATA interaction is a member.

This column cannot be changed and must not be NULL.

MPIAct (Multiparty Interaction Act code)

This column records the kind of interaction which took place. The legal values for this column are defined by the MPIACTS support table.

This column may not be NULL.

Seq (Sequence)

The first interaction of each multiparty interaction collection has a Seq value of 1, the second a value of 2, etc. The system will report an error if the Seq does not begin with 1 or is not contiguous.

Note

The Seq values need not be unique, per Mpiid. Duplicate sequence numbers are used to indicate simultaneous interactions, as would happen if, e.g., 2 individuals aggressed against 1.

This column may not be NULL.

Helped

This column indicates whether help was given, by the individual from whom help was requested, in response to a request for help. Helped must be FALSE when help was requested from an unknown individual.[97]This column contains meaningful information only for those MPI_DATA rows which record requests for help. (See above.)

This column is TRUE when help was given and FALSE when no help was forthcoming.

This column may be NULL.

Active

This column indicates whether help given was active or passive. It contains meaningful information only for those MPI_DATA rows which record requests for help. (See above.)

This column is TRUE when the help supplied was active and FALSE when either the help supplied was passive or when no help was supplied. This column is NULL when the MPIAct value represents an action other than a request for help.

Caution

When looking for help requests that received passive help always check the Helped value to be sure that help was actually received.

This column may be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

MPI_PARTS (Multiparty Interaction PARTicipantS)

This table contains records of participants in the interactions which make up a multiparty interaction collection (MPIS). Each interaction is represented as though it occurs between two individuals designated actor and actee. Interactions between multiple individuals are broken down into interactions between pairs according to rules described in the protocols. Therefore, this table should contain two rows for every record of an interaction (for every row in MPI_DATA), one row to record the actor, and one to record the actee. Rules for classifying individuals as actor or actee are documented below in the description of the Role column.

Tip

The MPI_EVENTS view provides a convenient way to view multiparty interactions as single rows.

Warning

Every MPI_DATA row should be related to exactly two MPI_PARTS rows, otherwise it is an error. However, the system allows this condition to exist. It is presumed that such an error condition will exist for only as long as it takes to enter a complete set of data. The system will report those cases where there are not exactly two MPI_PARTS rows for every MPI_DATA row.

Caution

The data integrity rules require that the MPI_DATA row be entered before the 2 MPI_PARTS rows.

Either the Sname or the Unksname column must be NULL, but not both.

The actor and the actee of an interaction, when specified as Snames, must not be the same individual.

Mpipid (Multiparty Interaction Participant IDentifier)

A unique integer which identifies the MPI_PARTS row, and thereby the participant in the interaction the row records.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Mpidid (Multiparty Interaction Data IDentifier)

Multiparty interaction identifier. This column holds the Mpiid value of the row on the MPI_DATA table containing further information on the interaction in which the animal is a participant. It can be used to retrieve the other information recorded on the multiparty interaction. There must be a row in MPI_DATA with an Mpiid of this value. This column cannot be changed and may not be NULL.

Sname

A three-letter code (an id) that uniquely identifies a particular animal (an Sname) in BIOGRAPH. This code can be used to retrieve information, such as the maternal group of the animal, from BIOGRAPH or other places where the animal's three-letter code appears.

This column must not be NULL when the participating individual is precisely identified and NULL otherwise.

Unksname (Unknown neighbor code)

The nature of the problem when one of the participants in the interaction cannot be precisely identified. The legal values of this column are defined by the PARTUNKS support table.

This column must be NULL when the participating individual is precisely identified and not NULL otherwise.

Role

This column designates whether the row records the actor or the actee of the interaction. The two possible values are:

The MPI_PARTS.Role Values
CodeMnemonicDefinition
RActorThe actor is usually the one performing the act. For the agonism data, the individual that is the winner (does not perform a submissive behavior) is the actor. For help requests, the individual that is requesting the help is the actor. For help supplied, the individual supplying the help is the actor. For grooming data, the individual that is grooming is the actor. And so forth.
EActeeThe actee is usually the one that is the recipient of another animal's attentions. For the agonism data, the individual that is the loser (performing a submissive behavior) is the actee. For help requests, the individual of whom help is requested is the actor. For help supplied, the individual to whom the help is supplied is the actor. For grooming data, the individual that is groomed is the actee. And so forth.

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

PARTS (Participants in interactions)

This table contains records of the participants in observed interactions between animals. Each row in the table records a participant. Each interaction is represented as though it occurs between two individuals designated actor and actee. Interactions between multiple individuals are broken down into interactions between pairs according to rules described in the protocols. Therefore, this table should contain two rows for every record of an interaction (for every row in INTERACT_DATA), one row to record the actor, and one to record the actee. Rules for classifying individuals as actor or actee are documented below in the description of the Role column.

Caution

Every INTERACT_DATA row must be related to exactly 2 PARTS rows, excepting those INTERACT_DATA rows that are associated with ad-lib focal point sampling -- those that have non-NULL Sid values. Ad-lib interactions collected during focal point sampling are allowed to have only one participant, but only when that participant is the focal individual. So that data can be entered the system allows these error conditions to exist while a transaction is in progress. These conditions are validated on transaction commit.

Caution

The data integrity rules require that the INTERACT_DATA row be entered before the 2 PARTS rows.

Tip

The utility in the PARTS table, as opposed to having single rows for interactions as the ACTOR_ACTEES view does, is in writing database queries that search for interaction participants. It is easy to use PARTS to search for a participant without knowing whether the participant is the actor or the actee. The same is not true of the ACTOR_ACTEES view.

Note

It is easy to produce the ACTOR_ACTEES view from INTERACT_DATA and PARTS, but the reverse would not be true. This is why the underlying database representation is as it is and not the reverse.

The actor and the actee of an interaction must not be the same individual.

Partid (Parts IDentifier)

A unique integer which identifies the PARTS row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Sname

A three-letter code (an id) that uniquely identifies a particular animal (an Sname) in BIOGRAPH. This code can be used to retrieve information, such as the maternal group of the animal, from BIOGRAPH or other places where the animal's three-letter code appears. This column may not be NULL.

Role

This column designates whether the row records the actor or the actee of the interaction. The two possible values are:

The PARTS.Role Values
CodeMnemonicDefinition
RActorThe actor is usually the one performing the act. For grooming data, the individual that is grooming is the actor. For the agonism data, the individual that is the winner (does not perform a submissive behavior) is the actor. For mounts, consortships, and ejaculations, the male is the actor.
EActeeThe actee is usually the one that is the recipient of another animal's attentions. For grooming data, the individual that is groomed is the actee. For the agonism data, the individual that is the loser (performing a submissive behavior) is the actee. For mounts, consortships, and ejaculations, the female is recorded as actee.

This column may not be NULL.

Iid (Interaction identifier)

Interaction identifier. This column holds the Iid value of the row on the INTERACT_DATA table containing further information on the interaction in which the animal is a participant. It can be used to retrieve the other information recorded on the interaction. There must be a row in INTERACT_DATA with an Iid of this value. This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

POINT_DATA (Point observation data)

One row for every point observation collected on a focal individual during a sampling interval. When, for whatever reason, there are no point data collected on the focal individual at the turn of the minute, there is no row on POINT_DATA. The position of the points within the sample, Min value, may therefore contain gaps -- missing numbers. The missing numbers are points taken when the focal animal is out of sight or the point was missed for whatever reason. Babase represents the observational period during which a sample is collected as a SAMPLES row.

Warning

Always use the POINTS view in place of this table (see Views for the rationale.) It contains additional computed columns which may be of interest and is guaranteed to remain consistent in future Babase releases.

A POINT_DATA row must contain a Foodcode when the Activity column indicates the focal is feeding, otherwise Foodcode must be NULL.

Consistency is enforced with respect to time taken to collect the sample and the number of point observations. The Min value must not be larger than the Mins of the corresponding sample.

Validation of the Activity and Posture columns partially depends on the row's related SAMPLES.SType. The STYPES_ACTIVITIES and STYPES_POSTURES tables define which SType values can be used with which Activity and Posture values, respectively.

Changing the Sid risks data integrity issues that are not easily prevented with simple data checks, especially with the calculating of Minsis. Because of this, the Sid can only be changed by an administrator or superuser.

Pntid (Point observation Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and is used in other tables to refer to particular points.

This column may not be NULL.

Sid (Sample Identifier)

The SAMPLES.Sid of the focal sample during which this point was collected.

This column may not be NULL.

Min (Point number within the sample)

The ordinal number of the point within the sample. The first point in the sample has a Point value of 1, the second a Point value of 2, etc. Note that these numbers need not be contiguous since some points are lost during data collection. (See above.)

This column may not be NULL.

Ptime (Point observation Time)

The time the point was recorded. This column stores the time using a data type having a precision of one second. The precision and accuracy of the data values are dependent upon the focal data collection system's timekeeping, the operator, and the protocol and is surely not one second. Consult the Amboseli Baboon Research Project Monitoring Guide.[98]

Warning

It is unlikely that the researcher is interested in this data because, as of January 2006, the field protocols require no particular relationship between the time of the point and the time the observer records the data.

The time may not be before 05:00 and may not be after 19:00.

This column may not be NULL.

Activity

The ACTIVITIES.Activity of the individual when the point was taken.

Some values from ACTIVITIES may be restricted, based on the sampling protocol. See STYPES_ACTIVITIES for more information.

This column may not be NULL.

Posture

The POSTURES.Posture of the individual when the point was taken.

Some values from POSTURES may be restricted, based on the sampling protocol. See STYPES_POSTURES for more information.

This column may not be NULL.

Foodcode (May be NULL)

Food item eaten when the point was taken, if any. NULL when no food items are eaten. The legal values for this column are determined by the FOODCODES support table.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

NEIGHBORS (point observation data on Neighbors)

The neighbors of the focal individual are recorded during point sampling. NEIGHBORS contains one row for every neighbor recorded during a point data collection event (minute).

When no neighbor is observed for a particular neighbor type (Ncode), no new rows are added to this table. This is different from how an "unknown" neighbor is recorded, as discussed below.

A focal individual's neighbors are not always recognizable or for some other reason do not always have a row in BIOGRAPH. For this reason NEIGHBORS contains two different columns used to identify the neighbor, Sname and Unksname. The first for recording known neighboring individuals and the second for recording unknown neighboring individuals. One and only one of these columns must contain a value, the other column must then contain NULL.[99]

The system will report a warning when the neighbor is not in the same group as the focal individual.

The neighbor must be alive and in the study population on the day of the sample (SAMPLES.Date, as discovered via POINT_DATA.Sid) -- the day of the sample may not be before the neighbor's Entrydate, and may not be after the neighbor's Statdate.[100] This means that the demographic information for a particular time interval must be entered into Babase before the sample data for that interval.

The system will report a warning when the related Date is before a neighbor's LatestBirth.

Each point observation (Pntid value) may have at most one NEIGHBORS row of a given neighbor classification (Ncode value.) The combination of Pntid and Ncode must be unique.

The NCODES table places restrictions on which individuals can be neighbors. One effect of this is to limit the order in which NEIGHBORS may be added to and deleted from Babase.

The sample's focal individual (SAMPLES.Sname, as discovered via POINT_DATA.Sid) may not be her own neighbor.

The combination of Pntid and Sname must be unique.

Validation of the Ncode column partially depends on the row's related SAMPLES.SType. The STYPES_NCODES table defines which SType values can be used with which Ncode values.

Nghid (Neighbors IDentifier)

A unique integer which identifies the NEIGHBORS row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Pntid (Point Identifier)

The POINT_DATA.Pntid of the point in which this neighbor was recorded. Further information related to the entire sample must be found by using POINT_DATA.Sid, the sample identifier.

This column may not be NULL.

Sname

The BIOGRAPH.Sname of the neighbor.

This column must be NULL when the neighbor is an unknown individual or otherwise not in BIOGRAPH, i.e. when the Unksname is not NULL.

Ncode (code classifying the kind of neighbor)

The NCODES.Ncode describing the kind of neighbor represented in the row.

Some values from NCODES may be restricted, based on the sampling protocol. See STYPES_NCODES for more information.

This column may not be NULL.

Unksname (Unknown neighbor code)

The UNKSNAMES.Unksnamenature code recorded when the neighbor cannot be precisely identified[101].

This column must be NULL when the neighboring individual is precisely identified, i.e. when the Sname is not NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

SAMPLES (all-occurrences Samples)

One row for every continuous period of time during which data are collected at regular intervals on a specific focal individual. Although the field protocols center around collecting data primarily stored in the POINT_DATA, FPOINTS, and NEIGHBORS tables, other information — normally collected ad-libitum during data collection — may be collected as well and are also associated with the specific sample. Further, a sample is allowed to contain no (animal) information.[102] Each SAMPLES row contains the information pertaining to all the data collected during the sample.

The date of the sample must not be before the focal individual's Entrydate, nor after the focal individual's Statdate. Therefore the demographic data pertaining to any particular time period must be entered into Babase before the sample data collected during that time period.

The system will return a warning when the Date is before the focal individual's LatestBirth.

The number of point observations occurring during the sampling interval (Minsis) must be less than or equal to the total number of minutes elapsed (Mins) during the sampling interval.[103]

Other data integrity checks may be performed on a SAMPLES row — and on related rows in POINT_DATA, FPOINTS, and NEIGHBORS — depending on the data collection protocol used in the focal sample. Each sample's protocol is indicated by its SType, and the details of these other data integrity checks are defined in the STYPES table.

The system will report a warning when the group (Grp) of the focal individual, as recorded on SAMPLES, is not the same as the group MEMBERS records for the focal individual on the date of data collection.

One of the participants in all interactions collected during the sample (see INTERACT_DATA.Sid and PARTS) must be the focal individual.

Focal sampling protocols usually designate how many minutes should elapse in each sample, but for various reasons samples collected in the field may last for fewer than the expected number of minutes. Regardless of the expected number of elapsed minutes in a sample, the actual number and the number of those minutes in which a focal "point" was collected are recorded in the Mins and Minsis columns, respectively. Both of these integer columns cannot be less than zero. Their maximum allowed value depends on the row's SType and related STYPES.Max_Points.

The data collected during a focal sample are complex. To assist the observer with recording it all, these data are often though not always collected with an electronic device — e.g. a handheld phone/tablet — and specialized data collection software. This table uses three columns to record details about the hardware and software — or lack thereof — used for data collection: Collection_System, Programid, and Setupid. The Collection_System indicates the hardware used (e.g. "Samsung Tablet B", "Psion unit 6", "Pen and paper"). The Programid indicates the software that the hardware used, and the Setupid indicates any special configuration file(s) that the software used.

Tip

If a focal sampling arrangement has no particular need for one of these columns — e.g. samples recorded with a pen and paper likely won't need Programid nor Setupid — do not set that column to NULL. Collection_System isn't allowed to be NULL, and both Programid and Setupid should only be NULL when their true values are unknown. Instead, add a row to the column's respective support table that essentially means "N/A" and use that value in this table.

Column Descriptions

Sid (Sample Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and is used in other tables to refer to a particular sample.

This column cannot be changed and must not be NULL.

Date

The date of the focal sample[104].

This column may not be NULL.

Stime (time sampling began)

The time the sampling began.

This column may not be NULL.

Grp (Group observed)

The GROUPS.Gid of the focal individual's group, recorded at the time of data collection by the observer.

This column may not be NULL.

Sname

The BIOGRAPH.Sname of the focal individual.

This column may not be NULL.

SType (Sample Type)

The STYPES.SType of the data collection protocol used in this focal sample.

This column may not be NULL.

Mins (Minutes elapsed)

The total number of minutes which actually elapsed while the sample was collected.

This column may not be NULL.

Minsis (Minutes In Sight)

The actual number of point observations (once per minute) recorded during the sample.

Babase maintains this value automatically by counting the number of POINT_DATA rows associated with the sample. If this value is manually set, Babase compares the supplied value with the value it computes and issues an error if the two do not match.

This column may not be NULL and must be less than or equal to this row's Mins.

Observer

The OBSERVERS.Initials of the person who collected the sample.

This column may not be NULL.

Collection_System

The SAMPLES_COLLECTION_SYSTEMS.Collection_System indicating how the sample's data were collected.

This column may not be NULL.

Programid

The PROGRAMIDS.Programid of the software ("program") used on this row's device to collect this sample's data.

This column may be NULL, indicating that this information is unknown.

Setupid

The SETUPIDS.Setupid representing the configuration ("setup") file(s) used by this row's software to collect the data in this sample.

This column may be NULL, indicating that this information is unknown.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

Darting

ANESTHS (Extra Sedation Administered During Darting)

ANESTHS contains one row for each time additional sedation is administered to a darted individual. If no additional sedation was administered then this table should not contain rows related to the darting.

Anesthetic cannot be administered to the same individual more than once at any given time -- the combination of Dartid and Antime must be unique.

Anesthetic cannot be administered before the individual is darted -- the Antime value cannot be before the related DARTINGS.Darttime time.

Anesthetic cannot be administered after the individual recovers from the previous dose -- the Antime value cannot be later than 2 hours after the later of the DARTINGS.Darttime time or the previous administration of additional sedation.

Tip

The ANESTH_STATS view aggregates the multiple administrations of anesthetic given during a darting and so provides a convenient way to analyze ANESTHS rows.[105]

Anesthid (Extra Sedation Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and is used in other tables to refer to a particular administration of extra sedation.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Dartid (Darting Identifier)

The darting event during which extra sedation was administered -- a DARTINGS.Dartid value. This column cannot be changed and may not be NULL.

Drug (Anesthetic Administered)

Anesthetic administered to extend sedation. The legal values for this column are defined by the DRUGS support table.

This column may not be NULL.

Antime (Time when additional anesthetic was administered)

The time additional sedation was administered to the darted individual.

The time zone is Nairobi local time.

The precision of this column is 1 minute -- seconds and fractions thereof must be 0.

This column may be NULL when there is no record of what time additional sedation was administered.

Anamount (Anesthetic Amount)

The amount of anesthetic administered, in CCs.

The maximum allowed is 1.0CC. The minimum is 0. The precision allowed and accuracy are .01CC.

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

BODYTEMPS (Darting Body Temperature Measurements)

BODYTEMPS contains one row for each body temperature measurement taken of a darted individual.

The temperature cannot be measured before the individual is darted or before the individual is picked up -- the Bttime value cannot be before either the related DARTINGS.Pickuptime time or[106] the Darttime time. The temperature cannot be taken after the individual has recovered from sedation - the Bttime value, when non-NULL, cannot be later than 2 hours after the later of the DARTINGS.Darttime time or the last administration of additional sedation, if any, as recorded in the ANESTHS table. A non-NULL Bttime value implies that there must be a known time of anesthetic administration -- either DARTINGS.Darttime or ANESTHS.Antime must be non-NULL.

Btid (Packed Cell Volume Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to a particular body temperature measurement.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Dartid (Darting Identifier)

The darting event during which the body temperature measurement was taken -- a DARTINGS.Dartid value. This column cannot be changed and may not be NULL.

Btemp (Body Temperature)

The measured temperature in degrees Celsius to a precision of 1/10th of a degree. The minimum allowed value is 25 degrees and the maximum 45 degrees.

This column may not be NULL.

Bttime (Time of Body Temperature measurement)

The time the body temperature of the darted individual was taken.

The time zone is Nairobi local time.

The precision of this column is 1 minute -- seconds and fractions thereof must be 0.

This column may be NULL when there is no record of when the body temperature measurement was taken.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

CHESTS (Darting Chest Circumference Measurements)

CHESTS contains a row for each chest circumference measurement made of a darted individual.

Chid (Chest circumference measurement Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to a particular chest circumference measurement.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Dartid (Darting Identifier)

The darting event during which the chest circumference measurement was taken -- a DARTINGS.Dartid value. This column cannot be changed and may not be NULL.

Chcircum (Chest circumference measurement)

The chest circumference measurement, in centimeters, with a precision of 1/10th of a centimeter. The minimum value allowed is 25 centimeters. The maximum value allowed is 99.9 centimeters.

Caution

The value contained in this column may have been adjusted for systematic observational bias. See the Chunadjusted column for more information.

This column may not be NULL.

Chunadjusted (Unadjusted Chest circumference measurement)

Some measurements were subject to systemic bias when taken. When this is known to have occurred the original, biased measurements are recorded in this column. When there is no known bias this column is NULL.

When non-NULL this column contains the original chest circumference measurement, in centimeters, with a precision of 1/10th of a centimeter. The minimum value allowed is 25 centimeters. The maximum value allowed is 99.9 centimeters.

Chseq (Chest circumference measurement Sequence)

A sequence number indicating the order in which the measurements were taken. The first chest circumference measurement taken during a darting has a Chseq value of 1, the second a value of 2, etc.

The system automatically re-computes Chseq values to ensure that they are contiguous and begin with 1. See the Automatic Sequencing section for further information.

Chobserver (Chest circumference measurer)

Initials of the person who performed the measurement. The legal values of this column are defined by the OBSERVERS support table.

This column may be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

CROWNRUMPS (Darting Crown-to-Rump Measurements)

CROWNRUMPS contains a row for each crown-to-rump measurement made of a darted individual.

Tip

The CROWNRUMP_STATS view aggregates the multiple crown-to-rump measurements taken during a darting and so provides a convenient way to analyze CROWNRUMPS rows.

CRid (Crown-to-Rump measurement Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to a particular crown-to-rump measurement.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Dartid (Darting Identifier)

The darting event during which the crown-to-rump measurement was taken -- a DARTINGS.Dartid value. This column cannot be changed and may not be NULL.

CRlength (Crown-to-Rump measurement)

The crown-to-rump measurement, in centimeters, with a precision of 1/10th of a centimeter. The minimum value allowed is 10 centimeters. The maximum value allowed is 99.9 centimeters.

This column may not be NULL.

CRseq (Crown-to-Rump measurement Sequence)

A sequence number indicating the order in which the measurements were taken.The first crown-to-rump measurement taken during a darting has a CRseq value of 1, the second a value of 2, etc.

The system automatically re-computes CRseq values to ensure that they are contiguous and begin with 1. See the Automatic Sequencing section for further information.

CRobserver (Crown-to-rump measurer)

Initials of the person who performed the measurement. The legal values of this column are defined by the OBSERVERS support table.

This column may be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

DART_SAMPLES (Darting Tissue Sample Records)

DART_SAMPLES contains one row for every sample type collected in each darting.

The combination of Dartid and DS_Type must be unique.

Tip

The DSAMPLES view also shows these data, one line per Dartid. For some users, this may be a more desirable way to look at these data.

Column Descriptions

DS_Id (Darting Sample collection Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to all the samples of a particular DS_Type collected during a single darting.

This column cannot be changed. This column may not be NULL.

Dartid (Darting Identifier)

The darting event during which the indicated samples were collected -- a DARTINGS.Dartid value.

This column cannot be changed. This column may not be NULL.

DS_Type (Darting Sample Type Identifier)

The DART_SAMPLE_TYPES.DS_Type of this sample.

This column cannot be changed. This column may not be NULL.

Num

The number of samples collected of the type given in the DS_Type column.

This column may not be NULL, must be greater than zero, and must be between the DART_SAMPLES.DS_Type's corresponding DART_SAMPLE_TYPES.Minimum and DART_SAMPLE_TYPES.Maximum values, inclusive.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

DARTINGS (Baboon Darting Events)

DARTINGS contains one row for every darting of an animal when data was collected.

The combination of Sname and Date must be unique.

The individual must be alive and in the study population when darted -- the Date must be between the individual's Entrydate and Statdate, inclusive. The system will return a warning when the Date is before the individual's LatestBirth.

The system will report a warning for females darted on or after 2006-01-01 for which there is no related DART_SAMPLES row that indicates a vaginal swab collection.

The Downtime value cannot be before the Darttime value and cannot be more than 1 hour after the Darttime value.

The Pickuptime value cannot be before the Downtime value and cannot be more than 90 minutes after the Downtime value. It also[107] cannot be before Darttime and cannot be more than 90 minutes after Darttime. The system will report a warning if the Pickuptime is more than 30 minutes after the Downtime.

Dartid (Darting Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and is used in other tables to refer to a particular darting event.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Sname

A three-letter code (an id) that uniquely identifies the darted animal (an Sname) in BIOGRAPH. This code can be used to retrieve information from BIOGRAPH or other places where the animal's three-letter code appears. This column may not be NULL.

Date (Darting Date)

The date the individual was darted.

This column may not be NULL.

Darttime (Darting Time)

The time the individual was darted -- when the dart was fired. The time zone is Nairobi local time.

The time may not be before 05:00 and may not be after 20:00.

The precision of this column is 1 minute -- seconds and fractions thereof must be 0.

This column may be NULL when the time of darting is unknown.

Downtime (Time the darted individual went down)

The time the darted individual succumbed to the anesthetic. The time zone is Nairobi local time.

The precision of this column is 1 minute -- seconds and fractions thereof must be 0.

This column may be NULL when the downtime is not known.

Pickuptime (Time the darted individual was picked up by the team)

The time that the darting team picked up the anesthetized individual.

The precision of this column is 1 minute -- seconds and fractions thereof must be 0.

This column may be NULL when the pickup time is not known.

Drug (Dart Anesthetic)

Anesthetic administered by the dart. The legal values for this column are defined by the DRUGS support table.

This column may not be NULL.

Mass (Mass of the darted individual)

Mass of the darted individual, in kilograms. The precision of this column is 1/10th of a kilogram. The minimum value allowed is 1Kg. The maximum value allowed is 40Kg.

The system will report a warning when this column is NULL.[108]

Logisticnotes (Notes on Logistics)

Notes regarding the logistics of the darting. Comments about collars, anesthetic, etc. Consult the Amboseli Baboon Research Project Monitoring Guide for further guidance as to usage.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Dartcomments (General comments about the darting)

Comments about the animal's condition, darting circumstances, etc. during darting. Consult the Amboseli Baboon Research Project Monitoring Guide for further guidance as to usage.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

CRnotes (Crown-to-Rump measurement notes)

Notes on the crown-to-rump measurements taken, if any.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Chnotes (Chest circumference measurement Notes)

Notes on the chest circumference measurements taken, if any.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Ulnotes (Ulna length measurement Notes)

Notes on the ulna length measurements taken, if any.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Hunotes (Humerus length measurement Notes)

Notes on the humerus length measurements taken, if any.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Dphysnotes (Darting Physiological measurement Notes)

Ad libitum notes taken on the physiological features of the darted individual, if any.

This column may be NULL.[109]. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

PCVnotes (PCV measurement Notes)

Notes on the PCV measurements taken, if any.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Bodytempnotes (Body Temperature Notes)

Notes on the body temperature readings taken, if any.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Dsamplenotes (Darting Sample Notes)

Notes that accompany any of the different samples recorded in the DART_SAMPLES table, if any.

This column may be NULL.[110]. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Teethnotes (Notes on the Teeth)

Notes on the teeth, if any observations on the teeth were made.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Caninenotes (Notes on the Canines)

Notes on the canines, if any observations on the teeth were made.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Testesnotes (Testes measurement Notes)

Notes on the testes measurements taken, if any.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Ticknotes (Notes on the Parasite counts)

Notes on the parasite counts done, if any.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

DPHYS (Darting Physiological Measurements)

DPHYS contains one row for each darting event during which physiological measurements were taken.

Additional physiological measurements are recorded in the PCVS and BODYTEMPS tables.

Dphysid (Darting Physiological measurements Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to a particular set of physiological measurements taken during a darting.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Dartid (Darting Identifier)

The darting event during which the set of physiological measurements were taken -- a DARTINGS.Dartid value. This column cannot be changed and may not be NULL.

Pulse

The pulse of the individual in beats per minute. The pulse must be greater than 0.

This column may be NULL.

Respiration

The respiration rate of the individual measured in counts per minute. The respiration rate must be greater than 0.

This column may be NULL.

Ringnode (state of Right Inguinal lymph Node)

The state of the right inguinal lymph node. The legal values of this column are defined by the LYMPHSTATES support table.

This column may be NULL.

Lingnode (state of Left Inguinal lymph Node)

The state of the left inguinal lymph node. The legal values of this column are defined by the LYMPHSTATES support table.

This column may be NULL.

Ringnode (state of Right Axillary lymph Node)

The state of the right axillary lymph node. The legal values of this column are defined by the LYMPHSTATES support table.

This column may be NULL.

Ringnode (state of Left Axillary lymph Node)

The state of the left axillary lymph node. The legal values of this column are defined by the LYMPHSTATES support table.

This column may be NULL.

Ringnode (state of Right Submandibular lymph Node)

The state of the right submandibular lymph node. The legal values of this column are defined by the LYMPHSTATES support table.

This column may be NULL.

Ringnode (state of Left Submandibular lymph Node)

The state of the left submandibular lymph node. The legal values of this column are defined by the LYMPHSTATES support table.

This column may be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

HUMERUSES (Darting Humerus Length Measurements)

HUMERUSES contains a row for each humerus length measurement made of a darted individual.

Huid (humerus length measurement Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to a particular humerus length measurement.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Dartid (Darting Identifier)

The darting event during which the humerus length measurement was taken -- a DARTINGS.Dartid value. This column cannot be changed and may not be NULL.

Hulength (Humerus Length measurement)

The humerus length measurement, in centimeters, with a precision of 1/10th of a centimeter. The minimum value allowed is 10 centimeters. The maximum value allowed is 35 centimeters.

Caution

The value contained in this column may have been adjusted for systematic observational bias. See the Huunadjusted column for more information.

This column may not be NULL.

Huunadjusted (Unadjusted Humerus length measurement)

Some measurements were subject to systemic bias when taken. When this is known to have occurred the original, biased measurements are recorded in this column. When there is no known bias this column is NULL.

When non-NULL this column contains the original humerus length measurement, in centimeters, with a precision of 1/10th of a centimeter. The minimum value allowed is 10 centimeters. The maximum value allowed is 35 centimeters.

Huseq (Humerus measurement Sequence)

A sequence number indicating the order in which the measurements were taken. The first humerus length measurement taken during a darting has a Huseq value of 1, the second a value of 2, etc.

The system automatically re-computes Huseq values to ensure that they are contiguous and begin with 1. See the Automatic Sequencing section for further information.

Huobserver (Humerus length measurer)

Initials of the person who performed the measurement. The legal values of this column are defined by the OBSERVERS support table.

This column may be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

PCVS (Darting Blood Measurements)

PCVS contains one row for each PCV (packed cell volume) measurement taken from a darted individual.

PCVid (Packed Cell Volume Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to a particular PCV measurement.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Dartid (Darting Identifier)

The darting event during which the PCV measurement was taken -- a DARTINGS.Dartid value. This column cannot be changed and may not be NULL.

PCV (Packed Cell Volume)

The packed cell volume measurement. This is a percentage and must be between 1 and 99, inclusive.

This column may not be NULL.

PCVseq (PCV measurement Sequence)

A sequence number indicating the order in which the PCV measurements were taken. The first PCV measurement taken during a darting has a PCVseq value of 1, the second a value of 2, etc.

The system automatically re-computes PCVseq values to ensure that they are contiguous and begin with 1. See the Automatic Sequencing section for further information.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

TEETH (Darting Tooth Data)

TEETH contains one row for every possible tooth site within the mouth on which data was collected for every darting event during which dentition data was collected. There may not be data on each tooth or tooth site. The absence of a row in this table says nothing about the presence or absence of a particular tooth at the time of darting.

When the tooth is missing, the Tstate is M, the Tcondition value must be NULL. When the tooth is not missing Teeth-Tcondition must be non-NULL.

There may be only one tooth in any given tooth site within the mouth, at any one time -- for any given darting there may be at most one row in TEETH for each tooth site (TOOTHSITES).

Note

While rows in this table record tooth presence/absence and condition in separate columns, these data might not be recorded that way in the field. In dartings from 2006-onward, the tooth's presence/absence is recorded in the same place that indicates a tooth has the "erupting" condition. Between this and the fact that it can be difficult for observers to discriminate between partially- and fully-erupted teeth, a tooth that in fact was still erupting might only be recorded as "present". Thus, erupting teeth might appear in this table without a Tcondition indicating it. Teeth that were recorded as "erupting" can safely be assumed to truly be erupting, however.

In other words: in dartings since 2006 (inclusive), there are likely some cases where an erupting tooth was mistakenly recorded only as 'present', and there is no way to identify when this has occurred.

Warning

When inserting a row into TEETH a NULL Tstate value has special meaning. Inserted rows with a NULL Tstate value are silently ignored; no such rows are ever inserted.[111]

The Tstate column cannot be changed to a NULL.

Note

The DENT_CODES view may be used to maintain the TEETH table. This view may also be useful when querying. It returns a single row with individual columns for every kind of tooth.

The DENT_SITES view provides a way to query TEETH, returning a single row with individual columns for each position in the mouth.

Teethid (Teeth row Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to a particular tooth (or tooth site when a tooth is missing).

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Dartid (Darting Identifier)

The darting event during which the tooth examinations were made -- a DARTINGS.Dartid value. This column cannot be changed and may not be NULL.

Tooth (Tooth examined)

The tooth, or tooth site if the tooth is missing. The legal values of this column are defined by the TOOTHCODES support table.

This column may not be NULL.

Tstate (Tooth existential State)

The degree to which the tooth exists. The legal values of this column are defined by the TSTATES support table.

This column will never contain a NULL. See the warning above for more information.

Tcondition (Tooth Condition)

A code rating the physical condition of the tooth. The legal values of this column are defined by the TCONDITIONS support table.

This column may be NULL. See TEETH above.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

TESTES_ARC (Darting Testes circumference Data)

TESTES_ARC contains one row for every darting event for every recorded measurement of testicle width and length circumference.

Caution

The TESTES_ARC table contains testes measurements of a portion of the testicle circumference. The TESTES_DIAM table contains testes measurements of the diameter. The two tables are otherwise identical in that they have the same structure and have corresponding validation rules.

Note

The pairing of the width and length measurements within this table exists to make data storage convenient; no special relationship is implied regarding the order in which the measurements were taken. For example, if there are 3 length measurements taken during a darting and 2 width measurements the width and length measurements may have been taken in either of the following orders, as well as other possible orders not listed here: length1, length2, length3, width1, width2 or length1, width1, length2, width2, length3. In other words the value of the Seq column describes the order in which the length measurements were taken and the order in which width measurements were taken but says nothing about the interspersing of length and width measurements.[112]

Either the width or the length must be specified -- both Testwidth and Testlength cannot be NULL in the same row.

There can only be one measurement taken per darting per testicle per measurement sequence number -- Testseq must be unique per Dartid per Testside.

Once a Testwidth value is NULL all the rows (for the same darting) with higher Testseq values must also have a NULL Testwidth value. The same is true of the Testlength column.[113]

An individual must be male to have a row in this table.

The system will report a warning when individuals have testes length measurements less than 15mm or have testes width measurements less than 10mm.

Testesid (Testes measurements Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to a particular testes measurement.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Dartid (Darting Identifier)

The darting event during which the testes measurements were taken -- a DARTINGS.Dartid value. This column cannot be changed and may not be NULL.

Testside (Testicle measured)

The testicle measured. The legal values are:

Valid Testside Values
CodeDescription
Lthe left testicle
Rthe right testicle

This column may not be NULL.

Testlength (Testes Length measurement)

The testes length measurement, in millimeters, with a precision of 1/10th of a millimeter. The minimum value allowed is 15 millimeters. The maximum value allowed is 140 millimeters.

This column may not be NULL.

Testwidth (Testes Width measurement)

The testes width measurement, in millimeters, with a precision of 1/10th of a millimeter. The minimum value allowed is 10 millimeters. The maximum value allowed is 95 millimeters.

This column may not be NULL.

Testseq (Testes measurement Sequence)

A sequence number indicating the order in which the measurements were taken. The first measurement, of each testicle, taken during a darting has a Testseq value of 1, the second a value of 2, etc.

The system automatically re-computes Testseq values to ensure that they are contiguous and begin with 1. Note that the TESTES_ARC rows are sequenced within Dartid within Testside whereas the other darting tables are sequenced only within Dartid. See the Automatic Sequencing section for further information.

TESTES_DIAM (Darting Testes Diameter Data)

TESTES_DIAM contains one row for every darting event for every recorded measurement of testicle width and length diameter.

Caution

The TESTES_ARC table contains testes measurements of a portion of the testicle circumference. The TESTES_DIAM table contains testes measurements of the diameter. The two tables are otherwise identical in that they have the same structure and have corresponding validation rules.

Note

The pairing of the width and length measurements within this table exists to make data storage convenient; no special relationship is implied regarding the order in which the measurements were taken. For example, if there are 3 length measurements taken during a darting and 2 width measurements the width and length measurements may have been taken in either of the following orders, as well as other possible orders not listed here: length1, length2, length3, width1, width2 or length1, width1, length2, width2, length3. In other words the value of the Seq column describes the order in which the length measurements were taken and the order in which width measurements were taken but says nothing about the interspersing of length and width measurements.[114]

Either the width or the length must be specified -- both Testwidth and Testlength cannot be NULL in the same row.

There can only be one measurement taken per darting per testicle per measurement sequence number -- Testseq must be unique per Dartid per Testside.

Once a Testwidth value is NULL all the rows (for the same darting) with higher Testseq values must also have a NULL Testwidth value. The same is true of the Testlength column.[115]

An individual must be male to have a row in this table.

The system will report a warning when individuals have testes length measurements less than 40mm or have testes width measurements less than 25mm.

Testesid (Testes measurements Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to a particular testes measurement.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Dartid (Darting Identifier)

The darting event during which the testes measurements were taken -- a DARTINGS.Dartid value. This column cannot be changed and may not be NULL.

Testside (Testicle measured)

The testicle measured. The legal values are:

Valid Testside Values
CodeDescription
Lthe left testicle
Rthe right testicle

This column may not be NULL.

Testlength (Testes Length measurement)

The testes length measurement, in millimeters, with a precision of 1/10th of a millimeter. The minimum value allowed is 15 millimeters. The maximum value allowed is 75 millimeters.

This column may not be NULL.

Testwidth (Testes Width measurement)

The testes width measurement, in millimeters, with a precision of 1/10th of a millimeter. The minimum value allowed is 10 millimeters. The maximum value allowed is 51 millimeters.

This column may not be NULL.

Testseq (Testes measurement Sequence)

A sequence number indicating the order in which the measurements were taken. The first measurement, of each testicle, taken during a darting has a Testseq value of 1, the second a value of 2, etc.

The system automatically re-computes Testseq values to ensure that they are contiguous and begin with 1. Note that the TESTES_DIAM rows are sequenced within Dartid within Testside whereas the other darting tables are sequenced only within Dartid. See the Automatic Sequencing section for further information.

TICKS (Darting Tick and Parasite Data)

TICKS contains one row for every darting event during which data on ticks and other parasites were recorded.

When a specific number could not be arrived at because there was a large number of parasites or there was some other reason why the count could not be taken, Tickcount should be left NULL.

The value of the Tickstatus column is constrained based on the Tickcount value. For further information see the documentation of the TICKSTATUSES support table and the meaning of the table's Special Values.

The combination of Dartid, Bodypart, and Tickkind must be unique.

Tickid (Tick and other parasite count Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to a particular tick count.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Dartid (Darting Identifier)

The darting event during which the tick count was made -- a DARTINGS.Dartid value. This column cannot be changed and may not be NULL.

Bodypart

The part of the body examined for ticks or other parasites. The legal values of this column are defined by the BODYPARTS support table.

This column may not be NULL.

Tickkind (Kind of Tick or other parasite)

The kind of tick or other parasite, or kind of parasite and it's developmental stage, or kind of parasite indicator counted. The legal values of this column are defined by the PARASITES support table.

This column may not be NULL.

Tickcount (Count of ticks or other parasites and their signs)

The recorded count of ticks, ticks in the indicated developmental stage, other parasites, or parasite signs. The minimum value allowed is 0, the maximum is 250.

This column may be NULL when there were too many parasites to count or the count was not taken for some other reason.

Tickstatus

A status value indicating whether and what sort of tick count was taken. The legal values of this column are from the Tickstatus column of the TICKSTATUSES table. See the documentation of the TICKSTATUSES support table for more information regarding what values may be used under which conditions.

This column may not be NULL.

Tickbpnotes (Body Part Notes)

Notes on the parasite infestation of the indicated body part.

Caution

Notes pertaining to parasites but not specific to the particular body part examined belong in DARTINGS.Ticknotes.

This column may contain NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

ULNAS (Darting Ulna Length Measurements)

ULNAS contains a row for each ulna length measurement made of a darted individual.

Ulid (Ulna length measurement Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to a particular ulna length measurement.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Dartid (Darting Identifier)

The darting event during which the ulna length measurement was taken -- a DARTINGS.Dartid value. This column cannot be changed and may not be NULL.

Ullength (Ulna Length measurement)

The ulna length measurement, in centimeters, with a precision of 1/10th of a centimeter. The minimum value allowed is 10 centimeters. The maximum value allowed is 35 centimeters.

Caution

The value contained in this column may have been adjusted for systematic observational bias. See the Ulunadjusted column for more information.

This column may not be NULL.

Ulunadjusted (Unadjusted Ulna length measurement)

Some measurements were subject to systemic bias when taken. When this is known to have occurred the original, biased measurements are recorded in this column. When there is no known bias this column is NULL.

When non-NULL this column contains the original ulna length measurement, in centimeters, with a precision of 1/10th of a centimeter. The minimum value allowed is 10 centimeters. The maximum value allowed is 10 centimeters.

Ulseq (Ulna length measurement Sequence)

A sequence number indicating the order in which the measurements were taken.The first ulna length measurement taken during a darting has a Ulseq value of 1, the second a value of 2, etc.

The system automatically re-computes Ulseq values to ensure that they are contiguous and begin with 1. See the Automatic Sequencing section for further information.

Ulobserver (ulna length measurer)

Initials of the person who performed the measurement. The legal values of this column are defined by the OBSERVERS support table.

This column may be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

VAGINAL_PHS (Darting Vaginal pH Measurements)

VAGINAL_PHS contains a row for each vaginal pH measurement taken on a darted female.

Tip

The VAGINAL_PH_STATS view aggregates the multiple vaginal pH measurements taken during a darting and so provides a convenient way to analyze VAGINAL_PHS rows.

VPId (Vaginal pH measurement Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to a particular vaginal pH measurement.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Dartid (Darting Identifier)

The DARTINGS.Dartid of the darting during which this vaginal pH measurement was taken.

This column cannot be changed and must not be NULL.

PH (Vaginal pH measurement)

The vaginal pH measurement, precise to the nearest 0.5. This must be a number between 4.0 and 10.0.

This column may not be NULL.

VPseq (Vaginal pH measurement Sequence)

A sequence number indicating the order in which the measurements were taken. The first vaginal pH measurement taken during a darting has a VPseq value of 1, the second a value of 2, etc.

The system automatically re-computes VPseq values to ensure that they are contiguous and begin with 1. See the Automatic Sequencing section for further information.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

Inventory

This section contains data about the origin, identity, location, and various other traits about the tissue and nucleic acid samples in the users' inventory. This includes samples currently residing in the users' inventory, as well as older samples that may have previously been in use but have since been sent to others, consumed, discarded, or lost. Because of this, the data in this section serve as both a historical record of all samples that have ever been in the users' possession and an active record of the samples that are currently in the users' possession.

Note

The text in this section uses the terms "nucleic acid" and "nucleic acid sample" interchangeably[116]. At the time of this writing, the system does not attempt to record details at the molecular level, so the reader can be assured that comments about the location, source, etc. of a specific "nucleic acid" should be interpreted as referring to a sample and not a specific molecule.

LOCATIONS

This table contains one row for every location that may be used to store tissue or nucleic acid samples.

Samples may be stored in varied locations with different organizations/research groups ("institutions"). The Institution column is included to allow easy segregation of locations across these varying locales.

The name of each distinct location is recorded in the Location column. Different organizations have their own conventions about how to organize and name storage locations, so this code may be a very descriptive and specific space ("Shelf 1, Rack 2, Box 3, Position D") or something more general ("PINK BOX").

Each Institution-Location pair must be unique.

To allow the use of nondescriptive general Location values but retain the ability to enforce uniqueness of specific ones, the boolean column Is_Unique is included. When Is_Unique is TRUE, the row's LocId may occur at most once across both the NUCACID_DATA.LocId and TISSUE_DATA.LocId columns (once total, not once per table). When FALSE, the LocId may be used any number of times in either table.

Column Descriptions

LocId (Location Identifier

A unique identifier for the location. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Institution

The INSTITUTIONS.Institution indicating the organization or research group at which this row's Location exists.

This column may not be NULL.

Location

A textual column naming this location.

This column may not be NULL.

Is_Unique

A boolean indicating whether or not this location at this institution is unique.

This column defaults to TRUE.

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

NUCACID_CONC_DATA (NUCleic ACID CONCentration DATA)

This table contains one row for every quantification of a nucleic acid sample's concentration. All concentrations are recorded in picograms per microliter (pg/μL).

A nucleic acid sample cannot be quantified before it was created, before the source tissue sample was collected, nor before the tissue sample's donor entered the study population (if applicable); the Conc_Date cannot be before the related NUCACID_DATA.Creation_Date, TISSUE_DATA.Collection_Date, nor the related BIOGRAPH.Entrydate. These dates already have a required sequence to them — Entrydate <= Collection_Date <= Creation_Date <= Conc_Date — so in many cases it may be sufficient for the system to only require that Conc_Date is after the Creation_Date. However, any of these date columns can be NULL, so for the sake of completeness the system separately checks that Conc_Date is greater than each of them.

Some quantification methods may use a different unit of concentration than that used in this table. Nanograms per microliter (ng/μL) is especially common. Such concentrations must be converted to pg/μL before they are added to this table.

Tip

Use the NUCACID_CONCS view instead of this table. It includes an additional column that indicates concentration in ng/μL, and also allows the insertion of quantifications in ng/μL. The conversion to ng/μL is thus performed by the system and not the user.

Warning

Do not assume that the number of significant figures employed in the Pg_ul column is the "true" number of significant figures for this quantification. This table records concentrations from a variety of quantification methods with varying levels of accuracy and stores them all in a single column that records all data to the nearest 0.1 pg/μL[117]. When new data are added, this column pays no attention to the number of provided significant figures and may indicate more than were actually used at the time of quantification. See the example below.

Example 3.2. (Mis)Use of Significant Figures in NUCACID_CONC_DATA

The concentration of a new DNA sample is determined to be 10.0 ng/μL, which has 3 significant figures. When recorded in NUCACID_CONC_DATA, this concentration will be recorded in Pg_ul as 10000.0 pg/μL, with 6 significant figures. A user should not assume that this quantification was originally performed with 6 significant figures' accuracy.


Column Descriptions

NACId (Nucleic Acid Concentration Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

NAId (Nucleic Acid Identifier)

The NUCACID_DATA.NAId of the quantified sample.

This column may not be NULL.

Conc_Method

The NUCACID_CONC_METHODS.Conc_Method used to quantify this concentration.

This column may not be NULL.

Conc_Date (Concentration Date)

The date that this concentration was quantified.

This column may be NULL, when the date is unknown.

Pg_ul (Picograms per microliter)

The concentration of the sample according to this quantification, in picograms per microliter (pg/μL).

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

NUCACID_CREATORS (NUCleic ACID CREATORS)

This table contains one row for every person involved with the creation of a specific nucleic acid sample. When a nucleic acid sample has multiple creators, each of them is recorded here in a separate row.

Most nucleic acid samples are created via "extraction". This table favors using "creation" rather than "extraction", for reasons explained in the discussion of the NUCACID_DATA table.

Each NAId-Creator combination must be unique; a sample cannot have the same creator more than once.

Tip

Use the NUCACIDS view to insert data into this table. It provides a simple way to determine the appropriate NAId value to use, and for a human data enterer to provide multiple creators in a single row.

Column Descriptions

NACrId (NUCACID_CREATORS Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

NAID (Nucleic Acid Identifier)

The NUCACID_DATA.NAId of the related nucleic acid sample.

This column may not be NULL.

Creator

The LAB_PERSONNEL.Initials of this creator.

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

NUCACID_DATA (General information about NUCleic ACID samples)

This table contains one row for every nucleic acid sample that is or ever has been in the inventory. Each nucleic acid sample is associated with a "source" tissue sample, which is indicated in the TId column.

Tip

Always use the NUCACIDS view in place of this table. It contains additional related columns which may be of interest.

This table records a nucleid acid sample's current location using the LocId column. Values in this column constrain and are constrained by values in the TISSUE_DATA.LocId column, and may or may not be unique, as discussed in the LOCATIONS table.

The Name_on_Tube column indicates whatever "name" or other identifying information is recorded on the tube. Because of labeling errors or misidentification in the field, this value may not indicate the true identity of the individual from whom this sample came.

Tip

To see the "true" identity of this individual, see the related line in the TISSUE_DATA table. This information is also provided in the NUCACIDS view.

Two columns in this table record information related to the sample's creation: Creation_Date and Creation_Method. Also the related table, NUCACID_CREATORS. In laboratory vernacular, the term "extraction" is usually favored over "creation" for most nucleic acid sample types. However, some samples are not "extracted" and are instead generated via a laboratory procedure (e.g. reverse transcription, dilution, PCR amplification, etc.). Because of this, the generic term "creation" is used here.

A sample's Creation_Date cannot be before the source tissue's Collection_Date, nor before the source individual's Entrydate, if any. It may often be redundant to verify that Creation_Date is on or after both dates, but this redundancy is intended, as discussed above.

This table attempts to keep an ongoing record of a sample's current volume in the Actual_Vol_ul column. It is left to the user to judge this column's accuracy, which depends greatly on 1) how diligently the lab personnel keep the data manager(s) informed of changes, and 2) the amount of time that has passed since this volume was determined[118]. To assist users in making these judgments, the date that the Actual_Vol_ul was last updated is recorded in the Actual_Vol_Date column. A sample's current volume cannot be recorded without also recording this date; both of the Actual_Vol_ul and Actual_Vol_Date columns must be NULL or both non-NULL.

A sample cannot have its current volume determined before the sample was created; the Actual_Vol_Date must be on or after the sample's Creation_Date.

It is unlikely, though not impossible, that a sample's volume might increase after its creation. The system will report a warning when a sample's Actual_Vol_ul is greater than its Initial_Vol_ul.

Column Descriptions

NAId (Nucleic Acid Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

TId (Tissue Identifier of Source)

The TISSUE_DATA.TId of the tissue sample from which this nucleic acid sample originated.

This column may not be NULL.

LocId (Identifier for the sample's current location)

The LOCATIONS.LocId indicating the current locale and location of the nucleic acid sample.

This column may not be NULL.

Name_on_Tube

The name of the source individual, according to the label on the tube.

This column may be NULL, when there is no identifying information on the tube. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

NucAcid_Type (Nucleic Acid sample Type)

The NUCACID_TYPES.NucAcid_Type of this nucleic acid sample.

This column may not be NULL.

Creation_Date

The date that this nucleic acid sample was created. When the process to generate a sample lasts more than one day, this is the date that the procedure was completed.

This column may be NULL, when the creation date is unknown.

Creation_Method

The NUCACID_CREATION_METHODS.Creation_Method describing how this nucleic acid sample was created.

This column may not be NULL.

Initial_Vol_ul (Initial Volume in μL)

The sample's volume, in microliters, when it was first created.

This column may be NULL, when the initial volume is unknown.

Actual_Vol_ul (Actual Volume, in μL)

The sample's volume, in microliters, as of the Actual_Vol_Date.

This column may be NULL, when users have not updated the sample's "current" volume or when the sample has not yet been used.

Actual_Vol_Date (Date of the recorded Actual Volume)

The date that the Actual_Vol_ul was determined.

This column may be NULL, when users have not updated the sample's "current" volume or when the sample has not yet been used.

Notes

Comments or miscellaneous information about this nucleic acid sample.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

NUCACID_LOCAL_IDS (LOCAL IDentifierS for NUCleic ACID samples)

This table contains one row for every name or ID used only at a specific institution (an ID that is "local" to that institution) to describe a particular nucleic acid.

Identity of samples is maintained by the system as much as possible, but when working with samples in the laboratory this is often inconvenient or impractical. Different groups and institutions often have their own systems for giving unique names to their samples, and while these names may be useful and meaningful for humans, they are mostly unhelpful from the database's perspective. They're vulnerable to typos, and can be very confusing when a sample is shared between institutions. However, these "local names" remain important for the people who are actually using these samples, so these identifiers are recorded in this table, one per nucleic acid sample, per institution.

Every combination of NAId and Institution must be unique; an NAId cannot go by more than one local name at the same Institution.

Every combination of Institution and LocalId must be unique; the same local name cannot be used at a single Institution more than once.

Column Descriptions

NAId (Nucleic Acid Identifier)

The NUCACID_DATA.NAId of the nucleic acid sample.

This column may not be NULL.

Institution

The INSTITUTIONS.Institution indicating the organization or research group at which this NaId's name is used.

This column may not be NULL.

LocalId (Local Identifier)

The local name used for this NAId at this Institution.

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

NUCACID_SOURCES

This table contains one row for every nucleic acid sample having another nucleic acid as its source.

Often, nucleic acid samples are created through some "extraction" process in which the nucleic acids are purified from a tissue sample (e.g. a blood draw, a buccal swab, etc.) However, there are also numerous different methods by which nucleic acid samples may instead be created from another nucleic acid sample (e.g PCR[119], reverse transcription, dilution, etc.). In addition to recording the identity of the source nucleic acid, this table includes the Relationship column, which indicates the nature of the connection between the row's nucleic acid and its source nucleic acid. This relationship may be simple enough to explain in a single word (e.g. "DILUTION"), or complex enough to require a lengthy explanation. To allow this flexibility, Relationship is not constrained to a set of legal values in a support table.

A nucleic acid sample cannot indicate itself as its source; the NAId and Source_NAId cannot be equal.

A nucleic acid sample cannot have more than one other sample as its source; this table's NAId column is unique.

A nucleic acid cannot have been created before its source; the related Creation_Date of this NAId must be on or after the Source_NAId's related Creation_Date.

Although a nucleic acid sample may have been generated from another nucleic acid sample, there will always be a single tissue sample from which both the nucleic acid samples originated; both samples' related NUCACID_DATA.TId's must be equal.

Column Descriptions

NAId (Nucleic Acid Identifier)

The NUCACID_DATA.NAId of the nucleic acid that has another nucleic acid as its source.

This column may not be NULL.

Source_NAId (Nucleic Acid Identifier of Source)

The NUCACID_DATA.NAId of the source nucleic acid.

This column may not be NULL.

Relationship

A textual description of how this nucleic acid and its source are connected.

This column may not be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

POPULATIONS

This table contains one row for every population under observation, and/or from which tissue or nucleic acid samples have been collected.

In this context, the term "population" refers to a particular species at a specific location. "The baboons in the Amboseli basin in Kenya", for example, are a population. "All baboons", or "all wildlife in the Amboseli basin", are not.

In the common vernacular, a population is often referred to only by the name of its site, e.g. "Gombe" when referring to the Gombe chimpanzees. Because of this, the Pop_Name and Site columns may seem redundant, but when setting vernacular aside it should be obvious that these two columns contain objectively different information. In practice, users may elect to enter the same value in both of these columns, but the two columns remain independent of each other.

Special Values

PopId 1 has special meaning to the system. Data integrity rules for the UNIQUE_INDIVS table presume that the population with this PopId is the population whose individuals are recorded in BIOGRAPH. No other code should be created to refer to that population.

Column Descriptions

PopId (Population Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Pop_Name (Population Name)

The name of the population.

This column may not be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Species_Sci_Name (Scientific Name of the Species)

The scientific name of this population's species.

This column may be NULL, when unknown or not applicable. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Species_Common_Name (Common Name of the Species)

The common name of this population's species.

This column may not be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Wild_Captive

A code indicating whether or not the population is wild or captive. The legal values are shown below.

POPULATIONS.Wild_Captive Values

W

Wild.

C

Captive.

U

Unknown.

NA

Not applicable.

This column may not be NULL.

Site

The location of the population.

This column may not be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Notes

Comments or miscellaneous information about this population.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

TISSUE_DATA (General information about TISSUE samples)

This table contains one row for every tissue sample that is or ever has been in the inventory.

Tip

Always use the TISSUES view in place of this table. It contains additional related columns which may be of interest.

This table records a tissue sample's current location using the LocId column. Values in this column constrain and are constrained by values in the NUCACID_DATA.LocId column, and may or may not be unique, as discussed in the LOCATIONS table.

If a sample was collected from an individual in BIOGRAPH — if the related UNIQUE_INDIVS.UIId has a PopId of 1 — the sample's Collection_Date must be on or after that individual's Entrydate. Depending on the sample's Tissue_Type, the Collection_Date may also be constrained by the individual's Statdate. See TISSUE_TYPES for more information.

The system will return a warning if a sample's Collection_Date is after the individual's Statdate, but only when the sample's Tissue_Type indicates that the Collection_Date is not constrained by the individual's Statdate. That is, when the related TISSUE_TYPES.Max_After_Statdate is NULL.

From time to time, field observers may mistakenly record the wrong collection date on a tube. To help identify when this has occurred, the system uses the CENSUS table to confirm whether the Collection_Date is a date that the individual was actually observed[120]. The result of that confirmation is indicated in the Collection_Date_Status column.

When a sample's Collection_Date is not a Date on which the individual was recorded present in CENSUS, the Collection_Date is not necessarily "wrong". There are numerous circumstances in which a sample may have been collected without a census being performed. Still, the absence of a related row in CENSUS is suspicious, so it elicits a warning. That is, the system will return a warning a tissue sample's Collection_Date_Status is 1.

Tip

Do not assume that the date written on a sample's label will always match the Collection_Date. When data managers determine that the date written on a label is erroneous, they may be able to determine the true date and update the Collection_Date as needed.

Column Descriptions

TId (Tissue Identifier)

A unique identifier for the tissue sample. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

UIId (Unique Individual Identifier)

The UNIQUE_INDIVS.UIId of the individual from whom this tissue sample was collected.

This column may not be NULL.

LocId (Identifier for the sample's current location

The LOCATIONS.LocId indicating the current locale and location of the sample.

This column may not be NULL.

Name_on_Tube

The name of the individual from whom this tissue sample was collected, according to the label on the tube.

This column may be NULL, when there is no identifying information on the tube. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Collection_Date

The date the sample was collected or originated.

This column may be NULL, when the date is unknown.

Collection_Time

The time the sample was collected or originated.

This column may be NULL, when the time is unknown.

Tissue_Type

The TISSUE_TYPES.Tissue_Type of this tissue sample.

This column may not be NULL.

Storage_Medium

The STORAGE_MEDIA.Storage_Medium in which the sample is stored.

This column may not be NULL.

Misid_Status (Misidentification Status)

The MISID_STATUSES.Misid_Status of this tissue sample.

This column may not be NULL.

Collection_Date_Status

A code indicating whether this row's Collection_Date is or isn't plausible according to available CENSUS data. The legal values are:

Valid TISSUE_DATA.Collection_Date_Status Values
CodeDescription
0This individual is part of the main population and has a non-"absent" CENSUS row on this Collection_Date, OR this individual is not part of the main population and we have no basis to question the accuracy of this Collection_Date
1This Collection_Date is NULL, OR this individual is part of the main population and either i) has no CENSUS rows on this Collection_Date or ii) has only "absent" censuses on this Collection_Date

This column is automatically maintained by the database and may not be NULL. Attempts to manually populate or update this column are silently ignored.

Notes

Comments or miscellaneous information about this tissue sample.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

TISSUE_LOCAL_IDS (LOCAL IDentifierS for TISSUE samples)

This table contains one row for every name or ID used only at a specific institution (an ID that is "local" to that institution) to describe a particular tissue sample.

For more details about the reason for this table and the difference between a "local" name/identifier and an ID generated by the database, see the discussion for the NUCACID_LOCAL_IDS table.

Every combination of TId and Institution must be unique; a TId cannot go by more than one name at the same Institution.

Every combination of Institution and LocalId must be unique; the same local name cannot be used at a single Institution to describe more than one sample.

Column Descriptions

TId (Tissue Identifier)

The TISSUE_DATA.TId of the tissue sample.

This column may not be NULL.

Institution

The INSTITUTIONS.Institution indicating the locale in which this TId's name is used.

This column may not be NULL.

LocalId

The local name used for this TId at this Institution.

This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

TISSUE_SOURCES

This table contains one row for every tissue sample having another tissue as its source.

In addition to recording the identity of the source tissue, this table includes the Relationship column, which indicates the nature of the connection between the row's tissue and its source tissue. This relationship may be simple enough to explain in a single word (e.g. "ALIQUOT"), or complex enough to require a lengthy explanation. To allow this flexibility, Relationship is not constrained to a set of legal values in a support table.

A tissue sample cannot indicate itself as its source; the TId and Source_TId cannot be equal.

A tissue sample cannot have more than one other sample as its source; this table's TId column is unique[121].

A tissue sample cannot have been collected before its source; the related Collection_Date of this TId must be on or after the Source_TId's related Collection_Date.

Depending on the details of the Relationship, a tissue sample and its source may or may not be from a different individual. The system does not require that the related UIId's of the TId and Source_TId be equal. However, the system will return a warning when they are not equal.

Column Descriptions

TId (Tissue Identifier)

The TISSUE_DATA.TId of the tissue sample that has another tissue sample as its source.

This column may not be NULL.

Source_TId (Tissue Identifier of Source)

The TISSUE_DATA.TId of the source tissue sample.

This column may not be NULL.

Relationship

A textual description of how this tissue sample and its source are connected.

This column may not be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

UNIQUE_INDIVS (All UNIQUE INDIVidualS)

This table contains one row for every individual under observation, and every individual from whom tissue or nucleic acid samples have been collected.

In contrast to BIOGRAPH, which records the identities of every individual in the main study population[122], this table also records the identities of all the individuals in other populations from whom there are tissue or nucleic acid samples recorded in the inventory. All individuals in BIOGRAPH are also included in this table, whether or not tissue or nucleic acid samples exist in the inventory. This presents a problem: there are two tables that separately track the identities of all individuals in the main population. To address this, the triggers have been written to ensure that BIOGRAPH retains primary authority over all individuals in the main population.

Management of individuals in the main population is done by BIOGRAPH (see its discussion for more information), so the ability to perform inserts/updates/deletes in this table for those individuals is heavily constrained, as follows:

  • Inserting rows for individuals in the main population is only allowed for the unknown individual or for individuals in BIOGRAPH who have not yet been added to this table[123].

  • The unknown individual's row can only be updated or deleted by an administrator.

  • Deleting rows for individuals in the main population is only allowed for individuals who are no longer in BIOGRAPH[124].

  • Updating rows for individuals in the main population is only allowed when changing only the Notes column.

  • Any individual's PopId cannot be updated to add or remove the individual from the main population.

Tip

Do not manually insert or delete rows in this table for individuals in BIOGRAPH. Perform those actions in BIOGRAPH, and the action will automatically be performed in this table, as well. Manual inserts and deletes in this table should only be done for individuals who are not in BIOGRAPH.

The IndivId column is used to record the individual's name or similar ID. Study projects and research institutions each have their own rules of nomenclature for their individuals, so this might be a lengthy name, an abbreviation, a series of numbers, or some mix of these. This value is not unique; the same identifier may be used more than once across different populations. However, per PopId, each IndivId must be unique; a population cannot use the same identifier more than once.

Special Values

PopId 1 is the population recorded in BIOGRAPH, so any row with this PopId (with a few exceptions, discussed below) must use the individual's Bioid as its IndivId.

IndivId UNKNOWN indicates the unknown individual, and is allowed to have PopId 1 and not be a Bioid.

IndivId MULTIPLE is used to indicate when TISSUE_DATA row includes samples from multiple individuals. It is allowed to have PopId 1 and not be a Bioid.

Column Descriptions

UIId (Unique Individual Identifier)

A unique identifier for the individual. This is an automatically generated sequential number that uniquely identifies the row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

IndivId (Individual Identifier)

The name/identifier for this individual.

This column may not be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

PopId (Population Identifier)

The POPULATIONS.PopId of the individual's population.

This column may not be NULL.

Notes

Comments or miscellaneous information about this individual.

This column may be NULL. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

SWERB Data (Group-level Geolocation Data)

This section contains timestamped geolocation data on groups, observers, and significant landscape features (groves, waterholes[125], and possibly other temporary or permanent landmarks), either recorded in a quad coordinate system or collected from GPS units. SWERB stands for Sleeping grove, Waterhole, End time, Ranging, and Begin time. Typically SWERB data are collected at hourly or half hourly intervals. Supporting information includes the locations of tree groves and waterholes. For more information see the Protocol for Data Management: Amboseli Baboon Project.

The quad coordinate system was devised prior to the incorporation of GPS technology into the data collection protocols. It is based on regular sub-divisions of the landscape into a grid of squares, 170 m per side. There is no altitude information associated with quad coordinate points. The IDs and coordinates of these quads are recorded in QUAD_DATA.

The GPS X and Y coordinates are in the WGS 1984 UTM Zone 37South coordinate system. The units of these coordinates are meters, as is the recorded altitude. The recorded precision of the X and Y values include at most 1 non-zero digit to the right of the decimal place, but when the coordinates were recorded using another system the transformation to UTM may yield values with more digits to the right of the decimal. X and Y coordinates must be on or within the bounding rectangle having X coordinates between 42300.0 and 651000.0, inclusive, and Y coordinates between 9497000.0 and 9894500.0, inclusive. The system will generate a warning when the location falls outside the bounding rectangle having X coordinates between 277000.0 and 311100.0, inclusive, and Y coordinates between 9689200.0 and 9709500.0, inclusive. The accuracy may vary; see the Protocol for Data Management: Amboseli Baboon Project for further information on accuracy at various times. Altitude is in meters. Altitude values must be between 0 and 10000, inclusive. There must be no (non-zero) digits to the right of the decimal place for altitude measurements taken before 2004-01-01. After 2004-01-01 one digit may appear to the right of the decimal place. The system will generate a warning when altitude values are NULL but X and Y coordinates are non-NULL.

Some devices and data-exporting applications favor longitude and latitude coordinates via the WGS 1984 2D CRS. Because of this, Babase can read coordinates in that system and transform them to WGS 1984 UTM Zone 37South. Regardless of the coordinate system used when the data are inserted, the coordinates are recorded using UTM. That is, "XYLoc" columns in all Babase tables have the PostGIS "geometry" datatype with SRID 32737, that of WGS 1984 UTM Zone 37South.

All PDOP columns must have values between 0 and 50, inclusive, and have one digit of precision to the right of the decimal. PDOP values are unit-less and should be multiplied by the specified accuracy in meters of the GPS unit to produce a 3 dimensional vector, in meters, representing the possible distance from the true location.[126]

All accuracy columns are in meters[127] with one digit of precision to the right of the decimal and must have values between 0 and 15, inclusive.

The kind of reported error is partially determined by characteristics of the the GPS unit used for data collection. GPS units which report error as a PDOP reading, those with GPS_UNITS.Errortype values of PDOP, cannot be related to rows with non-NULL Accuracy values. GPS units which report error as an accuracy reading, those with GPS_UNITS.Errortype values of accuracy, cannot be related to rows with non-NULL PDOP values. PDOP values must be NULL for data collected before 1993-09-01 or after 2001-01-31. Accuracy values must be NULL for data collected before 2001-02-01.[128] The system will report a warning when data collected with a GPS unit supporting PDOP or accuracy does not include, respectively, PDOP or accuracy values.

Warning

On 2000-05-02, the United States government ended its use of Selective Availability, a national security measure which intentionally lowered the accuracy of GPS signals. For more information about this, see Selective Availability on GPS.gov. The GPS accuracy indices in Babase (Accuracy and PDOP) do not and cannot account for this inaccuracy, so users should be aware that any GPS data collected through 2000-05-02 are likely less accurate than indicated.

Warning

GPS data between May and August 2019 are unreliable, apparently thanks to some issues with the European Union's Galileo satellites. See the SWERB Notebook for more information and documentation of this issue.

Starting 2004-01-01, GPS data began to be downloaded directly from the GPS units instead of being transcribed by hand. One consequence is that starting 2004-01-01 operators entered up to 10 characters of descriptive codes with each GPS waypoint taken. This information is processed and distributed throughout the SWERB data but the various Garmincode columns retain the raw data as entered by the operator.[129] Before 2004-01-01 the Garmincode columns must contain a NULL. On or after this date the Garmincode columns must not be NULL, but may be a string 0 characters long.[130] SWERB_DATA are the exception to this rule and may always be NULL. Begin and end rows, rows with a SWERB_DATA.Event values of B or E, may have NULL Garmincode columns regardless of date so that the data entry staff may supply begin and end rows without X and Y coordinates should the field team forget to record a begin or end row. Other SWERB_DATA rows are except from the Garmincode requirement to handle situations, notably those which involve lone animals, where data was written manually for some reason.

Before 2004-01-01 the GPS_Datetime columns must be NULL. The date portion of the GPS_Datetime columns must correspond to the date related to containing row. The time portion of the GPS_Datetime column is not validated, although the time portion of the GPS_Datetime value occasionally serves as data against which other columns are validated.

Caution

The Garmincode and GPS_Datetime columns may be NULL, without warning, no matter the date. This is to accommodate the manual recording of data taken using GPS units.[131]

Note

Data is validated per-observation team, per-group, per-day. Data upload and maintenance must be done within transactions that produce valid per-observation team, per-group, per-day data sets.

Note that it may be more convenient to use the views that support the SWERB data than to access the raw data.

AERIALS (Aerial photos)

This table contains one row for every aerial photo used in the specification of map quadrant system used in the early SWERB data.

Aerial (Aerial Identifier)

A unique identifier of the aerial photo. This is an integer greater than or equal to 1. It is used to refer to a particular aerial photo.

This column may not be NULL.

Date

The date the aerial photo was taken. This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

GPS_UNITS (Individual GPS Devices)

This table contains one row for each GPS unit which has been used in the field.

Note

In actual fact early records of unit identification may have been lost. In such cases a row in GPS_UNITS represents a number of units having the same capabilities (i.e. of the same make and model).

The date the unit was first used (Start) must be on or before the date the unit was last used (Finish).

The label on the GPS unit, the Label value must be unique within the time period in during which the GPS unit was in use, between the Start and Finish dates, inclusive.

GPS (GPS unit identifier)

A 2 digit non-negative numeric value that identifies the GPS unit as a distinct object throughout all time.

This column may not be NULL.

Descr (Description)

A short textual description of the GPS unit. If necessary this may include additional notes on such details as when the unit was used, its purpose, and so forth.

This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character. This column may not be NULL.

Make (Manufacturer)

The manufacturer of the GPS unit.

This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character. This column may not be NULL.

Model

The model of the GPS unit. This should be sufficiently detailed that the technical specifications of the unit can be found given this information.[132]

This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character. This column may not be NULL.

Errortype (Type of Error reporting)

The type of error the unit reports. This must be one of:

PDOP

The error is supplied as positional dilution of position.

accuracy

The error is in meters.

See the SWERB Data overview for more information.

This column may not be NULL.

Label (Identifying letter marked on the unit)

The letter code marked on the unit. Note that this information is not enough to uniquely identify the unit because the same letter codes have been used on different units at different times.

This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character. This column may not be NULL.

Start (Date of first use)

The date the GPS device entered service. This date cannot be before 1993-09-01, the date GPS units were first used. This column may not be NULL.

Finish (Date of last use)

The date the GPS unit was taken out of service. This column may be NULL when the unit is still in service.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

QUAD_DATA (map Quadrants)

The QUAD_DATA table contains one row for every map quadrant. For more information, see above in the introduction to the SWERB data.

Note

Before these quads were delineated on 1981-11-01, large scale aerial photographs were used to signify location in SWERB data.

Note

The QUADS view can be used to maintain the QUAD_DATA table. This view may also be more useful than the table when querying.

Quad (map Quadrant identifier)

The unique identifier code used to refer to a particular map quadrant.[133] This column may not be NULL.

XYLoc (X and Y WGS 1984 UTM Zone 37South coordinates)

The X and Y WGS 1984 UTM Zone 37South coordinates of the centroid of the map quadrant. This column may be NULL.

See the SWERB Data overview for more information.

Aerial (Aerial photo Identifier)

Code indicating the aerial photo in which the map quadrant is located, if any. Must be a value on the AERIALS table.

This column may be NULL when there is no aerial photo for the map quadrant.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

SWERB_BES (Begin/Ends: Uninterrupted bouts of group-level observation)

This table contains one row for every uninterrupted bout of group-level observation for which there is SWERB data.

Start and Stop values are automatically assigned the SWERB_DATA.Time value from the related SWERB_DATA row with an Event value of B and E rows, respectively. The begin and end of the bout of observation is determined by the begin and end rows entered in the field (or determined by the data manager).

Start must be NULL or be after the related SWERB_DEPARTS_DATA.Time, if any.

The Start value records the start of the day's observation of the group when there exists a related SWERB_DATA.Event value of B and that value is the first for that group/day and there is no earlier SWERB_DATA.Event E value. Likewise the Stop value records the end of the day's observation of the group when there exists a related SWERB_DATA.Event value of E and that value is the last for that group/day and there is no later SWERB_DATA.Event B value. The Start time cannot be after the Stop time.

The Btimeest value is only meaningful when either there is a begin time value or when investigation of existing records indicates that no record of a begin time on file -- when either the Start time value is non-NULL or the Bsource value is NR. The Etimeest value is only meaningful when either there is an end time value or when investigation of existing records indicates that no record of an end time on file -- when the Stop time value is not NULL or the Esource value is NR. When the values in these columns are meaningful they must contain a non-NULL value, otherwise they must contain a NULL value.[134] When the source of the start or stop time is NR then the estimated time flag must be FALSE and the time must be NULL.[135][136] It is required that there be a record of whether the start and stop times are estimated when there are start and stop times -- the Start and Stop columns cannot be non-NULL when the Btimeest and Etimeest columns, respectively, are NULL.[137] It is required that there be a record of the source of the start and stop times when there are start and stop times -- the Bsource and Esource values must be NULL unless, respectively, the Btimeest and Etimeest values are non-NULL.

SWERB_BES rows are automatically sequenced when no Seq is specified[138]by Start value, unless the Start value is NULL in which case they are sequenced last of all existing SWERB_BES rows for the group/day when initially inserted and otherwise not automatically sequenced.[139] In the case of a tie the automatic sequencing places the newly inserted row[140] last among the rows that are tied. Seq values may be manually assigned so long as the manual sequencing does not result in out-of-order Start values, or in those cases where Start is NULL, so long as the manually assigned sequence number is less than or equal to that which would be automatically assigned.[141]

As expected, changing the Start value (via a SWERB_DATA row with an Event value which indicates the start of observation) will automatically change the Seq value. Should there be other SWERB_BES rows for that group/day with the same SWERB_BEs-Start value the newly changed row will be be sequenced after the existing rows.[142]

Every bout of observation must have exactly one beginning -- there must be exactly one related row on SWERB_DATA with an Event of B. Every bout of observation must have exactly one end -- there must be exactly one related row on SWERB_DATA with an Event of E. These requirements are enforced on transaction commit, so the SWERB_BE row and the begin and end SWERB_DATA rows must all be created within a single transaction. The system will generate a warning when there are no observations in a bout of observation -- when there are no related SWERB_DATA rows with Event values other than B and E.

The focal group, Focal_grp, must be in existence, based on GROUPS.Start and GROUPS.Cease_To_Exist, on the date of the observation.

BEId (Begin/End Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to a particular bout of uninterrupted observation.

This column is automatically maintained by the database[143], cannot be changed, and must not be NULL.

DId (Departure Identifier)

The id of the SWERB_DEPARTS_DATA row representing the departure from camp of the observation team. This column cannot be changed.[144] This column must not be NULL.

Focal_grp

The group under observation. The legal values for this column are from the Gid column of the GROUPS table. This column cannot be changed.[145]This column may not be NULL.

Start (observation Starting time)

The time the bout of observation started. The time may not be before 05:00 and may not be after 20:00. The time must be on the minute mark; the seconds must be zero. This column may be NULL when the start of observation is unknown.

Btimeest (Begin Time is Estimated)

TRUE when the Start value is an estimation of the time the daily observation of the group began. FALSE otherwise. This column should be NULL when the Start time is the start of a uninterrupted bout of observation but is not the start of the day's observation of a group.

Bsource (Begin time estimation Source)

The source of the data used to estimate the Start value when that value is estimated and represents the start of the day's observation of the group -- how the start of the daily observation of the group was estimated. The legal values of this column are defined by the SWERB_TIME_SOURCES table. This column must be NULL when the Start time is the start of a uninterrupted bout of observation but is not the start of the day's observation of a group.

Stop (observation ending time)

The time the bout of observation ended. The time may not be before 05:00 and may not be after 20:00. The time must be on the minute mark; the seconds must be zero. This column may be NULL when the end of observation is unknown.

Etimeest (End Time is Estimated)

TRUE when the Stop value is an estimation of the time the daily observation of the group began. FALSE otherwise. This column should be NULL when the Stop time is the end of a uninterrupted bout of observation but is not the end of the day's observation of a group.

Esource (End time estimation Source)

The source of the data used to estimate the Stop value when that value is estimated and represents the end of the day's observation of the group -- how the end of the daily observation of the group was estimated. The legal values of this column are defined by the SWERB_TIME_SOURCES table. This column must be NULL when the Stop time is the end of a uninterrupted bout of observation but is not the end of the day's observation of a group.

Seq (daily per-group Sequence number)

A sequence number indicating the ordering of the bouts of uninterrupted observation of each group each day. The first bout of observation for the group for the day has a Seq value of 1, the second a value of 2, etc.

The system automatically re-computes Seq values to ensure that they are contiguous and begin with 1. See the overview of the SWERB_BES table and the Automatic Sequencing section for further information.

Is_Effort (does the bout count toward observer Effort)

A boolean value. TRUE means that the bout of observation counts toward total observer effort. FALSE means that the bout is concurrent with another bout of observation by the same team and should not count toward observer effort.

This column cannot be NULL.

Notes (Notes on the bout of observation)

Notes, if any, on the bout of observation. This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

This column may be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

SWERB_DATA (Group Level GPS Point Samples)

This table contains one row for every event related to group-level geolocation.[146]Such events geolocate a group upon the occurrence of a significant activity, including but not limited to ascent, descent, and drinking. Other events include geolocation at regular intervals and the begin and end of each bout of uninterrupted observation.

Note

The typical Babase user may find the SWERB view to be easier to query than SWERB_DATA and its related tables. It may be easier to use the SWERB_DATA_XY view to maintain SWERB_DATA than it is to modify the table content directly.

Rows with an Event value of O or P are not part of an observation bout of the focal group and so, unless the observed group is a Subgroup[147] or is the unknown group[148], must have a Seen_grp value which differs from that of the group under observation -- the SWERB_BES.Focal_grp value of the related SWERB_BES row. Likewise, rows which do not have an Event value of O or P must have a Seen_grp value of the group under observation -- a value which equals the SWERB_BES.Focal_grp value in the related SWERB_BES row.[149]The system will generate a warning when the SWERB_DATA row is for a non-focal group and the observed group is a subgroup and the observed group is the same as the focal group -- when Event is O and Subgroup is TRUE and SWERB_DATA.Seen_grp is the same as the related SWERB_BES.Focal_grp.

Per bout of observation, per BEId, there must be exactly one SWERB_DATA row recording the start and exactly one recording the finish of the bout -- exactly one SWERB_DATA row having an Event value of B and exactly one having a E value, respectively.

The time of the observation must be between the start and stop times of the bout of observation -- the Time value must be between (inclusive) the related SWERB_BES.Start and SWERB_BES.Stop values. Because SWERB_BES.Start may be NULL the Time value is also checked to be sure that it's not before the time the observation team departed from camp, before SWERB_DEPARTS_DATA.Time. Because SWERB_DEPARTS_DATA.Time may also be NULL the Time value is checked to be sure that it is not before 05:00. Because SWERB_BES.Stop may be NULL the Time value is checked to be sure that it is not after 20:00.

The date portion of the GPS_Datetime value must be the date of the observation team's departure from camp -- must equal the related SWERB_DEPARTS_DATA.Date value. The waypoint time recorded by the operator cannot be more than 15 minutes before the actual time the observation was taken -- the Time value cannot be more than 15 minutes before the time portion of the GPS_Datetime value. The exception to this rule is when a group drinks from a water hole; for these water hole events, the waypoint time cannot be more than 30 minutes minutes before the actual time the observation was taken. The waypoint time recorded by the operator cannot be more than 5 minutes after the actual time the observation was taken -- the Time value cannot be more than 5 minutes after the time portion of the GPS_Datetime value.

The Quad column records group location based on map quadrants and is used only in older data. Data recorded after 1994-09-30, rows associated with SWERB_DEPARTS_DATA rows with Date values after 1994-09-30, must have NULL Quad values. GPS units were used in later SWERB data collection so data recorded before 1993-09-01, rows associated with SWERB_DEPARTS_DATA rows having Date values before 1993-09-01, must have NULL XYLoc values.

Only data collected using GPS units have altitude, PDOP, accuracy, a GPS timestamp, or Garmincode values -- when the XYLoc column is NULL then the Altitude, PDOP, Accuracy GPS_Datetime, and Garmincode values must also be NULL.

The observed lone animal must be NULL unless the waypoint is an observation of a lone animal/non-focal group — Lone_Animal must be NULL unless Event is O.

Note

An other group observation of an unknown lone animal is recorded in a SWERB_DATA row having a NULL Lone_Animal value and a Seen_grp value of 10.0 (the group denoting a lone animal).

The observed predator must be NULL unless the waypoint is an observation of a predator — the Predator must be NULL unless the Event is P, in which case the Predator must not be NULL.

The observer's distance from the observed lone animal, predator, or non-focal group must be NULL unless the waypoint is an observation of a lone animal, predator, or non-focal group -- Ogdistance must be NULL unless Event is O or P.

Note

Through the end of 2022, the observers' protocol for recording this distance was either been poorly defined or poorly adhered-to. (It is unclear which.) Distances were occasionally recorded but usually not. It is unclear what decisions were made at the time that might decide whether or not to record this distance. To avoid fallacious assumptions about the nature of the data, all distances recorded before 01 Jan 2023 have been manually set to NULL.

In case someone wants to use the SWERB_DATA_HISTORY table to retrieve the once-present distances, they were set to NULL at 2023-03-22 00:20:44.206126+03 (Nairobi time).

The observed group, Seen_grp, must be in existence, based on GROUPS.Start and GROUPS.Cease_To_Exist, on the date of the observation.

An observed lone animal, Lone_Animal, must have already entered the study population and must be alive on the date of observation -- the SWERB_DEPARTS_DATA.Date related to the SWERB_DATA row must be between individual's related Entrydate and Statdate, inclusive. The system will return a warning if the related Date is before the individual's LatestBirth.

The system will generate a warning if a lone animal is a male and is observed more than 60 days before his assigned dispersal date -- before DISPERSEDATES.Dispersed.

When a lone individual is observed, the observed group must be the group reserved for lone animals -- when SWERB_DATA.Lone_Animal is non-NULL then SWERB_DATA.Seen_grp must be 10.0.

Caution

Interpolation does not reference SWERB data when making its computations. Consequently the MEMBERS table does not reflect SWERB sightings of lone individuals -- unless those sightings are otherwise recorded in the DEMOG table.

When a predator is observed, the observed group must be the group reserved for predator sightings -- when SWERB_DATA.Predator is non-NULL then SWERB_DATA.Seen_grp must be 99.0.

Note

It is not possible from these data to determine the number (quantity) of predators observed. Information like this is recorded, but not in the GPS units[150]. See the Amboseli Baboon Research Project Monitoring Guide for more information.

SWId (SWerb event Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to a particular GPS event.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

BEId (Begin/End Identifier)

The id of the SWERB_BES row representing the bout of uninterrupted observation of which the SWERB_DATA row is a part. This column cannot be changed and must not be NULL.

Seen_grp

The group under observation. Note that this is not always the focal group which the observation team set out to observe. For further details see the Protocol for Data Management: Amboseli Baboon Project. The legal values for this column are from the Gid column of the GROUPS table.

This column may not be NULL.

Lone_Animal

The BIOGRAPH.Sname of the observed lone animal.

This column may be NULL to indicate either that a lone animal was not observed or that an unknown lone animal was observed.

Event (half hourly/hourly, watering, begin, end, other group)

A code indicating what sort of event the row represents. The following codes are defined:

B

The row represents the beginning of a bout of uninterrupted observation of the focal group.

E

The row represents the end of a bout of uninterrupted observation of the focal group.

H

The row represents an observation of the focal group. These occur on half hourly or hourly intervals, depending on the protocol used to record the data.

W

The row records the focal group's drinking.

O

The row represents the observation of a non-focal group or lone animal.

P

The row represents a sighting of a predator.

This column may not be NULL.

Time

The time of the observation. This is usually the time manually entered by the observer but in those cases where the observer does not enter a time (such as begin and end rows) the SWERB_UPLOAD view may use GPS supplied information to calculate a time. See the section on the SWERB_UPLOAD.Description column. The time must be on the minute mark; the seconds must be zero. This column may be NULL when the time is not known.

Quad (map Quadrant)

The map quadrant of the seen group's location, when recorded in the field. The legal values for this column are from the Quad column of the QUAD_DATA table.

This column may be NULL.

XYLoc (X and Y WGS 1984 UTM Zone 37South coordinates)

The X and Y WGS 1984 UTM Zone 37South coordinates of the seen group. This column may be NULL.

See the SWERB Data overview for more information.

Altitude

The altitude, in meters, of the landscape on which the seen group is located. This column may be NULL.

See the SWERB Data overview for more information.

PDOP (error in Positional Dilution Of Precision)

The amount of error reported as positional dilution of precision. This column may be NULL when there is no PDOP information.

See the SWERB Data overview for more information.

Accuracy (in meters)

The accuracy of the GPS reading, in meters. This column may be NULL when there is no accuracy information in meters.

See the SWERB Data overview for more information.

Subgroup

TRUE when the observation is of a subgroup, FALSE when not.

Note that the field team cannot always record subgroup information and the value in this column is therefore sometimes determined heuristically[151] when the data is uploaded by the SWERB_UPLOAD view.

This column must not be NULL.

Ogdistance (Distance to Other Group)

The distance, in meters, between the observer and the observed non-focal group or the observer and the observed lone animal. This value must be a 3 digit non-negative integer that is also a mulitple of 0.

This column may be NULL when the observers did not record an Ogdistance (i.e. NULL values are not to be confused with zero distance).

GPS_Datetime (GPS supplied Date and Time)

The date and time automatically supplied by the GPS unit at the time the waypoint was recorded. For further information on when this column is NULL and when non-NULL see the SWERB Data overview.

This column may be NULL.

Garmincode (operator supplied waypoint value)

The information manually entered by the observer into the GPS unit as a coded waypoint that describe the SWERB data being recorded. This column may be empty, it need not contain characters, but it may not contain only whitespace characters. For further information on when this column is NULL and when non-NULL see the SWERB Data overview.

This column may be NULL. See the SWERB Data overview for more information.

Predator (code for observed predator)

The PREDATORS.Predator code of the observed predator.

This column may be NULL, when this row is not for a predator sighting.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

SWERB_DEPARTS_DATA (Observation team departures from camp)

This table contains one row for every departure from camp of every observation team, for those observation teams which have collected SWERB data.

The Time value may not be NULL when there is a related SWERB_DEPARTS_GPS row -- data collected using the GPS units must have a non-NULL time.

One observer may not depart camp on the same day at the same time with two different observation teams -- the combination of SWERB_DEPARTS_DATA.Date, SWERB_DEPARTS_DATA.Time, and SWERB_OBSERVERS.Observer, when all are non-NULL, must be unique.

The system will generate a warning for SWERB_DEPARTS_DATA rows having a Date after 1994-09-30 that do not also have a related SWERB_DEPARTS_GPS row.

The system will generate a warning for SWERB_DEPARTS_DATA rows for which no SWERB data was collected; that do not have a related SWERB_BES row.

Note

The SWERB_DEPARTS view can be used to maintain the SWERB_DEPARTS_DATA table. This view may also be more useful than the table when querying.

Warning

At the time of this writing departure data prior to about March of 2011 is not in the database. The process involved in loading historical data fabricates (departure date excepted, the actual departure date is used) the minimal required departure information. The early process used by the Data Manager involving loading data from the GPS units sometimes involved removing departure information. For further information and exact dates see the Data Manager's [Process for Uploading SWERB] document.

DId (Departure Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to a particular departure from camp of a particular observation team.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Date

The date of departure. This date must be on or after 1981-11-01 This column must not be NULL.

Time

The time of departure. The time may not be before 04:00 and may not be after 20:00. The system will generate a warning if the time is before 05:00 or after 14:30. The time must be on the minute mark; the seconds must be zero. This column may be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

SWERB_DEPARTS_GPS (SWERB GPS Departure data)

This table contains one row for every departure from camp of every observation team, for those observation teams which have collected SWERB data using GPS units. This table is an extension of the SWERB_DEPARTS_DATA that contains the additional information collected when a GPS unit is used to record the departure. There is at most one row in this table for every row in SWERB_DEPARTS_DATA. When a row exists it contains the information involving the GPS unit used by the observation team on that day. All SWERB_DEPARTS_DATA rows having associated SWERB_DEPARTS_GPS rows must have SWERB_DEPARTS_DATA.Date values on or after 1993-09-01.

The date of departure (SWERB_DEPARTS_DATA.Date) must be between the SWERB_DEPARTS_GPS' Start and Finish dates, inclusive.

Note

The SWERB_DEPARTS view can be used to maintain the SWERB_DEPARTS_GPS table. This view may also be more useful than the table when querying.

The system will generate a warning when there is more than one departure per GPS unit per day.

DId (Departure Identifier)

The id of the SWERB_DEPARTS_DATA row representing the departure from camp of the observation team. This column cannot be changed and must not be NULL.

XYLoc (X and Y WGS 1984 UTM Zone 37South coordinates)

The X and Y WGS 1984 UTM Zone 37South coordinates at departure. This column must not be NULL.

See the SWERB Data overview for more information.

Altitude

The altitude in meters of the GPS unit. This column may be NULL.

See the SWERB Data overview for more information.

PDOP (error in Positional Dilution Of Precision)

The error reported as positional dilution of precision. This column may be NULL.

See the SWERB Data overview for more information.

Accuracy (in meters)

The error reported in meters. This column may be NULL.

See the SWERB Data overview for more information.

GPS (GPS used by the team)

The identifier of the GPS device (the GPS_UNITS.GPS) used by the observation team. The legal values of this column are defined by the GPS_UNITS support table.

This column must not be NULL.

Garmincode (operator supplied waypoint value)

The information manually entered into the waypoint by the observer. This is a set of, mostly, single character codes that describe the SWERB data being recorded. This column may be empty, it need not contain characters, but it may not contain only whitespace characters. For further information on when this column is NULL and when non-NULL see the SWERB Data overview.

This column may be NULL. See the SWERB Data overview for more information.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

SWERB_GWS (SWERB Grove and Waterholes)

This table contains one row for every geolocated physical object, that is, for every grove and waterhole.[152]

Caution

This table may contain one row with special meaning. The SWERB_GWS row with a Loc value of UNK represents the unknown grove -- a grove with special properties. When a SWERB_GWS row exists with a SWERB_GWs-Loc value of UNK then the Type value must be G (grove). No trees may be located in the unknown grove -- TREES.Loc may not be UNK. The unknown grove may not be located anywhere -- SWERB_GW_LOC_DATA.Loc may not be UNK. And when it is not known where a group slept there can be no uncertainty regarding the sleeping grove -- when SWERB_LOC_DATA.Loc is UNK then SWERB_LOC_DATA.Loc_Status must be C (certain).

SWERB_GWS rows that represent groves, those with a SWERB_GWs-Type of G, have restrictions on the allowed Loc values due to the data structure supplied the SWERB_UPLOAD view (the Name column sometimes contains a grove code prefaced with the letter P). There cannot be two codes for groves, one which begins with the letter P and another which consists entirely of the same characters as the first but with the initial P omitted.[153] Because of this restriction the Babase administrator is the only user allowed to create Loc values which begin with the letter P.

With the exception of the unknown grove, the system will report a warning when the grove or waterhole has not been geolocated -- when there is no related SWERB_GW_LOC_DATA row.

Loc (Location)

A unique identifier. Up to 4 alphanumeric non-lowercase characters that uniquely identifies the row and may be used to refer to the grove or waterhole.

This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character. This column cannot be changed and must not be NULL.

Type (Type of place)

The type of place; whether grove, waterhole, or some other landmark. The legal values for this column are from the Place column of the PLACE_TYPES (codes for various landscape features) table.

This column must not be NULL.

Altname (Alternative Name)

Up to 20 characters of alternative name for the grove or waterhole.

This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character. This column may be NULL.

Start (Starting date)

The date when the grove or waterhole was named. This date cannot be before 1981-11-01.

This column must not be NULL.

Finish (Finish date)

The date of last known use after which the resource became permanently unavailable.

This column may be NULL when observations are ongoing or the row represents an object that cannot become unavailable.

Notes

Textual notes on the grove or waterhole, if any.

This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character. This column may be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

SWERB_GW_LOC_DATA (SWERB Grove/Waterhole Location Data)

This table contains one row for each time a location of a place, a grove or waterhole is recorded. Any given grove or waterhole may have its location recorded more than once.

Note

The typical Babase user may find the SWERB_GW_LOCS view to be easier to query than SWERB_GW_LOC_DATA and its related tables. It may be easier to use the SWERB_GW_LOC_DATA_XY view to maintain SWERB_GW_LOC_DATA than it is to modify the table content directly.

The date related to the location (SWERB_GW_LOC_DATA.Date) may not be before the grove or waterhole was first observed, may not be before the related SWERB_GWS.Start value. The date related to the location (SWERB_GW_LOC_DATA.Date) may not be after the grove or waterhole ceases existance, may not be after the related SWERB_GWS.Finish value.

The Quad column records group location based on map quadrants and is used only in older data. Data recorded after 1994-09-30, rows with Date values after 1994-09-30, must have NULL Quad values. GPS units were used in later SWERB data collection so data recorded before 1993-09-01, rows having Date values before 1993-09-01, must have NULL XYLoc values, unless the UTM XY coordinates were obtained through other means (XYSource is non-NULL).

There can only be a source for the recorded X and Y coordinates when there are recorded UTM coordinates -- the XYSource value may be non-NULL only when XYLoc is non-NULL. There must be X and Y UTM coordinates when there is a recorded source for the X and Y coodinates -- XYLoc must be non-NULL when XYSource is non-NULL.

Only data collected using GPS units have altitude, PDOP, accuracy, and GPS values -- when the XYLoc column is NULL then the Altitude, PDOP, Accuracy, GPS values must also be NULL.

The GPS unit used to make the observation must be in service on the date of the observation -- the date of the observation (Date) must be between the SWERB_DEPARTS_GPS' Start and Finish dates, inclusive, of the related GPS_UNITS row.

SGWLId (SWERB Grove/Waterhole Location Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to an observation which recorded the location of a particular grove or waterhole.

This column is automatically maintained by the database and must not be NULL.

Loc (Location)

The SWERB_GWS.Loc of the grove or waterhole associated with the recorded location.

This column must not be NULL.

Date

The date related to the location. This is either the date the location was caculated or an observation date. See the Protocol for Data Management: Amboseli Baboon Project for further information. This column must not be NULL.

Time

The time of the observation. When the data are taken with a GPS unit this is the time recorded by the GPS unit. The time cannot be before 05:00 and cannot be after 20:00. The time must be on the minute mark; the seconds must be zero. This column may be NULL when the time is not known.

Quad (map Quadrant)

The map quadrant of the grove or waterhole's location, when recorded. The legal values for this column are from the Quad column of the QUAD_DATA table.

This column may be NULL.

XYSource (Source of X/Y coordinates data)

The source of the UTM coodinate data. The legal values for this column are from the XYSource column of the SWERB_XYSOURCES (SWERB Time Sources) table.

This column may be NULL.

XYLoc (X and Y WGS 1984 UTM Zone 37South coordinates)

The X and Y WGS 1984 UTM Zone 37South coordinates of the grove or waterhole. This column may be NULL.

See the SWERB Data overview for more information.

Altitude

The altitude, in meters, of the grove or waterhole. This column may be NULL.

See the SWERB Data overview for more information.

PDOP (error in Positional Dilution Of Precision)

The error reported as positional dilution of precision. This column may be NULL when there is no PDOP information.

See the SWERB Data overview for more information.

Accuracy (in meters)

The error reported in meters. This column may be NULL when there is no accuracy information in meters.

See the SWERB Data overview for more information.

GPS (GPS used by the team)

The identifier of the GPS device (the GPS_UNITS.GPS) used in the observation. The legal values of this column are defined by the GPS_UNITS support table.

This column may be NULL.

See the SWERB Data overview for more information.

Notes

Textual notes regarding the record of the grove or waterhole's location, if any.

This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character. This column may be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

SWERB_LOC_DATA (LOCation-Specific DATA from GPS Points)

This table contains one row every time a group is observed at a geolocated physical object, i.e. at a grove or a waterhole or, possibly, some other physical landmark.[154]

SWERB_LOC_DATA rows must place a group at a single location -- each SWERB_DATA row has at most one related SWERB_LOC_DATA row. In effect, SWERB_LOC_DATA extends SWERB_DATA with additional columns.

Descent from, or ascent into, groves is indicated in the ADcode column. To indicate a descent, the ADCode value must relate to an ADCODES row a ADN value of D. To indicate an ascent, the ADCode value must relate to an ADCODES row a ADN value of A.

The observations recording descent from or ascent into sleeping groves must be related to groves (the related SWERB_GWS rows must have a Type of G).

The SWERB_DATA row representing the "begin" of the team's first bout of observation (the bout with the smallest SWERB_BES.Seq value) of any group (except the unknown group, group 9.0)[155] in a day must be related to a SWERB_LOC_DATA row recording descent from a sleeping grove. This enforces the requirement that a day's observations of a group must include the group's descent from exactly one grove (possibly the unknown grove). A group can be recorded as descending from more than one grove, but only when all of the descents are by subgroups (the related SWERB_DATA.Subgroup is TRUE), or all but one of the descents are by subgroups and those subgroup descents are from the unknown grove.

Similarly, the SWERB_DATA row representing the "end" of the team's final bout of observation (the bout with the greatest SWERB_BES.Seq value) of any group (except the unknown group, group 9.0)[156] in a day must be related to a SWERB_LOC_DATA row recording ascent into a sleeping grove. This enforces the requirement that a day's observations of a group must include the group's ascent into exactly one grove (possibly the unknown grove). A group can be recorded as ascending into more than one grove, but only when all of the ascents are by subgroups (the related SWERB_DATA.Subgroup is TRUE), or all but one of the ascents are by subgroups and those subgroup ascents are into the unknown grove. The database rules that enforce these "ascent into sleeping grove" rules are checked at transaction commit.[157]

Tip

When a group splits into subgroups and descends from or ascends into multiple groves there must be a separate bout of observation, another SWERB_BES row, to record the location of each subgroup.

Whether a SWERB_LOC_DATA row must have a NULL ADtime value or must have a non-NULL ADtime value is determined by the related ADCODES.Time flag.[158] Ascent and descent times related to a bout of observation cannot be before the beginning of the bout of observation -- SWERB_LOC_DATA.ADtime cannot be before the related SWERB_BES.Start time.[159] The database rules that enforce ADtime values are checked at transaction commit.[160]

Note

Descent and ascent times are recorded manually; they are not taken from the timestamps supplied by the GPS units. This necessitates additional columns for descent and ascent information. For further information see the Amboseli Baboon Research Project Monitoring Guide.

When the location is the unknown grove, status of that location must be 'certain'. That is, when the Loc value is UNK then the Loc_Status value must be C.

Babase allows SWERB data to record group presence at arbitrary landmarks, but some possibilities are rare and result in a warning. The system will issue a warning when a group is located at a waterhole but the recorded activity is not water (when the SWERB_GWS row's Type is W but the related SWERB_DATA row's Event value is not W).

SWERB_DATA rows representing observation of a group drinking at a waterhole must be related to waterholes. That is, when SWERB_DATA.Event is W there must be a related SWERB_GWS row, even if it is the generic and non-specific row which represents all rainpools, and the related SWERB_GWS row must have a Type value of W. In some cases this check is at transaction commit time and in other cases not.

Rows that record a drinking event -- those related to SWERB_DATA rows which have W Event values -- must have SWERB_LOC_DATA.ADcode values that indicate no involvement with a sleeping grove; the related ADCODES row must have a ADN value of N.

Groups may not be located at a place before observations began at the place or after observations ended at the place. That is, the SWERB_DEPARTS_DATA.Date related to the SWERB_DATA row referenced by the SWERB_LOC_DATA.SWId value must not be before the related SWERB_GWS.Start value and must not be after the related SWERB_GWS.Finish value.

SWId (SWERB Identifier)

The number that uniquely identifies the row and may be used to refer to an observation of a group at a particular time at a particular grove or waterhole. This is also the SWERB_DATA.SWId identifying the group, place, and time of the observation.

This column must not be NULL and cannot be changed.

Loc (Location)

The SWERB_GWS.Loc of the object (grove, waterhole, or landmark) where the group was observed.

This column must not be NULL.

ADcode (Ascent/Descent Code)

A code representing the nature of the relationship between the baboon group and the landscape feature at which the SWERB_LOC_DATA row places the group. The legal values of this column are defined by the ADCODES support table.[161]

This column must not be NULL.

Loc_Status (Location Observation Status)

The SWERB_LOC_STATUSES.Loc_Status value indicating the status of this observation of the location on record (this row's Loc). Usually, this will indicate whether the observers actually saw the group at the location or inferred that the group was there. For instance, if the group is still in a sleeping grove when the observers arrive then they will be "certain" about that grove (Loc_Status = C), but if the group is walking away from the grove when the observers arrive then they may indicate the grove as 'probable'(Loc_Status = P).

Note

Although the database supports degrees of certainty with respect to any group location, in practical terms the only time that there will be any degree of uncertainty will involve sleeping groves. This is for two reasons. First, at present the only provision in the Amboseli Baboon Research Project Monitoring Guide involving uncertainty is with respect to sleeping groves. Second, the SWERB_UPLOAD will only ever enter an indication of uncertainty into the database when the location is a sleeping grove.[162]

This column may not be NULL.

ADtime (Ascent/Descent Time)

The median time of group decent from or ascent into a sleeping grove. See the Amboseli Baboon Research Project Monitoring Guide for information regarding how median descent and ascent times are determined. The time may not be before 05:00 and may not be after 20:00. The time must be on the minute mark; the seconds must be zero. This column may be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

SWERB_LOC_GPS (Secondary Data for LOCations in GPS points)

The SWERB data collection protocol sometimes requires 2 GPS waypoint entries to record a group's presence at a physical landscape feature. (At the time of this writing descent from and ascent into sleeping groves requires 2 GPS waypoint entries.) This table contains one row every time a group is observed at a geolocated landscape feature and 2 GPS waypoints are required to record the data. The rows of this table contain the information stored in the second GPS waypoint, information automatically generated by the GPS unit or manually entered into the GPS unit, that otherwise have no place in the database.

Note

It may be easier to use the SWERB_LOC_GPS_XY view to maintain SWERB_LOC_GPS table than it is to modify the table content directly.

The SWERB_LOC_GPS table extends the SWERB_LOC_DATA table[163] with additional columns; SWERB_LOC_GPS contains at most one row for every row in SWERB_LOC_DATA.

As described in the SWERB Data overview above, data was first obtained directly from the GPS units on 2004-01-01. Consequently, this table cannot contain rows dated earlier than 2004-01-01.

SWId (SWerb event Identifier)

The number that uniquely identifies the row and may be used to refer to the GPS information involving an observation of a group at a particular time at a particular grove or waterhole. This is also the SWERB_DATA.SWId value, identifying the group, place, and time of the observation, and the SWERB_LOC_DATA.SWId value, identifying the placement of the group at a landscape feature.

This column must not be NULL and cannot be changed.

XYLoc (X and Y WGS 1984 UTM Zone 37South coordinates)

The X and Y WGS 1984 UTM Zone 37South coordinates of the SWERB_DATA.seen group. This column may not be NULL.

See the SWERB Data overview for more information.

Altitude

The altitude, in meters, of the landscape on which the seen group is located. This column may be NULL.

See the SWERB Data overview for more information.

PDOP (error in Positional Dilution Of Precision)

The amount of error reported as positional dilution of precision. This column may be NULL when there is no PDOP information.

See the SWERB Data overview for more information.

Accuracy (in meters)

The accuracy of the GPS reading, in meters. This column may be NULL when there is no accuracy information in meters.

See the SWERB Data overview for more information.

GPS_Datetime (GPS supplied Date and Time)

The date and time automatically supplied by the GPS unit at the time the waypoint was recorded. This column may not be NULL.

This column may be NULL.

Garmincode (operator supplied waypoint value)

The information manually entered by the observer into the GPS unit as a coded waypoint that describe the SWERB data being recorded. This column may be empty, it need not contain characters, but it may not contain only whitespace characters. This column may not be NULL, although it may be a string 0 characters long. See the SWERB Data overview for more information.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

SWERB_OBSERVERS

For teams collecting SWERB data this table contains one row for every departure from camp of every member of the departing observation team for those team members who drive or record data.

The system will generate a warning for those SWERB_DEPARTS_DATA rows without at least one related row in SWERB_OBSERVERS.

SWERBOId (SWERB Observers Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to a particular observer's departure from camp as part of a particular observation team.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

DId (Departure Identifier)

The id of the SWERB_DEPARTS_DATA row representing the departure from camp of the observer's observation team. This column must not be NULL.

Observer (Observer code)

Initials of the observer. The legal values of this column are defined by the OBSERVERS support table.

This column must not be NULL.

Role

The role assumed by the member of the SWERB observation team. The legal values of this column are defined by the OBSERVER_ROLES support table.

This column must not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

TREES

This table contains one row for every tree in the tree monitoring project.

Trees can only be located in groves -- the value of the TREES.Loc column must reference a SWERB_GWS row which has a SWERB_GWS.Type of G (Grove).

Tree numbers are unique within each grove. The combination of Loc and Tree must be unique.

TId (Tree Identifier)

A unique identifier. This is an automatically generated sequential number that uniquely identifies the row and may be used to refer to a particular tree.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Loc (Tree Location)

The identifier of the grove, a SWERB_GWS.Loc value, in which the tree is located.

This column must not be NULL.

Tree (Tree number)

The integer used to uniquely identify a tree within a particular grove.

This column must not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

Weather Data

The data in this section are collected from manually read instruments, with one notable exception: the DIGITAL_WEATHER table contains data from electronic instruments that record weather data automatically.

Tip

The MIN_MAXS view provides a way to view all the tables containing manually collected weather data at once, with each weather data collection event appearing as a single row.

Note

The weather-related tables contain weather-related information and so do not directly relate to any of the baboon information contained in Babase.

RAINGAUGES (Rain Measurements)

This table contains one row for every time a rain gauge reading is recorded. There can be at most one RAINGAUGES row per WREADINGS row.

WRid

The identifier of the meteorological collection event during which the rain gauge was read. Must be a value contained in the WRid column of a row on the WREADINGS table, and the associated row may not be associated with any other row in RAINGAUGES.

This column cannot be changed; and must not be NULL.

RGspan

The interval, in an integral number of seconds, since the previous rain gauge collection event.

This column is automatically maintained by the database and cannot be changed. This column must not be NULL.

Caution

When the WREADINGS.WRdaytime values used to compute RGspan are not integral, the resulting RGspan value is rounded to the nearest second. Values of .5 seconds are rounded to the nearest even number of seconds.

Warning

When a new row is inserted the value of this column is silently ignored and an automatically computed value is used in its place. It is best to omit this column from the inserted data (or specify the NULL value).

EstRGspan

Whether or not any estimated WREADINGS.WRdaytime values were used in the computation of the RGspan column. TRUE if any of the relevant WREADINGS.Estdaytime values are true, FALSE otherwise.

This column is automatically maintained by the database and cannot be changed. This column must not be NULL.

Warning

When a new row is inserted the value of this column is silently ignored and an automatically computed value is used in its place. It is best to omit this column from the inserted data (or specify the NULL value).

Rain

The measurement of rain accumulated since the last time the rain gauge was read. In millimeters stored using a data type having a precision of 0.1 millimeter. For the precision and accuracy of the data itself see the Amboseli Baboon Research Project Monitoring Guide.

This column must be non-negative and may not be more than 200.0. This column may not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

RGSETUPS (Rain Gauge Setups)

This table contains one row for every time a rain gauge is installed. There can be no RAINGAUGES rows recording rain gauge measurements at any given weather station (WSTATIONS) unless there is a prior record of a rain gauge installation in RGSETUPS.

Rain gauge measurements are only meaningful when it is known how long the rain has been collected. In the event that, e.g., an elephant steps on the rainguage, there will be a period of time until the rain gauge is replaced. The first reading of the replacement rain gauge is not a measurement of rain since the last rainguage reading, but is instead a measurement of the rain collected since the replacement rain gauge was installed. The RGSETUPS table allows the system to compute RAINGAUGES.RGspan intervals when rain gauges are replaced, first installed, or after an interval of corrupted measurements.[164]

There cannot be a RGSETUPS row and a RAINGAUGES row for the same location at the same time.

The combination of RGSdaytime and Wstation must be unique.

RGSid

A unique positive integer representing the rain gauge setup event.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Wstation

Code indicating the station at which the rain gauge was installed. Must be a value on the WSTATIONS table.

This column cannot be changed and must not be NULL.

RGSdaytime

The day and time the rain gauge was installed. The time zone is Nairobi local time.

RGSestdaytime

TRUE when the RGSdaytime column contains an estimated time. FALSE when the RGSdaytime column is an accurate record of the time the rain gauge was installed.

RGSPerson

Initials of the person who collected the data. Must be a value contained in the Initials column of a row on the OBSERVERS table.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

TEMPMINS (Minimum Temperature Measurements)

This table contains one row for every time a minimum temperature reading was recorded. There can be at most one TEMPMINS row for every WREADINGS row.

The Tempmin column has one decimal point of precision, but thanks to limitations of the thermometers the temperature is normally collected with a half decimal point of precision; the digit to the right of the decimal point should be either a 0 or a 5. This may not always be so, however. The system will return a warning when the Tempmin is not a multiple of 0.5.

Beginning 01 July 2022, a new thermometer with higher accuracy and precision was deployed, allowing for reliable recording of temperature to the nearest tenth of a degree. For this reason, the above warning only applies to data collected before that date[165].

WRid

The identifier of the meteorological collection event during which the minimum temperature was read. Must be a value contained in the WRid column of a row on the WREADINGS table, and the associated row may not be associated with any other row in TEMPMINS.

This column cannot be changed; and must not be NULL.

Tempmin

The minimum temperature recorded since the last minimum temperature reading.

This table must contain a value between -5 and 35, inclusive of endpoints, and must not be NULL.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

TEMPMAXS (Maximum Temperature Measurements)

This table contains one row for every time a maximum temperature reading was recorded. There can be at most one TEMPMAXS row for every WREADINGS row.

In extreme circumstances where a temperature reading is known to be spurious in some way, it may be desirable to record a correction or adjustment from the original temperature. When this is done, the adjusted temperature should be recorded in the Tempmax column, and the unadjusted temperature in the Unadjusted_Tempmax column. If no adjustment has been made, the Unadjusted_Tempmax should be NULL.

Because a non-NULL Unadjusted_Tempmax indicates that an adjustment has occurred, the Unadjusted_Tempmax cannot be equal to the Tempmax.

Both temperature columns have one decimal point of precision, but thanks to limitations of the thermometers the temperatures are normally collected with a half decimal point of precision; the digit to the right of the decimal point should be either a 0 or a 5. This may not always be so, however. Newer thermometers may be more precise, and temperature adjustments may not conveniently be to the nearest 0.5°. The system will return a warning when either Tempmax or Unadjusted_Tempmax is not a multiple of 0.5.

Beginning 01 July 2022, a new thermometer with higher accuracy and precision was deployed, allowing for reliable recording of temperature to the nearest tenth of a degree. For this reason, the above warning only applies to data collected before that date[166].

Values in both of the temperature columns in this table must be between 10 and 50, inclusive.

Historical Note

Weather station BC1 was positioned too close to the kitchen, resulting in spuriously high Tempmax readings. To correct for this, all Tempmax readings from that weather station have been adjusted by -4.2°C (rounded from -4.245). This adjustment was calculated as the residual + fixed effect from a model of Tempmax as a function of day of the year + random intercept of weather station with only BC1 and BC2, BC3, BC4 combined in the dataset (i.e., Tempmax ∼ day of the year + (1 | Wstation)). Day of the year was included in the model to correct for the fact that BC1 had an overrepresentation of January to June dates compared to the other three BC weather stations. BC5 was not used in the calculation because at the time of calculation there was less than one year of weather data from this station. We also calculated adjustment factors in two alternative ways which yielded extremely similar values: (1) taking the difference between the mean Tempmax of BC1 and mean Tempmax of BC2, BC3, BC4 combined (adjustment factor = -4.29°C) and (2) taking a residual + fixed effect from a model of Tempmax as a function of a fixed intercept + random intercept of weather station with only BC1 and BC2, BC3, BC4 combined in the dataset (i.e., Tempmax ∼ 1 + (1 | Wstation); adjustment factor = -4.28°C).

Column Descriptions

WRid

The WREADINGS.WRid of the meteorological collection event during which this maximum temperature was read.

This column is unique, cannot be changed, and must not be NULL.

Tempmax

The maximum temperature recorded since the last maximum temperature reading.

This column may not be NULL.

Unadjusted_Tempmax

The original, unadjusted maximum temperature, when the value in the Tempmax column has been adjusted in some way.

This column may be NULL, when the Tempmax has not been adjusted.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

DIGITAL_WEATHER (Digitally Collected Weather Data)

This table records the weather data that are automatically collected each hour by an electronic weather collection instrument.

Note

Originally, this table only contained data from WeatherHawk devices and was therefore named WEATHERHAWK. Likewise, the WEATHER_SOFTWARES table was originally named WEATHERHAWK_SOFTWARES. On 28 Nov 2023, these tables were renamed to reflect that they may also contain data related to other devices. Ideally, the WEATHERHAWK_HISTORY and WEATHERHAWK_SOFTWARES_HISTORY tables should remain in the babase_history schema so that changes to those tables will remain accessible. However, when these tables were renamed their history tables were both empty. There were no archived changes in either table that needed to be preserved, so the old history tables were not retained.

A weather station cannot have more than one reading at the same time. That is, the combination of TimeStamp and WStation must be unique.

Instrument accuracy may not, and probably does not, correspond with the recorded degree of precision. These instruments collect their data in engineering units, which are interpreted and converted to standardized units (degrees, kPa, etc.) by PC software when the data are retrieved from the instrument. Different PC software programs may vary in terms of units used, the number of significant figures employed, or other ways that are not immediately apparent. There are even some values that are simply not recorded by some programs or devices.

Despite hardware and software differences, most measurements saved in this table use a single column and a specified unit. Data managers should ensure that data are converted to the appropriate units, if needed. The allowed precision in these columns — usually a single digit to the right of the decimal — is based on a private message from WeatherHawk's technical support[167], who asserted that this is the maximum plausible precision that WeatherHawk devices are capable of measuring. It is presumed that this is also the maximum plausible precision for other (non-WeatherHawk) devices. This might be more or less precise than the value originally reported by the software.

Tip

Use the WEATHER_SOFTWARES table to see what is known about differences in these programs, including precision of measurements, units used, etc.

The WSoftware column is used to indicate which software was used to generate the data in each row, but the system does not treat data any differently based on this value. Users should be aware of the possibility of differences between programs, and decide for themselves how to handle any possible discrepancies.

Information about the voltage of the device's battery is provided in the BatVolt and BatVolt_Min columns. These values are not directly relevant to weather but can be useful if technical support is needed.

Wind speed may be recorded in km/hr as an integer or m/s with 1 decimal point of precision, depending on the software used. The precision difference between these two measures is large enough that they are divided into separate columns. Each row must indicate the average wind speed; exactly one (not both) of the WindSpeed_Avg_Km_Hr and WindSpeed_Avg_M_S columns must not be NULL. Maximum wind speed is not required, but when recorded it must be in either the WindSpeed_Max_Km_Hr or WindSpeed_Max_M_S column, but not both.

Each row must only use a single unit for all of its wind speed values; when WindSpeed_Avg_Km_Hr is NULL, WindSpeed_Max_Km_Hr must also be NULL, and when WindSpeed_Avg_M_S is NULL, WindSpeed_Max_M_S must also be NULL.

The barometric pressure value provided in this table (Barometer) is corrected, accounting for Amboseli's elevation: ~1130 m. To calculate the uncorrected values, ask a meteorologist.

Note

Prior to Babase 5.5.3, this column contained only UNcorrected values. Those values were corrected simply by adding 12.94503[168]to the uncorrected value.

When devices like these record rainfall, they often use a small "tip bucket" that only records rain when the bucket fills (see the device's user's manual for more information) and which theoretically may contribute to small errors in the accuracy of the measurement. For example, the WeatherHawk used a 1-mm tip bucket. If there is less than 1 mm of rainfall over the course of a given hour, the bucket may not fill up at that time and the rain will not be measured until later or may evaporate before the bucket fills. When there is a gap in the hourly measurements (due to changing out sensors, battery malfunctions, etc.), rainfall data during the down period might not be recorded.

Despite the fact that data are recorded every hour, some devices (e.g. WeatherHawk) do not simply report the amount of rainfall measured in that hour. Instead, these devices report the cumulative amount of rainfall measured since the beginning of the year[169]. That value is recorded in the YearlyRain column, for those devices that report it.

The rainfall for each hour is recorded in the TimeStampRain column. For rows whose YearlyRain column is not NULL, this value is the result of a simple calculation: this row's YearlyRain minus that of the chronologically previous row.

Note

When this table was first created and only contained data from WeatherHawk devices, the value of the TimeStampRain column was automatically calculated when new rows were added. That is, for a given row, the YearlyRain of the most recent row from the same calendar year and the same WStation was subtracted from the given row's YearlyRain, resulting in the amount of rainfall that was measured since the previous TimeStamp.

After the last WeatherHawk device was retired and data from other devices began to be added, this automatic calculation stopped being useful. In Babase 5.5.1, this table's ability to calculate TimeStampRain from YearlyRain was removed, largely based on the assumption that future devices are unlikely to use the dubious YearlyRain measurement. All previously calculated TimeStampRain values were not removed, so the TimeStampRain in a row with a non-NULL YearlyRain can safely be assumed to be a result of that functionality.

The amount of rain measured in the year cannot be less than the amount measured in a single timestamp. That is, when the YearlyRain is not NULL it cannot be greater than the TimeStampRain.

Warning

Do not assume that TimeStampRain values always describe a single hour's worth of rain. When one or more hours is absent from the data, the TimeStampRain value is the amount of rainfall measured since the previous row in the same year. Also do not assume that these values describe all of the rain that occurred in the intervening hours. If the device was off or malfunctioning at the time, then actual rainfall may have occurred and/or evaporated without being measured.

Column Descriptions

DWId (Digital_Weather Identifier)

A unique positive integer identifying the device's meteorological data collection that is recorded in this row.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

WStation (Weather Station Identifier)

The WSTATIONS.Wstation of the device used to collect the data.

This column may not be NULL.

TimeStamp

Date and time of the measurement. Measurements must be taken on the hour. Minutes, seconds, microseconds etc must be 0.

Warning

As indicated by the name, this value is a time stamp. It indicates the end of the period described in each row, not the beginning. This means that the last hour of a day will have a TimeStamp from the next day, e.g. the data from 23:00-23:59 on 31 Dec 1999 will have a TimeStamp of 2000-01-01 00:00.

This column may not be NULL.

WSoftware (Weather Software Identifier)

The WEATHER_SOFTWARES.WSoftware value indicating which software was used to generate the data.

This column may not be NULL.

RecordNum (Record Number)

The record number for this line, exported in the software. This appears to be a unique ID number used by the device or the software, or both.

This column may be NULL if the software did not report this value.

BatVolt (Battery Voltage)

The voltage of the battery at the TimeStamp. Values must be between 10.00 and 14.00, inclusive.

This column may not be NULL.

BatVolt_Min (Minimum Battery Voltage)

The minimum voltage of the battery in this hour. Values must be between 10.00 and 14.00, inclusive.

This column may be NULL if the software did not report this value.

AirTemp_Avg

Average air temperature for this hour, in degrees Celsius. Values must be between -10.0 and 50.0, inclusive.

This column may not be NULL.

RelativeHumidity_Avg

Average relative humidity for this hour in percent humidity. Values must be between 0.0 and 100.0, inclusive.

This column may not be NULL.

WindSpeed_Avg_Km_Hr (Average Wind Speed, in km/hr)

Average wind speed for this hour, in km/hr. Values must be between 0 and 30, inclusive.

This column may be NULL.

WindSpeed_Avg_M_S (Average Wind Speed, in m/s)

Average wind speed for this hour, in m/s. Values must be between 0.0 and 15.0, inclusive.

This column may be NULL.

Solar (Solar radiation)

Solar radiation in Watts per square meter. Values must be between 0.0 and 2000.0, inclusive.

This column may be NULL if the device did not report this value or if a reported value was subsequently recognized as erroneous.

AirTemp_Min (Minimum Air Temperature)

Minimum air temperature for this hour, in degrees Celsius. Values must be between -10.0 and 50.0, inclusive.

This column may be NULL if the software did not report this value.

AirTemp_Min_Time (Time of Minimum Air Temperature)

A time stamp indicating the minute in which the AirTemp_Min occurred.

This column may be NULL if the software did not report this value.

AirTemp_Max (Maximum Air Temperature)

Maximum air temperature for this hour, in degrees Celsius. Values must be between -10.0 and 50.0, inclusive.

This column may be NULL if the software did not report this value.

AirTemp_Max_Time (Time of Maximum Air Temperature)

A time stamp indicating the minute in which the AirTemp_Max occurred.

This column may be NULL if the software did not report this value.

Wind_Dir (Wind Direction)

Wind direction in degrees from North. Values must be between 0.0 and 360.0, inclusive.

Caution

The values of 0.0 and 360.0 represent the same direction. There's no telling if one or the other of them means something special, like no measurement. If they really do represent the same direction then we should probably change the rules and adjust the data values so that legal values are between 0 and 359.

This column may not be NULL.

WindSpeed_Max_Km_Hr (Maximum Wind Speed, in km/hr)

Maximum wind speed for this hour, in km/hr. Values must be between 0 and 30, inclusive.

This column may be NULL if the software did not report this value.

WindSpeed_Max_M_S (Maximum Wind Speed, in m/s)

Maximum wind speed for this hour, in m/s. Values must be between 0.0 and 15.0, inclusive.

This column may be NULL if the software did not report this value.

WindSpeed_Max_Time (Time of Maximum Wind Speed)

A time stamp indicating the minute in which the maximum wind speed[170] was recorded.

This column may be NULL if the software did not report this value.

Barometer (Barometric pressure)

Atmospheric pressure at the TimeStamp, expressed in kPa and corrected for elevation. Standard atmospheric pressure at sea level is 101.325 kPa, so this column's values must be between 96.3 and 106.3, inclusive.

This column may be NULL if the software did not report this value or the reported value was subsequently recognized as erroneous.

YearlyRain (Yearly Rainfall)

The amount of rain measured since the beginning of the year, in millimeters. Values must be integers greater than or equal to 0.

This column may be NULL if the device did not report this value.

TimeStampRain (Rainfall for this TimeStamp)

The amount of rain that was measured at this WStation since the previous TimeStamp.

This column may not be NULL.

Lightning_Strikes

An integer, indicating the number of lightning strikes recorded throughout the hour represented by this row.

This column may be NULL if the software or device did not report this value.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.

WREADINGS (Weather Readings)

The WREADINGS table contains one row for each time a person has collected data from the meteorological instruments. So, each WREADINGS row should have at least one associated RAINGAUGES, TEMPMINS, or TEMPMAXS row, but no more than one associated row from any one of these tables.

Note

Automated weather readings are not recorded in WREADINGS .

For any one weather reading the minimum recorded temperature cannot exceed the maximum recorded temperature -- the TEMPMINS.Tempmin value related to the WREADINGS row cannot exceed the related TEMPMAXS.Tempmax value.

The combination of WRdaytime and Wstation must be unique.

The Wstation column cannot be changed when there is a related RAINGAUGES row.

WRid

A unique positive integer representing the meteorological data collection event.

This column is automatically maintained by the database, cannot be changed, and must not be NULL.

Wstation

Code indicating the station from which the data were collected. Must be a value on the WSTATIONS table.

WRdaytime

The day and time the meteorological data were collected. The time zone is Nairobi local time.

Estdaytime

TRUE when the WRdaytime column contains an estimated time. FALSE when the WRdaytime column is an accurate record of the time the measurement was taken.

WRperson

Initials of the person who collected the data. Must be a value contained in the Initials column of a row on the OBSERVERS table.

WRnotes

Textual notes on the weather reading.

This column may be NULL when there are no notes.

This column may not be empty, it must contain characters, and it must contain at least one non-whitespace character.

Sys_Period

The timestamp range during which this row's data are considered valid. See The Sys_Period Column for more information.



[26] There are, of course, also system generated row identifiers, which are arbitrary and not derived from any field collected data.

[27] As opposed to using a query to let the database do all the considering for you.

[28] This is a generated error instead of one that is immediately raised in order to ease the data entry process. Because births are recorded before CENSUS rows are entered so that new births do not raise errors when uploading census data, new births regularly have dates that follow the mother's Statdate. This could be avoided by entering births without a Pid and then updating the Pid once the CENSUS table has been updated but this was deemed overly burdensome.

[29] Recall that when an individual has no non-absent CENSUS rows, their Statdate is set to their Entrydate, which might be before the LatestBirth. It is therefore presumed that a Statdate being before a LatestBirth will only ever be a temporary occurrence that will go away after the individual's CENSUS data have been added.

[30] If this is ever so, an Entrytype indicating "Birth" is arguably not appropriate.

[31] This number was chosen based on data management minutiae related to the fact that a single census in a group can interpolate an individual present in the group for up to 14 days. If this value in the interpolation code ever changes, then the number of days that LatestBirth is allowed to be after Entrydate should be re-evaluated.

[32] Thanks to the annoying habit of certain months to not be exactly thirty days — not to mention "leap days" — it's possible that different users may have slightly different interpretations of how many days are contained in "X" years. To allow some flexibility when making these estimates, this rule is implemented as a warning and not an error.

[33] This column was added when PostgreSQL depreciated its hidden identifier column, Oid.

[34] This is unlikely as the database will not allow entry of a duplicate Sname.

[35] At the time of this writing, the focal sample data collection devices use the Sname XXX for their own special purposes. There may be other such reserved Sname values unknown to Babase.

[36] Or whatever you want to call it in the case of a fetal loss.

[37] This is termed a visit in the Protocol for Data Management: Amboseli Baboon Project, which should be consulted for further details.

[38] Unless subsequent data (retcons) suggest that the January visit was the true Start of the JLA?

[39] D usually occurs when a male is seen alone or in a non-census group.

[40] When the Status column is D, the value of the Cen column indicates whether or not the individual was marked absent on the field census for the day.

[41] Facilities exist to require such CENSUS rows and their associated DEMOG rows be entered in a single transaction, and the rule requiring CENSUS rows with a Status of D to have a related DEMOG rows could then be enforced.

[42] DEMOG nearly makes the M CENSUS Status code obsolete, were it not so hard to search on textual data. Indeed, it was created in response to difficulties with the M code.

[43] It may seem odd that the Comment column may be NULL given that this is the only column in the table containing baboon-related data. However the data entered into the database can be an abbreviated version of the actual demography note, abbreviated even into non-existence.

[44] The system checks the group in which the individual was last censused present rather than the individual's Matgrp in order to accommodate group splitting.

[45] Presently group 9.0. This value is hardcoded at present.

Individuals are generally put in the unknown group when interpolation does not know their group membership, but it is also possible for an individual to be explicitly placed in the unknown group.

[46] This implies that a GROUPS row's From_group and it's To_group cannot be equal.

[47] As opposed to it being merely a coincidence that the gap began the same date that the group did.

[48] Again, as opposed to it being coincidence that gap and group ended at the same time.

[49] Because there is not a separate column for fusion start date Babase can only track fusions when all groups involved start fusing on the same date. Babase cannot track fusions involving more than one group when 2 groups begin to fuse and others fuse later before the first 2 groups complete fusing.

[50] The precise definition of an "official" study group is left for data management to determine.

[51] In constrast to birth and death, which mercifully tend to be pretty definite.

[52] At the time of this writing, the date used in the case where the transition to sexual maturity was not observed is the date when the individual first came under observation and was already mature.

[53] The ON date MSTATUSES code is a special value. See MSTATUSES: Special Values.

[54] Note that this is not literally true, because testicular changes in males are not tracked on a daily basis - males are assigned a matured date on the first day of the month in which seen with fully round testes. Likewise, a female's first Tdate will sometimes have a few days of error around it, as might other transitions.

[55] ...or average, or standard deviation, etc....

[56] This value was chosen somewhat arbitrarily. It's certainly possible to have more than 9 of a particular wound or pathology affecting a body part, but for our uses such a high number is implausible. This value may need to be adjusted in the future.

[57] Therefore during periods of continuous observation no sexual cycle transition events can go unrecorded. See the CYCPOINTS documentation below for the constraints this places on CYCPOINTS within a series.

[58] Yes, updates to CYCPOINTS can result in automatic changes to the CYCGAPS.State, meaning that updates to both tables are occurring in a single transaction. This is okay, because updates to State do not result in changes to CYCGAPDAYS.

[59] Admittedly, validation on CYCPOINTS and other tables could be rewritten to eschew CYCGAPDAYS and use CYCGAPS instead. However, that would result in a major performance dip, so let's not do it unless we have to.

[60] The CYCPOINTS.Cid column relates the CYCPOINTS row to the cycle of which it is a part.

[61] It is expected that these rows will exist only until related CYCPOINTS rows are entered.

[62] See Appendix C for an example.

[63] This rule minimizes the degree to which CYCPOINTS move between cycles, minimizes the degree to which their Cids change.

[64] It may not be worth documenting this, as there are certainly cases where it is not clear which rows are earlier. One such case is changing the date of a Ddate to a later date, that fall after subsequent cycles. If there is concern about the permanence of Cids then it may be best to simply delete CYCPOINTS rows and re-insert them rather than modify existing rows. This at least gives the greatest degree of control over the Cid values.

[65] Quite a bit of Babase's logic relies on there being a continuous series of Mdate, Tdate, Ddate sequences unless there are gaps in observation. It is for this reason that cycles must be complete.

[66] This is checked rather than enforced by index or trigger because the condition must exist temporarily as the triggers update the Seq.

[67] See the PREGS.Conceive documentation.

[68] The system allows the condition to occur to provide an opportunity to insert a new Mdate, Ddate, Tdate aggregate -- a new cycle -- into the middle of a period of observation. One of these dates must be inserted first, breaking, for the moment, the pattern of cycling -- the repetition of the Mdate, Ddate, Tdate sequence.

[69] This is enforced in triggers rather than by index as the triggers use this condition as a test for whether a new CYCLES row must be created.

[70] It is expected that such rows will exist only until PREGS.Conceive is updated with a reference to them.

[71] Note that cycles may be cut off, for a variety of reasons; some cycles may only contain a single CYCPOINTS row, that is, the Cid value may be unique to a single CYCPOINTS row.

[73] Or was in progress when observation ceased, which Babase treats the same as pregnancies in progress at the time data entry ceased. When now is is an important consideration in the determination of what in progress means. The cessation of data entry (e.g. BIOGRAPH.Statdate), for whatever reason, is the closest Babase comes to the concept of now.

[74] This implies that each Resume value differs from all the others.

[75] Zdate really.

[76] This condition also ensures that a female will not have more than one ongoing pregnancy, as pregnancies require a conception cycle.

[77] It is expected that such Tdates will exist only long enough to update a pregnancy's Resume value.

[78] There should only be CYCGAPS rows when a sexual cycle event may have been missed, but clearly when there is a CYCPOINTS.Resume value then no sexual cycle was missed.

[79] The MATERNITIES view does exactly this. It can be used whenever there is a need for these tables to be joined in this way.

[80] Why is this round-about-the-barn way preferred? Because curmudgeonly old database designers like to insist that keys contain no meaningful information, that's why.

[81] See? We told you that keys should not contain meaningful information.

[82] This indication of a period of no observation is not validated against the CYCGAPS table, that serves as a record of periods of no observation which are long enough that a sexual cycle transition event (Mdate, Tdate, or Ddate) may be missed. Babase does not have records of periods of no observation that are long enough to miss pregnancies. Although it would seem that CYCGAPS could be used for this purpose, and indeed CYCGAPS does black out REPSTATS, validating parity against CYCGAPS has not been thought through and awaits a future Babase enhancement.

Regardless, Babase does not presently automatically place a parity in the 100's -- the decision to switch between the 100s and the 1s (or 10s) must be made manually.

[83] This criteria is carefully phrased to account for gaps in the recorded data during the time period in which deturgesence probably began.

[84] When an individual matures, at menarche, there is no Mdate in the first sexual cycle.

[85] notably consortships

[86] There is no restriction on the age or maturity status of the female.

[87] This is not always as useful as it seems. See the rationale for the PARTS table.

[88] It is not that these interactions never occur among young individuals, it is that the researchers' interest is in paternity and maternity and so find that having to concern themselves with filtering out sexual interactions between juvenile individuals is distracting.

[90] and perhaps ejaculation

[91] Presumably data that is collected on a Psion or other electronic device.

[92] Requiring INTERACT_DATA.Observer be NULL, even when the existing value is correct and synchronized with SAMPLES.Observer, ensures that the value of the observer column has been taken into consideration by the person modifying the database.

[93] Consult the Amboseli Baboon Research Project Monitoring Guide to be sure, but this is because the accuracy of the data are never more than one minute, if that.

[95] As opposed to recording the interaction with an electronic device.

[96] Whether or not a MPI_DATA row records a request for help is determined by whether or not the value of the related MPIACTS.Kind column is R.

[97] Because the individual from whom help was requested is unknown, there is no way to tell if help was given in response to the request.

[98] Note that if we had the time the sample started, to the second, and we knew that the operator never took more than 59 seconds to enter the point data, and we assume that the operator makes the observation when the timer chimes, then we could calculate the actual time the point was observed. Absent these conditions it appears difficult or impossible to tell which of the 1 minute observation intervals were missed when there is not an exact match between the number of points taken and the total number of minutes in the sample.

[99] It is possible to create a view that extends the NEIGHBORS table by adding another column, call it Neighbor, that contains either the Sname or the Unksname, which ever is not NULL. However, the utility of such a column is not obvious because it seems that any analysis done using such a column would have to consistently use outer joins and then constantly test for NULL results, lest the Unksname data disappear from the analysis. At first glance this seems similar to the testing which must be done to when using two separate columns, the existing design, so it is not clear whether there's anything to be gained.

Such a view can always be added in the future without breaking backward compatibility.

[100] Assuming that the neighbor is a known individual, that the NEIGHBORS.Sname column is not NULL.

[101] The information on the actual unknown neighbor codes used in the field does not appear to be in the Amboseli Baboon Research Project Monitoring Guide.

[102] The name of the focal individual is always recorded, as there is always the intention to observe the focal individual even though this does not always happen.

[103] As the values in the POINT_DATA.Ptime column has little to do with the actual time of observation, it is impossible for Babase to perform additional consistency checks to between the points and the corresponding summary information in SAMPLES. Fortunately, as the data loading process is automated, there is little opportunity for data corruption.

[104] As all observation occurs during the day there are no issues surrounding samples taken just before midnight that start on one day and end on the next. Should there ever be such, this should be the date the sample started.

[105] The anesthetic administration times are not aggregated in this view although it could be useful to aggregate the difference between the time of darting and the time additional anesthetic was administered.

[106] To cover the case where Dartings-Pickuptime is NULL.

[107] To catch the case where Downtime is NULL.

[108] The column is allowed to be NULL due to data entry procedural constraints. The first data uploaded creates rows in DARTINGS but the data set containing mass is not uploaded until later.

[109] In a canonical database design this column would be on the DPHYS table. The column is part of the DARTINGS table due to concerns that the column might be overlooked by a user because so many other note columns are on the DARTINGS table.

[110] In a canonical database design this column would be on the DART_SAMPLES table. The column is part of the DARTINGS table due to concerns that the column might be overlooked by a user because so many other columns are on the DART_SAMPLES table and DSAMPLES view.

[111] This behavior exists so that rows can be inserted into TEETH via the DENT_CODES view.

[112] The alternative to this, an approach closer to the ideal database design, is to have separate tables for width and length measurements. This seems excessive.

[113] This rule is a result of the aforementioned design choice that places Testwidth and Testlength in the same table. A consequence of this choice is that this rule must exist to ensure that Testseq values are, effectively, contiguous.

Note that this condition must remain true even while the rows are in the process of automatic re-sequencing. It may be that some combinations of data values will simply not work with all possible UPDATE statements that change the row sequencing. Those experiencing problems should delete the rows in question and re-insert them with the correct sequence numbers.

[114] The alternative to this, an approach closer to the ideal database design, is to have separate tables for width and length measurements. This seems excessive.

[115] This rule is a result of the aforementioned design choice that places Testwidth and Testlength in the same table. A consequence of this choice is that this rule must exist to ensure that Testseq values are, effectively, contiguous.

Note that this condition must remain true even while the rows are in the process of automatic re-sequencing. It may be that some combinations of data values will simply not work with all possible UPDATE statements that change the row sequencing. Those experiencing problems should delete the rows in question and re-insert them with the correct sequence numbers.

[116] Also "tissue" and "tissue sample", but those two terms aren't terribly different anyway.

[117] This is expected to be the highest plausible accuracy to ever be used for the concentrations stored in this table. This can easily be expanded if needed.

[118] Even in the coldest of cold storage, frozen samples will slowly evaporate over time. A 100-μL sample that is frozen and stored for 5 years is unlikely to still be the full 100 μL at the end of that time.

[119] It is presumed that any reader who cares enough about nucleic acid samples to read this documentation is already familiar with the polymerase chain reaction. We will not attempt to explain it here.

[120] Admittedly, this approach is imperfect and is likely underestimating the true prevalance of the problem. The date written on a sample may not be the true date it was collected but may still be a date that the individual was censused. Unfortunately, there is little else that the system can do to recognize when this occurs.

[121] In real life, this rule could easily be violated. Something is going to need to change, before long.

[122] That is, the population whose data are recorded throughout the many tables in Babase.

[123] Related rows in this table are automatically inserted when rows are inserted into BIOGRAPH, so manual insertion of these rows is effectively not allowed.

[124] Similar to inserts, related rows in this table are automatically deleted when rows are deleted from BIOGRAPH, so manual deletion of these rows is effectively not allowed.

[125] Waterholes may be more or less permanent features of the landscape, or only temporary rain pools. This is no surprise to those familiar to the SWERB dataset, but whenever waterholes are mentioned in relation to SWERB data the waterhole may be either a waterhole or a rainpool.

[126] It is believed but not certain that this is the way PDOP is used.

[127] It is not clear whether the accuracy is 2 or 3 dimensional vector; whether the reported distance includes error in altitude.

[128] Because database rules which enforce when PDOP and Accuracy values must be NULL are hardcoded into the database it will take programmatic changes to change these limits. Normally this would be avoided by adding a column to the GPS_UNITS table to indicate whether or not the particular GPS unit records a PDOP or accuracy reading, thus allowing new units to be introduced which record such data. However because records have been lost as to which specific GPS units were used when and, as of the time of this writing, no one wishes to reconstruct the categories of GPS units in use based on a PDOP/Accuracy capability criteria the system design uses hardcoded dates to validate. Note further that given the existing set of validation criteria for PDOP and Accuracy there is never a circumstance which requires a PDOP or accuracy to be present. Normally the values of GPS_UNITS.Errortype would force the presence of PDOP or Accuracy values. Instead they merely enforce their absence. This is partly for reasons similar to the preceding and partly because, particularly during periods when GPS data was hand-transcribed, sometimes data is missing.

[129] And, possibly, subsequently corrected by the data specialists after consultation with the field teams.

Because the data manager expands the observer codes in the departure rows from 1 to 3 characters the SWERB_DEPARTS_GPS.Garmincode column can hold more than 10 characters.

[130] From a database design perspective it would make sense to control whether or not a Garmincode must be present based on a column in the GPS_UNITS table. In practice because all future GPS units will very likely allow the entry of data when waypoints are taken the matter is moot.

[131] While it may be desirable to have a cutoff date after which all data obtained using GPS units must come from the GPS units themselves, no such cutoff date has been established.

[132] Electronic manufacturers have taken to silently changing the specifications of a device without changing the model, a situation which is quite annoying when the specifications matter. When no other sort of identifying information is available sometimes the serial number can be used to determine device capabilities.

[133] The Amboseli Baboon project data protocols require these codes have a particular structure. Babase does not enforce these requirements, primarily because the QUAD_DATA table is essentially a support table and, once created, is static so enforcing specific rules in the database is not worth the time.

[134] Note that rows that violate this rule are not instantly rejected; the error is caught at the time of transaction commit. This is so that during data entry Btimeest and Etimeest values may be entered without Start and Stop values in the expectation that by the time the transaction is committed the insertion of SWERB_DATA rows will have automatically filled in the missing Start and Stop values.

[135] This last check is also performed at transaction commit time, for the same reason.

[136] Ideally, a begin or end time should not be NULL unless the records have been perused and no time found, in which case the time source would always be bb_norecord when there was no time. In practice this has not been done.

[137] Note that this rule is tested for immediately, not at the time of transaction commit. This means that the Btimeest and Etimeest columns must be non-NULL before inserting SWERB begin and end rows that have non-NULL times.

[138] More precisely, when the SWERB_BES.Seq is NULL. This typically amounts to the automatic sequencing of newly inserted rows because those are the rows which typically have no Seq value.

[139] At first glance it would seem appropriate to sequence those SWERB_BES rows with NULL Start times based on the first related SWERB_DATA.Time value but this presents a number of problems. Such a design would not allow for any flexibility in manually re-sequencing such rows unless automatic sequencing took place only upon insert of SWERB_DATA rows, in which case inserting and then deleting the inserted row could change the sequencing of the SWERB_BES rows. Such un-reversible changes can be confusing.

[140] Or whatever row has has a NULL SWERB_BES.Seq value.

[141] Manual sequencing is therefore only useful when the SWERB_BES.Start is NULL or when there are ties. Sequencing is normally manipulated by changing SWERB_BES.Start values, which are themselves automatically picked up from SWERB_DATA rows with B Event values.

When testing for correct sequencing of a SWERB_BES row other bouts of observation (other SWERB_BES rows) related to the same group on the same day cannot have a smaller Seq and also have a Start value greater than the smallest related SWERB_DATA.Time related to the given row. In those cases where other bouts of observations related to the same group on the same day have a NULL Start value the comparison is instead against the other bout's earliest related SWERB_DATA.