[Babase] Jeanne's comments on Karl's Babase documentation
Susan Alberts
babase@www.eco.princeton.edu
Wed, 10 Aug 2005 14:52:56 -0400
Susan's comments included below with Jeanne's.
>
>
>I've typed Jeanne's comments into the thin text of Karl's Babase
>documentation. Jeanne had written her notes on print-outs, so I've
>done my best to transfer everything to this electronic version. For
>the most parts, any inserted comments refer to the paragraph, line,
>or sentence immediately above the added text.
>
>Note that comments on the Interpolation section are not yet
>complete. I'll send the rest of the Jeanne's notes for that section
>soon. (Comments on Biograph, Members, Census, and Demog are
>complete.)
>
>Catherine
>
>> Babase:
>>
>> Technical Specifications for the Amboseli Baboon Project Data
>> Management System
>>
>> Karl O. Pinc
>>
>> The Meme Factory, Inc.
>>
>> Jeanne Altmann, PhD.
>>
>> Princeton University
>>
>> Susan C. Alberts, PhD.
>>
>> Duke University
>>
>> ER Diagram layout and conversion to Dia: Leah Gerber
>>
>> Docbook formatting: Anne Ndeti Hubbard, Karl O. Pinc
>>
>> Copyright (c) 2005 Karl O. Pinc, Jeanne Altmann, Susan
>> Alberts, Leah Gerber, The Meme Factory, Inc.
>>
>> Permission is granted to copy, distribute and/or modify
>> this document under the terms of the GNU Free
>> Documentation License, Version 1.2 or any later version
>> published by the Free Software Foundation; with no
>> Invariant Sections, no Front-Cover Texts, and no
>> Back-Cover Texts. A copy of the license is included in
>> the section entitled "GNU Free Documentation License."
>>
>> March 2, 2005
>>
>> +---------------------------------------------------------+
>> | Revision History |
>> |---------------------------------------------------------|
>> | Revision 0.0 | March, 2 2004 |
>> |---------------------------------------------------------|
>> | Initial document |
>> +---------------------------------------------------------+
>>
>> -------------------------------------------------------
>>
>> Table of Contents
>>
>> Introduction
>>
>> This Document
>>
>> System Designs
>>
>> To Start BABASE
>>
>> Data organization
>>
>> Databases
>>
>> Users, Groups and Database Permissions
>>
>> Schemas
>>
>> Organization of the Babase Program Code
>>
>> Data Relationships
>>
>> The Master Tables
>>
>> GROUPS (Groups)
>>
>> Data Entry Rules
>>
>> Data Element Descriptions
>>
>> BIOGRAPH (Baboon Biographical Data)
>>
>> Column Descriptions
>>
>> MATUREDATES (Sexual Maturity Dates)
>>
>> Matured
>>
>> Mstatus (Sexual Maturity Status)
>>
>> RANKDATES (Adult Rank Attainment Dates)
>>
>> Ranked
>>
>> CONSORTDATES (First Consortship Dates)
>>
>> Consorted
>>
>> DISPERSEDATES (Dispersal Dates)
>>
>> Dispersed
>>
>> PREGS (Pregnancies)
>>
>> Data Entry Rules
>>
>> Data Element Descriptions
>>
>> CYCGAPS
>>
>> Data Entry Rules
>>
>> Data Element Descriptions
>>
>> CYCPOINTS
>>
>> Data Entry Rules
>>
>> Data Element Descriptions
>>
>> SEXSKINS (Sexskin Turgescence Measurements)
>>
>> Data Entry Rules
>>
>> Data Element Descriptions
>>
>> MEMBERS (Group Membership)
>>
>> Column Descriptions
>>
>> CENSUS
>>
>> Column Descriptions
>>
>> DEMOG (Demography Notes)
>>
>> Column Descriptions
>>
>> RANKS (Rankings Within Groups)
>>
>> Data Entry Rules
>>
>> Data Element Descriptions
>>
>> INTERACT (Interactions)
>>
>> Data Entry Rules
>>
>> Data Element Descriptions
>>
>> PARTS (Participants in interactions)
>>
>> Data Entry Rules
>>
>> Data Element Descriptions
>>
>> Sname
>>
>> Role
>>
>> Iid
>>
>> REPSTATS
>>
>> Data Entry Rules
>>
>> Data Element Descriptions
>>
>> CYCSTATS
>>
>> Data Entry Rules
>>
>> Data Element Descriptions
>>
>> THE SUPPORT TABLES
>>
>> BSTATUSES
>>
>> STATUSES
>>
>> DCAUSES
>>
>> MSTATUSES
>>
>> WSTATIONS
>>
>> ACTS
>>
>> RNKTYPES
>>
>> Interpolation
>>
>> Interpolation's 3 Fundamentals
>>
>> Interpolation Visualized
>>
>> The Interpolation Rules
>>
>> Expectations and Implications
>>
>> Data Entry
>>
>> Automaticlly Generated IDs
>>
>> The Dataset Tables
>>
>> Datasets Containing INTERACT and PARTS Data
>>
>> Datasets Containing CENSUS Data
>>
>> Datasets Containing DEMOG Data
>>
>> Datasets Containing CYCLES Data
>>
>> BABASE PROGRAMS
>>
>> Data Maintenance Programs
>>
>> Useful Programs and Functions
>>
>> A. Changes to Babase between 1.0 and 2.0
>>
>> Changes to .Statdate
>>
>> Changes To Interpolation and MEMBERS
>>
>> Changes To The Sexual Cycle Information
>>
>> B. Docbook, Styling and other issues
>>
>>Introduction
>>
>> This Document
>>
>> This document describes the BABASE baboon data management
>> system. This includes a description of the tables, the
>> intended use of all related programs and directories, the
>> design of the system, and procedures for maintaining the
>> data management system itself. This document does not
>> include the procedures actually used to enter data into the
>> system, or the details of how to operate the systems
>> programs. Nor does it include any instructions on the
>> operation or administration of the computer itself. Further
>> information on the topics not covered in this document can
>> be found in the Protocol for Data Management: Amboseli
>> Baboon Project document.
>>
>> The Protocol for Data Management: Amboseli Baboon Project
>> document is an important adjunct to the BABASE system, but
>> it is not considered part of the system itself because it
>> describes the use of the system but not the capabilities of
>> the system. It is important to maintain the distinction
>> between use and capabilities so that when an enhancement is
>> needed, it is clear whether the desired result can be
>> obtained by altering the way the system is used, or whether
>> the system itself needs to be modified. It is also
>> important to provide different types of documentation to
>> those who operate the system from those who manage and
>> maintain the system because each of these two groups do not
>> need to know all the details of the others' work.
>>
>> Any deviation from the standards described in this document
>> should be discussed with the project director and may God
>> have mercy on your souls.
>>
>> Conventions Used In This Document
>>
>> All TABLE NAMES are written in UPPER CASE. Column Names are
>> in lower case with Initial Capitals.
>>
>> Significant but often slightly off-topic paragraphs are set
>> off from the surrounding material as a note, presented as
>> follows:
>>
>> Example 1. An example of a note
>>
>> Note
>>
>> Written material has no voice that can be raised, but
>> attention can be drawn with typographical conventions.
>>
>> When the reader should take care, particularly when the
>> system might do something unexpected in a given
>> circumstance, this is noted in a caution. Cautions are set
>> off from the surrounding text like this:
>>
>> Example 2. An example of a caution
>>
>> Babase will reject your change if you try to do something
>> that is not allowed, like giving a male an onset of
>> turgesence date.
>>
>> Caution
>>
>> When the rejected change is one of a number of changes
>> bundled into a transaction none of the changes will make
>> it into the database.
>>
>> When a mis-use of the system will lead to incorrect
>> results, particularly when such results are not obvious,
>> this document contains a warning. Warnings are set off from
>> the surrounding text like this:
>>
>> Example 3. An example of a warning
>>
>> Warning
>>
>> Babase cannot detect when an Sname is mis-typed, so it is
>> possible to inadvertently assign a female's sexual cycle
>> to the wrong female.
>>
>> To otherwise draw the readers attention to material some
>> text is marked important. Important text is set off from
>> the surrounding material like this:
>>
>> Example 4. An example of text denoted important
>>
>> Babase has a number of components, many of them, like the
>> SQL web interface, are third party tools, not written by
>> the Babase developers.
>>
>> Important
>>
>> When the third party tools are upgraded their "look" may
>> change but the features they provide should remain. As
>> Babase is composed of Free Software the Babase project
>> always has the option of customizing any of it's third
>> party tools and can contribute it's improvements back to
>> the program's developers for inclusion into future
>> releases.
>>
>> Suggestions as to how to use Babase are noted in tips, as
>> are remarks on how data is presently entered in Babase or
>> recorded in the field. These are set off from the regular
>> text of the document like this:
>>
>> Example 5. An example of a tip
>>
>> Tip
>>
>> Lick all the chocolate off your fingers before beginning
>> data entry.
>>
>> Often, the tips are the result of best practice developed
>> from considered experience and so document how Babase is
>> used at the time of this writing. However, as best practice
>> continues to develop and field protocols change, the
>> Protocol For Data Management and the field protocol
>> documentation should always be consulted. Those documents
>> have precedence over the tips presented herein should there
>> be conflicting advice.
>>
>> Supplemental and cross referential material is presented in
>> footnotes.
>>
>> A Guide for the Reader
>>
>> Anyone who is changing or adding programs to the system
>> should read this entire document. Section III The Master
>> Tables is particularly important for all those using the
>> system. Section V The Dataset Tables is of little interest
>> to those who only want to retrieve information from the
>> system. Section VI.B Useful Programs and Functions is of
>> interest to the more sophisticated user. People who will
>> not be programming need not read section VIII Standards,
>> although sections VIII.A General Program Standards and
>> VIII.B User Interface Standards may prove interesting.
>>
>> System Designs
>>
>> The BABASE system is designed to facilitate the retrieval,
>> storage, and maintenance of the Amboseli Baboon Project
>> data. The system consists of tables to store and organize
>> the data, programs used to facilitate the entry and
>> maintenance of the data, and this set of procedures and
... and the set of procedures and ... (not ...and this set...)
>> standards supporting the maintenance of the BABASE system
>> itself. It is currently expected that the bulk of the data
>> retrieval will be done using standard utilities and so
>> there is little programmatic support for data retrieval and
>> manipulation. The overall philosophy of the systems
>> implementation is to keep the system as easy to maintain as
>> possible. To this end, the system is designed to contain
>> only the most crucial features. The number of FoxPro**
>> program development tools used in the construction of the
>> software is kept to a minimum so that a new programmer need
>> not spend a lot of time learning the FoxPro** development
>> environment. At the present time, this means that the
>> programs are limited to use of the FoxPro** language and
>> the screen builder, no queries, reports, etc., are used.
>> All data are kept in FoxPro** databases.
Note that references to foxpro still need to be modified throughout.
>>
>> Note
>>
>> The design attempts 5th normal form1,
Is this supposed to 5th normal form or 5th normal form1?
>> no redundant data, no
>> empty data elements allowed, etc. What we've actually wound
>> up with is about 3rd normal form.
>>
>> To Start BABASE
>>
>> The Babase system is accessed over the web. Any web browser
>> may be used to view the data using phpPgAdmin generic
>> database interface. More advanced usage of the website will
>> likely require a web browser which conforms to the
>> international standards for the web defined by the World
>> Wide Web Consortium , otherwise known as the W3C , as we
>> have put forth no particular effort to accommodate
>> non-standards conforming browsers. The browser must support
>> CSS2 style sheets and XHTML 1.0. Note that at the time of
>> this writing Microsoft Internet Explorer does not provide
>> adequate style sheet support. Other browsers which do have
>> such support include Mozilla , Mozilla Firefox , Apple's
>> Safari , and Opera . TheW3.org site maintains a list of
>> browsers supporting style sheets.
>>
>> Babase's URL (web address) is
>> https://papio.biology.duke.edu . Be sure to type the s in
>> https . This secures your web connection.
>>
>> You must access most of the Babase web site using a secure
>> communications protocol ( HTTPS ) which encrypts all
>> communication to foil eavesdroppers and checks the identity
>> of the web site itself. The Babase project has signed it's
>> own security certificate, the certificate that ensures you
>> are talking with the website you think you are. ^[1] Our
>> certificate expires annually and is re-generated.
>>
>> Your browser very likely will not trust that our website is
>> who it says it is and so will very likely object when you
>> first access the Babase web site, and annually thereafter.
>> You may tell your browser to accept our certificate
>> permanently.
>>
>>Data organization
>>
>> Databases
>>
>> Databases are collections of information, all of which can
>> be queried and otherwise manipulated alone or in
>> aggregation with all other database content. ^[2] Babase
>> contains three databases.
>>
>> The babase Database
>>
>> The babase database contains the "real" information. All
>> research takes place in this database.
>>
>> The babase_copy Database
>>
>> The babase_copy database contains a copy of the babase
>> database. It is a place to try out dangerous things which
>> might break the babase database.
>>
>> The babase_test Database
>>
>> The babase_test database contains a few bits of made up
>> information. It is a place to try out random things and a
>> place where the babase developers can work on alterations
>> and enhancements.
>>
>> Users, Groups and Database Permissions
>>
>> Each user is given a login and a password they must use to
>> gain access to the database. It is good form to change your
>> password occasionally. ^[3]
>>
>> The database can grant specific users various levels of
>> access to specific tables, although such access is not
>> common as it is difficult to administer and maintain such a
>> fine grained degree of control. For further information see
>> the Postgresql Manual section on Database Users and
>> Privileges .
>>
>> Rather than maintain database access privileges on a
>> per-user basis it is more convenient to place users in
>> groups and then grant these groups different levels of
>> database access.
>>
>> Babase contains the following groups:
>>
>> The babase_readers group
>>
>> The members of this group have read access to Babase data
>> and cannot add, delete, or otherwise alter any of the data.
>>
>> The babase_editors group
>>
>> The members of this group have unlimited rights to the
>> Babase data. They may add data, delete data, or alter
>> existing data. They may not, however, alter the structure
>> of the babase database or change the rules to which the
>> data is required to conform. Thus, they may not add or
>> delete tables, alter triggers, or write or replace stored
>> procedures.
>>
>> Schemas
>>
>> Schemas are a convenient way to allow or deny people access
>> to various parts of a database. Tables, procedures,
>> triggers, and so forth are all kept in schemas. Schemas are
>> like mini databases, except that a single SQL statement can
>> refer to objects in the different schemas within a
>> database, but cannot refer to objects in other databases.
>>
>> Each database is divided into the same schemas. That is,
>> each schema described below exists within each of the
>> databases described here.
>>
>> The system looks at the different schemas for objects, for
>> example table names appearing in SQL queries, in the order
>> in which the schemas are listed below. If the table does
>> not appear in the first schema it looks in the second, and
>> so forth. As soon as table is found with the name given,
>> that table is used and the search stops.
>>
>> To reference an object in a specific schema, place the name
>> of the schema in front of the object, separating the two
>> with a period (e.g. schemaname.tablename).
>>
>> The babase schema
>>
>> The babase schema holds the "official" Babase tables.
>> Everything in the babase schema is documented and
>> supported.
>>
>> In this schema the babase_readers and babase_editors have
>> the access described above.
>>
>> The babase_sandbox schema
>>
>> The babase_sandbox schema holds tables that are used
>> together with the "official" Babase tables but have not yet
>> made it into the Babase project. They will not be
>> documented in the Babase documentation.
>>
>> The groups have the following permissions:
>>
>> babase_readers permissions in babase_sandbox
>>
>> The babase_readers have all the permissions in the
>> babase_sandbox schema that the babase_editors have in the
>> babase schema. They may add, delete, or modify any
>> information in the schema but may not alter the structure
>> of the schema by adding or removing tables, procedures,
>> triggers, or anything else.
>>
>> babase_editors permissions in babase_sandbox
>>
>> The babase_editors have all the permissions of the
>> babase_readers, plus they may add or delete tables, stored
>> procedures, or any other sort of object necessary to
>> control the structure of the data.
>>
>> The per-user schemas
>>
>> Each user has his own schema, a schema named with the
>> user's login. Users have permissions to do anything they
>> want in their own schema, and no permissions whatsoever to
>> anybody else's schema. A user's schema is private.
>>
>> Because of the schema search order the schema name must be
>> used to qualify anything created in the user's schema. E.g.
>> CREATE TABLE mylogin.foo (somecolumn INTEGER); .
>>
>> Organization of the Babase Program Code
>>
>> See the README files in the source tree directories for
>> information on how the source code is organized.
>>
>> Data Relationships
>>
>> The data in babase is stored in tables. Tables can be
>> visualized as grids, with rows and columns. Each row
>> represents a collection of related information, e.g. a
>> baboons birth date, name, and sex. Each cell in the row
>> contains a single unit of information, e.g. birth date,
>> name, and sex. Each column contains only one kind of
>> information, e.g. birth date.
>>
>> The relationship between the various Babase tables can be
>> visualized in entity relationship diagrams, as shown here.
>> In this diagram each table is a box, and each box contains
>> a list of the table's columns.
>>
>> Figure 1. Babase Sexual Cycle Entity Relationship Diagram
>>
>> If we could we would display a diagram here.
>>
>> Figure 2. Babase Group Membership Entity Relationship
>> Diagram
>>
>> If we could we would display a diagram here.
>>
>> Figure 3. Babase Social Interactions Entity Relationship
>> Diagram
>>
>> If we could we would display a diagram here.
>>
>> Figure 4. Key to the Babase Entity Relationship Diagrams
>>
>> If we could we would display a diagram here.
>>
>>The Master Tables
>>
>> The master tables contain the permanent records of the
>> system. Figure 1 is a graphical overview of the master
>> tables, showing all the recorded data elements and the
>> common relationships between them.
>>
>> Note
>>
>> The indexes on these tables are not documented. In general,
>> there is an index for each way the tables are commonly
>> referenced. For example, if records are often looked up on
>> the basis of date, there will be an index on the date. As a
>> practical guide, there is an index on each of the columns
>> at the endpoint of a "relational line" in the Database
>> Overview diagram below, as well as an index on every date
>> column with the exception of the CYCLES table's Edate and
>> Ldate columns. All indexes on these tables are compound
>> structural indexes. No indexes are unique, even though some
>> of the data always contains unique values, because the
>> unique indexes hide problems with duplicate data.
>>
>>GROUPS (Groups)
>>
>> This table contains one row for every group on which there
>> is some recorded information. This includes not only the
>> regular groups, but also temporary sub-groups and the
>> special group "Unknown"^[4]. (See the protocol for when to
>> use this special group.) When a sub- group becomes a
>> regular group, the new group should be given a new row in
>> the GROUPS database; the new row will be a permanent group
>> with a new Gid value. The "old" sub-group should be left in
>> GROUPS to support the sub-grouping membership history.
>>
>> The field names in this table are actually somewhat
>> misleading . The group membership recorded in other
>> database tables are generally the most detailed information
>> available. Consequently, what is usually recorded as the
>> "group" in these other tables is membership in the
>> sub-group, and many of the rows in the this table represent
>> sub- groups . Often, then, the recorded group identifer
>> (Gid) is really a sub-group identifier. The Supergroup
>> field of this table, which aggregates sub-groups, really
>> represents the regular and permanent groups which the
>> animals form.
>>
>> Every reference to a group elsewhere in the BABASE system
>> should correspond to a Gid of one of the records in this
>> table. When a new record is manually entered into this
>> table, the UPSUPERG program should be run to calculate the
>> Supergroup for the new record. Temporary groups should have
>> a From_group value and should not have their own Gid value
>> as their Supergroup. Permanent groups should have their own
>> Gid as their Supergroup. Permanent groups may or may not
>> have a From_group value. During data entry for groups that
>> are fission products of other groups, the fission products
>> have the mother group as their Supergroup. This is always a
>> temporary condition.
>>
>> Data Entry Rules
>>
>> This table is updated with the regular FoxPro** data
>> maintenance tools and the UPSUPERG program. See section 2.0
>> in the Protocol for Data Management: Amboseli Baboon
>> Project document.
>>
>> Data Element Descriptions
>>
>> Gid
>>
>> A positive numeric value with four digits (2 decimal
>> places), which identifies the group. Zero is not a valid
>> group Gid, and each Gid must be unique. This field should
>> not be blank.
>>
>> Name
>>
>> The spelled out name of the group, if any. This field
>> should be unique, and unique insensitive of case.This field
>> may be blank.
Can these fields still be blank in postgesql? Applie to entries below
as well. SHould this say "This field may be NULL?
>>
>> From_group
>>
>> The Gid of the group from which this group split off. If
>> this field is changed manually, the UPSUPERG program should
>> be run. This field may be blank
>>
>> Permanent
>>
>> This field indicates whether the group is a permanent group
>> or not. The character "Y" in this field means that the
>> group is permanent. If this field is empty, the group is
>> temporary. If this field is changed manually, the UPSUPERG
>> program should be run. This field may be blank.
>>
>> Supergroup
>>
>> The Gid of the permanent group from which this group split
>> off. Most of the time this will be the same as the
>> From_group field, but if the From_group is itself a
>> temporary sub- group then the value would be different - -
>> the first permanent group from which the pgroup is
>> descended. This field should not be entered manually. Its
>> value is derived by the UPSUPERG program. This field should
>> not be blank.
>>
>> Note
>>
>> this field is necessary because of the restrictions on
>> user-defined functions usable within SQL statements. If
>> these restrictions did not exist, a user defined function
>> could be used to transform actual groups into their
>> supergroups. This would be a better solution because you
>> would not have to worry whether the UPSUPERG program had
>> been run.
>>
>>BIOGRAPH (Baboon Biographical Data)
>>
>> This table records the basic biographical data on the
>> baboons. It contains one row for each baboon, including
>> aborted fetuses and fetal deathsCathrine sez: Shouldn't
>> this be "still births"? (collectively, fetal losses), on
>> which any data have been collected. Those rows that record
>> data on fetal losses must maintain the following relations
>> between their data values: the Sname and Name values must
>> be NULL; the Statdate must be the same as the birth date
>> (Birth); the Status must be 1 (definitely dead); and the
>> Dcause must be 7 (unknown) or 5 (loss of mother). Jeanne
>> needs to confirm that this is still the case since her
>> changes to DCAUSES. In all cases the Statdate value must
>> not be less than the Birth value. Live animals must not
>> have a recorded cause of death. Live animals that have no
>> associated CENSUS rows (absences excepted) must have a
>> Statdate equal to their Birth date.
>
>Revisions to the paragraph above:
>
>This table records the basic biographical data on baboons. It
>contains one row for each baboon, including fetal deaths, on which
>data have been collected. In all cases, the Statdate value must not
>be less than the Birth value. Live animals must not have a recorded
>cause of death. Live animals that have no associated CENSUS rows
>(absences excepted) must have a Statdate equal to their Birth date.
>Those rows that record data on fetal losses must maintain the
>following relations between their data values: the Sname and Name
>values must be NULL; the statdate must be the same as the birth date
>(Birth); the Status must be 1 (definitely dead); and the Dcause must
>be 7 (unknown) or 5 (loss of mother)
>
>>
>> All individuals with an Sname, i.e., those which aren't
>> fetal losses, must have a Name and will have rows on
>> MEMBERS. Because the fetal losses have no Sname they cannot
>
>Revisions to first sentence above:
>
>All individuals with an Sname, i.e. those that aren't fetal losses,
>must have a Name and will have rows in MEMBERS.
>
>> have corresponding CENSUS rows and so there will not be any
>> record of their group membership in MEMBERS.
>>
>> Column Descriptions
>>
>> Sname
>>
>> The short name of the individual. This is an exactly three
>> character long name abbreviation which is used to identify
>
>Add to the opening of this paragraph that the sname is usually, but
>not always, the first 3 letters of the name.
Suggested by SCA: The short name of the individual. This is an
exactly three character long name abbreviation. It is usually, but
not always, the first 3 letters of the Name. It is used to identify
>
>> the individual and so must be a unique data value. This
>> value appears in many other places in the system and so
>> should not be changed without changing all the other places
>> in the database where the abbreviation appears; really,
>> once established, the only reason to change this column is
>> because the short name had already been used.^[5] The Sname
>> is always composed of capital letters and may not contain a
>> space. This column should only be NULL if the row
>> represents a fetal loss.
>
>Jeanne wrote in the margin beside the paragraph above, "We have no
>way of locating all uses, do we?"
>
>>
>> Name
>>
>> The name of the individual. This is a textual column used
>> for descriptive purposes. This value must be unique when a
>> comparison is done in a case insensitive fashion. This
>> column should only be NULL if the row records a fetal loss.
>
>Jeanne wrote in the margin beside the paragraph above, "Pregnancies
>ongoing at the onset of a gap?"
>
>>
>> Pid
>>
>> The Pid value, from the PREGS (Pregnancies) table, of the
>> individual's mother's pregnancy that ended in the
>> birth^[6]of the individual. This column may be NULL when
>> there is no record of the individual's mother.
>>
>> Birth
>>
>> The date the pregnancy ends. If the pregnancy results in a
>> birth, this date is the birth date of the offspring,
>
>Revision to the first clause of the second sentence above:
>
>If the pregnancy results in a live birth . . .
>
>> otherwise, this is the date of the fetal loss. (A pregnancy
>> that ends with the mother's death is considered as a
>> spontaneous abortion for this purpose.)
>
>Revision to the last sentence above:
>
>A pregnancy that ends with the mother's death is considered a
>spontaneous abortion (fetal loss) for this purpose.
>
>>
>> This column may not be NULL.
>>
>> Bstatus
>>
>> Birthday status. This column records the quality of the
>> birth date estimate. The legal values for this column are
>> defined by the BSTATUSES support table.
>>
>> Tip
>>
>> At the time of this writing the legal values are:
>>
>> The BSTATUSES Table
>>
>> Code Description
>> 0 Known exactly (to within several weeks, usually to
>> within a few days)
>> 1 Estimate good to within 1 year
>> 2 Estimate good to within 2 years
>> 3 Estimate good to within 3 years
>> 4 Estimate good within 4 years
>> 9 Unknown, i.e. these dates are guesses and should not
>> be used
>>
>> I don't think it's a particularly good idea to show support
>> table values in this document. The procedure manual is the
>> place for that. The whole point of support tables are that
>> you can put anything you like in them.
>
>Regarding the comment above, Jeanne says ok.
SCA says - I see the point especially given that you can query the
table so easily. However, can you clarify where we might want to
include a list of legal values and where we don't want to show all
values? For instance does the comment apply to the valild sex values
below? As an end user of this document I know that I have many times
used this as an easy source of info about what is in the table. Also
in at least one case (when changes were made to some, but not all, of
the dcauses in biograph) having the dcauses support table in the
babase documentation provided an unequivocal record of the original
state of the data. But i realize that this is not a great argument
for keeping it in.
>
>>
>> This column may not be NULL.
>>
>> Sex
>>
>> The sex of the individual. The legal values are:
>>
>> Valid Sex Values
>>
>> Code Description
>> M the individual is male
>> F the individual is female
>> U the individual is of unknown sex
>>
>> This column may not be NULL.
>>
>> Matgrp
>>
>> The maternal group of the individual, the Gid of the
>> sub-group into which the individual was born.
>>
>> This column must contain a Gid value of a row on the GROUPS
>> table. This column may not be NULL.
>>
>> Tip
>>
>> If the maternal group is not known, the maternal group
>> should be recorded as the unknown group.
>>
>> Statdate
>>
>> The status date of the individual. When the individual is
>> alive, this is the latest date on which the animal was
>> censused and found in a group^[7], absences don't count.
>> When there are no such censuses, and the individual is
>> alive, then the Statdate is the birth date. This column is
>> automatically updated when CENSUS is updated to ensure the
>> these relationship remain true. When the individual is not
>
>In the last full sentence above, Jeanne circled "these relationship"
>and put a question mark beside it - check with her for more info.
>
>> alive the Statdate is the date of death.
>>
>> Caution
>>
>> Living individuals, unlike dead ones, can have MEMBERS rows
>> created by the interpolation procedure that locate the
>> individual in a group on a date later than the individual's
>> Statdate. For further information see: Interpolation at the
>> Statdate .
>>
>> Statdate (almost, given the preceding caveat) provides a
>> convenient way of determining the end of the time interval
>> during which there is data on an individual, a way that is
>
>Revision to start of the sentence above:
>
>. . . determining the end of the time interval during which there
>are data on an individual . . .
>
>> independent of whether the individual is alive or dead.
>>
>> This column may not be NULL.
>>
>> Status
>>
>> The state of the individual's life at the Statdate. The
>> legal values for this column are defined by the STATUSES
>> support table.
>>
>> Tip
>>
>> At the time of this writing the legal values are:
>>
>> The STATUSES Table
>>
>> Code Description
>> 0 alive
>> 1 known death
>> 2 suspected death
>>
>> This column may not be NULL.
>>
>> Dcause
>>
>> The cause of death or circumstances associated with death.
>> The legal values for this column are defined by the DCAUSES
>> support table.
>>
>> Tip
>>
>> At the time of this writing the legal values are:
>>
>> The DCAUSES Table
>>
>> Code Description
>> 1 predation
>> 2 conspecific
>> 3 other wounds or injuries
>> 4 Pathology or congenital problem
>> 5 loss of mother
>> 6 human action
>> 7 unknown
>> 8 under review
>>
>> Tip
>>
>> A value of 5 should only be present for individuals whose
>> mother has died or disappeared at the same time or shortly
>> before said individual.
>>
>> This column may not be NULL.
>>
>>MATUREDATES (Sexual Maturity Dates)
>>
>> This table records sexual maturity dates, the dates of
>> menarche or testicular enlargement. It contains one row for
>> every animal who matured in a study group or who lived in a
>> study group as an adult, and it may occasionally contain a
>> row for a male who was known to mature but who did not live
>> in a study group.
>>
>> Matured
>>
>> This is the date of menarche for females and the date of
>> testicular enlargement for males. For females, the date the
>> individual reached sexual maturity. For males, the date he
>> reached sexual maturity OR the date by which we consider
>> him to be sexually mature, if we did not observe his
>> transition to mature. All sexually mature individuals
>> should have a non-blank value in this column. For females,
>> this date should be the first "T" date recorded for the
>> females sexual cycling data in the CYCLES table. See also
>> the Mstatus field. This field may not be NULL.
>>
>> Mstatus (Sexual Maturity Status)
>>
>> This column records whether the animal became mature ON a
>> given (known) date, or BY a given (known) date. If a date
>> is designated as an "ON" date then we are saying that we
>> know the animal attained that marker ON that date (although
>> note that this is not literally true, because we don't
>> track rank changes for testicular changes in males on a
>> daily basis - males are assigned a matured date on the
>> first day of the month in which we saw them with fully
>> enlarged testes). If a date is designated as a "BY" date
>> then we know the animal was adult or subadult BY that date
>> but we don't know when he attained it. This will allow us
>> easily identify which animals in any group on any day are
>> juvenile, subadult and adult. The legal values for this
>> field are defined by the MSTATUSES support table, see
>> below. This column may not be NULL.
>>
>>RANKDATES (Adult Rank Attainment Dates)
>>
>> This table records dates individuals first attained adult
>> rank. It allows one row for every individual who has
>> attained adult rank.
>>
>> Tip
>>
>> RANKDATES currently contains only data for males but data
>> for females may be added.
>>
>> When there is a row in this table there must be a sexual
>> maturity date in MATUREDATES (Sexual Maturity Dates), and
>> the rank attainment date must be later than the sexual
>> maturity date.
>>
>> Ranked
>>
>> The date the individual first attained a rank among adults.
>> This column may not be NULL.
>>
>> Rstatus
>>
>> Rank date status.
>>
>> This column records the quality of the rank date. The legal
>> values for this column are O (for ON) and B (for BY), as
>> with Mstatus in the Matured table; see above for
>> definitions, and see description of the MSTATUSES support
>> table, below. This column may not be NULL.
>>
>>CONSORTDATES (First Consortship Dates)
>>
>> This table records the dates of first consortship. It
>> contains one row for every individual for which there is a
>> recorded first consortship.
>>
>> Tip
>>
>> Currently it only contains values for males; females may be
>> added if desired.
>>
>> Tip
>>
>> All dates are exact, no "BY" dates are entered as we do for
>> MATUREDATES (Sexual Maturity Dates) and RANKDATES (Adult
>> Rank Attainment Dates), so there is no "Status" column.
>>
>> When there is a row in this table there must be a sexual
>> maturity date in MATUREDATES (Sexual Maturity Dates), and
>> the consortship date must be later than the sexual maturity
>> date.
>>
>> Consorted
>>
>> The date the individual had its first consortship. This
>> column may not be NULL.
>>
>>DISPERSEDATES (Dispersal Dates)
>>
>> This table records dates of dispersal for males (females do
>> not disperse and do not appear in this table). It contains
>> one row for every male who has a known date of dispersed
>> from the study groups. Only males can have rows on this
>> table.
>>
>> Tip
>>
>> All dates are exact, no "BY" dates are entered as we do for
>> MATUREDATES (Sexual Maturity Dates) and RANKDATES (Adult
>> Rank Attainment Dates), so there is no "Status" column.
>>
>> When there is a row in this table there must be a sexual
>> maturity date in MATUREDATES (Sexual Maturity Dates).
>>
>> Dispersed
>>
>> The date the individual (male) left its maternal group.
>> This column may not be NULL.
>>
>>PREGS (Pregnancies)
>>
>> This table records pregnancies.
>>
>> It contains one row for each recorded pregnancy. A
>> pregnancy is defined to be an event occurring to some
>> mother, a single pregnancy could result in more than one
>> fetus. The only time there will not be an associated
>> BIOGRAPHrow is when the pregnancy is still in progress,
>> otherwise there will always be a BIOGRAPH row which records
>> the progeny of the pregnancy.
>>
>> The conception sexual cycle dates (Conceive) of the
>> pregnancy should not be later than the birth date value
>> (Birth) of the associated BIOGRAPH row. The birth date
>> value (Birth) of the associated BIOGRAPH row should not be
>> later than the resumption of cycling date values (Resume.)
>>
>> The sequence number (Seq on CYCLES) of the sexual cycle
>> immediately following pregnancy (Resume) should always be
>> exactly one more than the sequence number of the sexual
>> cycle associated with conception (Conceive). The female
>> associated with the conception sexual cycle (Conceive)
>> should be the same as the female associated with the sexual
>> cycle immediately following pregnancy (Resume). There
>> should be no overlap of pregnancy time periods, from
>> conception date to birth date or, if known, resumption of
>> sexual cycling date, among the pregnancies associated with
>> a particular female.
>>
>> Data Entry Rules
>>
>> No special program supports the maintenance of this table.
>>
>> Data Element Descriptions
>>
>> Pid
>>
>> The contents of this column uniquely identifies the
>> pregnancy record. The Pid is the mothers Sname followed by
>> the probable parity. Because the Pid is only used to
>> identify the record, it is not necessary to change the Pid
>> just because the parity of the pregnancy is found to have
>> changed. In general, once a unique Pid is established, it
>> should not be changed. When retrieving data from this table
>> the safe approach is to assume nothing about the contents
>> of this column except that it will uniquely identify a
>> pregnancy. The safe way to obtain the bearer of the
>> pregnancy is to find the female associated with the
>> ovulation by joining PREGS. Conceive with CYCLES.Csid to
>> find CYCLES.Sname. Likewise, the Parity column should
>> always be used to obtain a meaningful parity value.
>>
>> Parity
>>
>> (This use of 100, etc. in this column is under review.) The
>> cardinality of the pregnancy. 1 for a female's first
>> pregnancy, 2 for a female's second pregnancy, and so forth.
>> There should be no "gaps" in the pregnancies, sequenced by
>> Parity, of any female. When the first pregnancy is known,
>> the Parity sequence begins with 1. When the first pregnancy
>> is not known, the Parity sequence begins with 101.
>>
>> Conceive
>>
>> The information recorded on the sexual cycle of the
>> conception which initiated the pregnancy. This is the Cid
>> of a CYCLES row of the mother. The associated CYCLES row
>> should contain a Ddate value to record the date of
>> conception. The dates of the associated CYCLES record, when
>> dates are present, should be between the sexual maturity
>> date and the death date of the mother. This column should
>> contain a unique datum.
>>
>> When the date of conception is estimated because there is
>> no sexual cycle data, the conception date recorded should
>> be 178 days before the recorded birthday.
>>
>> Resume (NULL allowed)
>>
>> The resumption of cycle information of the first cycle
>> following the pregnancy. This is the Cid of a row in
>> CYCLES. The associated CYCLES record will usually not have
>> a resumption of menses date. This column may be NULL for
>> those cases when resumption of cycle information is not
>> known. When this column is not NULL, it should contain a
>> unique datum.
>>
>>CYCGAPS
>>
>> Records of the initiation and cessation of continuous
>> periods of observation during which a female's cycles are
>> presumed to all have been recorded. This table contains one
>> row for each female for each initiation or cessation of a
>> continuous period of observation.
>>
>> Rows with a Code value of "S" or "P" , that mark the
>> beginning of observational periods or that represent
>> isolated single days of observation must have a value in
>> the State column. All other rows, those with a code of "E"
>> that represent the end of an observational period, must not
>> have a value in the State column.
>>
>> This table is used to construct the reproductive state
>> tables, REPSTATS and CYCSTATS.
>>
>> The combination of Sname and Date is unique.
>>
>> Data Entry Rules
>>
>> We'll figure something out.
>>
>> Data Element Descriptions
>>
>> Gapid
>>
>> A number which uniquely identifies each row.
>>
>> Sname
>>
>> The short name of the female. This column should contain
>> the Sname of a female in BIOGRAPH. This column should not
>> be blank.
>>
>> Code
>>
>> What kind of endpoint the date records. Legal values are
>> "S" (Start), the date is the start of a period of
>> observation; "E" (End), the date is the end of a period of
>> observation; "P" (point), the date is an isolated
>> observation that belongs with no other observations, it is
>> both a start and an end of an observational period.
>>
>> Date
>>
>> The date upon which observations began or ended.
>> Observations were made on the given date.
>>
>> State (NULL allowed)
>>
>> The state of the female's sexual cycle on the given date.
>> Valid values are:
>>
>> "M" , menses-follicular, Mdate (inclusive) to Tdate
>> (exclusive)
>>
>> "S" , swelling-follicular, Tdate (inclusive) to 5 days
>> prior to Ddate (exclusive)
>>
>> "O" , ovulating - - 5 days prior to Ddate (inclusive) to
>> Ddate (exclusive)
>>
>> "D" , deturgesence, luteal - - Ddate (inclusive) to Mdate
>> (exclusive)
>>
>> "P" pregnant - - Ddate (inclusive) to birth (exclusive)
>>
>> "L" , lactating - - birth (inclusive) to Mdate (exclusive).
>>
>> Must not be NULL when Code is "S" or "P" , must be NULL
>> when code is "E" . See discussion in the table description
>> above.
>>
>>CYCPOINTS
>>
>> This table records information on the sexual cycle of the
>> females. The usual events that mark the transitions of a
>> female baboon's sexual cycles are onset of menses, onset of
>> turgesence, and beginning of deturgesence (turgesence
>> peak.) CYCPOINTS contains one row for every recorded
>> transition of a female's sexual cycle. In addition to the
>> usual recorded observations of transition states there are
>> additional rows that record estimations of when unobserved
>> transitions occurred, notably onset of menses dates
>> (Mdates) but also unobserved onset of deturgesence dates
>> for pregnancies.
>>
>> The transition events recorded in CYCPOINTS are collected
>> into sexual cycles, each cycle having (at most) a onset of
>> menses date (Mdate), a onset of turgesence date (Tdate),
>> and an onset of deturgesence date (Ddate). Each cycle is
>> assigned a sequence (Seq) beginning with 1 and the
>> different transition event dates are distinguished by Code
>> values of M, T, and D respectively. The combination of
>> Sname, Code, and Seq must be unique. Some sexual cycles may
>> be missing one or more of the transition codes, should
>> there be no record of an observation. In this case the
>> respective row is omitted from the table.
>>
>> The sexual cycles themselves are aggregated into periods of
>> continuous observation, termed Series, indicated by the
>> assignment of a Series number to each row. The first period
>> of continuous observation for an individual has a Series of
>> 1, the second a Series of 2, etc. Aggregating a female's
>> CYCPOINTS rows into a Series indicates that the collection
>> of data points is believed to be complete, no unobserved or
>> unrecorded sexual cycle transitions occurring during the
>> time spanned by the series. This allows the Series to be
>> used as the basis of an analysis of sexual cycle transition
>> intervals. All a female's CYCPOINTS belonging to the same
>> sexual cycle, i.e. having that same Seq value, must also
>> belong to the same Series, have the same Series value.
>>
>> CYCPOINTS includes rows for those cases in which we know
>> that a female had a cycle, because a pregnancy/birth
>> resulted, but we don't know anything else about the cycle.
>> The child will have a pregnancy record, and CYCPOINTS will
>> contain an associated row to record the estimated D date of
>> the pregnancy. (See the PREGS Conceive documentation
>> above.) CYCPOINTS also includes rows that record other
>> estimated conception dates.
>>
>> Because every female that has reached sexual maturity
>> should have a maturation date (Matured) in BIOGRAPH, with a
>> corresponding row for the first sexual cycle, the sexual
>> cycle with a sequence (Seq) of 1 should be the female's
>> first sexual cycle and should not, in general, have a onset
>> of menses date (Mdate).
>>
>> A cycle's onset of menses date (Mdate) should not be after
>> the onset of turgesence date (Tdate). A cycle's onset of
>> turgesence date (Tdate) should be before the onset of
>> deturgesence date (Ddate)
>>
>> The onset of menses date (Mdate), onset of turgesence
>> (Tdate), and onset of deturgesence (Ddate) of any cycle
>> should not be equal to or between any pair of the onset of
>> menses dates (Mdate), onset of turgesence dates (Tdate), or
>> onset of deturgesence dates (Ddate) of any other recorded
>> cycle for an individual.
>>
>> The earliest possible onset of menses (Emdate), onset of
>> turgesence (Etdate), and onset of deturgesence (Eddate)
>> columns should not be after the onset of menses (Mdate),
>> onset of turgesence (Tdate), and onset of deturgesence
>> (Ddate) columns, respectively. The latest possible onset of
>> menses (Lmdate), onset of turgesence (Ltdate), and onset of
>> deturgesence (Lddate) columns should not be before the
>> onset of menses (Mdate), onset of turgesence (Tdate), and
>> onset of deturgesence (Ddate) columns, respectively.
>>
>> Only one of (each different kind of date, early, regular,
>> and late) an individual's M date values may be before the
>> individual's onset of first menses date
>> (MATUREDATES.Matured), and all should be on or before the
>> individual's Statdate. All of (each different kind of date,
>> early, regular, and late) an individuals T and D date
>> values must be after the individuals onset of first menses
>> date (MATUREDATES.Matured), and all should be on or before
>> the individual's Statdate.
>>
>> Each series must either consist of a single observation
>> (Period = P) or have starting and ending dates that are
>> marked with a Period of S and E respectively.
>>
>> Note
>>
>> There are plans afoot to automatically fill in the early
>> and late dates. The early dates would include the
>> immediately prior census date, the late date would be the
>> day before the immediately following census date. There
>> must also be a mechanism for manually overriding the
>> automatic dates.
>>
>> Data Entry Rules
>>
>> We'll figure something out.
>>
>> Data Element Descriptions
>>
>> Cpid
>>
>> A numeric identifier unique to each row. This is used to
>> reference the sexual cycle transition elsewhere in the
>> database. This column should not be blank.
>>
>> Sname
>>
>> The short name of the female. This column should contain
>> the Sname of a female in BIOGRAPH. This column should not
>> be blank.
>>
>> Cid
>>
>> A numeric identifier identifying each sexual cycle. It is
>> unique across all cycles of all females. A cycle is defined
>> to begin with onset of menses, encompass the turgesence and
>> deturgesence transitions, and end the day before the next
>> onset of menses. Note that some cycles may only contain a
>> single CYCPOINTS row, that is, the Cid value may be unique
>> to a single CYCPOINTS row.
>>
>> Seq
>>
>> Sequence. The first sexual cycle of a female has a Seq
>> value of 1, the second a value of 2, etc. This column does
>> not need to be manually maintained. There are no gaps in
>> the sequence numbers assigned to a female. Even when
>> records of cycles are missing, the first recorded cycle
>> after the missing period has a sequence one greater than
>> the last recorded cycle before the missing period. This
>> column should not be blank.
>>
>> Date
>>
>> The date of the transition event. This column may not be
>> blank.
>>
>> Edate
>>
>> Earliest possible date of the transition event. This column
>> may be blank when there is no need to record a range of
>> date values.
>>
>> Ldate
>>
>> Latest possible date of the transition event. This column
>> may be blank when there is no need to record a range of
>> date values.
>>
>> Source
>>
>> Code indicating from whence the data was derived. D (Data)
>> for observed data. A (Auto) for automatically calculated
>> dates, such as M dates computed by adding 13 to the
>> previous D date, when the previous D date is not the start
>> of a pregnancy. E (Estimated) for estimated values not to
>> be used in other computations, such as estimated D dates
>> entered to relate mothers and pregnancies.
>>
>> Code
>>
>> The type of sexual cycle transition. M (onset of Menses), T
>> (onset of Turgesence), or D (onset of Deturgesence).
>>
>> Series
>>
>> Number indicating with which Series of continuous
>> observation the transition event belongs. Events that are
>> isolated observations have a series of their own. As with
>> Seq, the Series are per-female. Each female begins with a
>> Series of 1 and and series is incremented with each
>> interruption in regular observation.
>>
>>SEXSKINS (Sexskin Turgescence Measurements)
>>
>> This table records information on the size of the females'
>> SEXSKINS. It contains one row for every recorded
>> measurement of each female's sexskin.
>>
>> The combination of Sname, from the associated CYCLES row,
>> and Date should be unique.
>>
>> The earliest measurement larger than 1 of all the
>> measurements associated with a particular cycle, should
>> have a Date the same as or after the Tdate of the cycle.
>> The Tdate of a cycle should be after the dates of all the
>> cycle's sexskin measurements of zero which precede the
>> earliest non-zero measurement occurring in the cycle. The
>> Ddate of a cycle should be after the last measurement
>> before the largest measurement of the cycle. ^[8] The
>> Ddates of most cycles will be on or before the first
>> measurement following the largest sexskin measurement(s) of
>> the cycle. The earliest measurement larger than zero of all
>> the measurements associated with a particular cycle, should
>> have a Date after the Mdate of the cycle. All of the
>> SEXSKINS Date values associated with a particular cycle
>> should be later than the M, T, and D dates of the previous
>> cycle and earlier than the M, T, and D dates of the
>> succeeding cycle. There should not be any overlap of the
>> cycles' sexskin measurement dates, over the time period
>> from a cycle's earliest sexskin measurement date to its
>> latest, between the sexskin measurement dates of a female's
>> cycles.
>>
>> Data Entry Rules
>>
>> This table is updated with the regular FoxPro** data
>> maintenance tools. See section 2.0 in the Protocol for Data
>> Management: Amboseli Baboon Project document.
>>
>> Data Element Descriptions
>>
>> Date
>>
>> The date of the observation. This date should be between
>> the individual's puberty date (Matured) and Statdate,
>> inclusive. This field should not be blank.
>>
>> Size
>>
>> This field contains a number indicating the size of the
>> sexskin in relative units which are integers ranging from 0
>> through 20, inclusive. This field should not be blank.
>>
>> Note
>>
>> A zero is a blank. None the less, the information in this
>> field is always meaningful, even when the datum is zero.
>>
>> Cid
>>
>> The sexual cycle associated with the sexskin measurement.
>> This is a Cid from the Cycles table. This field can be used
>> to retrieve the Sname of the female which was measured as
>> well as all other data collected on the cycle. This field
>> should not be blank.
>>
>>MEMBERS (Group Membership)
>>
>> The group membership table. This table records which group
>> each animal is in on which date, excepting fetal losses
>> (individuals with no Sname). There is a row in MEMBERS for
>> every individual for every day between Birth and Statdate,
>> inclusive, including periods during which the whereabouts
>> of an individual are either recorded as being unknown or
>> assumed unknown by the interpolation procedure. (See: the
>> unknown group.) Some living individuals have MEMBERS rows
>> after their Statdate, for more information see the section:
>> Interpolation at the Statdate . MEMBERS is most useful when
>> one is interested in an individual's location on a
>> particular date. Simply check MEMBERS for the individual on
>> that date. To find all the individuals in a group on a
>> date, look at all the rows in the table on that date for
>> the group.
>>
>> MEMBERS is a single population-wide table created and
>> updated automatically using information from CENSUS,
>> BIOGRAPH, and DEMOG. The method used to do this is called
>> interpolation and is described fully in a section below.
>> Briefly, interpolation guesses which group an individual is
>> likely to be in when there is no observational data. The
>> MEMBERS rows which are the result of guessing have an I as
>> their Origin value.
>>
>> Note
>>
>> Babase requires that an animal be located in exactly one
>> group on any particular day, the combination of Sname and
>> Date should be unique. The intent of this table is to
>> record the location of each animal at the start of each
>> day. See other documents for further information on how the
>> actual practice of data acquisition and entry impacts this
>> goal.
>
>Jeanne wrote in the margin beside the paragraph above, "How are visits done?"
>
>>
>> Column Descriptions
>>
>> Sname
>>
>> The individual whose location is being recorded. The three
>> letter code that identifies the individual's row in the
>> BIOGRAPH table. There will always be a row in BIOGRAPH for
>> the individual identified here.
>>
>> This column may not be NULL.
>>
>> Date
>>
>> The date.
>>
>> This column may not be NULL.
>>
>> Grp
>>
>> The group where the individual is located. This is a Gid
>> value from GROUPS. This field should contain the most
>> specific sub-grouping available -- subject to the
>> constraints of the data entry protocol, of course.
>> Aggregation into larger groupings is accomplished by
>> retrieving the associated Supergroup from GROUPS.
>>
>> This column may not be NULL.
>>
>> Note
>>
>> Usage exception: For the years 1989-1991, inclusive, the
>> group recorded for the sub-groups of Alto's group do not
>> necessarily reflect the actual groupings of the animals on
>> a particular day, but are instead indications of the
>> group-splitting process. See Jeanne Altmann and the Data
>> Management Manual for a further explanation.
>>
>> Origin
>>
>> A one letter code indicating the source of the location
>> information. This information is derived from, and has the
>> same values as, the Status column of CENSUS, although
>> MEMBERS.Origin contains the I (interpolated) value not
>> found in CENSUS, and does not contain the A (absent) value.
>
>Revision to the second sentence of the paragraph above:
>
>This information is derived from, and has the same values as, the
>Status column of CENSUS, with the exceptions that MEMBERS.Origin
>contains the I (interpolated) value not found in CENSUS, and does
>not contain the A (absent) value.
>
>> The codes are as follows: C (CENSUS) values represent
>> census data points, I (interpolated) values are derived
>> from the census data points, D (demography) values
>> represent demography notes not present in the census
>> sheets, M and N (manual) values represent census data
>> points due to operator intervention in CENSUS . The S, E,
>> F, B, G, T, L, and R codes are derived from analysis of
>> historical data. See the CENSUS section for further
>> information.
>>
>> This column may not be NULL.
>>
>> Interp
>>
>> The time interval, in days, from the date in which an
>> individual was previously observed to be in a group
>> (censused -- automatic placement in the unknown group does
>> not count) to the date of the MEMBERS row. So the value is
>> 0 on those days on which the individuals are censused, 1 on
>> those (non-census) days immediately before or after the
>> census days, etc. For those MEMBERS rows that the
>> interpolation procedure has placed in the unknown group for
>> lack of a better place to put them, the Interp column is
>
>Revision to the first clause of the sentence above (was written in
>with a question mark - might want to discuss):
>
>For those MEMBERS rows that the interpolation procedure has placed
>the individual in the unknown group . .
SCA suggests: For those MEMBERS rows for which the interpolation
procedure has placed the individual in the unknown group...
>.
>
>> the number of days "distant" from the interpolating CENSUS
>> row, or the birth date, that determined the group
>> membership. Note that the CENSUS row that determined that
>> the MEMBERS.Grp should be unknown may record an absence.
>>
>> Important
>>
>> The Interp value is not meaningful over intervals that
>> contain census rows which are themselves the result of an
>> analysis. Over these intervals Interp is NULL. For more
>> information see Interpolation, Data is not Re-Analyzed.
>
>Revision to the paragraph above:
>
>The Interp value is not meaningful over intervals that contain
>census rows that are themselves the result of an analysis. Over
>these intervals the Interp is NULL. For more information see
>Interpolation, Data are not Re-Analyzed.
SCA doesn't understand this ..."that are themselves the result of an analysis."
>
>>
>> This column many be NULL.
>
>Revision to the sentence above:
>
>This column may be NULL.
>
>>
>>CENSUS
>>
>> The population census table. Aside from the BIOGRAPH Matgrp
>> column, this table is the origin of all information
>> regarding group membership. This table holds all the field
>> census data any any information regarding group membership
>> that is recorded in the field demography notes. It contains
>> one row per animal per group per day censused. There is an
>> additional row per individual per demography note for those
>> days when there is a demography note regarding the
>> individual and group but no census of the group. (See
>> DEMOG.)
>
>Revision to the start of the paragraph above (note question mark -
>may want to discuss):
>
>The population census table. Aside from the BIOGRAPH Matgrp column,
>this table is the origin of all information regarding group
>membership. This table holds all the field census data and any
>information regarding group membership that is recorded in the table
>(?) DEMOG.
SCA agrees that this should refer to the table not the actual notes.
>
>>
>> Tip
>>
>> One way to have Babase record that an individual is alone
>> is to first create a row in GROUPS (Groups) meaning alone,
>> and then to assign individuals who are alone to this group.
>> The "alone-ness" of an individual can then be tracked in
>> the same fashion as group membership, although the Babase
>> user does then need to be aware that the members of the
>> "alone" group are not actually proximate to one another.
>
>Jeanne wrote in the column beside the paragraph above, "So CENSUS
>will have 2 rows for individuals for the day but not MEMBERS?"
SCA says: In other words, if an animal is marked absent from a study
group AND is later seen alone with a note in demography notes, then
s/he gets an A in the CENSUS for the study group he was marked absent
from and a D in the CENSUS for "alone". Is this correct? On the other
hand if an individual is habitually living alone for several months
(or has only been known as a member of a non-study group in recent
years), they will eventually stop being recorded as absent from any
group and will just be alone -- so in that case they will just get
one row in Census with a D for alone.
>
>>
>> As noted in the MEMBERS documentation, Babase does not
>> allow an individual to be in more than one group on a given
>> day.
>>
>> The original field census data sheets can be recovered from
>> CENSUS, with one exception. Data is lost when an individual
>> is actually censused in two groups on the same day because
>> of movement between groups and the timing of the
>> censuses.^[9] In this situation a decision should be made
>
>Revision to the second sentence in the paragraph above:
>
>Data are lost . . .
>
>> as to which group CENSUS should record the individual's
>> presence on that day. A demography note should then be
>> added to DEMOG, with text that notes the individual's
>> presence in the second group. Although it is technically
>> true that this does put into the database all of the
>> information from the censuses in the field, as the
>> information regarding the second census is in textual
>> information it is not readily available to automated tools.
>
>Revisions to the last sentence in the paragraph above:
>
>Although it is technically true that this does put into the database
>all of the information from the censuses or other locational
>information in the field, because the information regarding the
>second census is in textual information it is not readily available
>to automated tools.
SCA still had a hard time understanding this. How about:
This results, technically, in all of the information from both
censuses or other locational information being entered into the
database. However, it should be remembered that, because the
information regarding the second census is in textual information, it
is not readily available to automated tools.
>
>>
>> Caution
>>
>> Be careful when changing this data; remember that rank will
>> almost certainly change should group membership change.
>
>Revisions to the sentence above:
>
>Be careful when changing these data . . .
>
>Jeanne wrote in the margin beside the sentence above, "What does one
>need to do?"
>
>>
>> Column Descriptions
>>
>> Cenid
>>
>> A unique identifier. This is an automatically generated
>> sequential number. Cenid links CENSUS to DEMOG.
>>
>> This column may not be NULL.
>>
>> Date
>>
>> The date of the census, or the date of the demography note
>> (when Status is D).
>>
>> This column may not be NULL.
>>
>> Sname
>>
>> The individual whose location is being recorded. The three
>> letter code that identifies an individual in BIOGRAPH.
>> There will always be a row in BIOGRAPH for the individual
>> identified here.
>>
>> This column may not be NULL.
>>
>> Grp
>>
>> The group where the individual is located. This is a Gid
>> value from GROUPS. This column should contain the most
>> specific sub-grouping available -- subject to the
>> constraints of the data entry protocol, of course.
>> Aggregation into larger groupings is accomplished by
>> retrieving the associated Supergroup from GROUPS.
>>
>> This column may not be NULL.
>>
>> Note
>>
>> Usage exception: For the years 1989-1991, inclusive, the
>> group recorded for the sub-groups of Alto's group do not
>> necessarily reflect the actual groupings of the animals on
>> a particular day, but are instead indications of the
>> group-splitting process. See Protocol for Data Management:
>> Amboseli Baboon Project document for a further explanation.
>>
>> Status
>>
>> A one letter code indicating the source of the location
>> information. Status is the source of MEMBERS.Origin data.
>> The current codes are as follows: C (census), A (absent), D
>> (demography), and M or N (manual). Other values derived
>> from analysis of historical data include: S, E, F, B, G, T,
>> L, and R.
>>
>> The CENSUS.Status Codes
>>
>> C
>>
>> (census) The animal was found in the group on a
>> field census sheet: from the census datasheets.
>> (There may or may not be a corresponding demography
>> note on DEMOG as well.)
>
>Revision to the last sentence above (note question mark):
>
>There may or may not be a corresponding demography note in (?) DEMOG as well.
>
>>
>> Tip
>>
>> A C Status is marked on the census data sheet as an
>> "X".
>
>Revision to the sentence above:
>
>A C Status is marked on the field census data sheet as an "X".
>
>>
>> A
>>
>> (absent) The animal was not found in the group on a
>> field census sheet. Note that while an individual
>> should not be recorded "present" in more than one
>> group on the same day, s/he may be absent from
>> several groups on any given day.
>
>Revision to the last sentence in the paragraph above (note question
>mark - may want to discuss):
>
>Note that while an individual should not be recorded "present" in
>CENSUS (?) in more than one group on the same day, s/he may be
>absent from several groups on any given day.
>
>>
>> Tip
>>
>> An A Status is marked on the census data sheet as
>> an "0".
>
>Revision to sentence above:
>
>An A Status is marked on the field census data sheet as an "0".
>
>>
>> D
>>
>> (demography) The animal was noted, in the field
>> notebooks or elsewhere, to be in a group but was
>> not marked present in a field census on that day.
>
>Revision to the sentence above:
>
>(demography) The animal was noted, in the field notebooks or
>elsewhere, to be in a group but was not marked present in a field
>census of a study group that day.
>
>> There is an associated DEMOG row associated with
>> the CENSUS row. The individual may or may not have
>> been marked "absent" on the same group's field
>> census for the day.^[10]
>>
>> Tip
>>
>> A D Status is marked on the census data sheet as an
>> "0", when there exists a corresponding place on the
>> census data sheet.
>
>Revision to the sentence above:
>
>A D Status is marked on the field census data sheet as an "0", when
>there exists a corresponding place on the census data sheet.
But this won't always be true I think. If a male is censused in
Viola's in the morning and censused present in Linda's in the
afternoon, we will make a decision about whether he is in Vio or Lin
for that day, there will be an X for in the field census data sheet,
and the Status in CENSUS will always be D and there will be a Demog
note noting that he was in the other group too. How about:
A D Status will usually be associated with an absence (O) on the
field census sheet when the individual has a row on the field census
sheet, but this will not always be true.
>
>>
>> M
>>
>> (manual, interpolated) This code provides a way to
>> manually supplement what is in the CENSUS table
>> when there is no other way to get the data in.
>> Babase considers this code to be the same as the C
>> code.
>>
>> N
>>
>> (manual, not interpolated) This code provides an
>> alternate way to manually supplement what is in the
>
>Revision to the start of the sentence above:
>
>This code provides an alternative way . . .
>
>> CENSUS table when there is no other way to get the
>> data in. This code does not interpolate, it is
>> presumed to be the result of some analysis.
>>
>> S
>>
>> (Susan's data) The data comes from the old DISPERSE
>> database where the record had both a Datein and a
>> Dateout.
>>
>> E
>>
>> (ending date) The data comes from the old DISPERSE
>> database where the record had a Datein but not a
>> Dateout.
>>
>> F
>>
>> (final date) The data comes from the old DISPERSE
>> database where there is a Dateout and the last
>> recorded location is before the Statdate.
>>
>> B
>>
>> (birth date) The data comes from the old DISPERSE
>> database where the record had a Dateout but not a
>> Datein.
>>
>> T
>>
>> (total) The data comes from the old DISPERSE
>> database where the record had neither a Datein nor
>> a Dateout.
>>
>> G
>>
>> (gap) The data is a record of the animal in the
>> unknown group when the animal appeared in the old
>> DISPERSE database but where there was a gap between
>> times of recorded location.
>>
>> L
>>
>> (lineage) The group is from the Matgrp on the old
>> CYCTOT database, either because the animal did not
>> appear in the DISPERSE database, or because the
>> first location for the animal in the old DISPERSE
>> database had a Datein and this Datein was after the
>> birth date of the animal.
>>
>> R
>>
>> (result of Alto's breakup) The data is S, E, F, B,
>
>Revision to the start of the sentence above:
>
>The data are . . .
>
>> G, T, or L data which has had locations which were
>> changed from 1.0 to the group in which the animal
>> was censused on 15/4/92. This change left all R
>> rows as part of a contiguous series of days during
>> which the animals are located in the Alto's
>> sub-group as censused on 15/4/92, and the
>> time-adjacent locations were not 1.0.
>>
>> This column may not be NULL.
>>
>> Cen
>>
>> Whether or not the CENSUS row represents an entry on a
>
>Revision to the start of the sentence above:
>
>Whether or not the CENSUS row represents an entry on a field census
>data sheet, . . .
>
>> census data sheet. TRUE means the CENSUS row exists because
>> of an entry on a census data sheet, FALSE means there was
>
>Jeanne circled "entry on a census data sheet" in the sentence above
>but made no other marks - may want to clarify
>
>> no census done and the CENSUS row exists to support a
>> demography note, manual notation of absence, etc. Cen
>> should only be TRUE when Status is C, A, or D.
>>
>> This column may not be NULL.
>>
>>DEMOG (Demography Notes)
>>
>> This table holds the text which records group membership
>
>Revision to the start of the sentence above:
>
>This table holds the text that records group membership . . .
>
>> information not written on the regular field census sheets,
>> especially that from the field demography notes. DEMOG
>> provides a means of notating CENSUS rows, and thus
>> facilitates management of additional "free form" CENSUS
>> rows, rows that do not directly correspond with the field
>> census sheets.^[11] Thus, in conjunction with these
>> corresponding CENSUS rows, the DEMOG rows capture group
>> membership information that otherwise would not appear in
>> the CENSUS table.
>>
>> DEMOG contains one row for every individual for every date
>> for every group where the individual was noted present in
>> the free form textual field notes. The DEMOG row holds
>
>Jeanne wrote "not quite" beside the first sentence in the paragraph
>above - check with her for comments.
>
>> textual information. There is always exactly one
>> corresponding CENSUS row which holds the corresponding
>
>Revision to the start of the sentence above:
>
>There is always exactly one corresponding CENSUS row, which holds the . . .
>
>> group membership information in the usual coded and
>> structured form. (Note that only some CENSUS rows will have
>> DEMOG rows; CENSUS rows that originate entirely in the
>> regular censuses of groups will not, in general, have an
>> associated DEMOG row). A single field note referring to
>> more than one individual must appear in DEMOG as two (or
>> more) separate rows, one row per individual. Multiple field
>> notes pertaining to a single individual on a single date
>> must be combined into one piece of text and entered in a
>> single DEMOG row. (See Protocol notes for structure of the
>> demography data as entered by the operator.)
>>
>> Column Descriptions
>>
>> Cenid
>>
>> A unique identifier. This is an automatically generated
>> sequential number. Cenid links CENSUS to DEMOG.
>>
>> This column may not be NULL.
>>
>> Reference
>>
>> A GROUPS Gid value that links the DEMOG row with the
>> written field notebook where the note can be found.
>>
>> This column may not be NULL.
>>
>> Comment
>>
>> The demography note text pertaining to the CENSUS row with
>> the given Cenid.
>>
>> This column may not be NULL.
>>
>>RANKS (Rankings Within Groups)
>>
>> The ranking of individuals within groups. This table
>> contains a row for every month for every ranked individual
>> for every type of rank assigned to the individual. When the
>> ranking has not been done for a type of rank in a month,
>> there are no rows for members of that group for that month
>> with that type of rank.
>>
>> The combination of Sname, Rnkdate, Grp, and Rnktype should
>> be unique.
>>
>> Data Entry Rules
>>
>> This table is updated with the RANKER maintenance program.
>> See section 2.0 in the Protocol for Data Management:
>> Amboseli Baboon Project document.
>>
>> Data Element Descriptions
>>
>> Sname
>>
>> The individual whose rank is being recorded. The three
>> letter code which identifies an individual (in Sname) in
>> BIOGRAPH. There should always be a record in BIOGRAPH for
>> the individual identified here. This field should not be
>> blank.
>>
>> Rnkdate
>>
>> The year and month of the ranking. This is always a 6 digit
>> number YYYYMM where YYYY is the year of the ranking and MM
>> is the month of the ranking. The year must be between 1940
>> and the present year, inclusive. This field should not be
>> blank.
>>
>> Grp
>>
>> The group or Supergroup (**Karl is this correct? Group or
>> supergroup? I think it has to be so or else we can't
>> explain the Alto's fission double rankings clearly) in
>> which the individual is ranked. This is a Supergroup value
>> from GROUPS. This field should contain the most general
>> grouping available. This field should not be blank. The
>> individual should be located in this group, or a sub-group
>> of this group, according to MEMBERS, at some point during
>> the month. Be careful when changing these data; remember
>> that the rank will almost certainly change when the group
>> is changed.
>>
>> Rnktype
>>
>> The kind of rank assigned to the individual, a Rnktype
>> value from the RNKTYPES table. This field should not be
>> blank. Examples of various rankings are: Adult Females, All
>> Females, etc., as defined in the RNKTYPES table.
>>
>> Rank
>>
>> This is the ranking among all the animals of the Rnktype in
>> the group over the Rnkdate period. The most dominant
>> individual is given a rank of 1, the next most dominant a
>> rank of 2, etc. This information is updated through the
>> ranking program and should not have to be manually updated.
>> This field may not be blank. The ranks must be contiguous.
>>
>>INTERACT (Interactions)
>>
>> This table contains ad lib monitoring data on interactions
>> between animals. Each record in the table records an
>> occurrence of an interaction between two individuals at a
>> particular time, and no further information on the
>> interaction. Each interaction is represented as though it
>> occurs between two ordered individuals designated "actor"
>> and "actee" - - these individuals are recorded in the PARTS
>> table. All characters are entered as capital letters. In
>> general, the Stop time must follow the Start time. The
>> exceptions to this are: For mounts and ejaculations, Start
>> equals Stop. When the interaction records a consortship and
>> the stop time is unknown the Stop time will be 00:00 and
>> the Start time will note the actual start time. When the
>> stop time is unknown the Start time will record the actual
>> start time and the stop time will be 00:00. When times are
>> not recorded, as is the case for agonisim and grooming,
>> both Start and Stop should be 00:00.
>>
>> Data Entry Rules
>>
>> This table is updated with the VALINTER and UPINTER dataset
>> based update programs. See section 2.0 in the Protocol for
>> Data Management: Amboseli Baboon Project document.
>>
>> Data Element Descriptions
>>
>> Iid
>>
>> A positive integer that uniquely identifies the
>> interaction. This number is assigned by the system. This
>> field should not be blank.
>>
>> Act
>>
>> A code indicating the kind of interaction. These codes are
>> the only legal values for this field: "A)" is entered for
>> agonism data, "G)" is entered for grooming data, "C" ) is
>> entered for consortships, "M)" is entered for mounts, "E)"
>> is entered for ejaculations. The ACTS support table, see
>> below, defines the legal values for this field. This field
>> should not be blank.
>>
>> Date
>>
>> The date on which the interaction took place. Currently,
>> the system is configured to display dates in the eight
>> digit dd/mm/yy format. This field should not be blank.
>> Usage exception: For agonism and grooming data, only the
>> month and year of the interaction are valid. For these
>> data, all dates are entered as the first day of the month
>> for that year, unless the two individuals are not both in
>> the same group on that day. In these cases, the date is the
>> first day of the month on which the two animals are in the
>> same group.
>>
>> Start
>>
>> The time the interaction began. This is recorded as a 5
>> digit character string. The first two digits are the hour,
>> in 24 hour time. The third digit is a colon. The last two
>> digits are the minutes. The Start time is entered for
>> mounts, consorts, and ejaculations. When there is no start
>> time in the data, the value "00:00" appears. For agonism
>> and grooming data, START is always "00:00" . (This
>> restriction is here because Start data is not being
>> collected for agonism and grooming.) This field should not
>> be blank.
>>
>> Stop
>>
>> The time the interaction stopped. This is recorded as a 5
>> digit character string. The first two digits are the hour,
>> in 24 hour time. The third digit is a colon. The last two
>> digits are the minutes. For consorts, Stop is entered from
>> the data. Stop is "00:00" when no stop time is indicated in
>> the data. For agonism and grooming, Stop is always "00:00".
>> (This restriction is here because Stop data is not being
>> collected for agonisim and grooming.) This field cannot be
>> blank.
>>
>>PARTS (Participants in interactions)
>>
>> This table contains records of the participants in observed
>> interactions between animals. Each record in the table
>> records a participant. Each interaction is represented as
>> though it occurs between two ordered individuals designated
>> "actor" and "actee" ^[12] interactions between multiple
>> individuals are broken down into interactions between pairs
>> according to rules described elsewhere. Therefore, this
>> table contains two rows for every record of an interaction,
>> one row to record the actor, and one to record the actee.
>> Rules for classifying individuals as actor or actee are
>> documented below in the description of the Role field. All
>> characters are entered as capital letters.
>>
>> The date of the interaction must be between the birthdate
>> and statdate, inclusive, of the participants.
>>
>> Note that the actor and the actee of an interaction should
>> not be the same.
>>
>> Data Entry Rules
>>
>> This table is updated with the VALINTER and UPINTER dataset
>> based update programs. See section 2.0 in the Protocol for
>> Data Management: Amboseli Baboon Project document.
>>
>>Data Element Descriptions
>>
>> Sname
>>
>> A three-letter code (an id) that identifies a particular
>> animal (an Sname) in BIOGRAPH. This code can be used to
>> retrieve information, such as the maternal group of the
>> animal, from BIOGRAPH or other places where the animal's
>> three letter code appears. This field should not be blank.
>>
>> Role
>>
>> This field designates whether the row records the actor or
>> the actee of the interaction. The two possible values are:
>>
>> R) Actor. The actor is usually the one performing the act.
>> For grooming data, the individual that is grooming is the
>> actor. For the agonism data, the individual that is the
>> winner (does not perform a submissive behavior) is the
>> actor. For mounts, consortships, and ejaculations, the male
>> is the actor.
>>
>> E) Actee. The actee is usually the one that is the
>> recipient of another animal's attentions. For grooming
>> data, the individual that is groomed is the actee. For the
>> agonism data, the individual that is the loser (performing
>> a submissive behavior) is the actee. For mounts,
>> consortships, and ejaculations, the female is recorded as
>> actee. This field should not be blank.
>>
>> Iid
>>
>> Interaction identifier. This field holds the Iid value of
>> the row on the INTERACT table containing further
>> information on the interaction in which the animal is a
>> participant. It can be used to retrieve the other
>> information recorded on the interaction. There should be a
>> row in INTERACT with an Iid of this value. This field
>> should not be blank.
>>
>>REPSTATS
>>
>> (REProductive STATus) Contains one row per female per day
>> for every day during continuous observation periods from
>> date of menarche through date of death (inclusive). When
>> menarche is unobserved then REPSTATS rows begin on a
>> beginning of observation date. Likewise, the cessation or
>> resumption of observation interrupts or resumes the
>> contiguous series of the females REPSTATS' dates. (See
>> CYCGAPS.) While the individual is alive the last date is
>> either the BIOGRAPH.Statdate or the last recorded sexual
>> cycle endpoint, which ever is later. End of cycle dates are
>> (exclusive of both) M (menses onset) date or
>> end-of-pregnancy date. The day-by-day nature of this table
>> makes it easy to correlate reproductive cycle information
>> with other events.
>>
>> Note
>>
>> Note that because of gaps in the observational record some
>> sexual cycles may not be recorded, or may be partially
>> recorded. In these cases the Din and Dr columns are NULL.
>> (See below.)
>>
>> See CYCSTATS for more fertility detail.
>>
>> Data Entry Rules
>>
>> This table is not maintainable by the user. The system
>> constructs this table automatically from the data values
>> recorded in the CYCLES table, the BIOGRAPH.Status and
>> BIOGRAPH.Statdate columns, and the CYCGAPS table.
>>
>> Data Element Descriptions
>>
>> Rid
>>
>> A unique number which serves to identify the row.
>>
>> Date
>>
>> The row records a female's reproductive state on this day.
>>
>> Sname
>>
>> The Sname identifying the female whose reproductive state
>> is recorded. (See BIOGRAPH.)
>>
>> State
>>
>> General reproductive state of the female on the given Date.
>> The legal values are:
>>
>> C (cycling), from (including) the T (turgesence onset) up
>> to (but not including) the D date of the onset of
>> pregnancy.
>>
>> P (Pregnant), from (including) the D (deturgesence onset)
>> date up to (but not including) the end-of-pregnancy date,
>> date of birth, abortion, or death.
>>
>> L (lactating), from (including) the end-of-pregnancy date
>> to (but not including) the next T date.
>>
>> Note
>>
>> Note that post menopausal individuals have a state of C, or
>> possibly L if the last cycle resulted in a pregnancy.
>>
>> Any of the above states may start late or end early in the
>> event of gaps in observation. (See CYCGAPS.)
>>
>> Dins (NULL allowed)
>>
>> (Days INto State) The number of days since the state
>> started. The first day of the state has a value of 1, the
>> next a value of 2, etc.
>>
>> This column is NULL when the system cannot determine when
>> the state began. This occurs when the beginning of the
>> reproductive state occurs during a period when the
>> individual is not under regular observation. (See CYCGAPS.)
>>
>> Dr (NULL allowed)
>>
>> (Days Remaining) The number of days remaining in the state.
>> The last day of the state has a value of 0, the next to
>> last day a value of 1, etc. Note that the sum of Dins and
>> Dr is always the total number of days the cycle spent in
>> the state.
>>
>> This column is NULL when the system cannot determine when
>> the state ends. This occurs when the end of the
>> reproductive state was not observed due to cessation of
>> regular observation. (See CYCGAPS.) It also occurs while
>> the individual is alive and the state has not ended, or
>> rather when the observations of the state have not been
>> entered into the system. Finally, it occurs when the
>> individual dies as it is not known when the state would
>> have ended.
>>
>> Pid (NULL allowed)
>>
>> (Pregnancy IDentifier) The Pid of the pregnancy associated
>> with the state. This value must be present when the state
>> is P or L. There is also a Pid value for those C cycles
>> that result in pregnancy. (See PREGS table.)
>>
>>CYCSTATS
>>
>> (fertility CYCle STATus) Contains one row per female per
>> day, for those days in REPSTATS where the REPSTATS Status
>> is C (cycling.) This is a day-by-day record of the details
>> of the females' fertile cycles. The day-by-day nature of
>> this table makes it easy to correlate sexual cycle
>> information with other events.
>>
>> While the individual is alive the last date is either the
>> BIOGRAPH.Statdate or the last recorded sexual cycle
>> endpoint, which ever is later.
>>
>> Rows exist only when there is information on a female's
>> sexual cycle, or enough information to estimate sexual
>> cycle transition dates. (See CYCLES.) There are no CYCSTATS
>> rows when a female is pregnant or lactating. Likewise there
>> are no CYCSTATS rows when there are gaps in the
>> observational record. (See CYCGAPS.) See the description of
>> the Din and Dr columns below for further information on how
>> sexual cycles are recorded when there missing sexual cycle
>> transition markers.
>>
>> Note
>>
>> Note that post-menopausal individuals' final cycles will
>> have a State of D and a long duration, with the
>> individual's date of death being the last day of the cycle.
>>
>> When the end of S (swelling, follicular) cycle state is not
>> known, that is Dr (days remaining in state) is NULL, some
>> of the computed Din (days into state) values may be skewed
>> as the end of the state is counted backward from the
>> beginning of the D date, the next observed transition
>> marker. See the information on the calculation of the O
>> (ovulatory) state below.
>>
>> Data Entry Rules
>>
>> This table is not maintainable by the user. The system
>> constructs this table automatically from the data values
>> recorded in the CYCLES table, the BIOGRAPH.Status and
>> BIOGRAPH.Statdate columns, and the CYCGAPS table.
>>
>> Data Element Descriptions
>>
>> Csid
>>
>> A unique number which serves to identify the row.
>>
>> Date
>>
>> The row records a female's reproductive state on this day.
>>
>> Sname
>>
>> The Sname identifying the female whose reproductive state
>> is recorded. (See BIOGRAPH.)
>>
>> State
>>
>> Categorizes the period within the reproductive cycle. Legal
>> values are:
>>
>> M (menses, follicular), the M (onset of menses) date to the
>> day before the T (turgesence onset) date (inclusive of
>> endpoints)
>>
>> S (swelling, follicular), the T date through 6 days before
>> the D (deturgesence onset) date (inclusive of endpoints)
>>
>> O (ovulating), from 5 days before the D date through the
>> day before the D date (inclusive of endpoints)
>>
>> D (deturgesence, luteal), from the D date through the day
>> before the M date (inclusive of endpoints).
>>
>> Dins (NULL allowed)
>>
>> (Days INto State) The number of days since the state
>> started. The first day of the state has a value of 1, the
>> next a value of 2, etc.
>>
>> This column is NULL when the system cannot determine when
>> the state began. This occurs when the cycle is the female's
>> first cycle, as there is no menses to begin the cycle, and
>> likewise for the first cycle after pregnancy. The cycle's
>> starting date is also unknown when it occurs during a
>> period when the individual is not under regular
>> observation. (See CYCGAPS.)
>>
>> Dr (NULL allowed)
>>
>> (Days Remaining) The number of days remaining in the state.
>> The last day of the state has a value of 0, the next to
>> last day a value of 1, etc. Note that the sum of Dins and
>> Dr is always the total number of days the cycle spent in
>> the state.
>>
>> This column is NULL when the system cannot determine when
>> the state ends. This occurs when the end of the
>> reproductive state was not observed due to cessation of
>> regular observation. (See CYCGAPS.) It also occurs while
>> the individual is alive and the state has not ended, or
>> rather when the observations of the state have not been
>> entered into the system. Finally, it occurs when the
>> individual dies as it is not known when the state would
>> have ended.
>>
>> MMINTERVALS
>>
>> One row for every day the female is cycling, counting
>> between M dates.
>>
>> When cycles start late, due to pregnancy, menarche, or
>> resumption of observation the Din column is blank. However,
>> the corresponding row in the REPSTATS table contains a
>> relevant Din value.
>>
>> When cycles are interrupted, end early, due to pregnancy
>> the Dr value can likewise be found in the REPSTATS table.
>> There is no relevant Dr value when cycles end early due to
>> death or interruption of observation.
>>
>> MDINTERVALS
>>
>> One row for every day the female is cycling, between M and
>> D dates.
>>
>>THE SUPPORT TABLES
>>
>> The support tables are those tables that define various
>> codes used as data values in the master tables. Each
>> support table contains only two fields, a key or id field,
>> that usually has the same name as the field in the master
>> tables for which the support table defines data values, and
>> a field called Descr. The key field contains the valid code
>> values, and the Descr field contains a short description of
>> the code. Because the support tables define many of the
>> codes used in the master tables, new code values can be
>> added to the system and used in the master tables by adding
>> new rows to the support tables. The system will validate
>> the new code values in the master tables against the rows
>> in the support tables without any changes to the existing
>> programs. At times, the BABASE programs recognize that
>> particular codes have special meanings, for example, the
>> BIOGRAPH Dcause code of 5 (death due to death of the
>> mother) can only be assigned to unborn individuals, those
>> with no Sname. The meaning of these codes is fixed into the
>> logic of the programs and therefore, these codes cannot be
>> removed from the BABASE system nor should their presence in
>> the data be used to code a different meaning from that
>> which the code presently has, for example, the Dcause of 5
>> should not be changed to mean "death due to meteorite
>> impact" because the system's programs would prevent any
>> individuals with an Sname from having this value as a cause
>> of death. Each of the "special" values that the system
>> requires retain particular meaning is listed in the Special
>> Values section of the table's documentation. For further
>> information on the meaning of the "special" values, see the
>> description of the master table(s) that contain the code
>> values. Should the meaning of one of these "special" values
>> need to be changed, the logic in the BABASE programs should
>> be adjusted to reflect the change.
>>
>> BSTATUSES
>>
>> The different accuracies of birthday estimates.
>>
>> Master Table Columns Defined
>>
>> BSTATS defines values for the Bstatus field of BIOGRAPH.
>>
>> Key Name
>>
>> Bstatus
>>
>> Special Values
>>
>> None.
>>
>> STATUSES
>>
>> The different states of an individual, reflecting what sort
>> of record keeping needs to be done on the individual in the
>> future.
>>
>> Master Table Columns Defined
>>
>> STATUSES defines values for the Status field of BIOGRAPH.
>>
>> Key Name
>>
>> Status
>>
>> Special Values
>>
>> The values of 0 (alive) and 1 (dead) have a special meaning
>> to the system's programs.
>>
>> DCAUSES
>>
>> The different causes of death.
>>
>> Master Table Columns Defined
>>
>> DCAUSES defines values for the Dcause field of BIOGRAPH.
>>
>> Key Name
>>
>> Dcause
>>
>> Special Values
>>
>> The values 0 (no cause of death), 4 (unknown cause of
>> death) and 5 (fetus died due to death of mother) have a
>> special meaning to the system's programs.
>>
>> MSTATUSES
>>
>> The different meanings of various maturity marker date
>> values.
>>
>> Master Table Columns Defined
>>
>> MSTATUSES defines values for MATUREDATES.Matured and
>> RANKDATES.Ranked columns. May be O (ON) or B (BY). O
>> indicates a known date. B indicates that we know that the
>> animal had reached that maturational marker BY the given
>> date but we have no information about the actual date on
>> which the marker was attained.
>>
>> Key Name
>>
>> Mstatus
>>
>> Special Values
>>
>> None.
>>
>> WSTATIONS
>>
>> The different weather stations from which metrological data
>> is obtained.
>>
>> Master Table Columns Defined
>>
>> WSTATIONS defines values for WREADINGS.WStation.
>>
>> Key Name
>>
>> Wstation
>>
>> Special Values
>>
>> None.
>>
>> ACTS
>>
>> The different kinds of interactions between individuals
>> which may be recorded.
>>
>> Master Table Columns Defined
>>
>> ACTS defines values for the Act field of INTERACT.
>>
>> Key Name
>>
>> Act
>>
>> Special Values
>>
>> All of the codes defined by ACTS have a special meaning to
>> the system's programs. Some codes on ACTS are marked with a
>> "Y" value in the Old column. This mark indicates the code
>> is present in older data but is not allowed in data entered
>> at the present.
>>
>> RNKTYPES
>>
>> The different categories of rankings that order individuals
>> by dominance within a group within a month. Each category
>> of ranking is identified with a row of this table.
>>
>> Master Table Columns Defined
>>
>> RNKTYPES defines value for the Rnktype field of RANKS.
>>
>> Key Name
>>
>> Rnktype
>>
>> Special Values
>>
>> None.
>>
>> The Query Column
>>
>> This table contains a "special" column, Query. The Query
>> column is an SQL query which defines which individuals are
>> eligible for inclusion in this category of ranking. The SQL
>> statement determines which individuals are included in any
>> given ranking. It must return distinct Snames of
>> individuals to be ranked within a given group over a given
>> time period. In general the query is a SELECT statement
>> which uses the BIOGRAPH and MEMBERS tables to determine who
>> is to be ranked within a group over a month. A number of
>> "special symbols" may be, and will need to be, included in
>> the SQL query. Each "special symbol" represents a value
>> which changes depending on the month or group ranked. The
>> "special symbols" are:
>>
>> %g
>>
>> Group id ' a number. The Gid of the group being ranked.
>>
>> %s
>>
>> Start date ' a date, should not be bracketed in the SQL
>> statement. Date of the first day of the interval over which
>> the individuals are ranked (inclusive.)
>>
>> %f
>>
>> Finish date ' a date, should not be bracketed in the SQL
>> statement. Date of the last day of the interval over which
>> the individuals are ranked (inclusive.) Note that ages,
>> maturation dates, and so forth are often computed using or
>> compared to the Finish Date value.
>>
>> The Weather Tables
>>
>> This has not yet been implimented.
>>
>> The weather realted tables contain weather related
>> information and so do not directly relate to any of the
>> baboon information contained in Babase.
>>
>> WREADINGS
>>
>> The WREADINGS table contains one row for each time a person
>> has collected data from the metrological instruments. So,
>> each WREADINGS row should have at least one associated
>> RAINGUAGES, TEMPMINS, or TEMPMAXS row, but no more than one
>> associated row from any one of these tables.
>>
>> Wrid-1
>>
>> A unique postive integer representing the metrological data
>> collection event.
>>
>> WStation
>>
>> Code indicating the station from which the data was
>> collected. Must be a value on the WSTATIONS table.
>>
>> Wrdaytime
>>
>> The day and time the metrological data was collected. The
>> time zone is Nairobi local time.
>>
>> Wrperson
>>
>> Initials of the person who collected the data. Must be a
>> value contained in the Initials column of a row on the
>> PEOPLE table.
>>
>> RAINGAUGES
>>
>> This table contains one row for every time a rain gauge
>> reading is recorded.
>>
>> Wrid-2
>>
>> The identifer of the metrological collection event during
>> which the rain gauge was read. Must be a value contained in
>> the Wrid column of a row on the WREADINGS table, and the
>> associated row may not be associated with any other row in
>> RAINGAUGES.
>>
>> Rgspan
>>
>> The interval, in seconds, since the previous raingauge
>> collection event. This column must be NULL when the
>> collection event is the very first collection event,
>> otherwise it must not be NULL.
>>
>> Rain
>>
>> The measurement of rain accrued since the last time the
>> raingauge was read. In millimeters with a precision of one
>> millimeter.
>>
>> TEMPMINS
>>
>> This table contains one row for every time a minimum
>> temperature reading was recorded. Depending on thermometer
>> availability some temperature readings may be recored in
>> Fahrenheit. Use the be_celsius() function to coerce all
>> temperature measurement units to Centigrade.
>>
>> Wrid-3
>>
>> The identifer of the metrological collection event during
>> which the minimum temperature was read. Must be a value
>> contained in the Wrid column of a row on the WREADINGS
>> table, and the associated row may not be associated with
>> any other row in TEMPMINS.
>>
>> Tempmin
>>
>> The minimum temperature recorded since the last minimum
>> temperature reading. This column has one decimal point of
>> precision, although the actual precision of the reading may
>> be different depending upon the units in which the
>> temperature reading was recorded.
>>
>> TMUnits
>>
>> The units used to record the temperature reading. A value
>> of "C" means the units are Centigrade and the precision of
>> the reading is 1/10th of a degree Centigrade. A value of
>> "F" means that the units are Fahrenheit and the precision
>> of the reading is ' of a degree Farenheit.
>>
>> TEMPMAXS
>>
>> This table contains one row for every time a maximum
>> temperature reading was recorded. Depending on thermometer
>> availability some temperature readings may be recored in
>> Fahrenheit. Use the be_celsius() function to coerce all
>> temperature measurement units to Centigrade.
>>
>> Wrid 11
>>
>> The identifer of the metrological collection event during
>> which the maximum temperature was read. Must be a value
>> contained in the Wrid column of a row on the WREADINGS
>> table, and the associated row may not be associated with
>> any other row in TEMPMAXS.
>>
>> Tempmax
>>
>> The maximum temperature recorded since the last maximum
>> temperature reading. This column has one decimal point of
>> precision, although the actual precision of the reading may
>> be different depending upon the units in which the
>> temperature reading was recorded.
>>
>> TMUnits
>>
>> The units used to record the temperature reading. A value
>> of "C" means the units are Centigrade and the precision of
>> the reading is 1/10th of a degree Centigrade. A value of
>> "F" means that the units are Fahrenheit and the precision
>> of the reading is ' of a degree Farenheit.
>>
>> PEOPLE
>>
>> Contains one row for each person who records information,
>> or at least information that we tag with a record of who
>> recorded it.
>>
>> Initials
>>
>> The initials of the person. This is used to uniquely
>> identify the person, so may not be the person's actual
>> initials if there is ever a conflict.
>>
>> GPSInitials
>>
>> The initials used to identify the person when recording GPS
>> data.
>>
>> Name
>>
>> The person's real name.
>>
>> Notes
>>
>> Any notes you may wish to make on the person. For example
>> you may wish to record that the person is John Smith the
>> graduate student, not John Smith the, for example,
>> President of Kenya who asked to be able to collect data and
>> actually did so for a day.
>>
>>Interpolation
>>
>> The Babase database uses a procedure called interpolation
>> to update MEMBERS whenever the CENSUS table, or the
>> BIOGRAPH.Birth, or BIOGRAPH.Statdate columns are updated.
>> Interpolation extrapolates the group membership of
>> individuals into days for which there is no actual
>> observation of the individuals' whereabouts. It "guesses"
>> with which group an individual is associated, given
>> knowledge of the individual's group membership (or lack
>> thereof) at given points in time, and records the result in
>> MEMBERS. Thus, MEMBERS always has a row recording group
>> membership for every day of every individual's life.
>>
>> This section is comprised of 3 sub-sections. The first
>> section introduces interpolation incrementally. Rules are
>> presented in an informal fashion and examples and
>> exceptions progressively developed. The second section is a
>> formal specification of interpolation. The third section
>> supplements the formal specification with expectations
>> regarding the use of interpolation and brief descriptions
>> of interpolation's implications. Most of the third section
>> is a restatement of material already presented in the first
>> section.
>>
>> Interpolation's 3 Fundamentals
>>
>> It is primarily by census records that Babase tracks group
>> membership. The CENSUS table is the source of all group
>
>Revision to the first sentence above (note question mark):
>
>It is primarily by census records of primary study groups (?) that
>Babase tracks group membership.
SCA suggests: Babase tracks group membership through census records,
which are defined as field-derived information about an individual's
presence or absence from a group, or information about an animal
living alone. Census records comprise actual field censuses of study
groups and of non-study groups, as well as field notes collected ad
libitum during field work; all of these records are referred to here
as census records even though each is collected somewhat differently
in the field. In Babase, the CENSUS table is the source of all group
...
>
>> membership information. Babase places rows in the CENSUS
>> table to indicate presence in a group whenever demography
>> information is stored other tables.^[13][14] Throughout
>> this section it is to be understood that any sort of
>> demographic information which results in CENSUS data is
>
>Revision to the line above:
>
>. . . demographic information that results in CENSUS data is . . .
>
>> implied when the term census, or it's plural, is used.
>> Unfortunately, the term census is further overloaded. It is
>> occasionally used in the colloquial sense, meaning present
>> -- found when a group census was taken, the alternative
>> being absent. It is hoped the meaning will be clear from
>> context.
>>
>> It is important to remember that censuses record absence
>> from a group as well as presence in a group, that there are
>> two mutually exclusive classes of CENSUS rows: absences,
>> records of absence from specific groups on specific days;
>> and "locating censuses", records that place the individual
>> in specific groups on specific days.
>>
>> The premise of interpolation is that an individual is
>> assumed to be in the group where observed for a period of
>> 14 days to either side of the observation unless there's
>> indication otherwise. To this end, interpolation keeps an
>> individual in the group where a census locates him for a
>> time period that is the shorter of:
>>
>> 1. Half of the time interval between the individual's next
>> (or prior) census which finds the individual in any
>> group.
>
>Revision to the sentence above:
>
>Half of the time interval between the individual's next (or prior)
>census that finds the individual in any group.
>
>>
>> 2. Half of the time interval between the next (or prior)
>> recorded absence from the group in which the individual
>> was censused. Absences from other groups are ignored.
>>
>> 3. The 14 day Interpolation Limit. Given no other
>> information, an individual is considered to remain (or
>> have been) in the group where observed for 14 days
>> following (or preceding) the date of observation.
>>
>> Should the above process not place an individual in a
>> group, the individual is placed in the unknown group; so
>> long as the individual is alive on the day in question.
>>
>> There are some subtleties to these rules, and there is
>> further elaboration necessary to allow for "old style"
>> CENSUS rows, which do not directly correspond with actual
>> census taking, and other factors. But these rules are the
>> foundation and we begin with them.
>>
>> Interpolation Visualized
>>
>> Interpolation is best described with the help of diagrams
>> as it is all about computing and comparing time intervals
>> of various lengths, which are easily represented in a
>> diagram by lines of various lengths. We begin with the
>> simplest case, an individual censused present and absent in
>> a single group.
>>
>> Tip
>>
>> As the examples throughout this section are developed be
>> sure to pay close attention to the diagrams' keys. At times
>> the meaning of a symbol changes from diagram to diagram to
>> reflect a subtlety.
>
>Jeanne wrote the following in the margin here:
>
>"Note that first section is assuming only a single group is censused."
>
>>
>> Interpolating presences and absences
>>
>> Figure 5 shows a record of one individual's censuses. The
>> group, for the moment we'll assume group 1, is censused
>> several times over a period of days. One day the individual
>> is absent.
>
>Jeanne wrote in the margin here that "rows are horizontal in examples here."
>
>>
>> Figure 5. An Individual is Censused Present and Absent
>>
>> One individual's census records
>> CENSUS: C C A C
>> Date: 1 2 3 4 5 6 7 8 9 10 11
>>
>> Key:
>> C Censused present in group (group 1)
>> A Censused absent in group (group 1)
>>
>>
>> The first step in interpolation is to construct the various
>> intervals from the given CENSUS rows. Figure 6 shows how
>> interpolation "splits the difference" between presences and
>> absences to construct two intervals for each locating
>> census, one preceding the census and one following it. As
>> the diagrams given here can only show a window in time and
>> omit what falls outside that window, only one interval each
>> is shown for the censuses taken on day 1 and day 11.
>>
>> Figure 6. Interpolating From Presences and Absences
>>
>> Interpolation intervals within a group
>> CENSUS: C C A C
>> Intervals: X---|---X---------| O |-----X
>> Date: 1 2 3 4 5 6 7 8 9 10 11
>>
>> Key:
>> C Censused present in group (group 1)
>> A Censused absent in group (group 1)
>> X Known present in group (group 1)
>> O Known absent in group (group 1)
>> - Presumed in group (group 1)
>> | Midpoint between census takings
>>
>>
>> Interpolation creates MEMBERS rows that place the
>> individual in a group each day. Figure 7 shows how group
>> membership assignment is based upon the computed intervals.
>> Because of the absence, there are days when the individual
>> is placed in group 9, the unknown group.
>
>Revision to the last sentence above:
>
>Because of the absence, the individual is placed in group 9, the
>unknown group, on some days.
>
>>
>> Figure 7. Interpolating Group Membership
>>
>> Intervals determine group membership
>> CENSUS: C C A C
>> Intervals: X---|---X---------| O |-----X
>> MEMBERS.
>> Group: 1 1 1 1 1 9 9 9 9 1 1
>> Origin: C I C I I I I I I I C
>>
>> Date: 1 2 3 4 5 6 7 8 9 10 11
>>
>> Key:
>> C Censused present in group (group 1)
>> A Censused absent in group (group 1)
>> X Known present in group (group 1)
>> O Known absent in group (group 1)
>> - Presumed in group (group 1)
>> | Midpoint between census takings
>>
>>
>> Figure 7 also introduces the MEMBERS' Origin column. As can
>> be seen, the Origin column mimics the corresponding CENSUS
>> Status column on those days when interpolation is not
>> guessing group membership. Origin is I on those day when
>> interpolation is guessing.
>>
>> The MEMBERS' Interp column represents distance from a
>> census. Interp is zero on those days when a census has
>
>Revision to the first sentence above:
>
>The MEMBERS' Interp column represents distance from a census with
>presence or "a locational census".
SCA suggests: The MEMBERS Interp column represents number of days
from a census in which an individual was recorded as present in some
known group. Interp is zero on those days....
>
>> located the individual. The recorded absence is reflected
>> in the group, but is immaterial to Interp. Even though
>> there's an absence, the Interp count is over the interval
>> between the two locating censuses. Interp gets it's value
>> from a "split the difference" between censuses which record
>
>Revision to the line above"
>
>. . . from a "split the difference" between censuses that record . . .
>
>> presence in the group, a different sort of "split the
>> difference" than is used to determine into which group an
>> individual should be placed. Figure 8 extends Figure 7,
>> showing the computation of Interp. With this addition the
>> interpolation has finished, the MEMBERS table can be
>> constructed from the given CENSUS rows.
>>
>> Figure 8. Computing Interp Values
>>
>> The resulting MEMBERS rows
>> CENSUS: C C A C
>> Intervals
>> For Group: X---|---X---------| O |-----X
>> For Interp: X~~~|~~~X~~~~~~~~~~~~~~~|~~~~~~~~~~~~~~~X
>> MEMBERS.
>> Group: 1 1 1 1 1 9 9 9 9 1 1
>> Interp: 0 1 0 1 2 3 4 3 2 1 0
>> Origin: C I C I I I I I I I C
>>
>> Date: 1 2 3 4 5 6 7 8 9 10 11
>>
>> Key:
>> C Censused present in group (group 1)
>> A Censused absent in group (group 1)
>> X Known present in group (group 1)
>> O Known absent in group (group 1)
>> - Presumed in group (group 1)
>> ~ Inside of interval
>> | Midpoint of interval
>>
>>
>> Applying the 14 day interpolation limit
>>
>> So far we have only explored the first 2 of the 3
>> fundamental interpolation intervals, those dealing with
>> being censused present and absent. Before we elaborate
>> further and examine the more complicated interactions
>> between presences and absences let us dispense with the 14
>> day interpolation limit.
>>
>> Figure 9 shows the effect of the 14 day interpolation
>> limit. For reasons of space some days are removed from the
>> interval.
SCA suggests "To save space in this document, some days are not shown
in the figure"
>>There are no censuses, present or absent, on the
>> days omitted. As the "Date:" line shows, a total of 33 days
>> are examined, an entire month 31 days in length and the
>> first two days of the following month. Again, we assume the
>> censuses are taken in group 1.
>>
>> Figure 9. The 14 Day Interpolation Limit
>>
>> The shorter intervals are chosen
>> CENSUS: C C
>> C C Interval: X----- ... -----------|------- ... ---------X
>> 14 Day Limit: X----- ... -------| |--- ... ---------X
>> MEMBERS.
>> Group: 1 1 ... 1 1 9 9 1 ... 1 1 1
>> Interp: 0 1 ... 13 14 15 15 14 ... 2 1 0
>> Origin: C I ... I I I I I ... I I C
>>
>> Date: 1 2 ... 14 15 16 17 18 ... 31 1 2
>>
>> Key:
>> C Censused present in group (group 1)
>> X Known present in group (group 1)
>> - Inside of interval
>> | Interval endpoint
>>
>>
>> As the 16th and 17th are more than 14 days away from either
>> census the individual is placed in the unknown group on
>
>Revision to the start of the sentence above:
>
>Because the 16th and 17th are more than 14 days away from either
>census, the individual . . .
>
>> those days. Days that are closer to the actual censuses are
>> interpolated into group 1. So, as the rules require, the
>> individual is interpolated into the censused group for the
>> shorter of the two time periods. As before, all the
>> interpolated MEMBERS rows, those which do not correspond to
>> an actual census, have an Origin of I. And as before the
>> Interp column counts up from and down to the actual
>> censuses.
>
>Revision to the start of the sentence above:
>
>And as before, the Interp . . .
>
>>
>> Interpolation and Birth Dates
>>
>> There are some exceptions to the rules as stated so far.
>> Not surprisingly, interpolation will not presume to put an
>> individual in a group, create a MEMBERS row, before the
>> individual's Birth date.
>>
>> The birth date is an exception another fashion, it locates
>
>In the first part of the sentence above, Jeanne circled "exception
>another" a put a question mark beside it.
>
>> the individual in his Matgrp like a special sort of census.
>> The rationale for this is that although the birth may not
>> be observed the individual most certainly enters the group
>
>Revision to the line above:
>
>. . . be observed, the individual most certainly enters the group . . .
>
>> when born. Further, this rule ensures that we have a row in
>> MEMBERS for every day the individual is alive. When there
>> is a regular census on the birth date, finding the
>> individual in his Matgrp -- or so one would hope, the
>> interpolated MEMBERS row is like that for any other census.
SCA suggests for the last sentence: When there is a regular census on
the birth date, we assume that the field workers find the individual
in its Matgrp and the MEMBERS row [not "interpolated MEMBERS row"?]
is like that for any other census.
>> But when there is no locating census on the birth date the
>> resulting MEMBERS row has a Origin of I and an Interp of 0.
>> This is shown in Figure 10.
>>
>> Figure 10. Interpolation at Birth
>>
>> Individual born into group 1
>> CENSUS: B C C C
>> Intervals: X-----|-----X-|-X-----|-----X
>> MEMBERS.
>> Group: 1 1 1 1 1 1 1 1
>> Interp: 0 1 1 0 0 1 1 0
>> Origin: I I I C C I I C
>>
>> Date: 1 2 3 4 5 6 7 8 9 10
>>
>> Key:
>> B Born (into group 1)
>> C Censused present in group (group 1)
>> X Known present in group (group 1)
>> O Known absent in group (group 1)
>> - Presumed in group (group 1)
>> | Midpoint between census takings
>>
>>
>> Clearly, there are no MEMBERS rows before the birth date,
>> the individual is in his Matgrp on the day of his birth,
>> and the Interp value counts up from the birth date and then
>> down to the next census as though there were a census on
>> the birth date.
>
>Jeanne wrote in the margin beside the paragraph above:
>
>"Interp is with respect to a group, not an (or in addition to?) an
>individual?"
>
>>
>> An individual is placed in his Matgrp on his birth date
>> even when a regular census has an absence recorded for the
>> individual on the date of birth.^[15]
>>
>> Interpolation at the Statdate
>>
>> Another exception to the rules, or rather two exceptions,
>> occur at the Statdate. You might expect that interpolation
>> would not place a row after the individual's Statdate, and
>> this is indeed true, but true only when the individual is
>> dead. When an individual is alive, interpolation will place
>> a row after the individual's Statdate, but only when there
>> is a subsequent absence from the same group as the group in
>> which the individual was censused.^[16][17] While at first
>> this may seem odd, the reasoning behind this behavior is
>> clear -- the Statdate is not the last date on which there
>> is data for the individual. This is elaborated below.
>
>Revision to the line above:
>
>. . . are data for the individual . . .
>
>>
>> All the same, at times there is a reason to have
>> interpolation halt at the Statdate. When individuals are
>> alive the system should not try to interpolate into time
>> periods for which data has yet to be entered, elsewise
>
>Revision to the line above:
>
>. . . periods for which data have yet to be entered, elsewise . . .
>
>> there would always be spurious interpolated MEMBERS rows
>> which vanish as soon as additional data is entered. The
>> trouble with creating such rows is that, although the
>> interpolation is corrected and the rows disappear once data
>> entry resumes, the use of these rows in analysis is always
>> inappropriate. Such rows will exist at the end of every
>> period of data entry, as there will always be a large
>> number of living individuals found in their groups on the
>> last census entered. The solution is to not create the
>> rows.^[18] When a living individual has no later absences
>> from the group where last located, no absences from the
>> group of his last locating census that post-date his last
>> locating census, this is taken to mean that there is
>
>Revision to the line above:
>
>. . . locating census, this is taken to mean that there are . . .
>
>> additional as yet unentered data on the individual. In this
>> case interpolation stops on the day the individual was last
>> found in a group. This situation is shown in Figure 11,
>> where the last census taken found the individual in group 1
>> on day 5, and so this day is the individual's Statdate as
>> well. There is no interpolation past the last census.
>>
>> Figure 11. Alive and Present When Last Censused
>>
>> Living individual with Statdate of 5
>> CENSUS: C A C
>> Intervals: X-----| O |-X
>> MEMBERS.
>> Group: 1 1 9 9 1
>> Interp: 0 1 2 1 0
>> Origin: C I I I C
>>
>> Date: 1 2 3 4 5 6 7 8 9 10 11
>>
>> Key:
>> C Censused present in group (group 1)
>> A Censused absent in group (group 1)
>> X Known present in group (group 1)
>> O Known absent in group (group 1)
>> - Presumed in group (group 1)
>> | Midpoint between census takings
>>
>>
>> In Figure 12 more data has been entered, the individual has
>
>Revision to the start of the sentence above:
>
>In Figure 12 more data have been entered, the individual . . .
>
>> been missing since the last census shown in Figure 11
>> above. As there have been no further censuses during which
>> the individual was found the individual's Statdate is still
>> day 5, although there is now subsequent interpolation.
>> Notice that there are no MEMBERS rows created after day 7.
>> When interpolating a living individual, after the Statdate
>> there is no default placement of the individual into the
>> unknown group.^[19]
>>
>> Figure 12. Alive and Absent in Last Census^[20]
>>
>> Living individual with Statdate of 5
>> CENSUS: C A C A A
>> Intervals: X-----| O |-X---------| O
>> MEMBERS.
>> Group: 1 1 9 9 1 1 1
>> Interp: 0 1 2 1 0 1 2
>> Origin: C I I I C I I
>>
>> Date: 1 2 3 4 5 6 7 8 9 10 11
>>
>> Key:
>> C Censused present in group (group 1)
>> A Censused absent in group (group 1)
>> X Known present in group (group 1)
>> O Known absent in group (group 1)
>> - Presumed in group (group 1)
>> | Midpoint between census takings
>>
>>
>> Although the only change between Figure 11 and Figure 12 is
>> the entry into CENSUS of rows recording absence, that is
>> enough to signal that interpolation can go forward without
>> creating spurious MEMBERS rows -- rows likely erased upon
>> the entry of more data. It is important that interpolation
>> does go forward in this case, past the Statdate, as
>> otherwise bias would be introduced. The last C CENSUS would
>> be interpolated differently from all the other censuses. To
>> be sure, there is bias introduced in Figure 11 when
>> interpolation is cut short. But censoring bias at the end
>> of data collection is unavoidable, whereas we can avoid
>> introducing bias here.
>>
>> Warning
>>
>> So long as an individual is alive the last CENSUS to locate
>
>In the line above, Jeanne wrote "considered" above the word alive.
>
>> the individual ought be followed by a record of absence, an
>> absence from the group where the individual was last found.
>> To do otherwise, as must occur when there is simply no
>> further data to be entered, is to introduce a bias into
>> MEMBERS.
SCA doesn't quite understand this comment. To whom is the "ought to"
directed -- to data enterers?
>>
>> In Figure 13 there is no additional census information, but
>> the individual's Status has been adjusted to mark the
>> individual dead. A new Statdate value indicates the
>> individual died on day 9 and interpolation is now up to and
>> including the day of death. As is usual, when an
>> individual's group membership cannot be determined he is
>> placed in the unknown group.
>>
>> Figure 13. Interpolation to Statdate When Dead
>>
>> Dead individual with Statdate of 9
>> CENSUS: C A C A A
>> Intervals: X-----| O |-X---------| O
>> MEMBERS.
>> Group: 1 1 9 9 1 1 1 9 9
>> Interp: 0 1 2 1 0 1 2 3 4
>> Origin: C I I I C I I I I
>>
>> Date: 1 2 3 4 5 6 7 8 9 10 11
>>
>> Key:
>> C Censused present in group (group 1)
>> A Censused absent in group (group 1)
>> X Known present in group (group 1)
>> O Known absent in group (group 1)
>> - Presumed in group (group 1)
>> | Midpoint between census takings
>>
>>
>> Although Figure 13 does not show this, the 14 day
>> interpolation limit applies when the individual is dead.
>> When there are no absences after the last census and there
>> are more than 14 days between the last census and the
>> Statdate the individual is placed in the unknown group from
>> the 15th day through the day of death.
>>
>> The Midpoint Rule
>>
>> The alert reader may have noticed that the above examples
>> are carefully crafted so that the midpoint between
>> presences and absences always falls between two days. What
>> happens when there is an odd number of days in the interval
>> so that the midpoint is a day exactly in between the
>> endpoints, as occurs 3 times in Figure 14?
>>
>> Figure 14. Midpoint Days
>>
>> Intervals with an odd number of days
>> CENSUS: C A C C A C
>> Intervals: X---| O |-------X-|-X---| O |-X
>> Date: 1 2 3 4 5 6 7 8 9 10 11
>>
>> Key:
>> C Censused present in group (group 1)
>> A Censused absent in group (group 1)
>> X Known present in group (group 1)
>> O Known absent in group (group 1)
>> - Presumed in group (group 1)
>> | Midpoint between census takings
>>
>>
>> The MEMBERS table has a 1 day precision, there is no way to
>> be in a group in the morning and out of it in the
>> afternoon, so on any one midpoint day the individual must
>> either be in the group or out of it. Should the individual
>> be in the group on midpoint day or out of it? The question
>> is resolved using a property of the date itself. Briefly,
>> the julian dating system is a method of assigning every day
>> a unique number. As a midpoint day is no more likely to be
>> on one day than another, we can avoid bias by using whether
>> or not the midpoint day falls on an even or an odd julian
>> date to resolve the problem.
>>
>> Whenever interpolation is called upon to halve an interval
>> between two CENSUS rows that contains an odd number of days
>> then the "midpoint day" is assigned to the left, earlier,
>> half of the interval when the julian date of the midpoint
>> day is even. A midpoint day is assigned to the right,
>> later, half of the interval when the julian date of the
>> midpoint day is odd.
>>
>> So, The Midpoint Rule resolves the issue by adjusting the
>> intervals as shown in Figure 15. The intervals are no
>> longer perfectly halved. On the midpoint day there is no
>> preference either for or against interpolating the
>> individual into the group censused.
>>
>> Figure 15. The Midpoint Rule Adjusts Intervals
>>
>> Intervals with an odd number of days
>> CENSUS: C A C C A C
>> Intervals: X-----| O |---------X-|-X-| O |-X
>> MEMBERS.
>> Group: 1 1 9 9 1 1 1 1 9 9 1
>> Interp: 0 1 2 3 2 1 0 0 1 1 0
>> Origin: C I I I I I C C I I C
>>
>> Julian Date: 1 2 3 4 5 6 7 8 9 10 11
>>
>> Key:
>> C Censused present in group (group 1)
>> A Censused absent in group (group 1)
>> X Known present in group (group 1)
>> O Known absent in group (group 1)
>> - Presumed in group (group 1)
>> | Interval endpoint
>>
>>
>> Interpolating When The Group Changes
>>
>> Having dispensed with the various elaborations and
>> exceptions that occur in unusual cases it is time to return
>> to the fundamentals of interpolation and examine what
>> happens when an individual moves between groups. What comes
>> into play are the first 2 of the 3 interpolation intervals.
>> Recall:
>>
>> Interpolation keeps an individual in the group where a
>> census locates him for a time period that is the shorter
>> of:
>>
>> 1. Half of the time interval between the individual's
>> next (or prior) census which finds the individual in
>> any group.
>>
>> 2. Half of the time interval between the next (or prior)
>> recorded absence from the group in which the
>> individual was censused. Absences from other groups
>> are ignored.
>>
>> Figure 16 shows a record of one individual's censuses. He,
>> a male, is censused in 2 groups, group 1 and group 2. The
>> census records for each group reflect both presence in the
>> group and absence from the group.
>>
>> Figure 16. An Individual is Censused in 2 Groups
>>
>> One individual's census records
>> Group 1: C C A C A
>> Group 2: A C C
>>
>> Date: 1 2 3 4 5 6 7 8 9 10
>>
>> Key:
>> C Censused present
>> A Censused absent
>>
>>
>> Figure 17 shows what would happen if interpolation worked
>> with each group separately. There are conflicts, days when
>> the individual is in both groups. Something else must be
>> done.
>>
>> Caution
>>
>> Figure 17 is an example of an interpolation method that
>> does not work. The method shown in the figure is not one
>> Babase uses when interpolating.
>>
>> Figure 17. Interpolating Each Group Separately
>>
>> One individual's census records
>> Group 1: C C A C A
>> Group 2: A C C
>>
>> Group 1 Interpolating just group 1
>> CENSUS: C C A C A
>> Intervals: X---|---X---------| O |-X-| O
>>
>> Group 2 Interpolating just group 2
>> CENSUS: A C C
>> Intervals: O |---------X-------|-------X
>>
>> Date: 1 2 3 4 5 6 7 8 9 10
>>
>> Key:
>> C Censused present
>> A Censused absent
>> X Known present
>> O Known absent
>> - Presumed present
>> | Interval endpoint
>>
>>
>> The solution is return to the interpolation fundamentals.
>> We begin by taking a closer look at the way we have been
>> diagramming intervals. In Figure 17 the first group has 3
>> locating census and 2 absences, and yet we've diagrammed
>> the resultant intervals on a single line. The interpolation
>> fundamentals tell us to obtain 2 pairs of intervals for
>> each locating census. A "halfway to census" pair of
>> intervals and a "halfway to absence" pair of intervals.
>> Figure 18 takes the CENSUS rows of the first group shown in
>> Figures 16 and 17 and does this for each locating census.
>> In Figure 18 the CENSUS rows of days 1, 3 and 9 each have
>> their own sections detailing the intervals to the nearest
>> censuses and intervals to the nearest absences. The lines
>> labeled Presence show the intervals that are halfway from
>> each locating census to the next. The lines labeled Absence
>> show the intervals that are halfway from each census to the
>> nearest absence. This detailed breakdown is followed by a
>> composite interval diagram of the familiar type encountered
>> in figures 6 through 17 above. It should be clear that we
>> have arrived at the "composite" form of the interval
>> diagram by following the fundamentals, the composite is
>> made up of the shorter of each census's intervals. The
>> result is correct, the composite constructed in Figure 18
>> is identical to the one shown previously in Figure 17. It
>> had better be, or else the interpolations of Figure 17
>> would be in conflict with the fundamental interpolation
>> rules.
>>
>> Figure 18. A Closer Look at Intervals
>>
>> CENSUS rows from group 1
>> CENSUS: C C A C A
>>
>> Day 1 Intervals by presence and absence
>> Presence: X---| X
>> Absence: X-------------| O
>>
>> Day 3 Intervals by presence and absence
>> Presence: X |---X-----------| X
>> Absence: X---------| O
>>
>> Day 9 Intervals by presence and absence
>> Presence: X |-----------X
>> Absence: O |-X-| O
>>
>> Combining the shorter intervals
>> Interval: X---|---X---------| O |-X-|
>>
>> Date: 1 2 3 4 5 6 7 8 9 10
>>
>> Key:
>> C Censused present
>> A Censused absent
>> X Known present in same group
>> x Known present in different group
>> O Known absent in same group
>> - Inside of interval
>> | Interval endpoint
>>
>>
>> The intervals in Figure 18 did not have to be grouped by
>> censused day, they could have been grouped by Presence and
>> Absence or any other way. For each set of locating censuses
>> we can always split out the "halfway to census" intervals
>> from the "halfway to absence" intervals, group them any way
>> we like, and later use the interpolation fundamentals to
>> recombine them, without affecting the result. This has not
>> been necessary so far, but it is essential if we are to
>> correctly interpolate when an individual moves between
>> groups, as above in Figure 16: "An Individual is Censused
>> in 2 Groups". We must return to the fundamentals to make
>> sense of interpolation. Rather than trying to combine the
>> results of interpolating the groups separately, as was done
>> in Figure 17: "Interpolating Each Group Separately",
>> instead combine the results of interpolating the presences
>> in all the groups with separate interpolations of the
>> absences in each group. Each time a census finds an
>> individual in a group, separately compute both the interval
>> halfway to the nearest census that finds the individual in
>> any group and the interval halfway to the nearest absence
>> from the particular group being censused. In Figure 19,
>> this method is applied to the data first seen in Figure 16.
>> For clarity the intervals surrounding the censuses that
>> belong to one group are shown separately from those
>> belonging to the other group.^[21] The lines labeled
>> Presence show the intervals that are halfway from each
>> census to the nearest census that finds the individual in
>> any group. The lines labeled Absence show the intervals
>> that are halfway from each census to the nearest absence in
>> the same group. Censuses with no neighboring absence do not
>> have this latter sort of interval shown.^[22]
>>
>> Figure 19. Presence and Absence Interpolated Separately
>>
>> One individual's census records
>> Group 1: C C A C A
>> Group 2: A C C
>>
>> Group 1 The intervals of group 1's censuses
>> Presence: X---|---X-----| x |-----X-| x
>> Absence: X---------| O |-X-| O
>>
>> Group 2 The intervals of group 2's censuses
>> Presence: x x |-----X-----| x |-X
>> Absence: O |---------X
>>
>> Date: 1 2 3 4 5 6 7 8 9 10
>>
>> Key:
>> C Censused present
>> A Censused absent
>> X Known present in same group
>> x Known present in different group
>> O Known absent in same group
>> - Inside of interval
>> | Interval endpoint
>>
>>
>> Figure 20 shows how interpolation combines the "presence"
>> and "absence" intervals by choosing the shorter of the two
>> to as the period during which the individual is assumed to
>> be in the group where censused. The line labeled Used
>> contains the shorter of each census's two intervals.^[23]
>>
>> Figure 20. Combining Presence and Absence Intervals
>>
>> One individual's census records
>> Group 1: C C A C A
>> Group 2: A C C
>>
>> Group 1 The intervals of group 1's censuses
>> Presence: X---|---X-----| x |-----X-| x
>> Absence: X---------| O |-X-| O
>> Used: X---|---X-----| |-X-|
>> In Group: 1 1 1 1 ? ? ? ? 1 ?
>>
>> Group 2 The intervals of group 2's censuses
>> Presence: x x |-----X-----| x |-X
>> Absence: O |---------X
>> Used: |-----X-----| |-X
>> In Group: ? ? ? ? 2 2 2 ? ? 2
>>
>> Date: 1 2 3 4 5 6 7 8 9 10
>>
>> Key:
>> C Censused present
>> A Censused absent
>> X Known present in same group
>> x Known present in different group
>> O Known absent in same group
>> - Inside of interval
>> | Interval endpoint
>>
>>
>> Having interpolated the intervals surrounding each census,
>> determining the final group membership is a straightforward
>> matter of placing the individual in the unknown group when
>> there's no where else to put him. Figure 21 shows this
>> process. All that remains is to compute the Interp values
>> in the usual fashion, by ignoring absences and counting
>> distance from the nearest census. In Figure 21 the
>> intervals between locating census are shown, labeled For
>> Interp, to support the Interp values given.
>>
>> Figure 21. Group Membership Given Multiple Groups
>>
>> One individual's census records
>> Group 1: C C A C A
>> Group 2: A C C
>>
>> Group 1 The intervals of group 1's censuses
>> Used: X---|---X-----| |-X-|
>> In Group: 1 1 1 1 ? ? ? ? 1 ?
>>
>> Group 2 The intervals of group 2's censuses
>> Used: |-----X-----| |-X
>> In Group: ? ? ? ? 2 2 2 ? ? 2
>>
>> Intervals between locating censuses
>> For Interp: X~~~|~~~X~~~~~|~~~~~X~~~~~|~~~~~X~|~X
>>
>> MEMBERS.
>> Group: 1 1 1 1 2 2 2 9 1 2
>> Interp: 0 1 0 1 1 0 1 1 0 0
>> Origin: C I C I I C I I C C
>>
>> Date: 1 2 3 4 5 6 7 8 9 10
>>
>> Key:
>> C Censused present
>> A Censused absent
>> X Known present in same group
>> - Presumed present
>> ~ Inside of interval
>> | Interval endpoint
>>
>>
>> By now it should be clear that interpolation^[24] is a
>> function over CENSUS row sets. It is a function, for every
>> input you get exactly one output. It takes sets of CENSUS
>> rows as input. Because sets are unordered you can put
>> CENSUS rows into the database in any order and the result
>> will be the same. And, because it is a function, you can
>> re-interpolate the same CENSUS rows as many times as
>> desired without altering the final result.
>>
>> It should also be clear why interpolation always chooses to
>> use "the shorter interval", and why this always produces
>> the "correct" result. The shorter interval is short for a
>> reason, there is some reason to believe the individual is
>> not in the group elsewise the interval would be longer.
>> Further, every time the shorter interval is chosen a
>> possible overlap with another interval from a different
>> locating census is eliminated. By always choosing the
>> shorter interval interpolation insures that the
>> interpolation of any two locating censuses will not
>> conflict.
>>
>> Pre-Analyzed Data Disturbs Interpolation
>>
>> In addition to that most important distinction which
>> classifies CENSUS rows into absent and locating censuses
>> there is a second distinction which further divides
>> locating censuses into those which interpolate and those
>> which do not. Those CENSUS rows that record observational
>> data are interpolating censuses; those with Status values
>> of C, D and, M.^[25] (All of the previous examples have
>> concerned CENSUS rows of this type.) The remaining
>> CENSUS.Status values indicate that the CENSUS row is the
>> result of analysis, all of the "old style", that is
>> "historical", CENSUS.Status values and the N manual Status
>> value. These are the non-interpolating censuses.
>>
>> This further division of locating censuses into
>> interpolating and non-interpolating, the division between
>> raw and already analyzed data, leads to the final
>> refinement to the interpolation procedure. We do not want
>> interpolation to produce re-analyzed results from already
>> analyzed data. Interpolation occurs only between "regular",
>> that is to say interpolating, censuses (and to the birth
>> date as a special case). "Non-interpolating" census rows
>> are copied directly from CENSUS to MEMBERS, CENSUS.Status
>> becomes MEMBERS.Origin, and Interp is set to 0. When a
>> non-interpolating census is found on the birth date, the
>> birth date will not interpolate.
SCA asks, what does this mean exactly? What are the consequences?
Just that interpolation won't occur and the data will all be treated
at face value?
>>
>> Interpolation looks at "regular" census rows and attempts
>> to guess the individual's location on those days when there
>> are no observations. It does so by looking at the intervals
>> between the "regular" censuses. Finding non-interpolating
>> CENSUS rows, that is to say already analyzed data, on one
>> of these intervals breaks the assumptions interpolation
>> uses in it's "guessing". The previously analyzed data point
>> could be there for any reason at all, and there's no point
>> in pretending it's not there either. What interpolation
>> does is give up. It interpolates up to the offending data
>> point and then stops.^[26] After that it still creates rows
>> in MEMBERS, but it does not attempt to make guesses about
>> where to place an individual or what the interpolated row
>> means.
>>
>> Note
>>
>> This situation is not expected to occur, or, rather,
>> whenever there are non-interpolating CENSUS rows between
>> interpolating censuses, the non-interpolating CENSUS rows
>> are expected to be contiguous over the entire interval
>> between the interpolating censuses. So, the expected cases
>> are the trivial degenerate ones. None the less, such
>> situations probably do occur in the existent data. It would
>> probably best to either require the expected behavior, or
...probably be best to either require...
>> to get rid of all the pre-analyzed CENSUS rows and replace
>> them with raw data. Especially given the design problems
>> pointed out below.
>>
>> Regardless, non-trivial examples are presented here so that
>> a complete understanding of interpolation can be developed.
>>
>> Figure 22 shows that the 3 fundamental interpolation
>> intervals are shortened when a non-interpolating census is
>> found between interpolating censuses. The intervals for
>> each locating census are examined separately. The
>> non-interpolating census has no interpolation intervals.
>> The intervals of the interpolating censuses are truncated,
>> reduced to the interval between the interpolating and
>> non-interpolating censuses. By this means a portion of the
>> diagram, days 4 and 5, are blocked from interpolating into
>> the group. If there were no N census, the Absence interval
>> would be day 1's shortest interval, and days 4 and 5 (as
>> well as day 3) would interpolate into the group. (Notice
>> that day 1's Absence interval has a midpoint day, day 5,
>> and that it would have been included in the interval.)
>> Interpolation is prevented from placing individuals in the
>> group of their interpolating census on the "far side" of
>> non-interpolating censuses.
>>
>> Figure 22. Pre-Analyzed Data Truncates Interpolation
>> Intervals
>>
>> CENSUS rows from group 1
>> CENSUS: C N A C
>>
>> Day 1 Intervals per fundamental type
>> Presence: X-----| N X
>> Absence: X-----| N O
>> 14 Day Lim: X-----| N
>>
>> Day 3 Intervals per fundamental type
>> Presence: N
>> Absence: N
>> 14 Day Lim: N
>>
>> Day 12 Intervals per fundamental type
>> Presence: X N |---------------------X
>> Absence: N O |-----X
>> 14 Day Lim: N |---------------------------------X
>>
>> Julian Day: 1 2 3 4 5 6 7 8 9 10 11 12
>>
>> Key:
>> C Censused present in group (group 1)
>> N Manual entry,
>> present in group but non-interpolating (group 1)
>> A Censused absent in group (group 1)
>> X Known present in group (group 1)
>> O Known absent in group (group 1)
>> - Inside of interval
>> | Interval endpoint
>>
>>
>> In Figure 23 the shortest intervals of each locating census
>> have been chosen and combined; the result is the line
>> labeled For Group. This is then used to determine group
>> membership.
>>
>> The interesting part of Figure 23 is the computation of the
>> Interp values. The "halfway to census" intervals of
>> Figure 22 have been combined and labeled For Interp. Recall
>> that it is these intervals that are used to compute the
>> Interp values. The N census has created a "gap" in
>> interpolation, clearly shown on the For Interp line as
>> running from day 3 through day 6. Over this interval
>> interpolation's assumptions have been violated and it does
>> not know what to do. The group membership is easy. On day
>> 3, the day of the N census it can simply copy the CENSUS
>> row's Grp and Status into the appropriate MEMBERS columns
>> in the same fashion it would for any other locating census.
>> On days 4 through 6 it can do what it usually does with
>> group membership when it does not know where to locate an
>> individual, it places the individual in the unknown group
>> with a Origin of I. On days 3 through 6 interpolation has
>> no way of knowing how far away the day is from the nearest
>> locating census, which is what is supposed to go in the
>> Interp column. Due to this lack of information it assigns
>> the Interp column a value of NULL, no data, on this
>> interval.
>>
>> Figure 23. Pre-Analyzed Data Interrupts Interpolation
>>
>> An individual is censused
>> CENSUS: C N A C
>> Intervals
>> For Group: X-----| N O |-----X
>> For Interp: X~~~~~| |~~~~~~~~~~~~~~~~~~~~~X
>> MEMBERS.
>> Group: 1 1 1 9 9 9 9 9 9 9 1 1
>> Interp: 0 1 5 4 3 2 1 0
>> Origin: C I N I I I I I I I I C
>>
>> Date: 1 2 3 4 5 6 7 8 9 10 11 12
>>
>> Key:
>> C Censused present in group (group 1)
>> N Manual entry,
>> present in group but non-interpolating (group 1)
>> A Censused absent in group (group 1)
>> X Known present in group (group 1)
>> O Known absent in group (group 1)
>> - Presumed in group (group 1)
>> ~ Inside of interval
>> | Interval endpoint
>>
>>
>> When looking at Figure 23, one way to explain what happens
>> to Interp is to say that it is fixed at NULL over that
>> portion of the day 1 census's "halfway to census" interval
>> that was truncated because the N row showed up. (See
>> Figure 22.) Effectively, as MEMBERS Interp counts up with
>> increasing distance from the interpolating census, the
>> count is fixed at NULL upon encountering a
>> non-interpolating census until the point is reached at
>> which counting back down to the next interpolating census
>> begins, at which point the count downward resumes as though
>> never interrupted.^[28]
>>
>> The approach interpolation takes, in some sense, attempts
>> to minimize the disturbance created when already analyzed
>> census data is mixed in with raw census information.
>> However, as can be seen in Figure 23, it is not entirely
>> successful. Although day 7, for example, has an Interp
>> value indicating it is 5 days away from a census, it is
>> really 4 days away from the N census. If the N CENSUS does
>> really represent a census, then day 7's Interp value is
>> wrong. And the problems are not restricted to Interp
>> values. Is it really true that days 4 and 5 should be
>> assigned to the unknown group? If so then why aren't there
>> N rows that say so? Day 2 is even more disturbing. There is
>> no diagram for this, but suppose the N census found the
>> individual in a different group. Figure 22 would be
>> unchanged, all of day 1's intervals would be truncated at
>> the N census. The effect would be more clear if the
>> interval between the preceding C census and the following N
>> census were larger, but consider that day 2, by the
>> midpoint rule, would be "assigned" to the N census. That
>> means that if the N census really does represent a census
>> in a different group, that day 2 should be assigned to that
>> group, not to group 1.
>>
>> Note that, in the general case, even though the "halfway to
>> census" interval does not determine group membership (all
>> the intervals are truncated, leaving a "gap" in which
>> interpolation defaults to the unknown group), whether this
>> interval has a midpoint day, and if so where it falls, does
>> matter to the computation of Interp. If the midpoint day
>> happens to fall into the side of the interval containing
>> the non-interpolating census then the Interp value will be
>> NULL. Otherwise, it will have a value representing the
>> number of days to the nearest locating, and interpolating,
>> census.
>>
>> Incorporating the above safety checks into the rules we
>> already have, ensuring that data is not re-analyzed,
>> produces the actual interpolation rules.
>>
>> The Interpolation Rules
>>
>> Using these rules interpolation creates rows in MEMBERS
>> based on the information it finds in CENSUS, and the
>> BIOGRAPH columns Birth, Matgrp, Statdate and Status.
>>
>> I. CENSUS Rows Are Either Absences, Interpolating, or
>> Non-Interpolating
>>
>> Interpolation partitions all CENSUS rows into one of 3
>> categories:
>>
>> 1. Absences
>>
>> CENSUS rows which indicate absence from a group.
>>
>> 2. Interpolating censuses
>>
>> Those CENSUS rows that record observational data
>> are interpolating censuses; those with Status
>> values of C, D and, M.
>>
>> 3. Non-interpolating censuses
>>
>> The remaining CENSUS.Status values indicate the
>> CENSUS row is the result of analysis. These rows,
>> all of the "old style", that is "historical",
>> CENSUS.Status values and the N manual Status value,
>> are not re-analyzed and so do not interpolate.
>>
>> For convenience, the CENSUS rows that are not absences,
>> the interpolating and the non-interpolating censuses,
>> are termed "locating censuses".
>>
>> II. Censusing Assigns Group Membership
>>
>> On those days when an individual is censused in a
>> group, when there is a locating CENSUS row, a row is
>> created in MEMBERS to place that individual in the
>> group on the given day. The Origin value is the CENSUS
>> row's Status value. When the CENSUS row is
>> interpolating the Interp value is 0. When the CENSUS
>> row is non-interpolating the Interp value is NULL.
>>
>> III. The 3 Interpolation Intervals
>>
>> Interpolation places an individual in the group into
>> which he is censused, the Grp of an interpolating
>> CENSUS row (Status values C, D, and M), on the days to
>> either side of the census being interpolated for a
>> time period that is the shorter of:
>>
>> 1. The Halfway to Census Interval
>>
>> Half of the time interval between the
>> individual's next (or prior) locating and
>> interpolating census, which may locate the
>> individual in any group.
>>
>> 2. The Halfway to Absence Interval
>>
>> Half of the time interval between the next (or
>> prior) recorded absence, considering only
>> absences from the same group in which the
>> individual was censused. Absences from other
>> groups are ignored.
>>
>> 3. The 14 day Interpolation Limit
>>
>> Given no other information, an individual is
>> considered to remain (or have been) in the group
>> where observed for 14 days following (or
>> preceding) the date of observation.
>>
>> The resulting MEMBERS rows have an Origin of I and an
>> Interp value of the number of days difference between
>> the MEMBERS row's Date and the date of the nearest
>> locating census; Interp values count up over the The
>> Halfway to Census Interval as the distance from the
>> interpolated census increases. An interpolated MEMBERS
>> row falling on the day after a census has an Interp of
>> 1, the day after that the Interp is 2, and so forth,
>> assuming, of course, the individual has no other
>> nearby CENSUS rows.
>>
>> IV. The Midpoint Rule
>>
>> This rule qualifies how interpolation assigns the
>> halfway point between two CENSUS rows in The Halfway to
>> Census Interval and The Halfway to Absence Intervals,
>> above, when the number of days in the interval cannot
>> be divided into equal halves. Whenever interpolation is
>> called upon to halve an interval between two CENSUS
>> rows that contains an odd number of days then the
>> "midpoint day" is assigned to the left, earlier, half
>> of the interval when the julian date of the midpoint
>> day is even. A midpoint day is assigned to the right,
>> later, half of the interval when the julian date of the
>> midpoint day is odd.
>>
>> V. Births Locate Individuals
>>
>> This rule declares a live birth to be the equivalent of
>> an interpolating census, one that indicates presence in
>> the individual's Matgrp. fetal losses, individuals with
>> NULL Snames, are not considered births and are never
>> interpolated. An individual is placed in his Matgrp on
>> his birth date even when a regular census has an absence
>> recorded for the individual on the date of birth. In
>> this case interpolation always entirely ignores the
>> absence and will not use such an absence to compute a
>> Halfway To Absence Interval.
>>
>> When there is a locating census on the birth date, the
>> MEMBERS row interpolation creates is like that made for
>> any other locating census with the given Status. But,
>> when there is no locating census on the birth date the
>> resulting MEMBERS row has a Origin of I (and an Interp
>> of 0 as any census with a Status of C would have.) Aside
>> from their I Origin value, births interpolate as would
>> any CENSUS with a C Status.
>>
>> VI. No Data Implies Unknown Group Membership
>>
>> On days when none of the above rules serve to place an
>> individual in a group, the individual is placed in the
>> unknown group. The resulting MEMBERS rows have an
>> Origin of I and an Interp value of the number of days
>> difference between the MEMBERS row's Date and the date
>> of individual's nearest interpolating census.^[29]
>>
>> VII. Birth stops interpolation
>>
>> Interpolation will not place a row in MEMBERS before
>> an individual's Birth date.
>>
>> VIII. Death stops interpolation
>>
>> When an individual is dead, interpolation will not
>> place a row after the individual's Statdate.
>>
>> IX. Data Entry Cessation Stops Interpolation of Living
>> Individuals
>>
>> When an individual is alive, interpolation will create
>> rows after the individual's last locating census only
>> when there are subsequent absences; absences, that is,
>> from the group in which the individual was
>> censused.^[30] In this case, unlike above, no data does
>> not imply unknown group membership; such rows are
>> created only so long as the individual is interpolated
>> into the group of his last locating census. When a
>> living individual has no absences after their last
>> locating census, absences from the group of their last
>> locating census, interpolation assumes that there is
>> further data available which has yet to be entered and
>> interpolation stops at the last locating census.
>>
>> X. Data is not Re-Analyzed
Revision: X. Data are not Re-Analyzed
>>
>> Interpolation is only done to regular, that is
>> interpolating, CENSUS rows; data that was collected in
revision: data that were collected in
>> the field. Other data, the "non-interpolating" census
>> rows that represent the result of prior analysis, do not
>> interpolate; they are copied directly from CENSUS to
>> MEMBERS, CENSUS.Status becomes MEMBERS.Origin and Interp
>> is set to 0. Further, when a non-interpolating census is
>> found on one of The 3 Interpolation Intervals the
>> interval is shortened enough that the non-interpolating
>> census is no longer on the interval. When a
>> non-interpolating census is found on a birth date, the
>> birth date does not interpolate.
>>
>> The MEMBERS Interp column is fixed at NULL on the
>> interval from the non-interpolating census row through
>> the "midpoint" end of The Halfway to Census Interval,
>> endpoints included.^[31] Here we are speaking of The
>> Halfway to Census Interval as computed, not a Halfway to
>> Census Interval shortened in the preceding paragraph.
>>
>> Expectations and Implications
>>
>> It is expected that all non-interpolating CENSUS rows, that
>> is to say CENSUS rows produced by prior analysis, will be
>> clustered in contiguous intervals with "regular" census
>> rows at the endpoints. This is particularly expected of
>> "old style" census rows from before Babase, as they precede
>> all "regular" census data, but is also expected of the N
>> non-interpolating, manual, Status code, should it ever be
>> used. If these expectations are born out, the Data is not
>> Re-Analyzed rule will never be invoked.
>>
>> There are some not-quite-obvious implications given these
>> interpolation rules:
>>
>> o The only rows in MEMBERS that have an Origin of I, and
>> an Interp of 0, and are not placed in the unknown group
>> are birth dates. Not every birth date will have an
>> associated MEMBERS row with these values, as some birth
>> dates have locating censuses, but MEMBERS rows with
>> these values will be birth dates.
>>
>> o Living individuals, but not dead ones, can have MEMBERS
>> rows created by the interpolation procedure that locate
>> the individual in a group on a date later than the
>> individual's Statdate.^[32]
>>
>> o So long as an individual is alive the last CENSUS to
>> locate the individual ought be followed by a record of
>> absence, an absence from the group where the individual
>> was last found. To do otherwise, as must occur when
>> there is simply no further data to be entered, is to
>> introduce a bias into MEMBERS.
>>
>> o Aside from births, the only other rows in MEMBERS with
>> an Origin of I and an Interp of 0 are those in the
>> unknown group which were created by Data is not
>> Re-Analyzed.
SCA: don't understand.
>>
>> o As fetal losses, individuals with NULL Snames, cannot
>> appear in CENSUS, are not considered a live birth, and
>> always have their birth date equal to their Statdate,
>> they never have MEMBERS rows associated with them.
>>
>> o When computing Interp values from The Halfway to Census
>> Interval The Midpoint Rule is usually immaterial.
>> However, when non-interpolating censuses affect the
>> interpolation The Midpoint Rule can be the factor that
>> determines whether or not a MEMBERS row has a 0 Interp
>> value or not.
>>
>>Data Entry
>>
>> Automaticlly Generated IDs
Typo: Automatically
>>
>> The system will automatically generate id columns whenever
>> a new row is inserted and an id column is not supplied.
>> When an id value is supplied the system does not check to
>> see that it is indeed the next id in sequence, nor does it
>> update the sequence number to be automatically supplied the
>> next time a row is inserted without an id column. Should an
>> id value be manually supplied, it may be necessary to
>> update the internal id counter so that future system
>> generated ids will not conflict with an id already used.
>>
>> See the Postgresql documentation section on sequence
>> functions for reference material on the requsite Postgresql
>> functions.
>>
>> Tip
>>
>> Don't supply id values manually unless you know what your
>> doing.
>>
>>The Dataset Tables
>>
>> Data are entered in to the system in datasets, small
>> temporary tables that hold a batch of data to be entered
>> into the system.
>>
>> Datasets Containing INTERACT and PARTS Data
>>
>> INTERACT and PARTS are updated from data entered into a
>> single dataset. Their validation and update programs were
>> designed to allow additional information not presently
>> entered for agonisim and grooming data to be entered. The
>> programs therefore base their actions on the presence of
>> particular field names. For example, the validation program
>> validates a date column if it is present, but runs just
>> fine if there is no date column. In keeping with this, the
>> programs are written so that they are dependent only on the
>> names of the data fields. This is good and bad. The good
>> part is that data can be more or less comprehensive and the
>> programs will still work. Also, the programs are not
>> dependent upon the creation order of the different columns
>> of the table's structure. The bad part is that an
>> unrecognized or misspelled column name will result in the
>> programs ignoring that column, even though the operator
>> might think that something is being done with the column.
>>
>> Datasets used to update INTERACT and PARTS should have
>> columns with the same names and datatypes as INTERACT and
>> two additional columns, Actor and Actee. The Actor column
>> contains the Sname of the actor, and the Actee the Sname of
>> the actee. Also, the Start and Stop columns are somewhat
>> different in the dataset than in INTERACT. In INTERACT ,
>> the Start and Stop columns are characters, 5 long, no
>> characters may be spaces, the first two characters indicate
>> the hour, the third is a colon, and the last 2 indicate the
>> minute. In the datasets, the Start and Stop columns are 4
>> digit numeric fields. The first two digits are the hour (in
>> 24 hour format) and the last two are the minutes. Leading
>> zeros need not be entered and will not be retained.
>>
>> Datasets containing more than one type of data must contain
>> an Act column. The values of an Act column must be in the
>> Type column of the ACTS table, and the ACTS row must not
>> have an Old value of "Y" . (The act must be one of the
>> codes currently in use.) Agonisim and grooming datasets
>> must contain Actor, and Actee columns. Act, Date, Start,
>> and Stop columns are optional. Mount and ejaculation
>> datasets must contain Date, Actor, Actee, and Start
>> columns. Act and Stop columns are optional. Consort
>> datasets must contain Date, Actor, Actee, Start, and Stop
>> columns. The Act column is optional.
>>
>> Datasets Containing CENSUS Data
>>
>> The CENSUS validation and maintenance programs operate on
>> datasets that look something like census logs. The datasets
>> should contain a Date column (of type date) and a column
>> for each individual in the census group. The columns for
>> the individuals should have as their name, the short name
>> for the individual. The individuals' columns should all be
>> defined as character data, one character long. Each row in
>> the dataset should contain the date of the census in the
>> Date column, and either nothing or a "0" (the character
>> zero) in the individuals' columns. A "0" means the
>> individual was absent when censused on the census date, an
>> empty column means that the individual was present when
>> censused.
>>
>> Datasets Containing DEMOG Data
>>
>> Entry of demography notes is made into a dataset with the
>> columns: date, Sname, group, reference, and comment. The
>> reference and comment columns correspond to the columns of
>> DEMOG. The other columns are used to link the DEMOG rows to
>> the CENSUS rows.
>>
>> Datasets Containing CYCLES Data
>>
>> The CYCLES validation and maintenance programs operate on
>> datasets that look just like the table except that they do
>> not contain a Cid column. The Cid and Seq columns are
>> created by the update program.
>>
>>BABASE PROGRAMS
>>
>> Data Maintenance Programs
>>
>> These are the programs that are used in the entry and
>> maintenance of the BABASE Master tables. Their use is fully
>> documented in the procedure manual. The summary written
>> here provides a statement of purpose and a mention of all
>> updated data. The operation and behavior of the programs
>> supports the table and program characteristics documented
>> in this manual. For more information on the actual
>> capabilities of the programs, see the documentation in the
>> programs' headers.
>>
>> Integrit Validate database
>> integrity. Valinter Validate an INTERACT/PARTS
>>dataset. Upinter Update
>> INTERACT and PARTS master tables. Valcen
>> Validate a CENSUS
>> dataset. Upcen Update CENSUS and MEMBERS master tables with a
>> census. The statdate on BIOGRAPH will also be
>> increased to the last observed date. Note that all
>> living animals must be censused regularly in order to
>> keep them from "disappearing" when
>> attempting to locate them with a join to MEMBERS.
>>The operator should
>> enter group numbers where appropriate. The group
>> censused must exist in GROUPS before UPMEMB can be
>> run. Newincen Report on individuals appearing in a
>> census who did not appear on immediately prior month's
>> census of the group. This can alert the operator to the
>> need to manually add an "A" census for new
>> males on the immediate prior census date of the group to
>> prevent interpolation from presuming the male's presence
>> in the prior month. Also see the description of the
>> interpolation procedure below. Valdemog Validate a
>>DEMOG dataset. Updemog Update
>> MEMBERS, CENSUS (if there is no census
>> for the individual on the day of the demography note),
>> and MEMBERS in a manner
>> analogous to Upcen above. Also see the description of
>> the interpolation procedure below. Reinterp Re-creates
>> the MEMBERS rows from
>> BIOGRAPH and CENSUS in the event of manual
>> changes to either. Also see the description of the
>> interpolation procedure below. Valcycle Validate a
>> CYCLES dataset. Upcycle Update the CYCLES master
>> table. Ranker Generate an interaction matrix and update
>> the group rankings for a month. Subrank Generate ranking
>> of a new type and update the RANKS table by taking a subset of
>> individuals from a previously created ranking. Upseq
>> Update the Seq field on the CYCLES master
>> table. Upsuperg Update the Supergroup field of
>> the GROUPS master
>> table. Addcen Add rows to CENSUS recording an individual's
>> membership in a group over a time period. This program
>> is exceptional in that it can be invoked directly from
>> the command window with all the information necessary to
>> run, or be invoked interactively. This program is also
>> exceptional in that it is expected to be run in
>> conjunction with regular FoxPro** commands for ad-hoc
>> maintenance of CENSUS. Consequently, the
>> program does not make a log of the changes it makes to
>> the database, only checks that the Sname and group
>>Gid are valid, and
>> that the individual is not located in any group during
>> the specified time period. Other database integrity
>> rules are not checked as the program assumes that the
>> operator knows what s/he is doing. Also see the
>> description of the interpolation procedure
>> below.
>>
>> Makerep
>>
>> Re-create the REPSTATS and CYCSTATS tables from the CYCLES
>> and PREGS tables.
>>
>> Useful Programs and Functions
>>
>> This section describes programs and functions available for
>> general use. These functions are in addition to those
>> supplied as part of the FoxPro** system. Typically, one
>> would use one of these programs as part of a special
>> process not part of the regular BABASE system. One would
>> use one of these functions in a SQL - SELECT statement, a
>> query, a report, or perhaps in a special purpose program,
>> or a new BABASE system program you might want to write. For
>> more detailed information on the operation of these
>> programs and functions see the documentation written into
>> the program header of the program source code.
>>
>> Functions
>>
>>Name
>>
>> be_celsius -- convert to centigrade if necessary
>>
>>Synopsis
>>
>> numeric(4, 2) be_celsius ( temperature,
>> units);
>>
>> numeric(5, 2) temperature ;
>> char(1) units ;
>>
>>Input
>>
>> temperature
>>
>> A temperature.
>>
>> units
>>
>> The units in which the temperature is measured. C when
>> Centigrade units, F when Fahrenheit units.
>>
>>Description
>>
>> A function which returns temperature measurements in
>> degrees Centigrade. Fahrenheit measurments are converted to
>> Centigrade, Centigrade measurements are left untouched.
>>
>> Documentation on the use of these programs can be found in
>> the Protocol for Data Management: Amboseli Baboon Project
>> and in this document. This document also contains the
>> coding standards and design philosophy of the system, which
>> should be followed by anyone modifying or adding programs
>> to this directory.
>>
>> AGE()
>>
>> You supply this function with two dates, the first is the
>> earlier date and the second is the later date, and it
>> returns the number of full years between the two dates.
>>
>> Note
>>
>> To obtain the number of days between two dates, subtract
>> one date from the other.
>>
>> JULIAN()
>>
>> You supply this function with a date and it returns the
>> integer that represents the given date as the number of
>> days since a particular reference date. This number is
>> known as the julian date representation of the given date.
>> (Day number 2,361,222 is September 14, 1752.) Legal values
>> for the date are between September 14, 1752 and December
>> 31, 9999, inclusive.
>>
>> UNJULIAN()
>>
>> This function undoes the JULIAN() function. You supply an
>> integer, the number of days since day number 0, and it
>> returns the date that goes with the number.
>>
>> COMPARE
>>
>> You supply the names of two tables and the program does a
>> row by row comparison of the two tables. The rows must be
>> in the same order or it will report a discrepancy.
>>
>> MERGEIP
>>
>> Creates a table which combines the data in the INTERACT
>> table and the PARTS table. This program exists because such
>> a table is often useful because the Record-count. A Scale
>> column is added to the second table. This column records
>> the number of days in which the paired individuals were in
>> the same Supergroup during the month. It may be used to
>> scale the number of interactions to the time period over
>> which interactions were possible. The Scale column may be
>> blank. A blank in the Scale column means that interactions
>> were recorded during the month, but that there is no record
>> in the system of the individuals being in the same
>> Supergroup during the month. The operator must also enter
>> the starting and ending dates of the time period over which
>> pairs should be generated.
>>
>> A "pair" is recorded as two individuals, in the animal1 and
>> animal2 columns. The "pairs" are order sensitive, so
>> recording individual "A" in the Animal1 and "B" in Animal2
>> is a different pair than recording individual "B" in
>> Animal1 and individual "A" in Animal2. The universe of
>> individuals for which pairs are considered consists of all
>> the individuals in either column in the table, together
>> with all the individuals in a second table supplied by the
>> operator.
>>
>> A typical select statement to create the interaction table
>> is (named "kop2" here):
>>
>> select actor.sname animal1, actee.sname
>> animal2, ; ount(*) count, firstday(interact.date)
>> month; from parts actor, parts actee, interact,
>> biograph bactor, ; iograph bactee ; into table kop2 ;
>> where actor.iid = interact.iid ; and actee.iid =
>> interact.iid ; and actor.sname != actee.sname ; and
>> bactor.sname = actor.sname ; and bactee.sname =
>> actee.sname ; and interact.act = "G" ; and
>> interact.date = {1/1/1988} and interact.date
>> {1/1/1990} ; and actor.role = "R" ; and bactor.matgrp
>> = 1 ; and bactee.matgrp = 1 ; and bactor.sex = "F" ;
>> and bactee.sex = "F"ble which will record the animals
>> Snames.
>>
>> The first two columns of the old table must be the columns
>> which contain the paired Snames. The data columns of the
>> new table will be named based on the column names of the
>> old table. When the data in the column comes from the row
>> of the old table that recorded the action of the first
>> column upon the second column the column name in the new
>> table will end in "_1" , when the data comes from the row
>> where the second column acted upon the first column, the
>> column name in the new table will end in "_2" .
>>
>> ZEROS
>>
>> This is the original version of the ZEROS2 program. It
>> operates on less structured tables than ZEROS2 and may
>> still be useful. The program is given two tables. The first
>> table contains a single column, a list of all the Snames to
>> appear as pairs in the resulting analysis. For safety's
>> sake, all the Snames in the first 2 columns of the second
>> table will also be included in the resultant pair
>> combinations. (Note that a Sname will not be paired with
>> itself, i.e. no self-directed behaviors are allowed.)
>>
>> The second table counts interactions between individuals.
>> The table must contain 3 columns. The first two columns
>> contain Snames, the third column records the number of
>> interactions between the individuals. (This number will
>> always be a positive integer, never zero. Which is the
>> problem.) The third column must have a data type of
>> "Number" .
>>
>> The program adds rows to the second table, ensuring that
>> all possible pairs of SnameS appear. Any rows added are
>> given a zero value in the third column to record that the
>> individuals never interacted.
>>
>> Note
>>
>> Note that the order of the pairs in the second table is
>> significant. "A" in the first column and "B" in the second
>> might mean that individual "A" has groomed individual "B" ,
>> not vice versa - - the pairing is ordered. This program
>> ensures that there will be interaction counts for both the
>> pair "A" , "B" and the pair "B" , "A" . If necessary new
>> rows will be added for both pairs.
>>
>> A typical select statement to create the interaction table
>> is (named "kop" here):
>>
>> select actor.sname, actee.sname,count(*) ; from
>> interact,parts actor, parts actee, ; biograph bactor,
>> biograph bactee ; into table kop ; where actor.iid =
>> interact.iid ; and actee.iid = interact.iid ; and
>> actor.sname != actee.sname ; and bactor.sname = actor.sname
>> ; and bactee.sname = actee.sname ; and actor.role = "R" ;
>> and interact.act = "G" ; and year(interact.date) = 1988 ;
>> and bactor.matgrp = 1 ; and bactee.matgrp = 1 ; and
>> bactor.sex = "F" ; and bactee.sex = "F"ms,
>>
>>A. Changes to Babase between 1.0 and 2.0
>>
>> A number of changes were made to Babase in the transition
>> from FoxPro (Babase 1.0) to Postgresql (Babase 2.0). This
>> appendix attempts to document changes made to data
>> semantics.
>>
>> Changes to BIOGRAPH.Statdate
>>
>> The Statdate is now constrained, when the individual is
>> alive, to be the most recent date on which a census located
>> an individual in a group. Although this was true in
>> practice, the 1.0 system did not require it.
>>
>> This constraint leads directly to another, when the
>> individual is alive and there are no (non-absent) censuses
>> then the individual's Statdate must be the individual's
>> birth date. Because arbitrary Statdates are not allowed, we
>> prevent automatic changes from erasing manually set
>> Statdates.
>>
>> Changes To Interpolation and MEMBERS
>>
>> The interpolation procedure changed somewhat. As the
>> interpolation is what creates the MEMBERS table this
>> appendix also describes the changes made to MEMBERS between
>> 1.0 and 2.0.
>>
>> o Individuals have a row in MEMBERS for every day of
>> their lives.
For every day between birth and stat date right, with some rows
interpolated after stat date in the case of live animals. My point is
that a male that disappears and we reason to think that he is
probably still alive somewhere doesn't keep showing up with a 9 in
MEMBERS.
>>
>> Interpolation now places individuals in the unknown
>> group when individuals' locations cannot be otherwise
>> assigned, for example outside of the 14 day
>> interpolation limit. Formerly, when the individual
>> could not be place in a group on a particular day the
>> individual had no row in MEMBERS on that day.
>>
>> o Individuals are no longer placed in a group, the group
>> in which they were last censused, on their Statdate and
>> this "location" no longer interpolates.
>>
>> When first written, the interpolation procedure was
>> designed to work with females, who are unlikely to be
>> absent from their group for more than 28 days. (Twice
>> the 14 day interpolation limit.) By placing an
>> individual in a group on their Statdate, the group in
>> which they were last censused, the females were assured
>> a row in MEMBERS for every day of their lives. Further,
>> analysis was simplified as each of these rows
>> associated the females with their group (even though at
>> the end of their lives they may not have been present
>> in the group.)
>>
>> The new interpolation procedure does not consider the
>> Statdate in it's determination of the individual's
>> group membership on that day, although, as always, when
>> the Statdate is a death date it does stop
>> interpolation.
>>
>> o There is a change in what happens when an individual is
>> censused absent on his birth day. In the new system, if
>> the individual is censused "absent" on his birth
>> interpolation will "override" the absence and place the
>> individual in his Matgrp group in MEMBERS.
>>
>> In the old system, if the individual is censused
>> "absent" on his birth interpolation will not "override"
>> the absence and place the individual in a group in
>> MEMBERS. As the individual is expected to be somewhere
>> on his birth, it's expected that there be a demography
>> note made for the individual on that date to give the
>> individual a location ' a row in MEMBERS.
>>
>> o MEMBERS.Interp may now be NULL. The Foxpro system did
>> not have NULL values. In the new system Interp is NULL
>> when interpolation does not know where the nearest
>> locating census is. See Pre-Analyzed Data Disturbs
>> Interpolation
>>
>> o The behavior of interpolation on the last census is now
>> documented.
>>
>> The interpolation procedure changed during the period
>> of use of Babase 1.0, but the changes were not
>> documented. The primary change was that interpolation
>> was altered so that it did not interpolate if there was
>> no subsequent, absent or not, censuses. This prevented
>> (almost) every living individual currently monitored
>> from having a 14 day "tail" of interpolated values
>> following the last entered census -- a "tail" that
>> would disappear the next time the census information
>> was updated.
>>
>> Changes To The Sexual Cycle Information
>>
>> The structure of the sexual cycle portion of the database
>> was changed. The CYCLES table became CYCPOINTS. The CYCGAPS
>> table was added. And the CYCSTATS and REPSTATS were
>> modified and made useful. For further information please
>> compare the old and new documentation.
>>
>>B. Docbook, Styling and other issues
>>
>> All things Docbook can be found at The Docbook Project. The
>> basic Docbook reference is Docbook: The Definitive
>> Guide.^[33] While this book describes how to write Docbook,
>> it does not describe how to generate output or how to vary
>> the "look", the term-of-art is style, of the generated
>> output. A more gentle introduction can be found in Writing
>> Documentation Using DocBook. Babase uses the Unix xmlto
>> command, in conjunction with make to generate various
>> Docbook output formats, to go into further detail is beyond
>> the scope of this document. However, as altering the style
>> of the Docbook output is something done rarely it is useful
>> for the project to have some reference material on-hand as
>> a guide when needed.
>>
>> Those who wish to alter the style of the Babase
>> documentation should start by reading the Makefile to see
>> how xmlto is invoked. Follow this with an examination of
>> the style sheet fragments supplied to xmlto. These files
>> contain XSL, the Extensible Stylesheet Language, explained
>> in What is XSL?. To make further sense of this see the
>> reference material on styling Docbook. This is covered in
>> Docbook XSL: The Complete Guide, Part II. Stylesheet
>> options. Additional detail may be found in XSL Frequently
>> Asked Questions, and it's companion Docbook Frequently
>> Asked Questions. The FO Parameter Reference is the
>> comprehensive list of formatting "customization variables".
>> The XSL specifications are available from the W3C, The
>> World Wide Web Consortium.
>>
>> An overview of XML and where XSL fits in can be found at
>> XML: The Big Picture.
>>
>> --------------
>>
>> ^[1] We do this rather than paying one of the regular
>> certification authorities to validate our identity. These
>> certification authorities appear to validate the identity
>> of their customers by virtue of having successfully been
>> paid.
>>
>> ^[2] As security restrictions permit, of course.
>>
>> ^[3] That way if you unknowingly revealed your password to
>> the terrorists last weekend when you were drunk, by the
>> time everybody sobers up the password will have been
>> changed and the amount of damage done will be limited.
>>
>> ^[4] Presently group 9.0. This hardcoded at present.
>>
>> Individuals are generally put in the unknown group when
>> interapolation does not know their group membership, but it
>> is also possible for an individual to be explicitly placed
>> in the unknown group.
>>
>> ^[5] This is unlikely as the database will not allow entry
>> of a duplicate Sname.
>
>Jeanne wrote "it did" beside foot note number 5
>
>>
>> ^[6] Or whatever you want to call it in the case of a fetal
>> loss.
>
>Jeanne had a question mark beside foot note number 6
>
>>
>> ^[7] An actual census does not have to be taken, as the
>> Statdate of live individuals is derived from the CENSUS
>> table, any observation of an individual in a group which
>> results in a row being added to CENSUS is sufficient.
>
>Jeanne had a question mark beside foot note number 7.
>
>>
>> ^[8] This criteria is specifically phrased to account for
>> gaps in the recorded data during the time period in which
>> the peak turgesence probably occured.
>>
>> ^[9] This is termed a visit in the Protocol for Data
>> Management, which should be consulted for further details.
>>
>> ^[10] D usually occurs when a male is seen alone or in a
>> non-census group.
>
>Revision to foot note number 19:
>
>D usually occurs when a male is seen alone or in a non-study census group.
>
>>
>> ^[11] DEMOG nearly makes the M CENSUS Status code obsolete,
>> were it not so hard to search on textual data. Indeed, it
>> was created in response to difficulties with the M code.
>>
>> ^[12] One would think that, in order to maintain perfect
>> database consistency, the actor and actee participants in
>> an interaction should be in the same Supergroup, according
>> to the MEMBERS table. The database consistency checker
>> (integrit.prg) does report when the actor and actee are not
>> members of the same Supergroup. However, there is currently
>> no check for actor/actee location correspondence in the
>> update programs. This is for three reasons. First,
>> movements between groups and the timing of censuses and
>> interaction data collection may result in valid records of
>> interactions between individuals that are recorded as being
>> in different supergroups. The effects of this on the manual
>> data correction process could be reduced by having the
>> interaction master table update process add additional
>> location data into the MEMBERS table, but not totally
>> eliminated because the resolution of the MEMBERS table is
>> one day and individuals can move between groups during a
>> one day interval. Second, some of the interaction data are
>> entered with a date of the first of the month, not the
>> actual date of the interaction. Thus, the animals could be
>> in different groups on the first of the month and still
>> interact during the month. When this situation is
>> discovered, the date of the interaction for these
>> interactions should be manually changed to the first day of
>> the month on which the two animals were in the same group.
>> Third, the lack of a check allows the interaction data to
>> be entered before the census data for the month. Also, from
>> 1989 through 1991, inclusive, recorded group for the
>> sub-groups of Alto's group does not always represent the
>> actual location of the individual. (See the MEMBERS
>> documentation.)
>>
>> ^[13] At this time only DEMOG, the demography notes table,
>> contributes to CENSUS any information regarding group
>> membership.
>>
>> ^[14] Sometimes, when demography information is added into
>> other tables, CENSUS rows are altered rather than removed.
>> Likewise, CENSUS rows are removed (or altered as necessary)
>> when demography information is removed from other tables.
>>
>> ^[15] This is the one exception, if you wish to consider it
>> so, to the rule that an individual cannot be censused both
>> present and absent in the same group on the same day.
>>
>> ^[16] The "same group" condition is one that must be met
>> whenever interpolation examines intervals between presence
>> and absence.
>>
>> ^[17] As the individual is alive, every census that
>> post-dates the individual's Statdate must record an
>> absence, else the Statdate would be adjusted to reflect the
>> date of last census.
>>
>> ^[18] This is a heuristic. While it should work well enough
>> most of the time the Babase user must be aware of the
>> pitfalls in this approach. These are explained below.
>>
>> ^[19] Without this restriction interpolation would have to
>> insert rows forever, placing the individual in the unknown
>> group off into the indefinite future.
>>
>> ^[20] Notice that interpolation does not bother analyzing
>> absences, such as the last-most, that are not neighbor to
>> censuses.
>>
>> ^[21] As locating censuses are interpolated individually
>> the figure could diagram the intervals associated with each
>> census separately, as in Figure 18, work out group
>> membership from that, and then combine the results; the
>> outcome would be unaffected. The chosen presentation form
>> allows the interval endpoints to "match up" in a revealing
>> fashion. As an exercise the reader should prove to himself
>> that the intervals associated with each locating census are
>> accurately depicted, and that the order in which locating
>> censuses are interpolated does indeed make no difference.
>>
>> ^[22] Figure 18: " A Closer Look at Intervals" makes clear
>> that it is not necessary to show these intervals. By
>> definition, the omitted intervals will always be longer
>> than the "halfway to census" interval of the census being
>> interpolated. As the shorter interval is the one used the
>> longer may be ignored.
>>
>> ^[23] When there are two intervals. When there's no
>> "absence" interval the "Used:" line shows the "presence"
>> interval.
>>
>> ^[24] The proper term is "The Glorious Interpolation
>> Procedure", but we don't tell this to just anybody.
>>
>> ^[25] See MEMBERS.Origin.
>>
>> ^[26] It might be better if interpolation did not
>> interpolate at all on those intervals between interpolating
>> censuses that contain a non-interpolating census^[26] -- if
>> it put the individual in the unknown group, with an Interp
>> of 0 and an Origin of NULL whenever there was no locating
>> census. However, this could easily cause problems because
>> interpolation has always worked as the body of this
>> document describes. Although these situations are not
>> supposed to occur, it is likely the data contains such
>> situations and changes should not be made to interpolation
>> which break the database.
>>
>> ^[2626] I have not thought this through. At first glance it
>> seems the code would be simpler, but perhaps not. And the
>> effect on data analysis is unclear. It is probably best to
>> adopt one of the solutions presented in the note below.
>>
>> ^[28] Although in this example we "count up" traversing the
>> timeline from left to right, had the N census had been
>> closer to the right side of the diagram than the left we
>> would be "counting up" the interval by traversing the
>> timeline in the opposite direction, from right to left.
>>
>> ^[29] The same method is used to compute Interp values when
>> interpolation uses The 3 Interpolation Intervals, above.
>>
>> ^[30] This "same group" criteria corresponds with the
>> criteria found in The Halfway to Absence Interval.
>>
>> ^[31] Interp is fixed at 0 over the portion of The Halfway
>> to Census Interval that was truncated in the preceding
>> paragraph. Effectively, as MEMBERS Interp counts up with
>> increasing distance from the interpolating census, the
>> count is fixed at NULL upon encountering a
>> non-interpolating census until the point is reached at
>> which counting back down to the next interpolating census
>> begins, at which point the count downward resumes as though
>> never interrupted.
>>
>> ^[32] This is examined in detail in Interpolation at the
>> Statdate .
>>
>> ^[33] Be sure to read the edition that describes the
>> version of Docbook you're using. This text was written for
>> Docbook 4.3.
>>
>
>_______________________________________________
>Babase mailing list
>Babase@www.eco.princeton.edu
>http://www.eco.princeton.edu/mailman/listinfo/babase
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Susan Alberts, Associate Professor
Department of Biology, Duke University, Box 90338, Durham NC 27708
phone 919-660-7272 fax 919-660-7293