JBG, SCA, NHL, JA, EAA Sep 2018. JBG 2022
Complete theBabase Training Course before doing anything
Familiarize yourself with the SQL for Babase section on the Babase Wiki. It has lots of useful information about building queries.
- While planning analyses or exploring new data sets:
- Discuss with your advisor how to identify questions that are not already being asked of the data, and who to go to to get ‘clearance’ and training on various data sets, particularly those that are highly specialized and/or not in Babase proper, e.g., hormone data (Laurence and Susan), genetic/genomic resources (Jenny), or parasite data (Beth).
Read thedocumentation for the table(s) you want to use.
- Learn about any views pertinent to your data of interest by reading the documentation for views. They may save you considerable time and effort.
- Search on particular words in the technical specifications to find pertinent documentation for useful Views (e.g., searching on ‘Groups’).
- Ask database managers or more experienced users to help you identify useful views.
If you have questions about how to calculate something, find something, etc., be sure to check the Babase FAQs.
- Be aware of and properly employ all status columns (e.g., bstatus, mstatus, rstatus) and confidence columns (e.g., dcauseagentconfidence, dcausenatureconfidence, dispconfidence) that are relevant to your question.
- Study the GROUPS_HISTORY view, the BEHAVE_GAPS table, and the DATA_SUMMARY_MONTHLY and DATA_SUMMARY_YEARLY views as starting places for considering what kinds of data are available during which periods of time.
- Don’t assume that data coverage is equal across all time and all groups. Examine how your data of interest are distributed across time and groups. The BEHAVE_GAPS table and the Data_Summary views provide rough guides to time periods when behavioral data may be sparse. It is not a substitute for you actually examining the data. It is your responsibility to know the limitations of the data.
- Plan carefully about which groups you will include in your analysis. Lodge group experienced a very different environment than the other groups; in most cases you will want to exclude Lodge group and the groups into which it fissioned. Beware also of study groups that have been dropped and continue to exist after regular data collection has ended - this has become increasingly common and warrants careful use of groups_history.
During fission periods, some types of demographic measures are not reliable (e.g., adult sex ratios) because group membership may shift on an hourly or daily basis. More recent fissions have real-time membership data in census, whereas this information still needs to be backfilled for older fissions - see the DatasetStatusTable on the wiki for updates
- While carrying out your analyses:
Keep an electronic notebook—the equivalent of a lab notebook—that records and explains all queries you use to gather data. Documenting your queries is extremely important both so that you know how you got the dataset you’re working with, and so that you don’t have to re-invent the wheel when you need to reconstruct old datasets.
See examples of query documentation on the wiki
- When you run a query and save its output for an analysis, be sure to save the date and time—in Babase, not your local time—that you did it. You should probably mention this date in any publications or data repositories where your data will actually be used. Adding “NOW()” or “CURRENT_TIMESTAMP” to the list of columns you’re selecting is a good way to ensure you've got the right time.
- Keep an eye out for errors in the database! We make every effort to prevent erroneous data from being entered into Babase, but it can happen. If you find something that you suspect to be an error:
- First check the documentation for the table/view where the “error” is stored, and make sure you correctly understand the data in question.
- Make every effort to track down exactly what, in your query, is producing the error. This will help you figure out whether it’s an error in the data or an error in your query, an important step before going to a manager.
- If you still suspect you have an error, contact a data manager and send them the SQL code you used that produces/highlights the error. Below is a list of the data managers and the datasets each of them oversees:
Niki (Princeton): Census/Demography, Female cycling/pregnancy, Mature dates, Disperse Dates, Female ranks, Rank dates (some), Consort dates, Weather, Wounds/Pathology, Intergroup encounters, Subgroup notes, Neonatals, Deaths and corpse_info, Entrydates, Hybrid scores
Jake (Duke): Interactions (agonisms, groomings, mounts/consorts, multiparty interactions), Predations, Human disturbances, Focal sampling, SWERB, Male ranks, Rank dates (some), Dartings, inventory at Duke
Willi (Notre Dame): Parasites, inventory at Notre Dame
Laurence (Duke): Hormones