New users must successfully complete this quiz before working on their own projects!

1. What does ending your query with “limit 100” do and when should you use this statement?

2. Write a query that returns all the information in biograph about offspring of the female named PLUM in birth order. (Hint: Don’t order by pid – pids may not match parity and 10 through 19 will organize themselves between 1 and 2.)

3. The result of the below query could be described as "a list of all dead individuals".

SELECT *
  FROM biograph
  WHERE status = 1;

The result of the below query will return a list of...what?

SELECT *
  FROM biograph, members
  WHERE matgrp = grp
    AND biograph.sname = members.sname
    AND sex = 'M'
    AND status = 0
    AND dcause = 0
    AND statdate = date
  ORDER BY matgrp, pid;

4. Write a query whose result shows each individual’s short name, birth date, status of their birthdate estimate, their year of birth, maturedate, and whether their maturedate is an “on” or a “by” date, and that will write that table to your own schema in a table called “temptable”, and will order it by the individual’s name. The table should return only females, and only ones that actually have a mature date. (Hint: You'll need biograph and maturedates.)

For questions 5 and 6, you are asked to fix a query. If you have trouble understanding how/why the query is wrong and/or how to fix it, you may find our discussion about wrong queries useful.

5. You are running a query to find all grooming events on adult males between 2000 and 2008, inclusive, organizable by year and the male’s age, but your query is hanging up in Babase (because you forgot to use limit 100). After you ask a Babase administrator to kill your query, you need to figure out what is wrong. Add a line to the query to fix it.

select *, (actor_actees.date - biograph.birth)/365.25 as male_age
  from actor_actees
  join ranks
    on ranks.grp = actor_actees.actee_grp
  join rankdates
    on rankdates.sname = actor_actees.actee
  join biograph
    on biograph.sname = actor_actees.actee
where actor_actees.date >= '2000-01-01' and
       actor_actees.date <= '2008-12-31' and
       actor_actees.date >= rankdates.ranked and
       rnkdate(actor_actees.date) = ranks.rnkdate and
       actor_actees.act = 'G' and
       ranks.rnktype = 'ALM'
order by actor_actees.date;

New users--for example, people who are taking this quiz--are often confused by the use of the "rnkdate()" function in the line, "rnkdate(actor_actees.date) = ranks.rnkdate". That is a custom function that is unique to Babase. See the documentation for more information.

6. You want a list of all conceptions that have occurred since the beginning of hydrological year 2000 (i.e., since 1 Nov 1999), including which group the mom was in at conception. This time you remembered to use “limit 100” and don’t have to have your query killed but the output is clearly wrong – when Babase finally spits out an answer, all 100 rows returned by the query are about one conception. Add a line to fix the query.

select mtd_cycles.sname
     , mtd_cycles.ddate
     , extract(year from mtd_cycles.ddate) as year
     , hydroyear(mtd_cycles.ddate) as hydroyr
     , extract(month from mtd_cycles.ddate) as month
     , pregs.pid
     , pregs.resume
     , members.grp
  from mtd_cycles
  join pregs
    on pregs.conceive = mtd_cycles.dcpid
  join members
    on members.sname = mtd_cycles.sname
where mtd_cycles.ddate > '1999-10-31'
order by ddate
limit 100;

7.

select * from biograph, members
where matgrp = grp and
biograph.sname = members.sname and
sex = 'M' and
status = 0 and
dcause = 0 and
statdate = date
order by matgrp, pid;

select * from biograph, members
where biograph.matgrp = members.grp and
biograph.sname = members.sname and
biograph.sex = 'M' and
biograph.status = 0 and
biograph.dcause = 0 and
biograph.statdate = members.date
order by biograph.matgrp, biograph.pid;

select *
  from biograph
       join members on (biograph.matgrp = members.grp and
                        biograph.sname = members.sname)
   where biograph.sex = 'M' and
         biograph.status = 0 and
         biograph.dcause = 0 and
         biograph.statdate = members.date
   order by biograph.matgrp, biograph.pid;

-- This is an illustrative query used in the Babase Quiz
-- This is how you can write notes about the query to yourself within the query.
SELECT *
  FROM biograph
       JOIN members ON (biograph.matgrp = members.grp AND
                        biograph.sname = members.sname)
  WHERE biograph.sex = 'M' AND   --  M means male (tricky 'eh?)
        biograph.status = 0 AND
        biograph.dcause = 0 AND
        biograph.statdate = members.date
  ORDER BY biograph.matgrp, biograph.pid;

8. Here’s another option, which may be especially helpful for longer queries – you can "alias" the tables. As an example, let’s rewrite the query from question 6 using aliases.

Instead of:

from mtd_cycles, pregs, members

You could say:

from mtd_cycles as mc, pregs as p, members as m

Or simply:

from mtd_cycles mc, pregs p, members m

With these, you could rewrite the query from question 6 as follows: (this query has not been fixed by the answer to #6, so don't run this query unless you've fixed it first!)

select mc.sname
     , mc.ddate
     , extract(year from mc.ddate) as year
     , hydroyear(mc.ddate) as hydroyr
     , extract(month from mc.ddate) as month
     , p.pid
     , p.resume
     , m.grp
  from mtd_cycles mc
  join pregs p
    on p.conceive = mc.dcpid
  join members m
    on m.sname = mc.sname
  where mc.ddate > '1999-10-31'
  order by mc.ddate limit 100;

When you give the tables aliases, you can name them whatever you want; you don't HAVE to pick the first letter of the table like in the example above.

Now, write a query using aliases that will provide the minimum, maximum, mean, and standard error (std dev/square root of n) for age (in years) at maturity of males and females, with separate rows for males and females but using only one query. Your answer will need only the BIOGRAPH and MATUREDATES tables, and will need to use “GROUP BY”. Be sure to name the output columns by placing “as column_name” at the end of each row that’s doing math for you.

Next, add some constraints to your query (in the "WHERE" clause) so that it only includes baboons [1] born in wild-feeding study groups (group is < 3) [2] whose birth date is both known to within a few weeks and [3] whose maturedate is known (Hint: pay attention to the matgrp, bstatus, and mstatus columns).

How do the results compare after you constrained the results by matgrp, bstatus, and mstatus?

9. See the Babase FAQs to learn how to identify a wild-feeding study group. Note that there are separate sections for identifying wild-feeding and study groups. In the previous question, you needed only to say that a specific "grp" column needed to be < 3, but in this question we need to take a more thorough approach.

Write a query that will create a list of all adult males who have ever lived (i.e. were ever present) in a wild-feeding study group. Include the male’s sname, birth date, the date he attained adult rank, his statdate, his biograph "status" (alive, dead, or censored), and a count of all offspring the male is known to have sired. Be sure to include adult males who have no known offspring. (Hint: You’ll need biograph, rankdates, and parents.) To get all the males who have ever lived in a wild-feeding study group, you'll probably need to employ the information you learned from the Babase FAQs in a subquery.

10. Throughout the 50+ years of baboon project history, there have been periods of time when data collection was interrupted or was generally sparser than usual. Also, there are periods of time where study groups are in the process of a fission into two or more new groups, or fusing multiple groups into a new one. Data from these fission/fusion periods can be weird in numerous ways. For most analyses, it's important that you be aware of these periods (both the data interruptions and the fissions/fusions) and handle your data appropriately. Exactly what is "appropriate" is a topic that you will need to discuss with your advisor/principal investigator/etc.

In this question, we'll work with some of the tools available to you to help you identify these periods:

Before you proceed with this question, go look at each of these tables/views and make sure you understand what information they contain.

Now let's use these tools to answer some questions. You’re interested in creating grooming networks for groups six months before, during, and six months after each fission. Before you do that, you need to know which fissions have grooming data consistently throughout the periods you are looking at.

11. This question concerns the content in the Temporal Tables appendix in the Babase tech specs. You aren't expected to have a thorough understanding of what is discussed there, but you should at least be generally aware of what a temporal table allows you to do.

12. Did you join the Babase mailing list yet?


QuizAnswers


There are additional quiz and quiz-like questions available that are more illustrative of real-world queries.

Questions 1-7 were created by Karl O. Pinc, Susan C. Alberts and Niki H. Learn. Questions 8-10 were written by Niki H. Learn, with some revisions by Jake Gordon. Questions 11-12 are all Jake's fault.

BabaseQuiz (last edited 2023-09-11 20:12:28 by JakeGordon)

Wiki content based upon work supported by the National Science Foundation under Grant Nos. 0323553 and 0323596. Any opinions, findings, conclusions or recommendations expressed in this material are those of the wiki contributor(s) and do not necessarily reflect the views of the National Science Foundation.