[Babase] Males with Matgrp 9, not in grp 9 at birth

Lacey Maryott Roerish lroerish4 at gmail.com
Sat Dec 26 22:12:26 EST 2009


On Sat, Dec 26, 2009 at 10:04 PM, Karl O. Pinc <kop at meme.com> wrote:

> On 12/26/2009 08:32:27 PM, Lacey Maryott Roerish wrote:
>
> > There are no longer any males in the DB with a matgrp of 9, but whose
> > members records place them in a study group since birth!
>
> I'm not sure exactly what you're cleaning up but there are a few
> left matching your conditions above.   And there also seem to be
> a number of others where somebody tried to clean up by putting
> census.status of A and C at the birthdate but census.status B
> rows remain and place the individuals into study groups (because
> the Bs were never removed?).  It's possible somebody did something
> for the conversion, maybe, to fix some sort of data integrity error
> and get the rows to go into the database.   In any case, there's
> something artificial going on because we don't have real A and C
> census rows going back that far.  (Do we?)
>

This is our temporary way of fixing this until real data goes in, but yes,
in most (all but about 2) cases, these were pulled from actual census sheets
that are just not yet digitized.

>
> Maybe what's left are the males that really were in the
> study groups when real censuses started?
>
> babase=> select distinct sname from census where status = 'B' order by
> sname;
>  sname
> -------
>  ALY
>  BAR
>  BJX
>  COW
>  DAR
>  DUT
>  HAR
>  IBI
>  IVA
>  KUS
>  LIP
>  MAX
>  MWA
>  NGU
>  PET
>  PIG
>  RAD
>  SEK
>  SIN
>  SLK
>  STB
>  TUL
>  WYM
> (23 rows)
>
> This matches perfectly the group of males I just fixed. I wasn't getting
rid of all B rows. I was just fixing their records to indicate their exact
date of entry into a study group and before hand, so that they are in grp 9
prior to their arrival...  The rest of the rows will get fixed with the
'true' census fix in the spring.

>
> babase=> select census.grp, census.sname, biograph.matgrp, count(*)
> from census, biograph where census.status = 'B' and biograph.sname =
> census.sname group by census.grp, census.sname, biograph.matgrp order
> by census.grp, census.sname;
>  grp  | sname | matgrp | count
> ------+-------+--------+-------
>  1.00 | BJX   |   9.00 |     4
>  1.00 | COW   |   9.00 |   210
>  1.00 | DUT   |   9.00 |  1486
>  1.00 | IVA   |   9.00 |   329
>  1.00 | MAX   |   9.00 |  2023
>  1.00 | PET   |   9.00 |  2639
>  1.00 | SIN   |   9.00 |   471
>  1.00 | STB   |   9.00 |  1564
>  2.00 | ALY   |   9.00 |   515
>  2.00 | BAR   |   9.00 |   664
>  2.00 | HAR   |   9.00 |   810
>  2.00 | LIP   |   9.00 |    98
>  2.00 | SEK   |   9.00 |   232
>  2.00 | SLK   |   9.00 |  1902
>  2.00 | WYM   |   9.00 |   222
>  3.00 | NGU   |   9.00 |  1853
>  3.00 | RAD   |   9.00 |  3078
>  3.00 | TUL   |   9.00 |  1494
>  4.00 | IBI   |   9.00 |   570
>  4.00 | KUS   |   4.00 |  3413
>  4.00 | MWA   |   9.00 |  4019
>  4.00 | PIG   |   9.00 |   121
>  6.00 | DAR   |   9.00 |     1
>  9.00 | DAR   |   9.00 |  3638
>  9.00 | IBI   |   9.00 |  3267
> (25 rows)
>
> A lot seem to match the pattern shown below....
>
> babase=> select * from census where census.sname = 'BJX' order by
> census.date;
>  cenid  |    date    | sname |  grp  | status | cen
> ---------+------------+-------+-------+--------+-----
>  2024695 | 1971-07-31 | BJX   |  1.00 | A      | t
>  2024694 | 1971-08-01 | BJX   |  1.00 | C      | t
>  147562 | 1971-08-02 | BJX   |  1.00 | B      | f
>  147635 | 1971-08-03 | BJX   |  1.00 | B      | f
>  147708 | 1971-08-04 | BJX   |  1.00 | B      | f
>  147781 | 1971-08-05 | BJX   |  1.00 | B      | f
>  147924 | 1971-08-06 | BJX   | 99.00 | S      | f
>  147997 | 1971-08-07 | BJX   | 99.00 | S      | f
>  148070 | 1971-08-08 | BJX   | 99.00 | S      | f
>  148143 | 1971-08-09 | BJX   | 99.00 | S      | f
>  148216 | 1971-08-10 | BJX   | 99.00 | S      | f
>  148289 | 1971-08-11 | BJX   | 99.00 | S      | f
> ...
>
> While you're cleaning up arn't the 'S' rows with
> a group of 99.00 redundant and able to be eliminated?
>

Group 99 is our next project to get rid of.  It's on the list.  I am just
doing one thing at a time based on problems that show up (males in study
groups at birth who have mat grp 9, etc).

Like I said above, we hope to have a LOT of the true census entered next
semester.  We can't be sure yet how much will get done, but plan to devote
at least 1 undergrad to that project alone. At that time this will all be
cleaned up. This was just a temporary fix, and wasn't intended to affect
males records after their date of entry, just before, and wasn't meant to be
a full cleanup of B rows. But we're getting there :)

Cheers
Lacey

>
>
> Karl <kop at meme.com>
> Free Software:  "You don't pay back, you pay forward."
>                 -- Robert A. Heinlein
>
>
> _______________________________________________
> Babase mailing list
> Babase at www.eco.princeton.edu
> http://www.eco.princeton.edu/mailman/listinfo/babase
>



-- 
- -
Lacey K. Maryott Roerish
Alberts Lab
Department of Biology
Duke University
ph: 919-660-7306
fax: 919-660-7293
Lacey.Maryott at duke.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://eeblistserv.Princeton.EDU/pipermail/babase/attachments/20091226/9f69f4d0/attachment.html


More information about the Babase mailing list