[Babase] Interpolation, is it self-censoring? Please review
Karl O. Pinc
babase@www.eco.princeton.edu
Fri, 22 Jul 2005 02:01:15 +0000
Hi,
I'd like a response from Jeanne or Susan before going ahead
with the changes to interpolation we discussed over lunch
today.
In foxpro interpolation works (in part) as follows:
3. The 14 day Interpolation Limit
Given no other information, an individual is considered to
remain (or have been) in the group where observed for 14 days
following (or preceding) the date of observation.
Outside of this 14 day range the individual is placed in the
unknown group.
Clarifications:
a. The 14 day Interpolation Limit will not place a row in
MEMBERS before an individual's Birth date.
b. When an individual is dead, The 14 day Interpolation Limit
will not place a row after the individual's Statdate.
c. When an individual is alive, The 14 day Interpolation Limit
will place a row after the individual's Statdate, but only
when there is a subsequent absence. Again, as in The
Halfway to Absence Interval, the absence must be one
recorded for the same group as the previous locating
censuses.^[14]
^[14] Note that, as the individual is alive, any censuses that
post-date the individual's Statdate must record an absence,
else the Statdate would be adjusted to reflect the date of
last census.
Over lunch we decided that interpolation would never interpolate
past an individual's Statdate. However, this
change seems to introduce self-censoring.
(I knew I had a problem with this when Cathrine
and I discussed it earlier, but I didn't recall
it over lunch.)
First, a review:
Rule c keeps interpolation from putting a trailing
14 days of in-the-group onto the end of each
individual at the end of each month, until the
next census data sheet is entered. Clearly, we don't
want such artifacts of the data entry process.
This too is a certain amount of self-censoring,
but we have unavoidable problems at "the end
of time" -- when data entry ceases -- and this is
a good a way as any to work around the problem.
But a subsequent absence means we're not at
"the end of time" for the individual. There
has been further data entry.
To eliminate interpolation after the Statdate
would be self-censoring because the census
when the individual was last observed is treated differently
from all the other censuses. If the individual is dead,
then we don't want to interpolate their last census,
but if they just wander off I don't see why we would
not interpolate to the subsequent absence just as
we do with any other census. The only difference
is that they happened to return to the group in the
previous censuses. (To speak for Cathrine,
she thinks the last census _is_ different
because the individual _could_ be dead. But I
think that if we thought they were dead
we'd mark them dead and do whatever we like
with the Statdate to indicate date of death.
The ones we don't mark dead should be considered
alive (past their Statdate) just like everybody
else who's not dead.)
This all rang a bell with me because IIRC we
introduced the "interpolation after Statdate"
the last time we took a serious look at interpolation
and made changes (and didn't document them.)
Please decide whether we should made this change
to the interpolation procedure.
Thanks.
Karl <kop@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein