Knowing that every change to the data is recorded with a
timestamp, it is now possible to "go back in time" and query
tables in the database "as of" a specific time. Unfortunately,
PostgreSQL does not include the syntax AS
OF
. However, the range operator @>
effectively means the same thing.
Example I.1. Querying "as of" a date
SELECT *
FROM sometable
WHERE sys_period @> '2022-02-22 12:34:56'::timestamptz;
Note that in this example, only the fictional table SOMETABLE is
being selected-from, so the query will only return rows that 1)
were in the table at 2022-02-22 12:34:56, and 2) are still in
the table now. Rows that were in the table on 2022-02-22
12:34:56 but which have since been removed or updated are in
SOMETABLE_HISTORY in the babase_history schema, which is not
selected-from here.
When querying for a table's data "as of" a specific date
in the past, the best practice is to query both the table in
babase and its history
table in babase_history,
simultaneously. The UNION
or UNION
ALL
operators are ideal for this.
Example I.2. Querying a table's history "as of" a date
WITH sometable_all AS (SELECT * FROM babase.sometable
UNION
SELECT * FROM babase_history.sometable_history)
SELECT *
FROM sometable_all
WHERE sys_period @> '2022-02-22 12:34:56'::timestamptz;
Querying data from a view is more complicated. Views usually represent data from two or more tables that have been joined together somehow, so recreating data from a view will require that the Sys_Period of each component table is accounted-for. The "best" way to do this depends somewhat on the view itself, so a generalized example about "someview" is not provided here.
See below for specific examples using real tables and a real view.
Returning to the story of TIM and JIM in the previous section, suppose a researcher named Dawn published a paper in 2015 that relied on data from a query that she executed at 2015-05-05 05:05:05. In 2015, it was not yet known that TIM and JIM were two names for the same individual, so Dawn's data includes both names and presumes that they are distinct individuals.
Several years later a new researcher, Pam, wants to revisit Dawn's analysis. As a first step, she needs to recreate Dawn's dataset. Pam cannot do this with the data in the babase schema alone; the TIM/JIM misidentification has already been identified and addressed in her time, so only one of those IDs is still present in Pam's "current" data. She needs the data exactly as it was when Dawn collected it.
Dawn's analysis used data from the BIOGRAPH and MATUREDATES tables, and interactions recorded in the ACTOR_ACTEES view. When collecting data to recreate the analysis, the code Pam would use to collect data from the tables is relatively simple:
-- For BIOGRAPH
WITH biograph_all AS (SELECT * FROM babase.biograph
UNION
SELECT * FROM babase_history.biograph_history)
SELECT *
FROM biograph_all
WHERE sys_period @> '2015-05-05 05:05:05'::timestamptz
AND [WHATEVER OTHER CONSTRAINTS DAWN USED];
-- For MATUREDATES
WITH maturedates_all AS (SELECT * FROM babase.maturedates
UNION
SELECT * FROM babase_history.maturedates_history)
SELECT *
FROM maturedates_all
WHERE sys_period @> '2015-05-05 05:05:05'::timestamptz
AND [WHATEVER OTHER CONSTRAINTS DAWN USED];
Knowing that the ACTOR_ACTEES view is a join between INTERACT_DATA, two instances of PARTS, and two subqueries of MEMBERS, Pam accounted for the Sys_Period of each of those tables and recreated Dawn's May 2015 data using:
WITH dawns_time AS -- Declare the date/time once here so it doesn't
-- need to be retyped for every table
(SELECT '2015-05-05 05:05:05'::timestamptz AS this_time)
, interact_data_all AS (SELECT * FROM babase.interact_data
UNION
SELECT * FROM babase_history.interact_data_history)
, dawns_interact_data AS (SELECT *
FROM interact_data_all
WHERE sys_period @> (SELECT this_time FROM dawns_time))
, parts_all AS (SELECT * FROM babase.parts
UNION
SELECT * FROM babase_history.parts_history)
, dawns_parts AS (SELECT *
FROM parts_all
WHERE sys_period @> (SELECT this_time FROM dawns_time))
, members_all AS (SELECT * FROM babase.members
UNION
SELECT * FROM babase_history.members_history)
, dawns_members AS (SELECT *
FROM members_all
WHERE sys_period @> (SELECT this_time FROM dawns_time))
, dawns_actor_actees AS (SELECT dawns_interact_data.iid AS iid
, dawns_interact_data.sid AS sid
, dawns_interact_data.act AS act
, dawns_interact_data.date AS date
, dawns_interact_data.start AS start
, dawns_interact_data.stop AS stop
, dawns_interact_data.observer AS observer
, actor.partid AS actorid
, COALESCE(actor.sname, '998'::CHAR(3)) AS actor
, (SELECT actorms.grp
FROM dawns_members AS actorms
WHERE actorms.sname = actor.sname
AND actorms.date = interact_data.date) AS actor_grp
, actee.partid AS acteeid
, COALESCE(actee.sname, '998'::CHAR(3)) AS actee
, (SELECT acteems.grp
FROM dawns_members AS acteems
WHERE acteems.sname = actee.sname
AND acteems.date = interact_data.date) AS actee_grp
, dawns_interact_data.handwritten AS handwritten
, dawns_interact_data.exact_date AS exact_date
FROM dawns_interact_data
LEFT OUTER JOIN dawns_parts AS actor
ON (actor.iid = dawns_interact_data.iid AND actor.role = 'R')
LEFT OUTER JOIN dawns_parts AS actee
ON (actee.iid = dawns_interact_data.iid AND actee.role = 'E')
SELECT *
FROM dawns_actor_actees
WHERE [WHATEVER CONSTRAINTS DAWN USED];