Differences between revisions 2 and 16 (spanning 14 versions)
Revision 2 as of 2007-02-02 17:51:54
Size: 2494
Editor: KarlPinc
Comment: Continue with initial documentation
Revision 16 as of 2008-02-27 23:07:08
Size: 7762
Editor: KarlPinc
Comment: Note use as actual documentation of this particular version of the format
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
= The Psion Data Format = = The Psion Data Format (Version 703) =

This page documents version
703 (the third version?) of the Psion's text output. This page is the definitive
reference for version 703 of the Psion data format and is linked to by the
[http://papio.biology.duke.edu/babase_system_html/ Babase: Technical Specifications for the Amboseli Baboon Project Data Management System]
document. This page also
notes what data is used and what is ignored.
Line 7: Line 14:
and that document should also be consulted. This documents version
703 (the third version?) of the Psion's text output. This document also
notes what data is used and what is ignored.
and that document should also be consulted.
Line 18: Line 23:
Line 25: Line 29:
The order of the lines in the file is significant. The occurrence of a header
line indicates the start of a new data collection session. Within each data
collection session the point and ad-lib lines appear in data acquisition
sequence and, for point lines, determines the POINT_DATA.Min value.
Line 39: Line 48:
 1 The string: {{{<HDR>}}}

 2 Sample collection information of the form: YYMMDDHHMMIII

   YY Year the sampling began. The year portion of SAMPLES.Date.

   MM Month the sampling began. The month portion of SAMPLES.Date.

   DD Day of the month the sampling began. The day portion of SAMPLES.Date.

   
HH Hour the sampling begain (24 hour time). The hour portion of SAMPLES.Stime.

   MM Minutes withing the hour the sampling began. The minutes portion of SAMPLES.Stime.

   III The initials of the person who took the sample. Ignored.

 3 Psion program version id. A PROGRAMIDS.Programid value.

 4 Psion setup id. A SETUPIDS.Setupid value.

 5 Psion id. A PALMTOPS.Palmtop value.

 6 Date the sample was taken. Format is "DD MMM YYYY", MMM is month abbreviation.

 7 Group being observed. A GROUPS.Lettercode value.

 8 Observer taking the sample. An OBSERVERS.Initials value.

 9 Focal individual. A BIOGRAPH.Sname value.

 10 The type of sample being collected, juvenile or female. Corresponds to SAMPLES.Stype. A value of {{{FEM}}} maps to a {{{F}}} SAMPLES.Stype value. A value of {{{JUV}}} maps to a {{{J}}} SAMPLES.Stype value.
 1 : The string: {{{<HDR>}}}

 2 : A unique sample identifier; a collection information of the form: YYMMDDHHMMIII

   YY : Year the sampling began. A 2 digit number. Ignored.

   MM : Month the sampling began. A 2 digit number.
The month portion of SAMPLES.Date.

   DD : Day of the month the sampling began. A 2 digit number. The day portion of SAMPLES.Date.

   HH : Hour the sampling began (24 hour time). A 2 digit number. The hour portion of SAMPLES.Stime.

   MM : Minutes withing the hour the sampling began. A 2 digit number. The minutes portion of SAMPLES.Stime.

   III : The Sname of the focal individual. Ignored.

 3 : Psion program version id. A PROGRAMIDS.Programid value.  Always {{{PTSAMPLR_JUL03}}} when the data layout matches the description on this page.

 4 : Psion setup id. A SETUPIDS.Setupid value.  Always {{{SETUP_JUL03}}} when the data layout matches the description on this page.

 5 : Psion id. A PALMTOPS.Palmtop value.

 6 : Date the sample was taken. Format is "DD MMM YYYY". DD is a 2 digit day of the month. MMM is the 3 character month abbreviation.   YYYY is a 4 digit year. The year portion is used as the year portion of SAMPLES.Date, the remainder of the data is ignored.

 7 : Group being observed. A GROUPS.Lettercode value.

 8 : Observer taking the sample. An OBSERVERS.Initials value.

 9 : Focal individual. A BIOGRAPH.Sname value.

 10 : The type of sample being collected, juvenile or female. Corresponds to SAMPLES.Stype. A value of {{{FEM}}} maps to a {{{F}}} SAMPLES.Stype value. A value of {{{JUV}}} maps to a {{{J}}} SAMPLES.Stype value.
Line 73: Line 82:
All point lines, whether recording female or juvenile samples, share some fields.

=== Special neighbor codes used in PNT lines ===

The value of {{{XXX}}} as a Sname for a neighbor code means that there is no neighbor of that sort.

Neighbor codes in the Psion may also contain values found in UNKSNAMES.Unksname column, in which case the data is stored in the NEIGHBORS.Unksname column rather than the NEIGHBORS.Sname column.

=== Sample (Juvenile) Point line ===

{{{
<PNT>,0602011512EAS,15:19:44,R2,YAN,VEB,XXX,}}}

=== Common Field layout ===

The fields common to all point lines are:

 1 : The string: {{{<PNT>}}}

 2 : A unique sample identifer of the same form as the header line. Ignored except for error checking by psionload program.

 3 : The time the point was recorded (not the time the point was observed). In a HH:MM:SS format. A POINT_DATA.Ptime value.

   HH : The hour the point was recorded (24 hour time). A 2 digit number.

   MM : The minutes within the hour the point was recorded. A 2 digit number.

   SS : The seconds within the minute the point was recorded. A 2 digit number.

 4 : The first two characters of this field are common to both juvenile and female point samples. The first 2 characters are either Activity and posture or out of sight. Either two characters of the form AP or characters {{{OS}}}. (In the latter case there is no POINT_DATA row but the SAMPLES.Mins value is incremented for the line.)

    A : Activity code. A POINTS.Activity value.

    P : Posture code. A POINTS.Posture value.

 5 : First neighbor. A NEIGHBORS.Sname value with a related Ncode value of {{{1}}}.

 8 : Foodcode or no data. A POINT_DATA.Foodcode value.

==== Juvenile Field layout ====

The fields unique to the Juvenile samples are:

 4 : This field contains no data that is unique to juvenile samples.

 6 : 2nd nearest neighbor. A NEIGHBORS.Sname value with a related Ncode of {{{2}}}.

 7 : 3rd nearest neighbor. A NEIGHBORS.Sname value with a related Ncode of {{{3}}}.

=== Female Point lines ===

Female point lines contain information not present in Juvenile point lines.

==== Sample Female Point line ====

{{{
<PNT>,0602011523DUN,15:24:41,R2OS,EVA,EVA,XXX,}}}

==== Female Field layout ====

The fields unique to female samples are:

 4 : This field contains shares it's first two characters with Juvenile point lines, including the possibility of recording "no data" by using {{{OS}}}. The field adds an additional 2 characters resulting in a 4 character format that looks like: APCS

   C : Kid contact information. A FPOINTS.Kidcontact value.

   S : Kid suckling information. A FPOINTS.Kidsuckle value.

 6 : Nearest adult neighbor. A NEIGHBORS.Sname value with a related Ncode of {{{A}}}.

 7 : Other adult neighbor. A NEIGHBORS.Sname value with a related Ncode of {{{O}}}.
Line 74: Line 155:

The Ad-lib lines are of 6 different types, as determined by the content of field 4. Groomings, agnonisims, requests to groom, and approches are recorded in the INTERACT_DATA and PARTS tables. The other types, consortships and unspecified, are recorded in the ALLMISCS table.

=== Sample ===

{{{
<ADL>,0602011512EAS,15:20:36,G,ELD,G,EAS}}}

=== Common Field layout ===

 1 : The string: {{{<ADL>}}}

 2 : A unique sample identifer of the same form as the header line. Ignored except for error checking in the psionload program.

 3 : The time the observation was recorded. The same format as the point line field 3 documented above.

 4 : The observation type.

   G : Grooming.

   A : Agonisim.

   R : Request to groom.

   P : Approach.

   C : Consortship.

   U : Unspecified.

   O : Other Social.

=== Type G, A, R, or P Field layout ===

 3 : The time. The INTERACT_DATA.Start and Stop value, both.

 5 : The actor. A PARTS.Sname value which has a releated Role of {{{R}}}.

 6 : The act. A PARTS.Act value.

 7 : The acteee. A PARTS.Sname value which has a related Role of {{{E}}}.

=== Non-G, A, R, and P Field layout ===

 3 : The time. An ALLMISCS.Atime value.

 fields 4 and up : The entire end of the adlib line is made the ALLMISCS.Txt value.

The Psion Data Format (Version 703)

This page documents version 703 (the third version?) of the Psion's text output. This page is the definitive reference for version 703 of the Psion data format and is linked to by the [http://papio.biology.duke.edu/babase_system_html/ Babase: Technical Specifications for the Amboseli Baboon Project Data Management System] document. This page also notes what data is used and what is ignored.

The Psion hand-held data collection devices output the data as text.

The format of the collected data and the output format has changed over time. The data format is closely tied to the field data collectcion protocols and that document should also be consulted.

Overall Structure

Psion files consist of ASCII text. The end-of-line sequence is carrage-return, line-feed, corresponding to the MS-DOS text file conventions. Contrary to the MS-DOS convention, the files do not seem to use the Ctrl-Z character as an end-of-file marker. Commas are used within the lines to delimit data.

Each line is one of 3 formats:

  • A header line. These begin with <HDR>.

  • A point line. These begin with <PNT>.

  • A ad-lib line. These begin with <ADL>.

The order of the lines in the file is significant. The occurrence of a header line indicates the start of a new data collection session. Within each data collection session the point and ad-lib lines appear in data acquisition sequence and, for point lines, determines the POINT_DATA.Min value.

Header lines

Header lines indicate the start of a set of data points. The file should begin with a header line.

Sample

<HDR>,0602011300VEH,PTSAMPLR_JUL03,SETUP_JUL03,2,01 Feb 2006,VIO,SNS,VEH,JUV

Field layout

The fields of the header lines are:

  • 1 : The string: <HDR> 2 : A unique sample identifier; a collection information of the form: YYMMDDHHMMIII

    • YY : Year the sampling began. A 2 digit number. Ignored. MM : Month the sampling began. A 2 digit number. The month portion of SAMPLES.Date. DD : Day of the month the sampling began. A 2 digit number. The day portion of SAMPLES.Date. HH : Hour the sampling began (24 hour time). A 2 digit number. The hour portion of SAMPLES.Stime. MM : Minutes withing the hour the sampling began. A 2 digit number. The minutes portion of SAMPLES.Stime. III : The Sname of the focal individual. Ignored.

    3 : Psion program version id. A PROGRAMIDS.Programid value. Always PTSAMPLR_JUL03 when the data layout matches the description on this page.

    4 : Psion setup id. A SETUPIDS.Setupid value. Always SETUP_JUL03 when the data layout matches the description on this page. 5 : Psion id. A PALMTOPS.Palmtop value. 6 : Date the sample was taken. Format is "DD MMM YYYY". DD is a 2 digit day of the month. MMM is the 3 character month abbreviation. YYYY is a 4 digit year. The year portion is used as the year portion of SAMPLES.Date, the remainder of the data is ignored. 7 : Group being observed. A GROUPS.Lettercode value. 8 : Observer taking the sample. An OBSERVERS.Initials value. 9 : Focal individual. A BIOGRAPH.Sname value.

    10 : The type of sample being collected, juvenile or female. Corresponds to SAMPLES.Stype. A value of FEM maps to a F SAMPLES.Stype value. A value of JUV maps to a J SAMPLES.Stype value.

Point lines

All point lines, whether recording female or juvenile samples, share some fields.

Special neighbor codes used in PNT lines

The value of XXX as a Sname for a neighbor code means that there is no neighbor of that sort.

Neighbor codes in the Psion may also contain values found in UNKSNAMES.Unksname column, in which case the data is stored in the NEIGHBORS.Unksname column rather than the NEIGHBORS.Sname column.

Sample (Juvenile) Point line

<PNT>,0602011512EAS,15:19:44,R2,YAN,VEB,XXX,

Common Field layout

The fields common to all point lines are:

  • 1 : The string: <PNT> 2 : A unique sample identifer of the same form as the header line. Ignored except for error checking by psionload program.

    3 : The time the point was recorded (not the time the point was observed). In a HH:MM:SS format. A POINT_DATA.Ptime value.

    • HH : The hour the point was recorded (24 hour time). A 2 digit number. MM : The minutes within the hour the point was recorded. A 2 digit number. SS : The seconds within the minute the point was recorded. A 2 digit number.

    4 : The first two characters of this field are common to both juvenile and female point samples. The first 2 characters are either Activity and posture or out of sight. Either two characters of the form AP or characters OS. (In the latter case there is no POINT_DATA row but the SAMPLES.Mins value is incremented for the line.)

    • A : Activity code. A POINTS.Activity value. P : Posture code. A POINTS.Posture value.

    5 : First neighbor. A NEIGHBORS.Sname value with a related Ncode value of 1. 8 : Foodcode or no data. A POINT_DATA.Foodcode value.

Juvenile Field layout

The fields unique to the Juvenile samples are:

  • 4 : This field contains no data that is unique to juvenile samples.

    6 : 2nd nearest neighbor. A NEIGHBORS.Sname value with a related Ncode of 2.

    7 : 3rd nearest neighbor. A NEIGHBORS.Sname value with a related Ncode of 3.

Female Point lines

Female point lines contain information not present in Juvenile point lines.

Sample Female Point line

<PNT>,0602011523DUN,15:24:41,R2OS,EVA,EVA,XXX,

Female Field layout

The fields unique to female samples are:

  • 4 : This field contains shares it's first two characters with Juvenile point lines, including the possibility of recording "no data" by using OS. The field adds an additional 2 characters resulting in a 4 character format that looks like: APCS

    • C : Kid contact information. A FPOINTS.Kidcontact value. S : Kid suckling information. A FPOINTS.Kidsuckle value.

    6 : Nearest adult neighbor. A NEIGHBORS.Sname value with a related Ncode of A.

    7 : Other adult neighbor. A NEIGHBORS.Sname value with a related Ncode of O.

Ad-lib lines

The Ad-lib lines are of 6 different types, as determined by the content of field 4. Groomings, agnonisims, requests to groom, and approches are recorded in the INTERACT_DATA and PARTS tables. The other types, consortships and unspecified, are recorded in the ALLMISCS table.

Sample

<ADL>,0602011512EAS,15:20:36,G,ELD,G,EAS

Common Field layout

  • 1 : The string: <ADL> 2 : A unique sample identifer of the same form as the header line. Ignored except for error checking in the psionload program. 3 : The time the observation was recorded. The same format as the point line field 3 documented above. 4 : The observation type.

    • G : Grooming. A : Agonisim. R : Request to groom. P : Approach. C : Consortship. U : Unspecified. O : Other Social.

Type G, A, R, or P Field layout

  • 3 : The time. The INTERACT_DATA.Start and Stop value, both.

    5 : The actor. A PARTS.Sname value which has a releated Role of R. 6 : The act. A PARTS.Act value.

    7 : The acteee. A PARTS.Sname value which has a related Role of E.

Non-G, A, R, and P Field layout

  • 3 : The time. An ALLMISCS.Atime value. fields 4 and up : The entire end of the adlib line is made the ALLMISCS.Txt value.

PsionFormat (last edited 2012-02-07 19:17:18 by localhost)

Wiki content based upon work supported by the National Science Foundation under Grant Nos. 0323553 and 0323596. Any opinions, findings, conclusions or recommendations expressed in this material are those of the wiki contributor(s) and do not necessarily reflect the views of the National Science Foundation.