The Psion Data Format (Version 703)
This page documents version 703 (the third version?) of the Psion's text output. This page is the definitive reference for version 703 of the Psion data format and is linked to by the Babase: Technical Specifications for the Amboseli Baboon Project Data Management System document. This page also notes what data is used and what is ignored.
The Psion hand-held data collection devices output the data as text.
The format of the collected data and the output format has changed over time. The data format is closely tied to the field data collection protocols and that document should also be consulted.
Overall Structure
Psion files consist of ASCII text. The end-of-line sequence is carrage-return, line-feed, corresponding to the MS-DOS text file conventions. Contrary to the MS-DOS convention, the files do not seem to use the Ctrl-Z character as an end-of-file marker. Commas are used within the lines to delimit data.
Each line is one of 3 formats:
A header line. These begin with <HDR>.
A point line. These begin with <PNT>.
A ad-lib line. These begin with <ADL>.
The order of the lines in the file is significant. The occurrence of a header line indicates the start of a new data collection session. Within each data collection session the point and ad-lib lines appear in data acquisition sequence and, for point lines, determines the POINT_DATA.Min value.
Header lines
Header lines indicate the start of a set of data points. The file should begin with a header line.
Sample
<HDR>,0602011300VEH,PTSAMPLR_JUL03,SETUP_JUL03,2,01 Feb 2006,VIO,SNS,VEH,JUV
Field layout
The fields of the header lines are:
1 : The string: <HDR> 2 : A unique sample identifier; a collection information of the form: YYMMDDHHMMIII
- YY : Year the sampling began. A 2 digit number. Ignored. MM : Month the sampling began. A 2 digit number. The month portion of SAMPLES.Date. DD : Day of the month the sampling began. A 2 digit number. The day portion of SAMPLES.Date. HH : Hour the sampling began (24 hour time). A 2 digit number. The hour portion of SAMPLES.Stime. MM : Minutes withing the hour the sampling began. A 2 digit number. The minutes portion of SAMPLES.Stime. III : The Sname of the focal individual. Ignored.
3 : Psion program version id. A PROGRAMIDS.Programid value. Always PTSAMPLR_JUL03 when the data layout matches the description on this page.
4 : Psion setup id. A SETUPIDS.Setupid value. Always SETUP_JUL03 when the data layout matches the description on this page. 5 : Psion id. A PALMTOPS.Palmtop value. 6 : Date the sample was taken. Format is "DD MMM YYYY". DD is a 2 digit day of the month. MMM is the 3 character month abbreviation. YYYY is a 4 digit year. The year portion is used as the year portion of SAMPLES.Date, the remainder of the data is ignored. 7 : Group being observed. A GROUPS.Three_letter_code value. 8 : Observer taking the sample. An OBSERVERS.Initials value. 9 : Focal individual. A BIOGRAPH.Sname value.
10 : The type of sample being collected, juvenile or female. Corresponds to SAMPLES.Stype. A value of FEM maps to a F SAMPLES.Stype value. A value of JUV maps to a J SAMPLES.Stype value.
Point lines
All point lines, whether recording female or juvenile samples, share some fields.
Special neighbor codes used in PNT lines
The value of XXX as a Sname for a neighbor code means that there is no neighbor of that sort.
Neighbor codes in the Psion may also contain values found in UNKSNAMES.Unksname column, in which case the data is stored in the NEIGHBORS.Unksname column rather than the NEIGHBORS.Sname column.
Sample (Juvenile) Point line
<PNT>,0602011512EAS,15:19:44,R2,YAN,VEB,XXX,
Common Field layout
The fields common to all point lines are:
1 : The string: <PNT> 2 : A unique sample identifer of the same form as the header line. Ignored except for error checking by psionload program.
3 : The time the point was recorded (not the time the point was observed). In a HH:MM:SS format. A POINT_DATA.Ptime value.
- HH : The hour the point was recorded (24 hour time). A 2 digit number. MM : The minutes within the hour the point was recorded. A 2 digit number. SS : The seconds within the minute the point was recorded. A 2 digit number.
4 : The first two characters of this field are common to both juvenile and female point samples. The first 2 characters are either Activity and posture or out of sight. Either two characters of the form AP or characters OS. (In the latter case there is no POINT_DATA row but the SAMPLES.Mins value is incremented for the line.)
- A : Activity code. A POINTS.Activity value. P : Posture code. A POINTS.Posture value.
5 : First neighbor. A NEIGHBORS.Sname value with a related Ncode value of 1. 8 : Foodcode or no data. A POINT_DATA.Foodcode value.
Juvenile Field layout
The fields unique to the Juvenile samples are:
- 4 : This field contains no data that is unique to juvenile samples.
6 : 2nd nearest neighbor. A NEIGHBORS.Sname value with a related Ncode of 2.
7 : 3rd nearest neighbor. A NEIGHBORS.Sname value with a related Ncode of 3.
Female Point lines
Female point lines contain information not present in Juvenile point lines.
Sample Female Point line
<PNT>,0602011523DUN,15:24:41,R2OS,EVA,EVA,XXX,
Female Field layout
The fields unique to female samples are:
4 : This field contains shares it's first two characters with Juvenile point lines, including the possibility of recording "no data" by using OS. The field adds an additional 2 characters resulting in a 4 character format that looks like: APCS
- C : Kid contact information. A FPOINTS.Kidcontact value. S : Kid suckling information. A FPOINTS.Kidsuckle value.
6 : Nearest adult neighbor. A NEIGHBORS.Sname value with a related Ncode of A.
7 : Other adult neighbor. A NEIGHBORS.Sname value with a related Ncode of O.
Ad-lib lines
The Ad-lib lines are of 6 different types, as determined by the content of field 4. Groomings, agnonisims, requests to groom, and approches are recorded in the INTERACT_DATA and PARTS tables. The other types, consortships and unspecified, are recorded in the ALLMISCS table.
Sample
<ADL>,0602011512EAS,15:20:36,G,ELD,G,EAS
Common Field layout
1 : The string: <ADL> 2 : A unique sample identifer of the same form as the header line. Ignored except for error checking in the psionload program. 3 : The time the observation was recorded. The same format as the point line field 3 documented above. 4 : The observation type.
- G : Grooming. A : Agonisim. R : Request to groom. P : Approach. C : Consortship. U : Unspecified. O : Other Social.
Type G, A, R, or P Field layout
- 3 : The time. The INTERACT_DATA.Start and Stop value, both.
5 : The actor. A PARTS.Sname value which has a releated Role of R. 6 : The act. A PARTS.Act value.
7 : The acteee. A PARTS.Sname value which has a related Role of E.
Non-G, A, R, and P Field layout
- 3 : The time. An ALLMISCS.Atime value. fields 4 and up : The entire end of the adlib line is made the ALLMISCS.Txt value.