NAME

update_obo_oa_ontologies.pl - script to create, populate, or update postgres tables for ontology annotator obotables.


SYNOPSIS

Edit the array of users to grant permission to (both 'select' and 'all'), edit obotable to URL hash entries in %obos hash, add optional code for specific obotable types, then run with

  ./update_obo_oa_ontologies.pl


DESCRIPTION

The ontology_annotator.cgi allows .obo files to be generically parsed into postgres tables for obotables used in fields of type 'ontology' or 'multiontology'.

.obo data changes routinely, so this script can run on a cronjob to update data when the obo files's 'date:' line has changed.

SCRIPT REQUIREMENTS

Create a directory to store the last version of each obo file. Change the path to it in the $directory variable.

Edit array of postgres database users to grant permission to (both 'select' and 'all').

Edit %obos hash for mappings of obotable to URL of .obo file.

CREATE TABLES

If creating an obotable type for the first time:

TERM INFO OBO TREE BROWSING

The script can be edited to add custom changes for specific obotables, such as parsing names, IDs, adding URL links in term information, creating obo tree links to browse the term info obo structure.

When creating obo tree links to browse the term info obo structure:

%children are populated by matching on 'is_a:' and 'relationship: part_of' tags in .obo file

SCRIPT FUNCTION

For each obotable .obo file compare date of downloaded .obo file with date of last .obo file used to populate postgres tables ; if the date is more recent, delete all data from tables, populate from new file, and write file to flatfile for future comparison.

The downloaded data file is split on '[Term]'. Id is a match on 'id: ' to the newline. Name is a match on 'name: ' to the newline. Synonyms are matches on 'synonym: "<match>"'. Data is the whole entry. Data lines are split, for each line the tag is anything up to the first colon, and it has a span html element tag added to bold it. There is a single entry for a given term id for name and data, but there can be multiple entries for synonyms, one for each synonym.