Task 264: .GO File Format

Task 264: .GO File Format

1. Properties of the .GO File Format Intrinsic to Its File System

The .GO file format corresponds to the OBO flat file format utilized for representing the Gene Ontology (GO), a structured vocabulary for biological functions. This format is text-based and consists of a header section followed by stanzas that describe ontology elements. The specifications are detailed in the OBO Flat File Format Guide, version 1.4. The properties (tags) are key-value pairs that define metadata and ontology structures. These are intrinsic to the format's structure, enabling the representation of ontologies in a file system-compatible manner.

The file structure includes:

  • A header with tag-value pairs providing ontology metadata.
  • Zero or more stanzas, each starting with a type in square brackets (e.g., [Term]), followed by tag-value pairs.
  • Supported stanza types: [Term] (for classes), [Typedef] (for relationship types), and [Instance] (for instances).
  • Lines are parsed as tag: value, with optional modifiers and comments (ignored after unescaped !).
  • Required elements include the format-version header tag and id in each stanza.

Below is a comprehensive enumeration of all properties (tags), categorized by header and stanza type, including descriptions, cardinality, and examples where applicable (derived from the format guide).

Header Tags

These appear at the file's beginning and apply to the entire ontology.

Tag Name Description Cardinality Example
format-version Specifies the OBO specification version used by the file. Required (0 or 1) format-version: 1.4
data-version Indicates the version of the ontology data. Optional (0 or 1) data-version: releases/2025-09-01
date Records the file creation or update date in dd:MM:yyyy HH:mm format. Optional (0 or 1) date: 21:09:2025 14:30
saved-by Identifies the user who last saved the file. Optional (0 or 1) saved-by: jsmith
auto-generated-by Names the program that generated the file. Optional (0 or 1) auto-generated-by: OBO-Edit 2.3
import References an external OBO or OWL document to import. Optional (any) import: http://purl.obolibrary.org/obo/cl.obo
subsetdef Defines a subset with a name and description. Optional (any) subsetdef: goslim "GO slim"
synonymtypedef Defines a custom synonym type with name, description, and optional scope. Optional (any) synonymtypedef: systematic_synonym "Systematic synonym" EXACT
default-namespace Sets the default namespace for IDs. Optional (0 or 1) default-namespace: gene_ontology
namespace-id-rule Defines a rule for generating namespace-specific IDs. Optional (any) namespace-id-rule: * GO:$sequence(7,0,9999999)$
idspace Maps a local ID space to a global one with optional description and URL. Optional (any) idspace: CL http://purl.obolibrary.org/obo/cl.owl
treat-xrefs-as-equivalent Treats xrefs from a prefix as equivalent terms. Optional (any) treat-xrefs-as-equivalent: UBERON
treat-xrefs-as-genus-differentia Treats xrefs as genus-differentia axioms. Optional (any) treat-xrefs-as-genus-differentia: CL cell located_in $
treat-xrefs-as-relationship Treats xrefs as specific relationships. Optional (any) treat-xrefs-as-relationship: part_of CL
treat-xrefs-as-is_a Treats xrefs as is_a relationships. Optional (any) treat-xrefs-as-is_a: CL
treat-xrefs-as-has-subclass Treats xrefs as has-subclass relationships. Optional (any) treat-xrefs-as-has-subclass: TAX
treat-xrefs-as-reverse-genus-differentia Treats xrefs as reverse genus-differentia. Optional (any) treat-xrefs-as-reverse-genus-differentia: TAX organism classification
ontology Specifies the ontology ID space. Optional (0 or 1) ontology: go
remark Provides general comments or notes. Optional (any) remark: This is a note.
property_value Adds arbitrary property-value pairs with optional type and description. Optional (any) property_value: seeAlso http://www.geneontology.org
owl-axioms Embeds OWL axioms in Manchester syntax. Optional (any) owl-axioms: EquivalentClasses(http://example.org/A http://example.org/B)

Stanza Tags for [Term]

These describe ontology classes (terms).

Tag Name Description Cardinality Example
id Unique identifier for the term (required). Required (1) id: GO:0000001
name Human-readable name (required unless obsolete). Required (0 or 1) name: mitochondrion inheritance
namespace Namespace for the term (required). Required (1) namespace: biological_process
alt_id Alternative ID. Optional (any) alt_id: GO:0000004
def Definition with quote-enclosed text and dbxrefs. Optional (0 or 1) def: "The something." [GO:curators]
comment Additional notes. Optional (0 or 1) comment: This is a comment.
subset Assigns to a subset. Optional (any) subset: goslim_generic
synonym Synonym with quote-enclosed text, scope, type, and dbxrefs. Optional (any) synonym: "reproduction" EXACT []
xref Cross-reference to external resources. Optional (any) xref: Wikipedia:Mitochondrion
is_a Subclass relationship. Optional (any) is_a: GO:0008150
intersection_of Logical intersection definition. Optional (any) intersection_of: GO:0006810 ! transport
union_of Logical union definition. Optional (any) union_of: GO:0000001
disjoint_from Disjointness axiom. Optional (any) disjoint_from: GO:0000002
relationship Typed relationship to another term. Optional (any) relationship: part_of GO:0008150
is_obsolete Marks term as obsolete. Optional (0 or 1) is_obsolete: true
replaced_by Replacement for obsolete term. Optional (any) replaced_by: GO:0000003
consider Suggestion for obsolete term replacement. Optional (any) consider: GO:0000004
created_by Creator of the term. Optional (0 or 1) created_by: jsmith
creation_date Creation date in ISO 8601 format. Optional (0 or 1) creation_date: 2009-04-13T01:32:36Z
property_value Arbitrary property-value pair. Optional (any) property_value: seeAlso http://example.org

Stanza Tags for [Typedef]

These describe relationship types.

Tag Name Description Cardinality Example
id Unique identifier (required). Required (1) id: part_of
name Human-readable name. Required (0 or 1) name: part of
namespace Namespace (required). Required (1) namespace: relations
alt_id Alternative ID. Optional (any) alt_id: is_part_of
def Definition. Optional (0 or 1) def: "A part-of relationship." []
comment Notes. Optional (0 or 1) comment: Transitive.
subset Subset assignment. Optional (any) subset: relation_slim
synonym Synonym. Optional (any) synonym: "parte de" EXACT []
xref Cross-reference. Optional (any) xref: RO:0000001
domain Domain class. Optional (0 or 1) domain: GO:0005575
range Range class. Optional (0 or 1) range: GO:0005575
is_anti_symmetric Anti-symmetric property. Optional (0 or 1) is_anti_symmetric: true
is_cyclic Cyclic property. Optional (0 or 1) is_cyclic: true
is_reflexive Reflexive property. Optional (0 or 1) is_reflexive: true
is_symmetric Symmetric property. Optional (0 or 1) is_symmetric: true
is_transitive Transitive property. Optional (0 or 1) is_transitive: true
is_functional Functional property. Optional (0 or 1) is_functional: true
is_inverse_functional Inverse functional property. Optional (0 or 1) is_inverse_functional: true
is_a Subproperty relationship. Optional (any) is_a: OBO_REL:part_of
intersection_of Intersection definition. Optional (any) intersection_of: OBO_REL:part_of
union_of Union definition. Optional (any) union_of: OBO_REL:part_of
disjoint_from Disjointness. Optional (any) disjoint_from: OBO_REL:has_part
inverse_of Inverse relationship. Optional (0 or 1) inverse_of: has_part
transitive_over Transitive over chain. Optional (any) transitive_over: part_of
holds_over_chain Holds over chain axiom. Optional (any) holds_over_chain: part_of part_of
equivalent_to_chain Equivalent to chain axiom. Optional (any) equivalent_to_chain: part_of has_part
disjoint_over Disjoint over axiom. Optional (any) disjoint_over: part_of
relationship Nested relationship. Optional (any) relationship: inverse_of has_part
is_obsolete Obsolete flag. Optional (0 or 1) is_obsolete: true
replaced_by Replacement. Optional (any) replaced_by: RO:0001025
consider Suggestion. Optional (any) consider: RO:0001025
created_by Creator. Optional (0 or 1) created_by: jsmith
creation_date Creation date. Optional (0 or 1) creation_date: 2009-04-13T01:32:36Z
property_value Property-value. Optional (any) property_value: seeAlso http://example.org
expand_assertion_to Macro expansion for assertions. Optional (any) expand_assertion_to: "Class: ?X EquivalentTo: ?Y"
expand_expression_to Macro expansion for expressions. Optional (any) expand_expression_to: "Class: ?X EquivalentTo: ?Y"
is_metadata_tag Marks as metadata. Optional (0 or 1) is_metadata_tag: true
is_class_level Class-level tag. Optional (0 or 1) is_class_level: true

Stanza Tags for [Instance]

These describe instances.

Tag Name Description Cardinality Example
id Unique identifier (required). Required (1) id: GO:0000001_instance
name Name. Optional (0 or 1) name: specific instance
namespace Namespace (required). Required (1) namespace: gene_ontology
alt_id Alternative ID. Optional (any) alt_id: GOインスタンス
def Definition. Optional (0 or 1) def: "Instance def." []
comment Notes. Optional (0 or 1) comment: Example instance.
synonym Synonym. Optional (any) synonym: "inst synonym" EXACT []
xref Cross-reference. Optional (any) xref: PMID:1234567
instance_of Type instantiation. Required (1 or more) instance_of: GO:0008150
property_value Property-value pair (used for relations). Optional (any) property_value: part_of GO:0008150
is_obsolete Obsolete flag. Optional (0 or 1) is_obsolete: true
replaced_by Replacement. Optional (any) replaced_by: GOインスタンス2
consider Suggestion. Optional (any) consider: GOインスタンス3
created_by Creator. Optional (0 or 1) created_by: jsmith
creation_date Creation date. Optional (0 or 1) creation_date: 2009-04-13T01:32:36Z

Direct download links for example .GO files (in OBO format for Gene Ontology) are as follows:

These files contain the full Gene Ontology in the specified format.

3. Ghost Blog Embedded HTML JavaScript for Drag-and-Drop .GO File Dump

The following is an embeddable HTML snippet with JavaScript suitable for a Ghost blog post. It enables users to drag and drop a .GO (OBO) file, parses it, and dumps all properties (header and stanza tags with values) to the screen in a structured list.

Drag and drop a .GO file here

4. Python Class for Handling .GO Files

The following Python class opens a .GO (OBO) file, decodes (parses) its content, reads the properties, allows writing back to a new file, and prints properties to the console.

class GoFile:
    def __init__(self, filename):
        self.filename = filename
        self.header = {}
        self.stanzas = []
        self.parse()

    def parse(self):
        with open(self.filename, 'r') as f:
            content = f.read()
        lines = [line.strip() for line in content.split('\n') if line.strip() and not line.strip().startswith('!')]
        current_stanza = None
        in_header = True
        for line in lines:
            if line.startswith('['):
                in_header = False
                if current_stanza:
                    self.stanzas.append(current_stanza)
                current_stanza = {'type': line, 'tags': {}}
            else:
                parts = line.split(':', 1)
                if len(parts) == 2:
                    tag = parts[0].strip()
                    value = parts[1].split('!')[0].strip()  # Ignore comments
                    if in_header:
                        if tag not in self.header:
                            self.header[tag] = []
                        self.header[tag].append(value)
                    elif current_stanza:
                        if tag not in current_stanza['tags']:
                            current_stanza['tags'][tag] = []
                        current_stanza['tags'][tag].append(value)
        if current_stanza:
            self.stanzas.append(current_stanza)

    def get_properties(self):
        return {'header': self.header, 'stanzas': self.stanzas}

    def print_properties(self):
        print("Header Properties:")
        for tag, values in self.header.items():
            for value in values:
                print(f"{tag}: {value}")
        print("\nStanzas:")
        for i, stanza in enumerate(self.stanzas, 1):
            print(f"Stanza {i}: {stanza['type']}")
            for tag, values in stanza['tags'].items():
                for value in values:
                    print(f"  {tag}: {value}")

    def write(self, output_filename):
        with open(output_filename, 'w') as f:
            # Write header
            for tag, values in self.header.items():
                for value in values:
                    f.write(f"{tag}: {value}\n")
            f.write("\n")
            # Write stanzas
            for stanza in self.stanzas:
                f.write(f"{stanza['type']}\n")
                for tag, values in stanza['tags'].items():
                    for value in values:
                        f.write(f"{tag}: {value}\n")
                f.write("\n")

5. Java Class for Handling .GO Files

The following Java class opens a .GO (OBO) file, decodes its content, reads properties, allows writing to a new file, and prints properties to the console.

import java.io.*;
import java.util.*;

public class GoFile {
    private String filename;
    private Map<String, List<String>> header = new LinkedHashMap<>();
    private List<Map<String, Object>> stanzas = new ArrayList<>();

    public GoFile(String filename) {
        this.filename = filename;
        parse();
    }

    private void parse() {
        try (BufferedReader reader = new BufferedReader(new FileReader(filename))) {
            String line;
            Map<String, List<String>> currentStanzaTags = null;
            String currentStanzaType = null;
            boolean inHeader = true;
            while ((line = reader.readLine()) != null) {
                line = line.trim();
                if (line.isEmpty() || line.startsWith("!")) continue;
                if (line.startsWith("[")) {
                    inHeader = false;
                    if (currentStanzaType != null) {
                        Map<String, Object> stanza = new HashMap<>();
                        stanza.put("type", currentStanzaType);
                        stanza.put("tags", currentStanzaTags);
                        stanzas.add(stanza);
                    }
                    currentStanzaType = line;
                    currentStanzaTags = new LinkedHashMap<>();
                } else {
                    String[] parts = line.split(":", 2);
                    if (parts.length == 2) {
                        String tag = parts[0].trim();
                        String value = parts[1].split("!")[0].trim();
                        if (inHeader) {
                            header.computeIfAbsent(tag, k -> new ArrayList<>()).add(value);
                        } else if (currentStanzaTags != null) {
                            currentStanzaTags.computeIfAbsent(tag, k -> new ArrayList<>()).add(value);
                        }
                    }
                }
            }
            if (currentStanzaType != null) {
                Map<String, Object> stanza = new HashMap<>();
                stanza.put("type", currentStanzaType);
                stanza.put("tags", currentStanzaTags);
                stanzas.add(stanza);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    public Map<String, Object> getProperties() {
        Map<String, Object> properties = new HashMap<>();
        properties.put("header", header);
        properties.put("stanzas", stanzas);
        return properties;
    }

    public void printProperties() {
        System.out.println("Header Properties:");
        for (Map.Entry<String, List<String>> entry : header.entrySet()) {
            for (String value : entry.getValue()) {
                System.out.println(entry.getKey() + ": " + value);
            }
        }
        System.out.println("\nStanzas:");
        for (int i = 0; i < stanzas.size(); i++) {
            Map<String, Object> stanza = stanzas.get(i);
            System.out.println("Stanza " + (i + 1) + ": " + stanza.get("type"));
            @SuppressWarnings("unchecked")
            Map<String, List<String>> tags = (Map<String, List<String>>) stanza.get("tags");
            for (Map.Entry<String, List<String>> entry : tags.entrySet()) {
                for (String value : entry.getValue()) {
                    System.out.println("  " + entry.getKey() + ": " + value);
                }
            }
        }
    }

    public void write(String outputFilename) {
        try (PrintWriter writer = new PrintWriter(new File(outputFilename))) {
            // Write header
            for (Map.Entry<String, List<String>> entry : header.entrySet()) {
                for (String value : entry.getValue()) {
                    writer.println(entry.getKey() + ": " + value);
                }
            }
            writer.println();
            // Write stanzas
            for (Map<String, Object> stanza : stanzas) {
                writer.println(stanza.get("type"));
                @SuppressWarnings("unchecked")
                Map<String, List<String>> tags = (Map<String, List<String>>) stanza.get("tags");
                for (Map.Entry<String, List<String>> entry : tags.entrySet()) {
                    for (String value : entry.getValue()) {
                        writer.println(entry.getKey() + ": " + value);
                    }
                }
                writer.println();
            }
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
    }
}

6. JavaScript Class for Handling .GO Files

The following JavaScript class opens a .GO (OBO) file (using Node.js fs module), decodes its content, reads properties, allows writing to a new file, and prints properties to the console.

const fs = require('fs');

class GoFile {
  constructor(filename) {
    this.filename = filename;
    this.header = {};
    this.stanzas = [];
    this.parse();
  }

  parse() {
    const content = fs.readFileSync(this.filename, 'utf8');
    const lines = content.split('\n').filter(line => line.trim() && !line.trim().startsWith('!'));
    let currentStanza = null;
    let inHeader = true;
    lines.forEach(line => {
      if (line.startsWith('[')) {
        inHeader = false;
        if (currentStanza) this.stanzas.push(currentStanza);
        currentStanza = { type: line, tags: {} };
      } else {
        const match = line.match(/^([^:]+):\s*(.*)/);
        if (match) {
          const tag = match[1].trim();
          const value = match[2].split('!')[0].trim();
          if (inHeader) {
            if (!this.header[tag]) this.header[tag] = [];
            this.header[tag].push(value);
          } else if (currentStanza) {
            if (!currentStanza.tags[tag]) currentStanza.tags[tag] = [];
            currentStanza.tags[tag].push(value);
          }
        }
      }
    });
    if (currentStanza) this.stanzas.push(currentStanza);
  }

  getProperties() {
    return { header: this.header, stanzas: this.stanzas };
  }

  printProperties() {
    console.log('Header Properties:');
    for (const [tag, values] of Object.entries(this.header)) {
      values.forEach(value => console.log(`${tag}: ${value}`));
    }
    console.log('\nStanzas:');
    this.stanzas.forEach((stanza, index) => {
      console.log(`Stanza ${index + 1}: ${stanza.type}`);
      for (const [tag, values] of Object.entries(stanza.tags)) {
        values.forEach(value => console.log(`  ${tag}: ${value}`));
      }
    });
  }

  write(outputFilename) {
    let output = '';
    // Write header
    for (const [tag, values] of Object.entries(this.header)) {
      values.forEach(value => output += `${tag}: ${value}\n`);
    }
    output += '\n';
    // Write stanzas
    this.stanzas.forEach(stanza => {
      output += `${stanza.type}\n`;
      for (const [tag, values] of Object.entries(stanza.tags)) {
        values.forEach(value => output += `${tag}: ${value}\n`);
      }
      output += '\n';
    });
    fs.writeFileSync(outputFilename, output);
  }
}

7. C++ Class for Handling .GO Files

The following C++ class opens a .GO (OBO) file, decodes its content, reads properties, allows writing to a new file, and prints properties to the console.

#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>
#include <map>
#include <string>

class GoFile {
private:
    std::string filename;
    std::map<std::string, std::vector<std::string>> header;
    std::vector<std::map<std::string, std::string>> stanzas;  // type + tags (tags as map<string, vector<string>>)

public:
    GoFile(const std::string& fn) : filename(fn) {
        parse();
    }

    void parse() {
        std::ifstream file(filename);
        if (!file.is_open()) {
            std::cerr << "Error opening file: " << filename << std::endl;
            return;
        }
        std::string line;
        std::map<std::string, std::vector<std::string>> currentStanzaTags;
        std::string currentStanzaType;
        bool inHeader = true;
        while (std::getline(file, line)) {
            line.erase(0, line.find_first_not_of(" \t"));  // Trim left
            line.erase(line.find_last_not_of(" \t") + 1);  // Trim right
            if (line.empty() || line[0] == '!') continue;
            if (line[0] == '[') {
                inHeader = false;
                if (!currentStanzaType.empty()) {
                    std::map<std::string, std::string> stanza;
                    stanza["type"] = currentStanzaType;
                    for (const auto& pair : currentStanzaTags) {
                        std::stringstream ss;
                        for (const auto& val : pair.second) {
                            ss << val << "; ";
                        }
                        stanza[pair.first] = ss.str();
                    }
                    stanzas.push_back(stanza);
                }
                currentStanzaType = line;
                currentStanzaTags.clear();
            } else {
                size_t colonPos = line.find(':');
                if (colonPos != std::string::npos) {
                    std::string tag = line.substr(0, colonPos);
                    tag.erase(tag.find_last_not_of(" \t") + 1);  // Trim
                    std::string value = line.substr(colonPos + 1);
                    value.erase(0, value.find_first_not_of(" \t"));
                    size_t commentPos = value.find('!');
                    if (commentPos != std::string::npos) value = value.substr(0, commentPos);
                    value.erase(value.find_last_not_of(" \t") + 1);
                    if (inHeader) {
                        header[tag].push_back(value);
                    } else {
                        currentStanzaTags[tag].push_back(value);
                    }
                }
            }
        }
        if (!currentStanzaType.empty()) {
            std::map<std::string, std::string> stanza;
            stanza["type"] = currentStanzaType;
            for (const auto& pair : currentStanzaTags) {
                std::stringstream ss;
                for (const auto& val : pair.second) {
                    ss << val << "; ";
                }
                stanza[pair.first] = ss.str();
            }
            stanzas.push_back(stanza);
        }
        file.close();
    }

    std::map<std::string, std::vector<std::string>> getHeader() const {
        return header;
    }

    std::vector<std::map<std::string, std::string>> getStanzas() const {
        return stanzas;
    }

    void printProperties() const {
        std::cout << "Header Properties:" << std::endl;
        for (const auto& pair : header) {
            for (const auto& value : pair.second) {
                std::cout << pair.first << ": " << value << std::endl;
            }
        }
        std::cout << "\nStanzas:" << std::endl;
        for (size_t i = 0; i < stanzas.size(); ++i) {
            const auto& stanza = stanzas[i];
            std::cout << "Stanza " << (i + 1) << ": " << stanza.at("type") << std::endl;
            for (const auto& pair : stanza) {
                if (pair.first != "type") {
                    std::cout << "  " << pair.first << ": " << pair.second << std::endl;
                }
            }
        }
    }

    void write(const std::string& outputFilename) const {
        std::ofstream outFile(outputFilename);
        if (!outFile.is_open()) {
            std::cerr << "Error opening output file: " << outputFilename << std::endl;
            return;
        }
        // Write header
        for (const auto& pair : header) {
            for (const auto& value : pair.second) {
                outFile << pair.first << ": " << value << "\n";
            }
        }
        outFile << "\n";
        // Write stanzas
        for (const auto& stanza : stanzas) {
            outFile << stanza.at("type") << "\n";
            for (const auto& pair : stanza) {
                if (pair.first != "type") {
                    // Split semi-colon for multi-values
                    std::string values = pair.second;
                    size_t pos = 0;
                    while ((pos = values.find("; ")) != std::string::npos) {
                        outFile << pair.first << ": " << values.substr(0, pos) << "\n";
                        values.erase(0, pos + 2);
                    }
                    if (!values.empty()) outFile << pair.first << ": " << values << "\n";
                }
            }
            outFile << "\n";
        }
        outFile.close();
    }
};