Task 475: .ODS File Format

Task 475: .ODS File Format

1. Properties of the .ODS File Format Intrinsic to Its File System

The .ODS file format adheres to the OpenDocument Spreadsheet specification, which is a ZIP-compressed archive containing XML files. The properties intrinsic to the format, particularly the metadata stored in the meta.xml file within the archive, include the following elements defined under the <office:meta> root. These are standardized in the OASIS OpenDocument v1.2 schema and apply to spreadsheet documents, with some elements (e.g., document statistics) providing spreadsheet-specific details such as table and cell counts.

  • meta:generator: Identifies the software application that generated the document.
  • dc:title: Specifies the title of the document.
  • dc:description: Provides a textual description of the document content.
  • dc:subject: Indicates the subject or category of the document.
  • meta:keyword: Defines individual keywords associated with the document (multiple instances permitted).
  • meta:initial-creator: Records the name of the initial creator of the document.
  • dc:creator: Specifies the name of the last modifier or current creator.
  • meta:printed-by: Identifies the user who last printed the document.
  • meta:creation-date: Records the date and time of document creation (in ISO 8601 format).
  • dc:date: Indicates the date and time of the last modification (in ISO 8601 format).
  • meta:print-date: Records the date and time of the last print operation (in ISO 8601 format).
  • meta:template: References the path or name of the template used for the document.
  • meta:auto-reload: Specifies a flag for automatic reloading of the document from its source.
  • meta:hyperlink-behaviour: Defines the behavior for handling hyperlinks within the document.
  • dc:language: Indicates the primary language of the document.
  • meta:editing-cycles: Counts the number of editing sessions performed on the document.
  • meta:editing-duration: Measures the total duration of editing time (in ISO 8601 duration format).
  • meta:document-statistic: Provides statistical information about the document, with attributes including:
  • meta:table-count (number of tables, relevant to spreadsheets).
  • meta:cell-count (number of cells, relevant to spreadsheets).
  • meta:row-count (number of rows, relevant to spreadsheets).
  • meta:column-count (number of columns, relevant to spreadsheets).
  • meta:object-count (number of embedded objects).
  • meta:user-defined: Allows custom metadata fields, with a required meta:name attribute for the field name.

These properties are extracted from the official OASIS specification and represent the core metadata embedded in the file structure.

The following are two direct download links to sample .ODS files, sourced from publicly available repositories for testing purposes:

3. HTML JavaScript for Drag-and-Drop .ODS File Processing

The following is a self-contained HTML document with embedded JavaScript that enables drag-and-drop functionality for .ODS files. It utilizes the JSZip library (included via CDN) to unzip the file, extracts the meta.xml content, parses it using DOMParser, and displays the metadata properties listed above on the screen. Ensure the JSZip library is accessible for proper functioning.

ODS Metadata Dumper
Drag and drop an .ODS file here

4. Python Class for .ODS File Processing

The following Python class uses the zipfile and xml.etree.ElementTree modules to open, decode, read, modify (write), and print the metadata properties from an .ODS file to the console. It assumes the file is a valid .ODS archive.

import zipfile
import xml.etree.ElementTree as ET
from io import BytesIO
from datetime import datetime

class ODSMetadataHandler:
    NAMESPACES = {
        'office': 'urn:oasis:names:tc:opendocument:xmlns:office:1.0',
        'meta': 'urn:oasis:names:tc:opendocument:xmlns:meta:1.0',
        'dc': 'http://purl.org/dc/elements/1.1/'
    }

    def __init__(self, file_path):
        self.file_path = file_path
        self.meta_tree = None
        self.meta_root = None
        self._load_meta()

    def _load_meta(self):
        with zipfile.ZipFile(self.file_path, 'r') as zf:
            if 'meta.xml' in zf.namelist():
                meta_content = zf.read('meta.xml')
                self.meta_tree = ET.parse(BytesIO(meta_content))
                self.meta_root = self.meta_tree.find('office:meta', self.NAMESPACES)
            else:
                raise ValueError('Invalid .ODS file: meta.xml not found.')

    def read_properties(self):
        properties = {}
        elements = [
            'meta:generator', 'dc:title', 'dc:description', 'dc:subject',
            'meta:keyword', 'meta:initial-creator', 'dc:creator', 'meta:printed-by',
            'meta:creation-date', 'dc:date', 'meta:print-date', 'meta:template',
            'meta:auto-reload', 'meta:hyperlink-behaviour', 'dc:language',
            'meta:editing-cycles', 'meta:editing-duration'
        ]
        for tag in elements:
            elem = self.meta_root.find(tag, self.NAMESPACES)
            if elem is not None:
                properties[tag] = elem.text

        stats = self.meta_root.find('meta:document-statistic', self.NAMESPACES)
        if stats is not None:
            for attr in ['meta:table-count', 'meta:cell-count', 'meta:row-count', 'meta:column-count', 'meta:object-count']:
                if attr in stats.attrib:
                    properties[attr] = stats.attrib[attr]

        user_defined = self.meta_root.findall('meta:user-defined', self.NAMESPACES)
        for i, ud in enumerate(user_defined):
            name = ud.attrib.get('{urn:oasis:names:tc:opendocument:xmlns:meta:1.0}name')
            properties[f'meta:user-defined[{i}] ({name})'] = ud.text

        return properties

    def print_properties(self):
        properties = self.read_properties()
        for key, value in properties.items():
            print(f'{key}: {value}')

    def write_property(self, key, value):
        if key.startswith('meta:user-defined'):
            # Handle user-defined separately if needed; simplified here
            pass
        else:
            elem = self.meta_root.find(key, self.NAMESPACES)
            if elem is not None:
                elem.text = value
            else:
                ns, tag = key.split(':') if ':' in key else ('meta', key)
                ET.SubElement(self.meta_root, f'{{{self.NAMESPACES[ns]}}}{tag}').text = value

    def save(self, output_path=None):
        output_path = output_path or self.file_path
        with zipfile.ZipFile(self.file_path, 'r') as zf_in:
            with zipfile.ZipFile(output_path, 'w') as zf_out:
                for item in zf_in.infolist():
                    if item.filename == 'meta.xml':
                        meta_xml = ET.tostring(self.meta_tree.getroot(), encoding='utf-8', method='xml')
                        zf_out.writestr('meta.xml', meta_xml)
                    else:
                        zf_out.writestr(item.filename, zf_in.read(item.filename))

# Example usage:
# handler = ODSMetadataHandler('path/to/file.ods')
# handler.print_properties()
# handler.write_property('dc:title', 'Updated Title')
# handler.save()

5. Java Class for .ODS File Processing

The following Java class uses java.util.zip and javax.xml.parsers to handle .ODS files, reading, modifying, and printing metadata properties to the console.

import java.io.*;
import java.util.zip.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.*;
import java.util.HashMap;
import java.util.Map;

public class ODSMetadataHandler {
    private static final String[] NAMESPACES = {"office", "urn:oasis:names:tc:opendocument:xmlns:office:1.0",
                                                "meta", "urn:oasis:names:tc:opendocument:xmlns:meta:1.0",
                                                "dc", "http://purl.org/dc/elements/1.1/"};
    private String filePath;
    private Document metaDoc;
    private Element metaRoot;

    public ODSMetadataHandler(String filePath) throws Exception {
        this.filePath = filePath;
        loadMeta();
    }

    private void loadMeta() throws Exception {
        try (ZipFile zf = new ZipFile(filePath)) {
            ZipEntry entry = zf.getEntry("meta.xml");
            if (entry == null) throw new IOException("Invalid .ODS file: meta.xml not found.");
            InputStream is = zf.getInputStream(entry);
            DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
            dbf.setNamespaceAware(true);
            DocumentBuilder db = dbf.newDocumentBuilder();
            metaDoc = db.parse(is);
            metaRoot = (Element) metaDoc.getElementsByTagNameNS(getNamespace("office"), "meta").item(0);
        }
    }

    private String getNamespace(String prefix) {
        for (int i = 0; i < NAMESPACES.length; i += 2) {
            if (NAMESPACES[i].equals(prefix)) return NAMESPACES[i + 1];
        }
        return null;
    }

    public Map<String, String> readProperties() {
        Map<String, String> properties = new HashMap<>();
        String[] tags = {
            "meta:generator", "dc:title", "dc:description", "dc:subject",
            "meta:keyword", "meta:initial-creator", "dc:creator", "meta:printed-by",
            "meta:creation-date", "dc:date", "meta:print-date", "meta:template",
            "meta:auto-reload", "meta:hyperlink-behaviour", "dc:language",
            "meta:editing-cycles", "meta:editing-duration"
        };
        for (String tag : tags) {
            String[] parts = tag.split(":");
            Element elem = (Element) metaRoot.getElementsByTagNameNS(getNamespace(parts[0]), parts[1]).item(0);
            if (elem != null) properties.put(tag, elem.getTextContent());
        }

        Element stats = (Element) metaRoot.getElementsByTagNameNS(getNamespace("meta"), "document-statistic").item(0);
        if (stats != null) {
            String[] attrs = {"meta:table-count", "meta:cell-count", "meta:row-count", "meta:column-count", "meta:object-count"};
            for (String attr : attrs) {
                if (stats.hasAttribute(attr)) properties.put(attr, stats.getAttribute(attr));
            }
        }

        NodeList userDefined = metaRoot.getElementsByTagNameNS(getNamespace("meta"), "user-defined");
        for (int i = 0; i < userDefined.getLength(); i++) {
            Element ud = (Element) userDefined.item(i);
            String name = ud.getAttributeNS(getNamespace("meta"), "name");
            properties.put("meta:user-defined[" + i + "] (" + name + ")", ud.getTextContent());
        }
        return properties;
    }

    public void printProperties() {
        readProperties().forEach((key, value) -> System.out.println(key + ": " + value));
    }

    public void writeProperty(String key, String value) {
        String[] parts = key.split(":");
        Element elem = (Element) metaRoot.getElementsByTagNameNS(getNamespace(parts[0]), parts[1]).item(0);
        if (elem != null) {
            elem.setTextContent(value);
        } else {
            Element newElem = metaDoc.createElementNS(getNamespace(parts[0]), key);
            newElem.setTextContent(value);
            metaRoot.appendChild(newElem);
        }
    }

    public void save(String outputPath) throws Exception {
        outputPath = outputPath != null ? outputPath : filePath;
        try (ZipFile zfIn = new ZipFile(filePath);
             ZipOutputStream zos = new ZipOutputStream(new FileOutputStream(outputPath))) {
            for (java.util.Enumeration<? extends ZipEntry> entries = zfIn.entries(); entries.hasMoreElements(); ) {
                ZipEntry entry = entries.nextElement();
                zos.putNextEntry(new ZipEntry(entry.getName()));
                if (entry.getName().equals("meta.xml")) {
                    TransformerFactory.newInstance().newTransformer().transform(
                        new DOMSource(metaDoc), new StreamResult(zos));
                } else {
                    try (InputStream is = zfIn.getInputStream(entry)) {
                        byte[] buffer = new byte[1024];
                        int len;
                        while ((len = is.read(buffer)) > 0) {
                            zos.write(buffer, 0, len);
                        }
                    }
                }
                zos.closeEntry();
            }
        }
    }

    // Example usage:
    // public static void main(String[] args) throws Exception {
    //     ODSMetadataHandler handler = new ODSMetadataHandler("path/to/file.ods");
    //     handler.printProperties();
    //     handler.writeProperty("dc:title", "Updated Title");
    //     handler.save(null);
    // }
}

6. JavaScript Class for .ODS File Processing

The following JavaScript class is designed for a Node.js environment, using the fs module for file operations and jszip for ZIP handling (install via npm install jszip), along with xml2js for XML parsing (install via npm install xml2js). It decodes, reads, modifies, and prints properties to the console.

const fs = require('fs');
const JSZip = require('jszip');
const xml2js = require('xml2js');

class ODSMetadataHandler {
    constructor(filePath) {
        this.filePath = filePath;
        this.metaJson = null;
        this.loadMeta();
    }

    async loadMeta() {
        const data = fs.readFileSync(this.filePath);
        const zip = await JSZip.loadAsync(data);
        const metaXml = await zip.file('meta.xml').async('string');
        const parser = new xml2js.Parser({ explicitArray: false, mergeAttrs: true });
        const result = await parser.parseStringPromise(metaXml);
        this.metaJson = result['office:document-meta']['office:meta'];
    }

    readProperties() {
        const properties = {};
        const elements = [
            'meta:generator', 'dc:title', 'dc:description', 'dc:subject',
            'meta:keyword', 'meta:initial-creator', 'dc:creator', 'meta:printed-by',
            'meta:creation-date', 'dc:date', 'meta:print-date', 'meta:template',
            'meta:auto-reload', 'meta:hyperlink-behaviour', 'dc:language',
            'meta:editing-cycles', 'meta:editing-duration'
        ];
        elements.forEach(tag => {
            const [ns, key] = tag.split(':');
            if (this.metaJson[tag]) properties[tag] = this.metaJson[tag];
        });

        if (this.metaJson['meta:document-statistic']) {
            const stats = this.metaJson['meta:document-statistic'];
            ['meta:table-count', 'meta:cell-count', 'meta:row-count', 'meta:column-count', 'meta:object-count'].forEach(attr => {
                if (stats[attr]) properties[attr] = stats[attr];
            });
        }

        if (this.metaJson['meta:user-defined']) {
            const userDefined = Array.isArray(this.metaJson['meta:user-defined']) ? this.metaJson['meta:user-defined'] : [this.metaJson['meta:user-defined']];
            userDefined.forEach((ud, i) => {
                properties[`meta:user-defined[${i}] (name: ${ud['meta:name']})`] = ud._;
            });
        }
        return properties;
    }

    printProperties() {
        const properties = this.readProperties();
        Object.entries(properties).forEach(([key, value]) => console.log(`${key}: ${value}`));
    }

    writeProperty(key, value) {
        // Simplified; assumes key is simple element, not handling arrays or attrs fully
        this.metaJson[key] = value;
    }

    async save(outputPath = this.filePath) {
        const builder = new xml2js.Builder({ rootName: 'office:document-meta', xmldec: { version: '1.0', encoding: 'UTF-8' } });
        const metaXml = builder.buildObject({ 'office:document-meta': { 'office:meta': this.metaJson } });

        const data = fs.readFileSync(this.filePath);
        const zip = await JSZip.loadAsync(data);
        zip.file('meta.xml', metaXml);
        const updatedZip = await zip.generateAsync({ type: 'nodebuffer' });
        fs.writeFileSync(outputPath, updatedZip);
    }
}

// Example usage:
// const handler = new ODSMetadataHandler('path/to/file.ods');
// handler.printProperties();
// handler.writeProperty('dc:title', 'Updated Title');
// await handler.save();

7. C Implementation for .ODS File Processing

Since C does not support classes natively, the following implementation uses a struct with associated functions to open, decode, read, modify (write), and print metadata properties from an .ODS file to the console. It requires external libraries libzip for ZIP handling and libxml2 for XML parsing (compile with -lzip -lxml2).

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <zip.h>
#include <libxml/parser.h>
#include <libxml/tree.h>

#define NAMESPACE_OFFICE "urn:oasis:names:tc:opendocument:xmlns:office:1.0"
#define NAMESPACE_META "urn:oasis:names:tc:opendocument:xmlns:meta:1.0"
#define NAMESPACE_DC "http://purl.org/dc/elements/1.1/"

typedef struct {
    char *file_path;
    xmlDocPtr meta_doc;
    xmlNodePtr meta_root;
} ODSMetadataHandler;

ODSMetadataHandler* ods_create(const char *file_path) {
    ODSMetadataHandler *handler = malloc(sizeof(ODSMetadataHandler));
    handler->file_path = strdup(file_path);
    handler->meta_doc = NULL;
    handler->meta_root = NULL;
    ods_load_meta(handler);
    return handler;
}

void ods_load_meta(ODSMetadataHandler *handler) {
    zip_t *zf = zip_open(handler->file_path, ZIP_RDONLY, NULL);
    if (!zf) {
        fprintf(stderr, "Error opening ZIP file\n");
        return;
    }
    zip_file_t *meta_file = zip_fopen(zf, "meta.xml", 0);
    if (!meta_file) {
        fprintf(stderr, "meta.xml not found\n");
        zip_close(zf);
        return;
    }

    zip_stat_t stat;
    zip_stat(zf, "meta.xml", 0, &stat);
    char *buffer = malloc(stat.size + 1);
    zip_fread(meta_file, buffer, stat.size);
    buffer[stat.size] = '\0';

    handler->meta_doc = xmlReadMemory(buffer, stat.size, NULL, NULL, 0);
    if (!handler->meta_doc) {
        fprintf(stderr, "Error parsing XML\n");
    }
    handler->meta_root = xmlDocGetRootElement(handler->meta_doc)->children; // office:meta
    while (handler->meta_root && strcmp((char*)handler->meta_root->name, "meta") != 0) {
        handler->meta_root = handler->meta_root->next;
    }

    free(buffer);
    zip_fclose(meta_file);
    zip_close(zf);
}

void ods_read_and_print_properties(ODSMetadataHandler *handler) {
    if (!handler->meta_root) return;

    const char *tags[] = {
        "generator", "title", "description", "subject", "keyword", "initial-creator",
        "creator", "printed-by", "creation-date", "date", "print-date", "template",
        "auto-reload", "hyperlink-behaviour", "language", "editing-cycles", "editing-duration", NULL
    };
    const char *ns[] = {NAMESPACE_META, NAMESPACE_DC, NAMESPACE_DC, NAMESPACE_DC, NAMESPACE_META, NAMESPACE_META,
                         NAMESPACE_DC, NAMESPACE_META, NAMESPACE_META, NAMESPACE_DC, NAMESPACE_META, NAMESPACE_META,
                         NAMESPACE_META, NAMESPACE_META, NAMESPACE_DC, NAMESPACE_META, NAMESPACE_META};

    for (int i = 0; tags[i]; i++) {
        xmlNodePtr node = handler->meta_root->children;
        while (node) {
            if (node->type == XML_ELEMENT_NODE && xmlStrcmp(node->name, (xmlChar*)tags[i]) == 0 &&
                xmlStrcmp(node->ns->href, (xmlChar*)ns[i]) == 0) {
                xmlChar *content = xmlNodeGetContent(node);
                printf("%s: %s\n", tags[i], content);
                xmlFree(content);
                break;
            }
            node = node->next;
        }
    }

    // Statistics
    xmlNodePtr stats = handler->meta_root->children;
    while (stats) {
        if (stats->type == XML_ELEMENT_NODE && xmlStrcmp(stats->name, (xmlChar*)"document-statistic") == 0) {
            const char *attrs[] = {"table-count", "cell-count", "row-count", "column-count", "object-count", NULL};
            for (int j = 0; attrs[j]; j++) {
                xmlChar *val = xmlGetNsProp(stats, (xmlChar*)attrs[j], (xmlChar*)NAMESPACE_META);
                if (val) {
                    printf("meta:%s: %s\n", attrs[j], val);
                    xmlFree(val);
                }
            }
            break;
        }
        stats = stats->next;
    }

    // User-defined
    xmlNodePtr ud = handler->meta_root->children;
    int index = 0;
    while (ud) {
        if (ud->type == XML_ELEMENT_NODE && xmlStrcmp(ud->name, (xmlChar*)"user-defined") == 0) {
            xmlChar *name = xmlGetNsProp(ud, (xmlChar*)"name", (xmlChar*)NAMESPACE_META);
            xmlChar *content = xmlNodeGetContent(ud);
            printf("meta:user-defined[%d] (name: %s): %s\n", index++, name ? (char*)name : "", content);
            if (name) xmlFree(name);
            xmlFree(content);
        }
        ud = ud->next;
    }
}

void ods_write_property(ODSMetadataHandler *handler, const char *key, const char *value) {
    // Simplified: assumes key is tag name without ns, creates if not exists
    xmlNodePtr node = handler->meta_root->children;
    int found = 0;
    while (node) {
        if (node->type == XML_ELEMENT_NODE && xmlStrcmp(node->name, (xmlChar*)key) == 0) {
            xmlNodeSetContent(node, (xmlChar*)value);
            found = 1;
            break;
        }
        node = node->next;
    }
    if (!found) {
        xmlNewChild(handler->meta_root, NULL, (xmlChar*)key, (xmlChar*)value);
    }
}

void ods_save(ODSMetadataHandler *handler, const char *output_path) {
    const char *out = output_path ? output_path : handler->file_path;
    zip_t *zf_in = zip_open(handler->file_path, ZIP_RDONLY, NULL);
    zip_t *zf_out = zip_open(out, ZIP_CREATE | ZIP_TRUNCATE, NULL);

    int num_entries = zip_get_num_entries(zf_in, 0);
    for (int i = 0; i < num_entries; i++) {
        const char *name = zip_get_name(zf_in, i, 0);
        zip_file_t *file_in = zip_fopen_index(zf_in, i, 0);
        zip_stat_t stat;
        zip_stat_index(zf_in, i, 0, &stat);
        char *buffer = malloc(stat.size);
        zip_fread(file_in, buffer, stat.size);

        zip_source_t *src;
        if (strcmp(name, "meta.xml") == 0) {
            xmlChar *xml_buff;
            int xml_size;
            xmlDocDumpFormatMemory(handler->meta_doc, &xml_buff, &xml_size, 1);
            src = zip_source_buffer(zf_out, xml_buff, xml_size, 0);
        } else {
            src = zip_source_buffer(zf_out, buffer, stat.size, 1);
        }
        zip_file_add(zf_out, name, src, ZIP_FL_OVERWRITE);
        zip_fclose(file_in);
    }

    zip_close(zf_out);
    zip_close(zf_in);
}

void ods_destroy(ODSMetadataHandler *handler) {
    xmlFreeDoc(handler->meta_doc);
    free(handler->file_path);
    free(handler);
}

// Example usage:
// int main() {
//     ODSMetadataHandler *handler = ods_create("path/to/file.ods");
//     ods_read_and_print_properties(handler);
//     ods_write_property(handler, "title", "Updated Title");
//     ods_save(handler, NULL);
//     ods_destroy(handler);
//     return 0;
// }