Task 474: .ODP File Format

Task 474: .ODP File Format

1. List of Properties Intrinsic to the .ODP File Format

The .ODP file format adheres to the OpenDocument Format (ODF) specification, as defined by OASIS and ISO/IEC 26300. It is a ZIP-compressed archive containing XML files, with metadata properties stored primarily in the meta.xml file within the archive. These properties are intrinsic to the format's structure and are used for document identification, authorship, and statistics. Based on the ODF specification (version 1.2 and later), the predefined metadata elements in meta.xml are as follows:

  • Title (<dc:title>): Represents the document's title, typically displayed in application interfaces.
  • Subject (<dc:subject>): Describes the document's topic or keywords for categorization.
  • Description (<dc:description>): Provides a textual summary or comments about the document's content.
  • Creator (<dc:creator>): Identifies the last user who modified the document.
  • Date (<dc:date>): Records the date and time of the last modification, in ISO 8601 format.
  • Language (<dc:language>): Specifies the primary language of the document, using a language code (e.g., "en-US").
  • Generator (<meta:generator>): Indicates the software application and version that generated or last saved the document.
  • Initial Creator (<meta:initial-creator>): Identifies the user who originally created the document.
  • Creation Date (<meta:creation-date>): Records the date and time of the document's creation, in ISO 8601 format.
  • Keywords (multiple <meta:keyword> elements): A list of keywords associated with the document for search and indexing purposes.
  • Editing Cycles (<meta:editing-cycles>): Indicates the number of times the document has been edited (revision count).
  • Editing Duration (<meta:editing-duration>): Represents the total time spent editing the document, in ISO 8601 duration format (e.g., "PT1H28M55S").
  • User-Defined Fields (multiple <meta:user-defined meta:name="..."> elements): Custom metadata fields defined by the user, consisting of a name attribute and corresponding value.
  • Document Statistics (<meta:document-statistic> with attributes): Provides counts of structural elements, including:
  • meta:page-count: Number of pages or slides.
  • meta:paragraph-count: Number of paragraphs.
  • meta:word-count: Number of words.
  • meta:character-count: Number of characters.
  • meta:image-count: Number of images.
  • meta:object-count: Number of embedded objects.
  • meta:table-count: Number of tables.
  • Other format-specific counts (e.g., meta:draw-count for drawings in presentations).

These properties are extracted from the <office:meta> element in meta.xml and utilize namespaces such as Dublin Core (dc:) and ODF metadata (meta:).

The following are two direct download links for sample .ODP files, sourced from publicly available repositories for testing purposes:

These files can be downloaded directly and used to verify compatibility with .ODP handling tools.

3. HTML/JavaScript Code for Drag-and-Drop .ODP Property Dump

The following is a self-contained HTML document with embedded JavaScript that can be embedded in a Ghost blog or similar platform. It allows users to drag and drop a .ODP file, unzips it using the JSZip library (loaded from a CDN), parses the meta.xml file, extracts the properties listed in section 1, and displays them on the screen. The code assumes browser support for FileReader and DOMParser.

ODP Property Dumper
Drag and drop a .ODP file here

4. Python Class for .ODP File Handling

The following Python class uses the standard library (zipfile and xml.etree.ElementTree) to open a .ODP file, decode and read the properties from meta.xml, print them to the console, and support writing modifications back to a new file.

import zipfile
import xml.etree.ElementTree as ET
from io import BytesIO

class ODPFile:
    def __init__(self, path):
        self.path = path
        self.zip = zipfile.ZipFile(path, 'r')
        self.tree = None
        self.properties = self._read_properties()

    def _read_properties(self):
        with self.zip.open('meta.xml') as f:
            self.tree = ET.parse(f)
        meta = self.tree.find('{urn:oasis:names:tc:opendocument:xmlns:office:1.0}meta')
        ns = {
            'dc': 'http://purl.org/dc/elements/1.1/',
            'meta': 'urn:oasis:names:tc:opendocument:xmlns:meta:1.0'
        }
        properties = {
            'title': meta.find('dc:title', ns).text if meta.find('dc:title', ns) is not None else '',
            'subject': meta.find('dc:subject', ns).text if meta.find('dc:subject', ns) is not None else '',
            'description': meta.find('dc:description', ns).text if meta.find('dc:description', ns) is not None else '',
            'creator': meta.find('dc:creator', ns).text if meta.find('dc:creator', ns) is not None else '',
            'date': meta.find('dc:date', ns).text if meta.find('dc:date', ns) is not None else '',
            'language': meta.find('dc:language', ns).text if meta.find('dc:language', ns) is not None else '',
            'generator': meta.find('meta:generator', ns).text if meta.find('meta:generator', ns) is not None else '',
            'initial_creator': meta.find('meta:initial-creator', ns).text if meta.find('meta:initial-creator', ns) is not None else '',
            'creation_date': meta.find('meta:creation-date', ns).text if meta.find('meta:creation-date', ns) is not None else '',
            'keywords': [k.text for k in meta.findall('meta:keyword', ns)],
            'editing_cycles': meta.find('meta:editing-cycles', ns).text if meta.find('meta:editing-cycles', ns) is not None else '',
            'editing_duration': meta.find('meta:editing-duration', ns).text if meta.find('meta:editing-duration', ns) is not None else '',
            'user_defined': {ud.attrib['{urn:oasis:names:tc:opendocument:xmlns:meta:1.0}name']: ud.text for ud in meta.findall('meta:user-defined', ns)},
            'statistics': {attr: value for attr, value in meta.find('meta:document-statistic', ns).attrib.items()} if meta.find('meta:document-statistic', ns) is not None else {}
        }
        return properties

    def print_properties(self):
        print("Properties:")
        for key, value in self.properties.items():
            if isinstance(value, list):
                print(f"{key}: {', '.join(value)}")
            elif isinstance(value, dict):
                print(f"{key}:")
                for sub_key, sub_value in value.items():
                    print(f"  {sub_key}: {sub_value}")
            else:
                print(f"{key}: {value}")

    def set_property(self, key, value):
        ns = {
            'dc': 'http://purl.org/dc/elements/1.1/',
            'meta': 'urn:oasis:names:tc:opendocument:xmlns:meta:1.0'
        }
        meta = self.tree.find('{urn:oasis:names:tc:opendocument:xmlns:office:1.0}meta')
        element = meta.find(f"{{http://purl.org/dc/elements/1.1/}}{key}" if key.startswith('dc:') else f"{{urn:oasis:names:tc:opendocument:xmlns:meta:1.0}}{key}", ns)
        if element is not None:
            element.text = value
        # For simplicity, assumes single value set; extend for lists/dicts as needed

    def save(self, new_path):
        with zipfile.ZipFile(new_path, 'w') as new_zip:
            for item in self.zip.infolist():
                if item.filename == 'meta.xml':
                    meta_io = BytesIO()
                    self.tree.write(meta_io, encoding='utf-8', xml_declaration=True)
                    new_zip.writestr('meta.xml', meta_io.getvalue())
                else:
                    new_zip.writestr(item, self.zip.read(item.filename))

# Example usage:
# odp = ODPFile('sample.odp')
# odp.print_properties()
# odp.set_property('title', 'New Title')
# odp.save('modified.odp')

5. Java Class for .ODP File Handling

The following Java class uses java.util.zip and javax.xml.parsers to open a .ODP file, decode and read the properties from meta.xml, print them to the console, and support writing modifications back to a new file.

import java.io.*;
import java.util.*;
import java.util.zip.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.SAXException;

public class ODPFile {
    private String path;
    private ZipFile zip;
    private Document doc;
    private Map<String, Object> properties;

    public ODPFile(String path) throws IOException, ParserConfigurationException, SAXException {
        this.path = path;
        this.zip = new ZipFile(path);
        this.properties = readProperties();
    }

    private Map<String, Object> readProperties() throws IOException, ParserConfigurationException, SAXException {
        ZipEntry entry = zip.getEntry("meta.xml");
        InputStream is = zip.getInputStream(entry);
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        dbf.setNamespaceAware(true);
        DocumentBuilder db = dbf.newDocumentBuilder();
        doc = db.parse(is);
        Element meta = (Element) doc.getElementsByTagNameNS("urn:oasis:names:tc:opendocument:xmlns:office:1.0", "meta").item(0);

        Map<String, Object> props = new HashMap<>();
        props.put("title", getText(meta, "http://purl.org/dc/elements/1.1/", "title"));
        props.put("subject", getText(meta, "http://purl.org/dc/elements/1.1/", "subject"));
        props.put("description", getText(meta, "http://purl.org/dc/elements/1.1/", "description"));
        props.put("creator", getText(meta, "http://purl.org/dc/elements/1.1/", "creator"));
        props.put("date", getText(meta, "http://purl.org/dc/elements/1.1/", "date"));
        props.put("language", getText(meta, "http://purl.org/dc/elements/1.1/", "language"));
        props.put("generator", getText(meta, "urn:oasis:names:tc:opendocument:xmlns:meta:1.0", "generator"));
        props.put("initial_creator", getText(meta, "urn:oasis:names:tc:opendocument:xmlns:meta:1.0", "initial-creator"));
        props.put("creation_date", getText(meta, "urn:oasis:names:tc:opendocument:xmlns:meta:1.0", "creation-date"));
        
        List<String> keywords = new ArrayList<>();
        NodeList kwList = meta.getElementsByTagNameNS("urn:oasis:names:tc:opendocument:xmlns:meta:1.0", "keyword");
        for (int i = 0; i < kwList.getLength(); i++) {
            keywords.add(kwList.item(i).getTextContent());
        }
        props.put("keywords", keywords);

        props.put("editing_cycles", getText(meta, "urn:oasis:names:tc:opendocument:xmlns:meta:1.0", "editing-cycles"));
        props.put("editing_duration", getText(meta, "urn:oasis:names:tc:opendocument:xmlns:meta:1.0", "editing-duration"));
        
        Map<String, String> userDefined = new HashMap<>();
        NodeList udList = meta.getElementsByTagNameNS("urn:oasis:names:tc:opendocument:xmlns:meta:1.0", "user-defined");
        for (int i = 0; i < udList.getLength(); i++) {
            Element ud = (Element) udList.item(i);
            userDefined.put(ud.getAttributeNS("urn:oasis:names:tc:opendocument:xmlns:meta:1.0", "name"), ud.getTextContent());
        }
        props.put("user_defined", userDefined);
        
        Map<String, String> stats = new HashMap<>();
        Element statElem = (Element) meta.getElementsByTagNameNS("urn:oasis:names:tc:opendocument:xmlns:meta:1.0", "document-statistic").item(0);
        if (statElem != null) {
            NamedNodeMap attrs = statElem.getAttributes();
            for (int i = 0; i < attrs.getLength(); i++) {
                Attr attr = (Attr) attrs.item(i);
                stats.put(attr.getLocalName(), attr.getValue());
            }
        }
        props.put("statistics", stats);

        return props;
    }

    private String getText(Element parent, String ns, String tag) {
        Node node = parent.getElementsByTagNameNS(ns, tag).item(0);
        return node != null ? node.getTextContent() : "";
    }

    public void printProperties() {
        System.out.println("Properties:");
        for (Map.Entry<String, Object> entry : properties.entrySet()) {
            String key = entry.getKey();
            Object value = entry.getValue();
            if (value instanceof List) {
                System.out.println(key + ": " + String.join(", ", (List<String>) value));
            } else if (value instanceof Map) {
                System.out.println(key + ":");
                ((Map<String, String>) value).forEach((subKey, subValue) -> System.out.println("  " + subKey + ": " + subValue));
            } else {
                System.out.println(key + ": " + value);
            }
        }
    }

    public void setProperty(String key, String value) {
        String ns = key.startsWith("dc:") ? "http://purl.org/dc/elements/1.1/" : "urn:oasis:names:tc:opendocument:xmlns:meta:1.0";
        String tag = key.replace("dc:", "").replace("meta:", "");
        Node node = doc.getElementsByTagNameNS(ns, tag).item(0);
        if (node != null) {
            node.setTextContent(value);
        }
        // For lists/dicts, extend as needed
    }

    public void save(String newPath) throws IOException, TransformerException {
        TransformerFactory tf = TransformerFactory.newInstance();
        Transformer transformer = tf.newTransformer();
        DOMSource source = new DOMSource(doc);
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        transformer.transform(source, new StreamResult(baos));

        try (ZipOutputStream zos = new ZipOutputStream(new FileOutputStream(newPath))) {
            Enumeration<? extends ZipEntry> entries = zip.entries();
            while (entries.hasMoreElements()) {
                ZipEntry entry = entries.nextElement();
                if (entry.getName().equals("meta.xml")) {
                    ZipEntry newEntry = new ZipEntry("meta.xml");
                    zos.putNextEntry(newEntry);
                    zos.write(baos.toByteArray());
                } else {
                    ZipEntry newEntry = new ZipEntry(entry.getName());
                    zos.putNextEntry(newEntry);
                    InputStream is = zip.getInputStream(entry);
                    byte[] buffer = new byte[1024];
                    int len;
                    while ((len = is.read(buffer)) > 0) {
                        zos.write(buffer, 0, len);
                    }
                    is.close();
                }
                zos.closeEntry();
            }
        }
    }

    // Example usage:
    // public static void main(String[] args) throws Exception {
    //     ODPFile odp = new ODPFile("sample.odp");
    //     odp.printProperties();
    //     odp.setProperty("title", "New Title");
    //     odp.save("modified.odp");
    // }
}

Note: This requires importing javax.xml.transform.* for saving.

6. JavaScript Class for .ODP File Handling

The following JavaScript class is designed for Node.js, using fs for file I/O and jszip for ZIP handling (install via npm install jszip), along with xml2js for XML parsing (install via npm install xml2js). It opens a .ODP file, decodes and reads the properties from meta.xml, prints them to the console, and supports writing modifications back to a new file.

const fs = require('fs');
const JSZip = require('jszip');
const xml2js = require('xml2js');

class ODPFile {
    constructor(path) {
        this.path = path;
        this.properties = {};
        this.metaXml = null;
        this.zipData = null;
        this.load();
    }

    async load() {
        this.zipData = await fs.promises.readFile(this.path);
        const zip = await JSZip.loadAsync(this.zipData);
        this.metaXml = await zip.file('meta.xml').async('string');
        const parser = new xml2js.Parser({ explicitArray: false, mergeAttrs: true });
        const result = await parser.parseStringPromise(this.metaXml);
        const meta = result['office:document-meta']['office:meta'];

        this.properties = {
            title: meta['dc:title'] || '',
            subject: meta['dc:subject'] || '',
            description: meta['dc:description'] || '',
            creator: meta['dc:creator'] || '',
            date: meta['dc:date'] || '',
            language: meta['dc:language'] || '',
            generator: meta['meta:generator'] || '',
            initial_creator: meta['meta:initial-creator'] || '',
            creation_date: meta['meta:creation-date'] || '',
            keywords: Array.isArray(meta['meta:keyword']) ? meta['meta:keyword'] : (meta['meta:keyword'] ? [meta['meta:keyword']] : []),
            editing_cycles: meta['meta:editing-cycles'] || '',
            editing_duration: meta['meta:editing-duration'] || '',
            user_defined: Array.isArray(meta['meta:user-defined']) ? meta['meta:user-defined'].reduce((acc, ud) => {
                acc[ud['meta:name']] = ud._ || '';
                return acc;
            }, {}) : (meta['meta:user-defined'] ? {[meta['meta:user-defined']['meta:name']]: meta['meta:user-defined']._ || ''} : {}),
            statistics: meta['meta:document-statistic'] || {}
        };
    }

    printProperties() {
        console.log('Properties:');
        for (const [key, value] of Object.entries(this.properties)) {
            if (Array.isArray(value)) {
                console.log(`${key}: ${value.join(', ')}`);
            } else if (typeof value === 'object') {
                console.log(`${key}:`);
                for (const [subKey, subValue] of Object.entries(value)) {
                    console.log(`  ${subKey}: ${subValue}`);
                }
            } else {
                console.log(`${key}: ${value}`);
            }
        }
    }

    setProperty(key, value) {
        // Simplified: updates in parsed object; extend for complex types
        this.properties[key] = value;
    }

    async save(newPath) {
        const builder = new xml2js.Builder({ renderOpts: { 'pretty': true, 'indent': '  ', 'newline': '\n' } });
        const metaObj = {
            'office:document-meta': {
                '$': { 'xmlns:office': 'urn:oasis:names:tc:opendocument:xmlns:office:1.0', /* add other ns */ },
                'office:meta': this.properties // Reconstruct from properties; may need mapping
            }
        };
        const newMetaXml = builder.buildObject(metaObj);

        const zip = await JSZip.loadAsync(this.zipData);
        zip.file('meta.xml', newMetaXml);
        const newBuffer = await zip.generateAsync({ type: 'nodebuffer' });
        await fs.promises.writeFile(newPath, newBuffer);
    }
}

// Example usage:
// const odp = new ODPFile('sample.odp');
// odp.printProperties();
// odp.setProperty('title', 'New Title');
// await odp.save('modified.odp');

Note: The save method reconstructs meta.xml from properties; full namespace handling may require adjustments.

7. C++ Class for .ODP File Handling

The following C++ class uses minizip for ZIP handling (from zlib contrib) and tinyxml2 for XML parsing (include via external libraries). It opens a .ODP file, decodes and reads the properties from meta.xml, prints them to the console, and supports writing modifications back to a new file. Assume minizip and tinyxml2 are linked.

#include <iostream>
#include <string>
#include <map>
#include <vector>
#include <fstream>
#include "unzip.h"  // From minizip
#include "zip.h"    // From minizip
#include "tinyxml2.h"

class ODPFile {
private:
    std::string path;
    tinyxml2::XMLDocument doc;
    std::map<std::string, std::string> simpleProps;
    std::vector<std::string> keywords;
    std::map<std::string, std::string> userDefined;
    std::map<std::string, std::string> statistics;

    void readProperties() {
        unzFile uf = unzOpen(path.c_str());
        if (uf == nullptr) return;

        unzLocateFile(uf, "meta.xml", 0);
        unz_file_info fileInfo;
        unzGetCurrentFileInfo(uf, &fileInfo, nullptr, 0, nullptr, 0, nullptr, 0);

        std::vector<char> buffer(fileInfo.uncompressed_size);
        unzOpenCurrentFile(uf);
        unzReadCurrentFile(uf, buffer.data(), fileInfo.uncompressed_size);
        unzCloseCurrentFile(uf);
        unzClose(uf);

        doc.Parse(buffer.data(), fileInfo.uncompressed_size);
        tinyxml2::XMLElement* meta = doc.FirstChildElement("office:document-meta")->FirstChildElement("office:meta");

        simpleProps["title"] = meta->FirstChildElement("dc:title") ? meta->FirstChildElement("dc:title")->GetText() : "";
        simpleProps["subject"] = meta->FirstChildElement("dc:subject") ? meta->FirstChildElement("dc:subject")->GetText() : "";
        simpleProps["description"] = meta->FirstChildElement("dc:description") ? meta->FirstChildElement("dc:description")->GetText() : "";
        simpleProps["creator"] = meta->FirstChildElement("dc:creator") ? meta->FirstChildElement("dc:creator")->GetText() : "";
        simpleProps["date"] = meta->FirstChildElement("dc:date") ? meta->FirstChildElement("dc:date")->GetText() : "";
        simpleProps["language"] = meta->FirstChildElement("dc:language") ? meta->FirstChildElement("dc:language")->GetText() : "";
        simpleProps["generator"] = meta->FirstChildElement("meta:generator") ? meta->FirstChildElement("meta:generator")->GetText() : "";
        simpleProps["initial_creator"] = meta->FirstChildElement("meta:initial-creator") ? meta->FirstChildElement("meta:initial-creator")->GetText() : "";
        simpleProps["creation_date"] = meta->FirstChildElement("meta:creation-date") ? meta->FirstChildElement("meta:creation-date")->GetText() : "";
        simpleProps["editing_cycles"] = meta->FirstChildElement("meta:editing-cycles") ? meta->FirstChildElement("meta:editing-cycles")->GetText() : "";
        simpleProps["editing_duration"] = meta->FirstChildElement("meta:editing-duration") ? meta->FirstChildElement("meta:editing-duration")->GetText() : "";

        tinyxml2::XMLElement* kw = meta->FirstChildElement("meta:keyword");
        while (kw) {
            keywords.push_back(kw->GetText() ? kw->GetText() : "");
            kw = kw->NextSiblingElement("meta:keyword");
        }

        tinyxml2::XMLElement* ud = meta->FirstChildElement("meta:user-defined");
        while (ud) {
            userDefined[ud->Attribute("meta:name")] = ud->GetText() ? ud->GetText() : "";
            ud = ud->NextSiblingElement("meta:user-defined");
        }

        tinyxml2::XMLElement* stat = meta->FirstChildElement("meta:document-statistic");
        if (stat) {
            for (const tinyxml2::XMLAttribute* attr = stat->FirstAttribute(); attr; attr = attr->Next()) {
                statistics[attr->Name()] = attr->Value();
            }
        }
    }

public:
    ODPFile(const std::string& p) : path(p) {
        readProperties();
    }

    void printProperties() {
        std::cout << "Properties:" << std::endl;
        for (const auto& pair : simpleProps) {
            std::cout << pair.first << ": " << pair.second << std::endl;
        }
        std::cout << "keywords: ";
        for (size_t i = 0; i < keywords.size(); ++i) {
            std::cout << keywords[i];
            if (i < keywords.size() - 1) std::cout << ", ";
        }
        std::cout << std::endl;
        std::cout << "user_defined:" << std::endl;
        for (const auto& pair : userDefined) {
            std::cout << "  " << pair.first << ": " << pair.second << std::endl;
        }
        std::cout << "statistics:" << std::endl;
        for (const auto& pair : statistics) {
            std::cout << "  " << pair.first << ": " << pair.second << std::endl;
        }
    }

    void setProperty(const std::string& key, const std::string& value) {
        tinyxml2::XMLElement* meta = doc.FirstChildElement("office:document-meta")->FirstChildElement("office:meta");
        tinyxml2::XMLElement* elem = meta->FirstChildElement(key.c_str());
        if (elem) {
            elem->SetText(value.c_str());
        }
        // For lists/dicts, extend as needed
    }

    void save(const std::string& newPath) {
        std::string xmlStr;
        tinyxml2::XMLPrinter printer;
        doc.Print(&printer);
        xmlStr = printer.CStr();

        zipFile zf = zipOpen(newPath.c_str(), APPEND_STATUS_CREATE);
        unzFile uf = unzOpen(path.c_str());

        unz_global_info gi;
        unzGetGlobalInfo(uf, &gi);
        for (int i = 0; i < gi.number_entry; ++i) {
            unz_file_info fileInfo;
            char filename[256];
            unzGetCurrentFileInfo(uf, &fileInfo, filename, sizeof(filename), nullptr, 0, nullptr, 0);

            if (std::string(filename) == "meta.xml") {
                zipOpenNewFileInZip(zf, "meta.xml", nullptr, nullptr, 0, nullptr, 0, nullptr, Z_DEFLATED, Z_DEFAULT_COMPRESSION);
                zipWriteInFileInZip(zf, xmlStr.c_str(), xmlStr.size());
                zipCloseFileInZip(zf);
            } else {
                std::vector<char> buffer(fileInfo.uncompressed_size);
                unzOpenCurrentFile(uf);
                unzReadCurrentFile(uf, buffer.data(), fileInfo.uncompressed_size);
                unzCloseCurrentFile(uf);

                zipOpenNewFileInZip(zf, filename, nullptr, nullptr, 0, nullptr, 0, nullptr, Z_DEFLATED, Z_DEFAULT_COMPRESSION);
                zipWriteInFileInZip(zf, buffer.data(), fileInfo.uncompressed_size);
                zipCloseFileInZip(zf);
            }
            unzGoToNextFile(uf);
        }
        unzClose(uf);
        zipClose(zf, nullptr);
    }
};

// Example usage:
// int main() {
//     ODPFile odp("sample.odp");
//     odp.printProperties();
//     odp.setProperty("dc:title", "New Title");
//     odp.save("modified.odp");
//     return 0;
// }

Note: Namespace handling in element lookups is simplified; adjust for prefixed tags. Ensure minizip and tinyxml2 are properly included and compiled.