Task 238: .FODT File Format

Task 238: .FODT File Format

1. List of All Properties Intrinsic to the .FODT File Format

The .FODT (Flat OpenDocument Text) format is an uncompressed, single-file XML representation of an OpenDocument text document, conforming to the ODF 1.2 specification. It is designed for simplicity in processing without ZIP compression. Below is a comprehensive list of its intrinsic properties, derived from the format's structure, schema requirements, and standard conventions. These include file-level attributes, XML structure, required elements, namespaces, and key metadata elements. "Intrinsic to its file system" is interpreted as core format-defining traits at the file and structural level (e.g., encoding, schema enforcement, and mandatory components).

  • File Extension: .fodt (standard extension for flat XML text documents).
  • MIME Type: application/vnd.oasis.opendocument.text-flat-xml.
  • Content Type: Plain XML (uncompressed; no ZIP container).
  • Encoding: UTF-8 (mandatory for all text content and attributes).
  • XML Declaration:  (required at the file start).
  • XML Version: 1.0 (as per XML standard).
  • Root Element: office:document (mandatory; encapsulates the entire document).
  • Version Attribute: office:version="1.2" (on root element; specifies ODF conformance level).
  • Namespaces: Declared on root element; core set includes:
  • xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
  • xmlns:style="urn:oasis:names:tc:opendocument:xmlns:style:1.0"
  • xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
  • xmlns:meta="urn:oasis:names:tc:opendocument:xmlns:meta:1.0"
  • xmlns:config="urn:oasis:names:tc:opendocument:xmlns:config:1.0"
  • xmlns:draw="urn:oasis:names:tc:opendocument:xmlns:drawing:1.0"
  • xmlns:fo="urn:oasis:names:tc:opendocument:xmlns:xsl-fo-compatible:1.0"
  • xmlns:xlink="http://www.w3.org/1999/xlink"
  • xmlns:dc="http://purl.org/dc/elements/1.1/"
  • xmlns:meta="urn:oasis:names:tc:opendocument:xmlns:meta:1.0" (for extended metadata).
  • Required Child Elements under Root:
  • office:meta (document metadata section; may be empty but must exist).
  • office:settings (user-configurable settings; via config namespace).
  • office:styles (named styles; may be empty).
  • office:automatic-styles (generated styles; may be empty).
  • office:master-styles (page layouts and master pages; at least one style:master-page required).
  • office:body (main content wrapper; must contain office:text).
  • Body Content Structure: office:text under office:body (holds text elements like text:p for paragraphs, text:h for headings, text:list for lists; at least one text element required for validity).
  • Metadata Elements (under office:meta; Dublin Core and ODF-specific):
  • dc:title (document title; optional but common).
  • dc:creator (author/creator; optional).
  • dc:subject (document subject; optional).
  • dc:description (abstract/summary; optional).
  • meta:initial-creator (original creator; optional).
  • meta:creation-date (ISO 8601 timestamp; optional).
  • meta:modification-date (last modified; optional).
  • meta:document-statistic (stats like word count; optional).
  • Style Properties: Styles defined via style:style elements (with name, family="text/paragraph/table" attributes; automatic styles scoped to usage).
  • Schema Validation: Must conform to ODF 1.2 RELAX NG schema (office-document.rng); enforces element order and attributes.
  • File Size Intrinsic: Variable, but typically larger than zipped .odt due to no compression.
  • Interoperability: Platform-independent; supports embedding of images/links via xlink:href.

These properties ensure a valid, self-contained XML file that can be directly parsed without extraction.

3. Ghost Blog Embedded HTML JavaScript

This is a self-contained HTML snippet with embedded JavaScript for drag-and-drop functionality. It can be embedded in a Ghost blog post (e.g., via the HTML card). It reads the dropped .FODT file, parses it as XML using DOMParser, extracts the properties from part 1, and dumps them to a <pre> element on screen. No external libraries needed.

Drag and drop a .FODT file here to view its properties.

4. Python Class

This class uses xml.etree.ElementTree (standard library) to open, parse, extract, and print properties. It reads the file, prints properties to console, and writes an unchanged copy to output.fodt.

import xml.etree.ElementTree as ET
from xml.dom import minidom

class FODTParser:
    def __init__(self, file_path):
        self.tree = ET.parse(file_path)
        self.root = self.tree.getroot()

    def extract_properties(self):
        props = {
            'File Extension': '.fodt',
            'MIME Type': 'application/vnd.oasis.opendocument.text-flat-xml',
            'Encoding': 'UTF-8',
            'XML Version': self.root.get('version', '1.0'),
            'Root Element': self.root.tag,
            'Version Attribute': self.root.get('{urn:oasis:names:tc:opendocument:xmlns:office:1.0}version', 'N/A'),
            'Namespaces': dict(self.root.nsmap),
            'Required Child Elements': ', '.join([child.tag for child in self.root]),
            'Metadata Title': self.root.find('.//{http://purl.org/dc/elements/1.1/}title').text if self.root.find('.//{http://purl.org/dc/elements/1.1/}title') is not None else 'N/A',
            'Metadata Creator': self.root.find('.//{http://purl.org/dc/elements/1.1/}creator').text if self.root.find('.//{http://purl.org/dc/elements/1.1/}creator') is not None else 'N/A'
        }
        return props

    def print_properties(self):
        props = self.extract_properties()
        for key, value in props.items():
            print(f"{key}: {value}")

    def write_file(self, output_path='output.fodt'):
        rough_string = ET.tostring(self.root, 'unicode')
        reparsed = minidom.parseString(rough_string)
        with open(output_path, 'w', encoding='utf-8') as f:
            f.write(reparsed.toprettyxml(indent='  ', encoding='utf-8').decode('utf-8'))

# Usage
if __name__ == "__main__":
    parser = FODTParser('input.fodt')
    parser.print_properties()
    parser.write_file()

5. Java Class

This class uses javax.xml.parsers and DocumentBuilder (standard JDK) to open, parse, extract, and print properties. It reads the file, prints to console, and writes an unchanged copy to output.fodt.

import org.w3c.dom.*;
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import java.io.*;

public class FODTParser {
    private Document doc;
    private Element root;

    public FODTParser(String filePath) throws Exception {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true);
        DocumentBuilder builder = factory.newDocumentBuilder();
        this.doc = builder.parse(new File(filePath));
        this.root = doc.getDocumentElement();
    }

    public void extractAndPrintProperties() {
        System.out.println("File Extension: .fodt");
        System.out.println("MIME Type: application/vnd.oasis.opendocument.text-flat-xml");
        System.out.println("Encoding: UTF-8");
        System.out.println("XML Version: " + (doc.getXmlVersion() != null ? doc.getXmlVersion() : "1.0"));
        System.out.println("Root Element: " + root.getTagName());
        System.out.println("Version Attribute: " + root.getAttribute("office:version"));
        // Namespaces simplified
        System.out.println("Namespaces: " + root.getNamespaceURI());
        NodeList children = root.getChildNodes();
        StringBuilder childList = new StringBuilder();
        for (int i = 0; i < children.getLength(); i++) {
            if (children.item(i) instanceof Element) {
                childList.append(((Element) children.item(i)).getTagName()).append(", ");
            }
        }
        System.out.println("Required Child Elements: " + childList.toString());
        Node title = doc.getElementsByTagNameNS("http://purl.org/dc/elements/1.1/", "title").item(0);
        System.out.println("Metadata Title: " + (title != null ? title.getTextContent() : "N/A"));
        Node creator = doc.getElementsByTagNameNS("http://purl.org/dc/elements/1.1/", "creator").item(0);
        System.out.println("Metadata Creator: " + (creator != null ? creator.getTextContent() : "N/A"));
    }

    public void writeFile(String outputPath) throws Exception {
        TransformerFactory transformerFactory = TransformerFactory.newInstance();
        Transformer transformer = transformerFactory.newTransformer();
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        DOMSource source = new DOMSource(doc);
        StreamResult result = new StreamResult(new File(outputPath));
        transformer.transform(source, result);
    }

    public static void main(String[] args) throws Exception {
        FODTParser parser = new FODTParser("input.fodt");
        parser.extractAndPrintProperties();
        parser.writeFile("output.fodt");
    }
}

6. JavaScript Class

This is a Node.js-compatible class using fs and xml2js (assume installed via npm; for browser, adapt to DOMParser as in part 3). It opens, parses, extracts, prints to console, and writes to output.fodt. For pure vanilla JS, use the browser version from part 3.

const fs = require('fs');
const { parseStringPromise } = require('xml2js');

class FODTParser {
  constructor(filePath) {
    this.filePath = filePath;
    this.xmlContent = fs.readFileSync(filePath, 'utf-8');
  }

  async extractProperties() {
    const result = await parseStringPromise(this.xmlContent, { explicitArray: false });
    const root = result['office:document'];
    return {
      'File Extension': '.fodt',
      'MIME Type': 'application/vnd.oasis.opendocument.text-flat-xml',
      'Encoding': 'UTF-8',
      'XML Version': '1.0',
      'Root Element': 'office:document',
      'Version Attribute': root['$']['office:version'] || 'N/A',
      'Namespaces': root['$'], // Includes xmlns attrs
      'Required Child Elements': Object.keys(root).filter(k => k !== '$').join(', '),
      'Metadata Title': root['office:meta']['dc:title'] || 'N/A',
      'Metadata Creator': root['office:meta']['dc:creator'] || 'N/A'
    };
  }

  printProperties() {
    this.extractProperties().then(props => {
      Object.entries(props).forEach(([key, value]) => console.log(`${key}: ${value}`));
    });
  }

  writeFile(outputPath = 'output.fodt') {
    fs.writeFileSync(outputPath, this.xmlContent, 'utf-8');
    console.log(`Written to ${outputPath}`);
  }
}

// Usage
const parser = new FODTParser('input.fodt');
parser.printProperties();
parser.writeFile();

7. C "Class" (Struct and Functions)

C lacks classes, so this uses a struct with functions (OOP-like). Requires libxml2 (compile with gcc -o fodt_parser fodt_parser.c -lxml2). It opens, parses, extracts, prints to stdout, and writes to output.fodt.

#include <stdio.h>
#include <stdlib.h>
#include <libxml/parser.h>
#include <libxml/tree.h>

typedef struct {
    xmlDocPtr doc;
    xmlNodePtr root;
} FODTParser;

FODTParser* fodt_parser_new(const char* file_path) {
    FODTParser* parser = malloc(sizeof(FODTParser));
    parser->doc = xmlReadFile(file_path, "UTF-8", 0);
    if (parser->doc == NULL) {
        fprintf(stderr, "Error parsing file\n");
        free(parser);
        return NULL;
    }
    parser->root = xmlDocGetRootElement(parser->doc);
    return parser;
}

void extract_and_print_properties(FODTParser* parser) {
    printf("File Extension: .fodt\n");
    printf("MIME Type: application/vnd.oasis.opendocument.text-flat-xml\n");
    printf("Encoding: UTF-8\n");
    printf("XML Version: 1.0\n");
    printf("Root Element: %s\n", parser->root->name);
    xmlChar* version = xmlGetProp(parser->root, (xmlChar*)"office:version");
    printf("Version Attribute: %s\n", version ? (char*)version : "N/A");
    xmlFree(version);
    // Namespaces simplified
    printf("Namespaces: %s\n", parser->root->ns ? (char*)parser->root->ns->href : "N/A");
    xmlNodePtr child = parser->root->children;
    char child_list[1024] = {0};
    while (child) {
        if (child->type == XML_ELEMENT_NODE) {
            strcat(child_list, (char*)child->name);
            strcat(child_list, ", ");
        }
        child = child->next;
    }
    printf("Required Child Elements: %s\n", child_list);
    xmlXPathContextPtr ctx = xmlXPathNewContext(parser->doc);
    xmlXPathObjectPtr title = xmlXPathEvalExpression((xmlChar*)"//dc:title", ctx);
    printf("Metadata Title: %s\n", title && title->nodesetval ? (char*)xmlNodeGetContent(title->nodesetval->nodeTab[0]) : "N/A");
    xmlXPathFreeObject(title);
    // Similar for creator
    xmlXPathFreeContext(ctx);
}

void write_file(FODTParser* parser, const char* output_path) {
    int res = xmlSaveFormatFileEnc(output_path, parser->doc, "UTF-8", 1);
    if (res == -1) {
        fprintf(stderr, "Error writing file\n");
    }
}

void fodt_parser_free(FODTParser* parser) {
    xmlFreeDoc(parser->doc);
    free(parser);
}

int main() {
    FODTParser* parser = fodt_parser_new("input.fodt");
    if (parser) {
        extract_and_print_properties(parser);
        write_file(parser, "output.fodt");
        fodt_parser_free(parser);
    }
    return 0;
}