Task 258: .GGB File Format

Task 258: .GGB File Format

1. List of All Properties of the .GGB File Format Intrinsic to Its File System

The .GGB file format is a GeoGebra worksheet format, which is fundamentally a standard ZIP archive (renamed from .zip to .ggb). Its intrinsic file system properties are derived from the ZIP specification (PKZIP format, version 2.0 or later), with specific expected contents tailored to GeoGebra's structure. These properties include the archive's structural elements and the mandatory/optional internal files that define the worksheet. Below is a comprehensive list:

  • Archive Format: ZIP (PKZIP compatible), using little-endian byte order. No encryption or password protection is used.
  • Compression Method: Primarily Deflate (method code 8) for all internal files; uncompressed data is possible but rare.
  • Local File Headers: Each file entry starts with a 30-byte header signature (0x04034b50), including version needed to extract (typically 20 for Deflate), general bit flag (bit 3 set if data descriptor follows), compression method, CRC-32, compressed/uncompressed sizes, and file name length. Followed by the file name (UTF-8 encoded), extra field (optional), and compressed data.
  • Data Descriptor: Optional 12-byte structure (if bit 3 in general flag is set) containing CRC-32, compressed size, and uncompressed size.
  • Central Directory: Ends the archive with a variable number of 46-byte central file header entries (signature 0x02014b50), including version made by, version needed, bit flags, compression method, last mod file time/date, CRC-32, sizes, file name length, extra field length, comment length, disk number start, internal/external file attributes, and local header offset. Followed by end of central directory record (signature 0x06054b50, 22 bytes) with disk numbers, central directory offset/size, and optional comment.
  • File Name Encoding: UTF-8 for all internal paths.
  • Mandatory Internal File: geogebra.xml - An XML file (uncompressed or Deflate-compressed) containing the construction data, GUI state, kernel settings, and metadata. Root element <geogebra format="5.0" ...> (version varies, e.g., 5.2 as of recent releases), with child elements like <construction>, <euclidianView>, <kernel>, <gui>, and <perspectives>. Intrinsic properties: Text-based, human-readable, schema-defined for GeoGebra objects (points, lines, functions, etc.).
  • Optional Internal File: geogebra_thumbnail.png - A PNG image (Deflate-compressed) for preview, typically 200x150 pixels, embedded as binary data.
  • Optional Internal File: geogebra_main.js or geogebra.js - A JavaScript file (uncompressed or Deflate) containing global definitions, macros, or scripts for dynamic elements.
  • Optional Directory: images/ - A subdirectory containing embedded image files (e.g., .png, .jpg) used in the construction via the Image tool or custom tools. Filenames are obfuscated (non-human-readable, e.g., UUID-like), stored as binary with Deflate compression.
  • File Attributes: Internal file attributes are 0 (no special Unix modes); external attributes follow ZIP defaults. No digital signatures or extended headers specific to .GGB.
  • Archive Comment: Optional ZIP comment field, rarely used in .GGB but can contain metadata like GeoGebra version.
  • Maximum File Size: Limited by ZIP spec (4GB per file, ~65K files), but practical limit ~100MB for GeoGebra due to construction complexity.
  • Version Compatibility: Supports ZIP version 2.0+; GeoGebra writes with version 20 (Deflate).

These properties ensure portability and easy extraction/modification by renaming to .zip.

3. Ghost Blog Embedded HTML JavaScript for Drag-and-Drop .GGB Property Dump

Embed the following HTML code into a Ghost blog post (use the HTML card in the editor). It includes a drag-and-drop zone that reads the .GGB file, parses it as a ZIP using JSZip (loaded from CDN), extracts and dumps all properties to a <pre> element on screen. Handles binary/text appropriately.

Drag and drop a .GGB file here to view its properties




4. Python Class for .GGB Handling

import zipfile
import xml.etree.ElementTree as ET
import os
import shutil

class GGBFile:
    def __init__(self, path):
        self.path = path
        self.zip = None
        self.properties = {}

    def read(self):
        """Read and decode .GGB, print properties to console."""
        if not os.path.exists(self.path):
            print("File not found.")
            return

        with zipfile.ZipFile(self.path, 'r') as self.zip:
            print("=== .GGB File Properties ===")
            print(f"Archive Format: ZIP")
            print(f"Number of Files: {len(self.zip.namelist())}")

            print("\nFiles in Archive:")
            for name in self.zip.namelist():
                info = self.zip.getinfo(name)
                print(f"- {name} (Uncompressed: {info.file_size} bytes, Compressed: {info.compress_size} bytes)")

            # geogebra.xml
            if 'geogebra.xml' in self.zip.namelist():
                xml_content = self.zip.read('geogebra.xml').decode('utf-8')
                print(f"\ngeogebra.xml: Present")
                root = ET.fromstring(xml_content)
                print(f"  Version: {root.get('format', 'Unknown')}")
                print(f"  Content (first 500 chars): {xml_content[:500]}...")
            else:
                print("\ngeogebra.xml: Missing")

            # geogebra_thumbnail.png
            if 'geogebra_thumbnail.png' in self.zip.namelist():
                thumb_info = self.zip.getinfo('geogebra_thumbnail.png')
                print(f"\ngeogebra_thumbnail.png: Present (Size: {thumb_info.file_size} bytes)")
            else:
                print("\ngeogebra_thumbnail.png: Missing")

            # geogebra.js
            js_files = ['geogebra.js', 'geogebra_main.js']
            js_present = any(js in self.zip.namelist() for js in js_files)
            if js_present:
                js_name = next(js for js in js_files if js in self.zip.namelist())
                js_info = self.zip.getinfo(js_name)
                print(f"\ngeogebra.js: Present (Size: {js_info.file_size} bytes)")
            else:
                print("\ngeogebra.js: Missing")

            # images/
            images = [n for n in self.zip.namelist() if n.startswith('images/')]
            print(f"\nImages Directory Files: {len(images)}")
            for img in images:
                img_info = self.zip.getinfo(img)
                print(f"- {img} (Size: {img_info.file_size} bytes)")

    def write(self, output_path):
        """Write .GGB to new path (simple copy for demo; extend for modifications)."""
        shutil.copy2(self.path, output_path)
        print(f"\nWritten to {output_path}")

# Usage example:
# ggb = GGBFile('example.ggb')
# ggb.read()
# ggb.write('output.ggb')

5. Java Class for .GGB Handling

import java.io.*;
import java.util.Enumeration;
import java.util.zip.*;

public class GGBFile {
    private String path;
    private ZipFile zip;

    public GGBFile(String path) {
        this.path = path;
    }

    public void read() {
        // Read and decode .GGB, print properties to console
        File file = new File(path);
        if (!file.exists()) {
            System.out.println("File not found.");
            return;
        }

        try {
            zip = new ZipFile(file);
            System.out.println("=== .GGB File Properties ===");
            System.out.println("Archive Format: ZIP");
            System.out.println("Number of Files: " + zip.size());

            System.out.println("\nFiles in Archive:");
            Enumeration<? extends ZipEntry> entries = zip.entries();
            while (entries.hasMoreElements()) {
                ZipEntry entry = entries.nextElement();
                System.out.println("- " + entry.getName() +
                    " (Uncompressed: " + entry.getSize() + " bytes, Compressed: " + entry.getCompressedSize() + " bytes)");
            }

            // geogebra.xml
            ZipEntry xmlEntry = zip.getEntry("geogebra.xml");
            if (xmlEntry != null) {
                System.out.println("\ngeogebra.xml: Present");
                BufferedReader reader = new BufferedReader(new InputStreamReader(zip.getInputStream(xmlEntry)));
                StringBuilder xmlContent = new StringBuilder();
                String line;
                int chars = 0;
                while ((line = reader.readLine()) != null && chars < 500) {
                    xmlContent.append(line).append("\n");
                    chars += line.length();
                }
                reader.close();
                // Parse version (simple)
                if (xmlContent.indexOf("format=") > 0) {
                    int start = xmlContent.indexOf("format=\"") + 8;
                    int end = xmlContent.indexOf("\"", start);
                    String version = xmlContent.substring(start, end);
                    System.out.println("  Version: " + version);
                }
                System.out.println("  Content (first 500 chars): " + xmlContent.toString().trim() + "...");
            } else {
                System.out.println("\ngeogebra.xml: Missing");
            }

            // geogebra_thumbnail.png
            ZipEntry thumbEntry = zip.getEntry("geogebra_thumbnail.png");
            if (thumbEntry != null) {
                System.out.println("\ngeogebra_thumbnail.png: Present (Size: " + thumbEntry.getSize() + " bytes)");
            } else {
                System.out.println("\ngeogebra_thumbnail.png: Missing");
            }

            // geogebra.js
            boolean jsPresent = zip.getEntry("geogebra.js") != null || zip.getEntry("geogebra_main.js") != null;
            if (jsPresent) {
                ZipEntry jsEntry = zip.getEntry("geogebra.js") != null ? zip.getEntry("geogebra.js") : zip.getEntry("geogebra_main.js");
                System.out.println("\ngeogebra.js: Present (Size: " + jsEntry.getSize() + " bytes)");
            } else {
                System.out.println("\ngeogebra.js: Missing");
            }

            // images/
            int imageCount = 0;
            entries = zip.entries();
            while (entries.hasMoreElements()) {
                ZipEntry entry = entries.nextElement();
                if (entry.getName().startsWith("images/")) {
                    imageCount++;
                    System.out.println("- " + entry.getName() + " (Size: " + entry.getSize() + " bytes)");
                }
            }
            System.out.println("\nImages Directory Files: " + imageCount);

            zip.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    public void write(String outputPath) {
        // Write .GGB to new path (simple copy for demo)
        try {
            Files.copy(Paths.get(path), Paths.get(outputPath), StandardCopyOption.REPLACE_EXISTING);
            System.out.println("\nWritten to " + outputPath);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    // Usage: new GGBFile("example.ggb").read();
}

(Note: Requires import java.nio.file.*; and import java.nio.file.StandardCopyOption; for write.)

6. JavaScript Class for .GGB Handling (Node.js)

This assumes Node.js with fs and jszip (install via npm install jszip).

const fs = require('fs');
const JSZip = require('jszip');

class GGBFile {
  constructor(path) {
    this.path = path;
    this.zip = null;
  }

  async read() {
    // Read and decode .GGB, print properties to console
    if (!fs.existsSync(this.path)) {
      console.log('File not found.');
      return;
    }

    const buffer = fs.readFileSync(this.path);
    this.zip = await JSZip.loadAsync(buffer);

    console.log('=== .GGB File Properties ===');
    console.log('Archive Format: ZIP');
    console.log(`Number of Files: ${Object.keys(this.zip.files).length}`);

    console.log('\nFiles in Archive:');
    for (const [name, entry] of Object.entries(this.zip.files)) {
      const size = entry._data.uncompressedSize || 0;
      console.log(`- ${name} (Uncompressed: ${size} bytes, Compressed: ${entry.compressedSize || size} bytes)`);
    }

    // geogebra.xml
    if (this.zip.files['geogebra.xml']) {
      const xmlContent = await this.zip.file('geogebra.xml').async('text');
      console.log('\ngeogebra.xml: Present');
      // Simple version parse
      const versionMatch = xmlContent.match(/format="([^"]+)"/);
      console.log(`  Version: ${versionMatch ? versionMatch[1] : 'Unknown'}`);
      console.log(`  Content (first 500 chars): ${xmlContent.substring(0, 500)}...`);
    } else {
      console.log('\ngeogebra.xml: Missing');
    }

    // geogebra_thumbnail.png
    if (this.zip.files['geogebra_thumbnail.png']) {
      const thumbSize = this.zip.files['geogebra_thumbnail.png']._data.uncompressedSize;
      console.log(`\ngeogebra_thumbnail.png: Present (Size: ${thumbSize} bytes)`);
    } else {
      console.log('\ngeogebra_thumbnail.png: Missing');
    }

    // geogebra.js
    const jsFiles = ['geogebra.js', 'geogebra_main.js'];
    const jsPresent = jsFiles.some(js => this.zip.files[js]);
    if (jsPresent) {
      const jsName = jsFiles.find(js => this.zip.files[js]);
      const jsSize = this.zip.files[jsName]._data.uncompressedSize;
      console.log(`\ngeogebra.js: Present (Size: ${jsSize} bytes)`);
    } else {
      console.log('\ngeogebra.js: Missing');
    }

    // images/
    const images = Object.keys(this.zip.files).filter(n => n.startsWith('images/'));
    console.log(`\nImages Directory Files: ${images.length}`);
    images.forEach(img => {
      const imgSize = this.zip.files[img]._data.uncompressedSize;
      console.log(`- ${img} (Size: ${imgSize} bytes)`);
    });
  }

  async write(outputPath) {
    // Write .GGB to new path (simple copy for demo)
    const buffer = fs.readFileSync(this.path);
    fs.writeFileSync(outputPath, buffer);
    console.log(`\nWritten to ${outputPath}`);
  }
}

// Usage:
// const ggb = new GGBFile('example.ggb');
// ggb.read().then(() => ggb.write('output.ggb'));

7. C Class for .GGB Handling

This is a basic implementation in standard C (ANSI C89 compatible) without external libraries. It parses ZIP headers minimally to extract file list, sizes, and contents (assumes Deflate but doesn't decompress for binary; uses simple fread for text). For full decompression, zlib would be ideal, but here we read uncompressed sizes and raw text for XML/JS. Compile with gcc ggb.c -o ggb. Write is a simple file copy.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>

#define ZIP_LOCAL_SIG 0x04034B50
#define ZIP_CENTRAL_SIG 0x02014B50
#define ZIP_END_SIG 0x06054B50
#define BUFFER_SIZE 1024

typedef struct {
    unsigned long compressed_size;
    unsigned long uncompressed_size;
    unsigned short compression_method;
    char *name;
} ZipEntry;

typedef struct {
    int num_entries;
    ZipEntry *entries;
} GGBFile;

GGBFile *ggb_open(const char *path) {
    GGBFile *ggb = malloc(sizeof(GGBFile));
    ggb->num_entries = 0;
    ggb->entries = NULL;

    int fd = open(path, O_RDONLY);
    if (fd < 0) return NULL;

    // Find end of central directory (read backward)
    long file_size = lseek(fd, 0, SEEK_END);
    char buf[22];
    long pos = file_size - 22;
    while (pos >= 0) {
        lseek(fd, pos, SEEK_SET);
        read(fd, buf, 22);
        if (*(unsigned long *)buf == ZIP_END_SIG) break;
        pos--;
    }
    if (pos < 0) {
        close(fd);
        free(ggb);
        return NULL;
    }

    // Parse central directory offset and size
    unsigned long central_offset = *(unsigned long *)(buf + 16);
    unsigned short central_size = *(unsigned short *)(buf + 12);

    // Parse central directory
    lseek(fd, central_offset, SEEK_SET);
    ggb->num_entries = central_size / 46;  // Approx size per entry
    ggb->entries = malloc(ggb->num_entries * sizeof(ZipEntry));
    int entry_idx = 0;

    char central_buf[BUFFER_SIZE];
    while (central_offset < file_size && entry_idx < ggb->num_entries) {
        read(fd, central_buf, 46);
        if (*(unsigned long *)central_buf != ZIP_CENTRAL_SIG) break;

        ZipEntry *entry = &ggb->entries[entry_idx];
        entry->compression_method = *(unsigned short *)(central_buf + 10);
        entry->compressed_size = *(unsigned long *)(central_buf + 20);
        entry->uncompressed_size = *(unsigned long *)(central_buf + 24);
        unsigned short name_len = *(unsigned short *)(central_buf + 28);
        unsigned long local_offset = *(unsigned long *)(central_buf + 42);

        // Read name
        entry->name = malloc(name_len + 1);
        lseek(fd, local_offset + 30, SEEK_SET);  // After local header
        read(fd, entry->name, name_len);
        entry->name[name_len] = '\0';

        entry_idx++;
        central_offset += 46 + name_len;  // Skip extra/comment
    }
    ggb->num_entries = entry_idx;

    close(fd);
    return ggb;
}

void ggb_print_properties(GGBFile *ggb) {
    if (!ggb) return;

    printf("=== .GGB File Properties ===\n");
    printf("Archive Format: ZIP\n");
    printf("Number of Files: %d\n\n", ggb->num_entries);

    printf("Files in Archive:\n");
    for (int i = 0; i < ggb->num_entries; i++) {
        ZipEntry *e = &ggb->entries[i];
        printf("- %s (Uncompressed: %lu bytes, Compressed: %lu bytes)\n",
               e->name, e->uncompressed_size, e->compressed_size);
    }

    // geogebra.xml (read raw text, assume uncompressed for simplicity; extend for deflate)
    int xml_found = 0;
    for (int i = 0; i < ggb->num_entries; i++) {
        if (strcmp(ggb->entries[i].name, "geogebra.xml") == 0) {
            printf("\ngeogebra.xml: Present\n");
            // Simple version parse (read first line)
            FILE *fp = fopen("temp.ggb", "rb");  // Reopen file
            fseek(fp, ggb->entries[i].compressed_size ? /* deflate offset */ 0 : 0, SEEK_SET);  // Simplified
            char xml_buf[501];
            fread(xml_buf, 1, 500, fp);
            xml_buf[500] = '\0';
            // Find version
            char *ver = strstr(xml_buf, "format=\"");
            if (ver) {
                ver += 8;
                char *end = strchr(ver, '"');
                if (end) *end = '\0';
                printf("  Version: %s\n", ver);
            }
            printf("  Content (first 500 chars): %s...\n", xml_buf);
            fclose(fp);
            remove("temp.ggb");
            xml_found = 1;
            break;
        }
    }
    if (!xml_found) printf("\ngeogebra.xml: Missing\n");

    // Similar for thumbnail (size only)
    int thumb_found = 0;
    for (int i = 0; i < ggb->num_entries; i++) {
        if (strcmp(ggb->entries[i].name, "geogebra_thumbnail.png") == 0) {
            printf("\ngeogebra_thumbnail.png: Present (Size: %lu bytes)\n", ggb->entries[i].uncompressed_size);
            thumb_found = 1;
            break;
        }
    }
    if (!thumb_found) printf("\ngeogebra_thumbnail.png: Missing\n");

    // geogebra.js
    int js_found = 0;
    for (int i = 0; i < ggb->num_entries; i++) {
        if (strcmp(ggb->entries[i].name, "geogebra.js") == 0 || strcmp(ggb->entries[i].name, "geogebra_main.js") == 0) {
            printf("\ngeogebra.js: Present (Size: %lu bytes)\n", ggb->entries[i].uncompressed_size);
            js_found = 1;
            break;
        }
    }
    if (!js_found) printf("\ngeogebra.js: Missing\n");

    // images/
    int img_count = 0;
    printf("\nImages Directory Files: ");
    for (int i = 0; i < ggb->num_entries; i++) {
        if (strncmp(ggb->entries[i].name, "images/", 7) == 0) {
            img_count++;
            printf("%s (%lu bytes) ", ggb->entries[i].name + 7, ggb->entries[i].uncompressed_size);
        }
    }
    printf("(%d total)\n", img_count);
}

void ggb_write(GGBFile *ggb, const char *output_path) {
    // Simple copy
    FILE *in = fopen(ggb ? ggb->path : "input.ggb", "rb");  // Assume path stored
    FILE *out = fopen(output_path, "wb");
    char buf[BUFFER_SIZE];
    size_t bytes;
    while ((bytes = fread(buf, 1, BUFFER_SIZE, in)) > 0) {
        fwrite(buf, 1, bytes, out);
    }
    fclose(in);
    fclose(out);
    printf("\nWritten to %s\n", output_path);
}

void ggb_close(GGBFile *ggb) {
    if (!ggb) return;
    for (int i = 0; i < ggb->num_entries; i++) {
        free(ggb->entries[i].name);
    }
    free(ggb->entries);
    free(ggb);
}

// Usage:
// GGBFile *ggb = ggb_open("example.ggb");
// ggb_print_properties(ggb);
// ggb_write(ggb, "output.ggb");
// ggb_close(ggb);

(Note: This C implementation is simplified; full ZIP parsing (e.g., variable lengths, deflate) requires more code. For production, use libzip.)