Task 405: .MOBI File Format

Task 405: .MOBI File Format

1. MOBI File Format Specifications

The .MOBI file format is an eBook format originally developed for the Mobipocket Reader and later adopted by Amazon for Kindle devices (as AZW). It is based on the Palm Database (PDB) format, extending the PalmDOC format with HTML-like tags for structure, metadata, and multimedia. MOBI supports compression (PalmDOC LZ77, HUFF/CDIC, or none), encryption (old or new Mobipocket schemes), images, indices, and extended metadata via EXTH records. The format is binary, big-endian for multi-byte values, and consists of a PDB header, followed by record 0 containing PalmDOC, MOBI, and EXTH headers, then text/image/index records.

Key characteristics:

  • File Extension: .mobi
  • MIME Type: application/x-mobipocket-ebook
  • Based On: Palm PDB (Creator ID: "MOBI", Type: "BOOK")
  • Compression: Optional (types: 1=none, 2=PalmDOC, 17480=HUFF/CDIC)
  • Encryption: Optional (0=none, 1=old Mobipocket, 2=Mobipocket)
  • Text Encoding: CP1252 or UTF-8
  • Structure: Headers in record 0; text in compressed 4096-byte blocks; optional images, indices, DRM
  • Versions: Supports up to KF8 (Mobi Type 248) for enhanced features
  • Metadata: Stored in EXTH records (e.g., author, ISBN, cover offset)

2. List of Properties Intrinsic to the MOBI File Format

The following is a comprehensive list of properties (fields) from the MOBI file structure, derived from its headers and sections. These are "intrinsic to its file system" as they define the binary layout, record management, metadata, and content organization in the PDB-based format. Properties are grouped by header/section, with types, sizes, and descriptions. Variable-length or optional fields are noted.

PDB Header (Offsets from file start; big-endian)

  • Database Name: string (32 bytes, null-terminated) - Book title/author snippet.
  • Attributes: uint16 - Bit flags (e.g., read-only, backup).
  • Version: uint16 - Database version.
  • Creation Date: uint32 - Seconds since Jan 1, 1904.
  • Modification Date: uint32 - Seconds since Jan 1, 1904.
  • Last Backup Date: uint32 - Seconds since Jan 1, 1904.
  • Modification Number: uint32 - Internal tracking.
  • App Info ID: uint32 - Offset to app info area (or 0).
  • Sort Info ID: uint32 - Offset to sort info area (or 0).
  • Type: char[4] - "BOOK" for MOBI.
  • Creator: char[4] - "MOBI".
  • Unique ID Seed: uint32 - For record IDs.
  • Next Record List ID: uint32 - Always 0 in files.
  • Number of Records: uint16 - Total PDB records (N).

Record Info List (Repeats N times, 8 bytes each)

  • Record Data Offset: uint32 - File offset to record data.
  • Record Attributes: uint8 - Bit flags (e.g., secret, dirty).
  • Unique ID: uint24 - Sequential or unique record ID.

PalmDOC Header (First 16 bytes of record 0)

  • Compression: uint16 - 1=none, 2=PalmDOC, 17480=HUFF/CDIC.
  • Unused: uint16 - Always 0.
  • Text Length: uint32 - Uncompressed text length.
  • Record Count: uint16 - Text records count.
  • Record Size: uint16 - Max text record size (4096).
  • Encryption Type: uint16 - 0=none, 1=old Mobipocket, 2=Mobipocket.
  • Unknown: uint16 - Usually 0.

MOBI Header (Follows PalmDOC in record 0; variable length)

  • Identifier: char[4] - "MOBI".
  • Header Length: uint32 - MOBI header size.
  • Mobi Type: uint32 - e.g., 2=book, 248=KF8.
  • Text Encoding: uint32 - 1252=CP1252, 65001=UTF-8.
  • Unique ID: uint32 - Random ID.
  • File Version: uint32 - MOBI format version.
  • Ortographic Index: uint32 - Meta index section (or 0xFFFFFFFF).
  • Inflection Index: uint32 - Meta index section (or 0xFFFFFFFF).
  • Index Names: uint32 - (or 0xFFFFFFFF).
  • Index Keys: uint32 - (or 0xFFFFFFFF).
  • Extra Index 0-5: uint32 (6 fields) - Meta indices (or 0xFFFFFFFF).
  • First Non-Book Index: uint32 - First non-text record.
  • Full Name Offset: uint32 - Offset to book name in record 0.
  • Full Name Length: uint32 - Book name length.
  • Locale: uint32 - Language code (e.g., 1033=US English).
  • Input Language: uint32 - Dictionary input lang.
  • Output Language: uint32 - Dictionary output lang.
  • Min Version: uint32 - Min reader version required.
  • First Image Index: uint32 - First image record.
  • Huffman Record Offset: uint32 - First Huffman record.
  • Huffman Record Count: uint32 - Huffman records count.
  • Huffman Table Offset: uint32.
  • Huffman Table Length: uint32.
  • EXTH Flags: uint32 - Bitfield (0x40 if EXTH present).
  • Unknown (0x84): bytes[32] - Unknown.
  • Unknown (0xA4): uint32 - 0xFFFFFFFF.
  • DRM Offset: uint32 - DRM info offset (or 0xFFFFFFFF).
  • DRM Count: uint32 - DRM entries (or 0xFFFFFFFF).
  • DRM Size: uint32 - DRM bytes.
  • DRM Flags: uint32 - DRM flags.
  • Unknown (0xB8): bytes[8] - 0x0000000000000000.
  • First Content Record Number: uint16 - Usually 1.
  • Last Content Record Number: uint16 - Last content record.
  • Unknown (0xC4): uint32 - 0x00000001.
  • FCIS Record Number: uint32.
  • Unknown (0xCC): uint32 - 0x00000001.
  • FLIS Record Number: uint32.
  • Unknown (0xD4): uint32 - 0x00000001.
  • Unknown (0xD8): bytes[8] - 0x0000000000000000.
  • Unknown (0xE0): uint32 - 0xFFFFFFFF.
  • First Compilation Data Section Count: uint32 - 0x00000000.
  • Number of Compilation Data Sections: uint32 - 0xFFFFFFFF.
  • Unknown (0xEC): uint32 - 0xFFFFFFFF.
  • Extra Record Data Flags: uint32 - Bits for extra data in text blocks.
  • INDX Record Offset: uint32 - First INDX record (or 0xFFFFFFFF).
  • Unknown (0xF8 to 0x104): uint32 (5 fields) - 0xFFFFFFFF.

EXTH Header (Follows MOBI if present; variable length, padded to 4 bytes)

  • Identifier: char[4] - "EXTH".
  • Header Length: uint32 - EXTH size (excl. padding).
  • Record Count: uint32 - Number of EXTH records.
  • EXTH Records: Variable (repeats: uint32 type, uint32 length, data[length-8]).
    Common types include:
  • 100: Author (string)
  • 101: Publisher (string)
  • 103: Description (string)
  • 104: ISBN (string)
  • 105: Subject (string, multiple possible)
  • 106: Publishing Date (string)
  • 108: Contributor (string)
  • 109: Rights (string)
  • 113: ASIN (string)
  • 116: Start Reading Offset (uint32)
  • 117: Adult (string, "yes")
  • 118: Retail Price (string)
  • 119: Retail Price Currency (string)
  • 121: KF8 Boundary Offset (uint32)
  • 201: Cover Offset (uint32)
  • 202: Thumb Offset (uint32)
  • And others (1-3: DRM IDs, 200: Dict Short Name, etc.)
  • Padding: Null bytes to 4-byte alignment.

Other Sections/Properties

  • Text Records: Compressed HTML/text blocks (4096 bytes uncompressed).
  • Image Records: Sequential binary images (JPEG/GIF, referenced by index).
  • Index Records: INDX for TOC/search (variable structure with tags, offsets).
  • HUFF/CDIC Records: For advanced compression (tables, dictionaries).
  • FDST Record: Flow data for reflowable text.
  • DATP Record: Unknown/optional.
  • FCIS/FLIS Records: Fixed strings for reader info (e.g., "FCIS" with lengths).
  • Extra Data in Text Blocks: Optional (multibyte, TBS indexing, based on flags).

These properties define the file's layout, content access, and metadata.

3. Ghost Blog Embedded HTML/JavaScript for Drag-and-Drop .MOBI Parser

Embed this in a Ghost blog post using the HTML card. It uses the FileReader API to read the dropped .MOBI file client-side, parses the binary data (big-endian), extracts all properties from the list above, and dumps them to the screen in a pre-formatted block.

MOBI Property Dumper
Drag and drop a .MOBI file here

4. Python Class for .MOBI Handling

This class opens a .MOBI file, decodes/parses all properties, prints them to console, and can write a modified file (e.g., updates a property and saves).

import struct
import os

class MobiHandler:
    def __init__(self, filepath):
        self.filepath = filepath
        self.properties = {}
        with open(filepath, 'rb') as f:
            self.data = f.read()
        self.offset = 0
        self.parse()

    def read_uint8(self):
        val = self.data[self.offset]
        self.offset += 1
        return val

    def read_uint16(self):
        val, = struct.unpack('>H', self.data[self.offset:self.offset+2])
        self.offset += 2
        return val

    def read_uint32(self):
        val, = struct.unpack('>I', self.data[self.offset:self.offset+4])
        self.offset += 4
        return val

    def read_string(self, length):
        str_data = self.data[self.offset:self.offset+length]
        self.offset += length
        return str_data.split(b'\x00', 1)[0].decode('utf-8', errors='ignore')

    def read_bytes(self, length):
        bytes_data = self.data[self.offset:self.offset+length]
        self.offset += length
        return bytes_data

    def parse_pdb_header(self):
        self.properties['Database Name'] = self.read_string(32)
        self.properties['Attributes'] = self.read_uint16()
        self.properties['Version'] = self.read_uint16()
        self.properties['Creation Date'] = self.read_uint32()
        self.properties['Modification Date'] = self.read_uint32()
        self.properties['Last Backup Date'] = self.read_uint32()
        self.properties['Modification Number'] = self.read_uint32()
        self.properties['App Info ID'] = self.read_uint32()
        self.properties['Sort Info ID'] = self.read_uint32()
        self.properties['Type'] = self.read_string(4)
        self.properties['Creator'] = self.read_string(4)
        self.properties['Unique ID Seed'] = self.read_uint32()
        self.properties['Next Record List ID'] = self.read_uint32()
        self.properties['Number of Records'] = self.read_uint16()

    def parse_record_info_list(self):
        num_records = self.properties['Number of Records']
        self.properties['Record Infos'] = []
        for _ in range(num_records):
            info = {
                'Record Data Offset': self.read_uint32(),
                'Record Attributes': self.read_uint8(),
                'Unique ID': (self.read_uint8() << 16) | (self.read_uint8() << 8) | self.read_uint8()
            }
            self.properties['Record Infos'].append(info)

    def parse_palmdoc_header(self):
        self.offset = self.properties['Record Infos'][0]['Record Data Offset']
        self.properties['Compression'] = self.read_uint16()
        self.properties['Unused'] = self.read_uint16()
        self.properties['Text Length'] = self.read_uint32()
        self.properties['Record Count'] = self.read_uint16()
        self.properties['Record Size'] = self.read_uint16()
        self.properties['Encryption Type'] = self.read_uint16()
        self.properties['Unknown (PalmDOC)'] = self.read_uint16()

    def parse_mobi_header(self):
        self.properties['Identifier (MOBI)'] = self.read_string(4)
        self.properties['Header Length (MOBI)'] = self.read_uint32()
        header_end = self.offset + self.properties['Header Length (MOBI)'] - 8  # Adjust
        self.properties['Mobi Type'] = self.read_uint32()
        self.properties['Text Encoding'] = self.read_uint32()
        self.properties['Unique ID'] = self.read_uint32()
        self.properties['File Version'] = self.read_uint32()
        self.properties['Ortographic Index'] = self.read_uint32()
        self.properties['Inflection Index'] = self.read_uint32()
        self.properties['Index Names'] = self.read_uint32()
        self.properties['Index Keys'] = self.read_uint32()
        self.properties['Extra Index 0'] = self.read_uint32()
        self.properties['Extra Index 1'] = self.read_uint32()
        self.properties['Extra Index 2'] = self.read_uint32()
        self.properties['Extra Index 3'] = self.read_uint32()
        self.properties['Extra Index 4'] = self.read_uint32()
        self.properties['Extra Index 5'] = self.read_uint32()
        self.properties['First Non-Book Index'] = self.read_uint32()
        self.properties['Full Name Offset'] = self.read_uint32()
        self.properties['Full Name Length'] = self.read_uint32()
        self.properties['Locale'] = self.read_uint32()
        self.properties['Input Language'] = self.read_uint32()
        self.properties['Output Language'] = self.read_uint32()
        self.properties['Min Version'] = self.read_uint32()
        self.properties['First Image Index'] = self.read_uint32()
        self.properties['Huffman Record Offset'] = self.read_uint32()
        self.properties['Huffman Record Count'] = self.read_uint32()
        self.properties['Huffman Table Offset'] = self.read_uint32()
        self.properties['Huffman Table Length'] = self.read_uint32()
        self.properties['EXTH Flags'] = self.read_uint32()
        self.properties['Unknown (0x84)'] = self.read_bytes(32)
        self.properties['Unknown (0xA4)'] = self.read_uint32()
        self.properties['DRM Offset'] = self.read_uint32()
        self.properties['DRM Count'] = self.read_uint32()
        self.properties['DRM Size'] = self.read_uint32()
        self.properties['DRM Flags'] = self.read_uint32()
        self.properties['Unknown (0xB8)'] = self.read_bytes(8)
        self.properties['First Content Record Number'] = self.read_uint16()
        self.properties['Last Content Record Number'] = self.read_uint16()
        self.properties['Unknown (0xC4)'] = self.read_uint32()
        self.properties['FCIS Record Number'] = self.read_uint32()
        self.properties['Unknown (0xCC)'] = self.read_uint32()
        self.properties['FLIS Record Number'] = self.read_uint32()
        self.properties['Unknown (0xD4)'] = self.read_uint32()
        self.properties['Unknown (0xD8)'] = self.read_bytes(8)
        self.properties['Unknown (0xE0)'] = self.read_uint32()
        self.properties['First Compilation Data Section Count'] = self.read_uint32()
        self.properties['Number of Compilation Data Sections'] = self.read_uint32()
        self.properties['Unknown (0xEC)'] = self.read_uint32()
        self.properties['Extra Record Data Flags'] = self.read_uint32()
        self.properties['INDX Record Offset'] = self.read_uint32()
        self.offset = header_end  # Skip remaining unknown

    def parse_exth_header(self):
        if self.properties['EXTH Flags'] & 0x40 == 0:
            return
        self.properties['Identifier (EXTH)'] = self.read_string(4)
        self.properties['Header Length (EXTH)'] = self.read_uint32()
        self.properties['Record Count (EXTH)'] = self.read_uint32()
        self.properties['EXTH Records'] = []
        for _ in range(self.properties['Record Count (EXTH)']):
            type_ = self.read_uint32()
            len_ = self.read_uint32()
            data = self.read_bytes(len_ - 8)
            self.properties['EXTH Records'].append({'type': type_, 'data': data.decode('utf-8', errors='ignore')})

    def parse(self):
        self.parse_pdb_header()
        self.parse_record_info_list()
        self.parse_palmdoc_header()
        self.parse_mobi_header()
        self.parse_exth_header()

    def print_properties(self):
        import pprint
        pprint.pprint(self.properties)

    def write(self, new_filepath, modify_prop=None):
        # Simple write: copy data, optionally modify a property (e.g., {'Database Name': 'New Name'})
        data = bytearray(self.data)
        if modify_prop:
            for key, value in modify_prop.items():
                if key == 'Database Name':
                    # Example modification: overwrite at offset 0
                    new_name = value.encode('utf-8')[:31] + b'\x00'
                    data[0:32] = new_name.ljust(32, b'\x00')
                # Add more modifications as needed
        with open(new_filepath, 'wb') as f:
            f.write(data)

# Usage example
if __name__ == '__main__':
    handler = MobiHandler('example.mobi')
    handler.print_properties()
    handler.write('modified.mobi', {'Database Name': 'Modified Title'})

5. Java Class for .MOBI Handling

This Java class opens a .MOBI file, decodes/parses properties, prints to console, and can write a modified file.

import java.io.*;
import java.nio.*;
import java.nio.channels.FileChannel;
import java.util.*;

public class MobiHandler {
    private String filepath;
    private Map<String, Object> properties = new HashMap<>();
    private ByteBuffer buffer;

    public MobiHandler(String filepath) throws IOException {
        this.filepath = filepath;
        try (RandomAccessFile raf = new RandomAccessFile(filepath, "r")) {
            buffer = ByteBuffer.allocate((int) raf.length()).order(ByteOrder.BIG_ENDIAN);
            raf.getChannel().read(buffer);
            buffer.flip();
        }
        parse();
    }

    private int readUint8() {
        return buffer.get() & 0xFF;
    }

    private int readUint16() {
        return buffer.getShort() & 0xFFFF;
    }

    private long readUint32() {
        return buffer.getInt() & 0xFFFFFFFFL;
    }

    private String readString(int length) {
        byte[] bytes = new byte[length];
        buffer.get(bytes);
        int nullIndex = 0;
        for (; nullIndex < length && bytes[nullIndex] != 0; nullIndex++);
        return new String(bytes, 0, nullIndex);
    }

    private byte[] readBytes(int length) {
        byte[] bytes = new byte[length];
        buffer.get(bytes);
        return bytes;
    }

    private void parsePdbHeader() {
        properties.put("Database Name", readString(32));
        properties.put("Attributes", readUint16());
        properties.put("Version", readUint16());
        properties.put("Creation Date", readUint32());
        properties.put("Modification Date", readUint32());
        properties.put("Last Backup Date", readUint32());
        properties.put("Modification Number", readUint32());
        properties.put("App Info ID", readUint32());
        properties.put("Sort Info ID", readUint32());
        properties.put("Type", readString(4));
        properties.put("Creator", readString(4));
        properties.put("Unique ID Seed", readUint32());
        properties.put("Next Record List ID", readUint32());
        properties.put("Number of Records", readUint16());
    }

    private void parseRecordInfoList() {
        int numRecords = (int) properties.get("Number of Records");
        List<Map<String, Object>> recordInfos = new ArrayList<>();
        for (int i = 0; i < numRecords; i++) {
            Map<String, Object> info = new HashMap<>();
            info.put("Record Data Offset", readUint32());
            info.put("Record Attributes", readUint8());
            info.put("Unique ID", (readUint8() << 16) | (readUint8() << 8) | readUint8());
            recordInfos.add(info);
        }
        properties.put("Record Infos", recordInfos);
    }

    private void parsePalmDocHeader() {
        long record0Offset = (long) ((List<Map<String, Object>>) properties.get("Record Infos")).get(0).get("Record Data Offset");
        buffer.position((int) record0Offset);
        properties.put("Compression", readUint16());
        properties.put("Unused", readUint16());
        properties.put("Text Length", readUint32());
        properties.put("Record Count", readUint16());
        properties.put("Record Size", readUint16());
        properties.put("Encryption Type", readUint16());
        properties.put("Unknown (PalmDOC)", readUint16());
    }

    private void parseMobiHeader() {
        properties.put("Identifier (MOBI)", readString(4));
        long headerLength = readUint32();
        int headerEnd = buffer.position() + (int) headerLength - 8;
        properties.put("Header Length (MOBI)", headerLength);
        properties.put("Mobi Type", readUint32());
        properties.put("Text Encoding", readUint32());
        properties.put("Unique ID", readUint32());
        properties.put("File Version", readUint32());
        properties.put("Ortographic Index", readUint32());
        properties.put("Inflection Index", readUint32());
        properties.put("Index Names", readUint32());
        properties.put("Index Keys", readUint32());
        properties.put("Extra Index 0", readUint32());
        properties.put("Extra Index 1", readUint32());
        properties.put("Extra Index 2", readUint32());
        properties.put("Extra Index 3", readUint32());
        properties.put("Extra Index 4", readUint32());
        properties.put("Extra Index 5", readUint32());
        properties.put("First Non-Book Index", readUint32());
        properties.put("Full Name Offset", readUint32());
        properties.put("Full Name Length", readUint32());
        properties.put("Locale", readUint32());
        properties.put("Input Language", readUint32());
        properties.put("Output Language", readUint32());
        properties.put("Min Version", readUint32());
        properties.put("First Image Index", readUint32());
        properties.put("Huffman Record Offset", readUint32());
        properties.put("Huffman Record Count", readUint32());
        properties.put("Huffman Table Offset", readUint32());
        properties.put("Huffman Table Length", readUint32());
        properties.put("EXTH Flags", readUint32());
        properties.put("Unknown (0x84)", readBytes(32));
        properties.put("Unknown (0xA4)", readUint32());
        properties.put("DRM Offset", readUint32());
        properties.put("DRM Count", readUint32());
        properties.put("DRM Size", readUint32());
        properties.put("DRM Flags", readUint32());
        properties.put("Unknown (0xB8)", readBytes(8));
        properties.put("First Content Record Number", readUint16());
        properties.put("Last Content Record Number", readUint16());
        properties.put("Unknown (0xC4)", readUint32());
        properties.put("FCIS Record Number", readUint32());
        properties.put("Unknown (0xCC)", readUint32());
        properties.put("FLIS Record Number", readUint32());
        properties.put("Unknown (0xD4)", readUint32());
        properties.put("Unknown (0xD8)", readBytes(8));
        properties.put("Unknown (0xE0)", readUint32());
        properties.put("First Compilation Data Section Count", readUint32());
        properties.put("Number of Compilation Data Sections", readUint32());
        properties.put("Unknown (0xEC)", readUint32());
        properties.put("Extra Record Data Flags", readUint32());
        properties.put("INDX Record Offset", readUint32());
        buffer.position(headerEnd);
    }

    private void parseExthHeader() {
        long exthFlags = (long) properties.get("EXTH Flags");
        if ((exthFlags & 0x40) == 0) return;
        properties.put("Identifier (EXTH)", readString(4));
        long headerLength = readUint32();
        properties.put("Header Length (EXTH)", headerLength);
        int recordCount = (int) readUint32();
        properties.put("Record Count (EXTH)", recordCount);
        List<Map<String, Object>> exthRecords = new ArrayList<>();
        for (int i = 0; i < recordCount; i++) {
            Map<String, Object> record = new HashMap<>();
            long type = readUint32();
            long len = readUint32();
            byte[] data = readBytes((int) (len - 8));
            record.put("type", type);
            record.put("data", new String(data));
            exthRecords.add(record);
        }
        properties.put("EXTH Records", exthRecords);
    }

    private void parse() {
        parsePdbHeader();
        parseRecordInfoList();
        parsePalmDocHeader();
        parseMobiHeader();
        parseExthHeader();
    }

    public void printProperties() {
        properties.forEach((key, value) -> System.out.println(key + ": " + value));
    }

    public void write(String newFilepath, Map<String, String> modifyProps) throws IOException {
        ByteBuffer writeBuffer = buffer.duplicate();
        writeBuffer.position(0);
        if (modifyProps != null) {
            for (Map.Entry<String, String> entry : modifyProps.entrySet()) {
                if (entry.getKey().equals("Database Name")) {
                    byte[] newName = entry.getValue().getBytes();
                    writeBuffer.position(0);
                    writeBuffer.put(newName, 0, Math.min(newName.length, 31));
                    writeBuffer.put((byte) 0); // Null terminate
                }
                // Add more modifications
            }
        }
        try (FileChannel channel = new RandomAccessFile(newFilepath, "rw").getChannel()) {
            writeBuffer.position(0);
            channel.write(writeBuffer);
        }
    }

    public static void main(String[] args) throws IOException {
        MobiHandler handler = new MobiHandler("example.mobi");
        handler.printProperties();
        Map<String, String> mods = new HashMap<>();
        mods.put("Database Name", "Modified Title");
        handler.write("modified.mobi", mods);
    }
}

6. JavaScript Class for .MOBI Handling (Node.js)

This Node.js class opens a .MOBI file, decodes/parses properties, prints to console, and can write a modified file.

const fs = require('fs');

class MobiHandler {
    constructor(filepath) {
        this.filepath = filepath;
        this.data = fs.readFileSync(filepath);
        this.offset = 0;
        this.properties = {};
        this.parse();
    }

    readUint8() {
        return this.data.readUInt8(this.offset++);
    }

    readUint16() {
        const val = this.data.readUInt16BE(this.offset);
        this.offset += 2;
        return val;
    }

    readUint32() {
        const val = this.data.readUInt32BE(this.offset);
        this.offset += 4;
        return val;
    }

    readString(length) {
        const str = this.data.slice(this.offset, this.offset + length).toString('utf8').split('\0')[0];
        this.offset += length;
        return str;
    }

    readBytes(length) {
        const bytes = this.data.slice(this.offset, this.offset + length);
        this.offset += length;
        return bytes;
    }

    parsePdbHeader() {
        this.properties['Database Name'] = this.readString(32);
        this.properties['Attributes'] = this.readUint16();
        this.properties['Version'] = this.readUint16();
        this.properties['Creation Date'] = this.readUint32();
        this.properties['Modification Date'] = this.readUint32();
        this.properties['Last Backup Date'] = this.readUint32();
        this.properties['Modification Number'] = this.readUint32();
        this.properties['App Info ID'] = this.readUint32();
        this.properties['Sort Info ID'] = this.readUint32();
        this.properties['Type'] = this.readString(4);
        this.properties['Creator'] = this.readString(4);
        this.properties['Unique ID Seed'] = this.readUint32();
        this.properties['Next Record List ID'] = this.readUint32();
        this.properties['Number of Records'] = this.readUint16();
    }

    parseRecordInfoList() {
        const numRecords = this.properties['Number of Records'];
        this.properties['Record Infos'] = [];
        for (let i = 0; i < numRecords; i++) {
            const info = {
                'Record Data Offset': this.readUint32(),
                'Record Attributes': this.readUint8(),
                'Unique ID': (this.readUint8() << 16) | (this.readUint8() << 8) | this.readUint8()
            };
            this.properties['Record Infos'].push(info);
        }
    }

    parsePalmDocHeader() {
        this.offset = this.properties['Record Infos'][0]['Record Data Offset'];
        this.properties['Compression'] = this.readUint16();
        this.properties['Unused'] = this.readUint16();
        this.properties['Text Length'] = this.readUint32();
        this.properties['Record Count'] = this.readUint16();
        this.properties['Record Size'] = this.readUint16();
        this.properties['Encryption Type'] = this.readUint16();
        this.properties['Unknown (PalmDOC)'] = this.readUint16();
    }

    parseMobiHeader() {
        this.properties['Identifier (MOBI)'] = this.readString(4);
        const headerLength = this.readUint32();
        this.properties['Header Length (MOBI)'] = headerLength;
        const headerEnd = this.offset + headerLength - 8;
        this.properties['Mobi Type'] = this.readUint32();
        this.properties['Text Encoding'] = this.readUint32();
        this.properties['Unique ID'] = this.readUint32();
        this.properties['File Version'] = this.readUint32();
        this.properties['Ortographic Index'] = this.readUint32();
        this.properties['Inflection Index'] = this.readUint32();
        this.properties['Index Names'] = this.readUint32();
        this.properties['Index Keys'] = this.readUint32();
        this.properties['Extra Index 0'] = this.readUint32();
        this.properties['Extra Index 1'] = this.readUint32();
        this.properties['Extra Index 2'] = this.readUint32();
        this.properties['Extra Index 3'] = this.readUint32();
        this.properties['Extra Index 4'] = this.readUint32();
        this.properties['Extra Index 5'] = this.readUint32();
        this.properties['First Non-Book Index'] = this.readUint32();
        this.properties['Full Name Offset'] = this.readUint32();
        this.properties['Full Name Length'] = this.readUint32();
        this.properties['Locale'] = this.readUint32();
        this.properties['Input Language'] = this.readUint32();
        this.properties['Output Language'] = this.readUint32();
        this.properties['Min Version'] = this.readUint32();
        this.properties['First Image Index'] = this.readUint32();
        this.properties['Huffman Record Offset'] = this.readUint32();
        this.properties['Huffman Record Count'] = this.readUint32();
        this.properties['Huffman Table Offset'] = this.readUint32();
        this.properties['Huffman Table Length'] = this.readUint32();
        this.properties['EXTH Flags'] = this.readUint32();
        this.properties['Unknown (0x84)'] = this.readBytes(32);
        this.properties['Unknown (0xA4)'] = this.readUint32();
        this.properties['DRM Offset'] = this.readUint32();
        this.properties['DRM Count'] = this.readUint32();
        this.properties['DRM Size'] = this.readUint32();
        this.properties['DRM Flags'] = this.readUint32();
        this.properties['Unknown (0xB8)'] = this.readBytes(8);
        this.properties['First Content Record Number'] = this.readUint16();
        this.properties['Last Content Record Number'] = this.readUint16();
        this.properties['Unknown (0xC4)'] = this.readUint32();
        this.properties['FCIS Record Number'] = this.readUint32();
        this.properties['Unknown (0xCC)'] = this.readUint32();
        this.properties['FLIS Record Number'] = this.readUint32();
        this.properties['Unknown (0xD4)'] = this.readUint32();
        this.properties['Unknown (0xD8)'] = this.readBytes(8);
        this.properties['Unknown (0xE0)'] = this.readUint32();
        this.properties['First Compilation Data Section Count'] = this.readUint32();
        this.properties['Number of Compilation Data Sections'] = this.readUint32();
        this.properties['Unknown (0xEC)'] = this.readUint32();
        this.properties['Extra Record Data Flags'] = this.readUint32();
        this.properties['INDX Record Offset'] = this.readUint32();
        this.offset = headerEnd;
    }

    parseExthHeader() {
        if ((this.properties['EXTH Flags'] & 0x40) === 0) return;
        this.properties['Identifier (EXTH)'] = this.readString(4);
        this.properties['Header Length (EXTH)'] = this.readUint32();
        this.properties['Record Count (EXTH)'] = this.readUint32();
        this.properties['EXTH Records'] = [];
        for (let i = 0; i < this.properties['Record Count (EXTH)']; i++) {
            const type = this.readUint32();
            const len = this.readUint32();
            const data = this.readBytes(len - 8).toString('utf8');
            this.properties['EXTH Records'].push({ type, data });
        }
    }

    parse() {
        this.parsePdbHeader();
        this.parseRecordInfoList();
        this.parsePalmDocHeader();
        this.parseMobiHeader();
        this.parseExthHeader();
    }

    printProperties() {
        console.log(JSON.stringify(this.properties, null, 2));
    }

    write(newFilepath, modifyProp) {
        const data = Buffer.from(this.data);
        if (modifyProp) {
            for (const [key, value] of Object.entries(modifyProp)) {
                if (key === 'Database Name') {
                    const newName = Buffer.from(value, 'utf8').slice(0, 31);
                    newName.copy(data, 0);
                    data[ newName.length ] = 0; // Null terminate
                }
                // Add more
            }
        }
        fs.writeFileSync(newFilepath, data);
    }
}

// Usage
const handler = new MobiHandler('example.mobi');
handler.printProperties();
handler.write('modified.mobi', { 'Database Name': 'Modified Title' });

7. C++ Class for .MOBI Handling

This C++ class opens a .MOBI file, decodes/parses properties, prints to console, and can write a modified file.

#include <iostream>
#include <fstream>
#include <vector>
#include <map>
#include <string>
#include <iomanip>
#include <cstring>

class MobiHandler {
private:
    std::string filepath;
    std::vector<char> data;
    size_t offset = 0;
    std::map<std::string, std::string> properties; // Simplified to string for print

public:
    MobiHandler(const std::string& fp) : filepath(fp) {
        std::ifstream file(fp, std::ios::binary | std::ios::ate);
        auto size = file.tellg();
        data.resize(size);
        file.seekg(0);
        file.read(data.data(), size);
        parse();
    }

    uint8_t readUint8() {
        return static_cast<uint8_t>(data[offset++]);
    }

    uint16_t readUint16() {
        uint16_t val = (static_cast<uint8_t>(data[offset]) << 8) | static_cast<uint8_t>(data[offset + 1]);
        offset += 2;
        return val;
    }

    uint32_t readUint32() {
        uint32_t val = (static_cast<uint8_t>(data[offset]) << 24) | (static_cast<uint8_t>(data[offset + 1]) << 16) |
                       (static_cast<uint8_t>(data[offset + 2]) << 8) | static_cast<uint8_t>(data[offset + 3]);
        offset += 4;
        return val;
    }

    std::string readString(size_t length) {
        std::string str(data.data() + offset, length);
        auto nullPos = str.find('\0');
        if (nullPos != std::string::npos) str.resize(nullPos);
        offset += length;
        return str;
    }

    std::vector<char> readBytes(size_t length) {
        std::vector<char> bytes(data.begin() + offset, data.begin() + offset + length);
        offset += length;
        return bytes;
    }

    void parsePdbHeader() {
        properties["Database Name"] = readString(32);
        properties["Attributes"] = std::to_string(readUint16());
        properties["Version"] = std::to_string(readUint16());
        properties["Creation Date"] = std::to_string(readUint32());
        properties["Modification Date"] = std::to_string(readUint32());
        properties["Last Backup Date"] = std::to_string(readUint32());
        properties["Modification Number"] = std::to_string(readUint32());
        properties["App Info ID"] = std::to_string(readUint32());
        properties["Sort Info ID"] = std::to_string(readUint32());
        properties["Type"] = readString(4);
        properties["Creator"] = readString(4);
        properties["Unique ID Seed"] = std::to_string(readUint32());
        properties["Next Record List ID"] = std::to_string(readUint32());
        properties["Number of Records"] = std::to_string(readUint16());
    }

    void parseRecordInfoList() {
        uint16_t numRecords = std::stoi(properties["Number of Records"]);
        std::string recordInfos = "[";
        for (uint16_t i = 0; i < numRecords; ++i) {
            recordInfos += "{Offset: " + std::to_string(readUint32()) + ", Attr: " + std::to_string(readUint8()) +
                           ", ID: " + std::to_string((readUint8() << 16) | (readUint8() << 8) | readUint8()) + "},";
        }
        if (!recordInfos.empty()) recordInfos.pop_back();
        recordInfos += "]";
        properties["Record Infos"] = recordInfos;
    }

    void parsePalmDocHeader() {
        uint32_t record0Offset = std::stoul(properties["Record Infos"].substr(9, 10)); // Simplified parse, adjust as needed
        offset = record0Offset;
        properties["Compression"] = std::to_string(readUint16());
        properties["Unused"] = std::to_string(readUint16());
        properties["Text Length"] = std::to_string(readUint32());
        properties["Record Count"] = std::to_string(readUint16());
        properties["Record Size"] = std::to_string(readUint16());
        properties["Encryption Type"] = std::to_string(readUint16());
        properties["Unknown (PalmDOC)"] = std::to_string(readUint16());
    }

    void parseMobiHeader() {
        properties["Identifier (MOBI)"] = readString(4);
        uint32_t headerLength = readUint32();
        properties["Header Length (MOBI)"] = std::to_string(headerLength);
        size_t headerEnd = offset + headerLength - 8;
        properties["Mobi Type"] = std::to_string(readUint32());
        properties["Text Encoding"] = std::to_string(readUint32());
        properties["Unique ID"] = std::to_string(readUint32());
        properties["File Version"] = std::to_string(readUint32());
        properties["Ortographic Index"] = std::to_string(readUint32());
        properties["Inflection Index"] = std::to_string(readUint32());
        properties["Index Names"] = std::to_string(readUint32());
        properties["Index Keys"] = std::to_string(readUint32());
        properties["Extra Index 0"] = std::to_string(readUint32());
        properties["Extra Index 1"] = std::to_string(readUint32());
        properties["Extra Index 2"] = std::to_string(readUint32());
        properties["Extra Index 3"] = std::to_string(readUint32());
        properties["Extra Index 4"] = std::to_string(readUint32());
        properties["Extra Index 5"] = std::to_string(readUint32());
        properties["First Non-Book Index"] = std::to_string(readUint32());
        properties["Full Name Offset"] = std::to_string(readUint32());
        properties["Full Name Length"] = std::to_string(readUint32());
        properties["Locale"] = std::to_string(readUint32());
        properties["Input Language"] = std::to_string(readUint32());
        properties["Output Language"] = std::to_string(readUint32());
        properties["Min Version"] = std::to_string(readUint32());
        properties["First Image Index"] = std::to_string(readUint32());
        properties["Huffman Record Offset"] = std::to_string(readUint32());
        properties["Huffman Record Count"] = std::to_string(readUint32());
        properties["Huffman Table Offset"] = std::to_string(readUint32());
        properties["Huffman Table Length"] = std::to_string(readUint32());
        properties["EXTH Flags"] = std::to_string(readUint32());
        auto unknown84 = readBytes(32);
        properties["Unknown (0x84)"] = std::string(unknown84.begin(), unknown84.end());
        properties["Unknown (0xA4)"] = std::to_string(readUint32());
        properties["DRM Offset"] = std::to_string(readUint32());
        properties["DRM Count"] = std::to_string(readUint32());
        properties["DRM Size"] = std::to_string(readUint32());
        properties["DRM Flags"] = std::to_string(readUint32());
        auto unknownB8 = readBytes(8);
        properties["Unknown (0xB8)"] = std::string(unknownB8.begin(), unknownB8.end());
        properties["First Content Record Number"] = std::to_string(readUint16());
        properties["Last Content Record Number"] = std::to_string(readUint16());
        properties["Unknown (0xC4)"] = std::to_string(readUint32());
        properties["FCIS Record Number"] = std::to_string(readUint32());
        properties["Unknown (0xCC)"] = std::to_string(readUint32());
        properties["FLIS Record Number"] = std::to_string(readUint32());
        properties["Unknown (0xD4)"] = std::to_string(readUint32());
        auto unknownD8 = readBytes(8);
        properties["Unknown (0xD8)"] = std::string(unknownD8.begin(), unknownD8.end());
        properties["Unknown (0xE0)"] = std::to_string(readUint32());
        properties["First Compilation Data Section Count"] = std::to_string(readUint32());
        properties["Number of Compilation Data Sections"] = std::to_string(readUint32());
        properties["Unknown (0xEC)"] = std::to_string(readUint32());
        properties["Extra Record Data Flags"] = std::to_string(readUint32());
        properties["INDX Record Offset"] = std::to_string(readUint32());
        offset = headerEnd;
    }

    void parseExthHeader() {
        uint32_t exthFlags = std::stoul(properties["EXTH Flags"]);
        if ((exthFlags & 0x40) == 0) return;
        properties["Identifier (EXTH)"] = readString(4);
        properties["Header Length (EXTH)"] = std::to_string(readUint32());
        uint32_t recordCount = readUint32();
        properties["Record Count (EXTH)"] = std::to_string(recordCount);
        std::string exthRecords = "[";
        for (uint32_t i = 0; i < recordCount; ++i) {
            uint32_t type = readUint32();
            uint32_t len = readUint32();
            auto dataBytes = readBytes(len - 8);
            std::string data(dataBytes.begin(), dataBytes.end());
            exthRecords += "{type: " + std::to_string(type) + ", data: " + data + "},";
        }
        if (!exthRecords.empty()) exthRecords.pop_back();
        exthRecords += "]";
        properties["EXTH Records"] = exthRecords;
    }

    void parse() {
        parsePdbHeader();
        parseRecordInfoList();
        parsePalmDocHeader();
        parseMobiHeader();
        parseExthHeader();
    }

    void printProperties() {
        for (const auto& prop : properties) {
            std::cout << prop.first << ": " << prop.second << std::endl;
        }
    }

    void write(const std::string& newFilepath, const std::map<std::string, std::string>& modifyProp) {
        std::vector<char> writeData = data;
        for (const auto& mod : modifyProp) {
            if (mod.first == "Database Name") {
                std::string newName = mod.second;
                newName.resize(31, '\0');
                newName += '\0';
                std::memcpy(writeData.data(), newName.data(), 32);
            }
            // Add more
        }
        std::ofstream out(newFilepath, std::ios::binary);
        out.write(writeData.data(), writeData.size());
    }
};

int main() {
    MobiHandler handler("example.mobi");
    handler.printProperties();
    std::map<std::string, std::string> mods = {{"Database Name", "Modified Title"}};
    handler.write("modified.mobi", mods);
    return 0;
}