Task 439: .MYI File Format

Task 439: .MYI File Format

File Format Specifications for .MYI

The .MYI file is the index file for MySQL's MyISAM storage engine. It stores the indexes for a MyISAM table in a B-tree or R-tree structure, along with metadata in a header. The file is divided into sections: a fixed header, the state section (MI_STATE_INFO), the base section (MI_BASE_INFO), keydef sections (MI_KEYDEF, one per key), recinfo sections (MI_COLUMNDEF, one per field in keys), and then the key blocks (pages) containing the index data. The file starts with a magic number in the file_version field, typically \xFE\xFE\x07\x01 for MyISAM indexes. The format is binary, with little-endian byte order for multi-byte values. The structure is defined in MySQL source code (e.g., myisamdef.h). The header provides lengths and positions for subsequent sections, allowing variable size based on the table's keys and fields. Key blocks are typically 1024 bytes by default, with compression for prefixes in keys.

  1. List of all the properties of this file format intrinsic to its file system:

The properties are the fields in the binary structure, grouped by section. These are intrinsic to the .MYI format and define its layout, metadata, and index integrity.

Header Section (fixed 24 bytes, nested in MI_STATE_INFO):

  • file_version: 4 bytes (uchar[4]) - Magic number or version identifier (typically 0xFE 0xFE 0x07 0x01).
  • options: 2 bytes (uchar[2]) - Bit flags for table options (e.g., compression, packing).
  • header_length: 2 bytes (uchar[2]) - Length of the entire header.
  • state_info_length: 2 bytes (uchar[2]) - Length of the state section.
  • base_info_length: 2 bytes (uchar[2]) - Length of the base section.
  • base_pos: 2 bytes (uchar[2]) - Position (offset) of the base section in the file.
  • key_parts: 2 bytes (uchar[2]) - Total number of key parts across all keys.
  • unique_key_parts: 2 bytes (uchar[2]) - Number of unique key parts.
  • keys: 1 byte (uchar) - Number of keys (indexes) in the table.
  • uniques: 1 byte (uchar) - Number of unique indexes.
  • language: 1 byte (uchar) - Language/charset code for sorting.
  • max_block_size_index: 1 byte (uchar) - Maximum block size index.
  • fulltext_keys: 1 byte (uchar) - Number of full-text keys.
  • not_used: 1 byte (uchar) - Reserved/unused byte.

State Section (variable length, follows header; MI_STATE_INFO continued):

  • state: MI_STATUS_INFO struct - Contains sub-fields like keys_length (ulonglong), data_length (ulonglong), data_records (ha_rows), deleted (ha_rows), empty (ulong), etc., for table status.
  • split: ha_rows - Number of split records.
  • dellink: my_off_t - Pointer to deleted record chain.
  • auto_increment: ulonglong - Current auto-increment value.
  • process: ulong - Process ID or counter.
  • unique: ulong - Unique row count.
  • update_count: ulong - Update counter.
  • status: ulong - Table status flags.
  • rec_per_key_part: ulong* - Array of records per key part.
  • key_root: my_off_t* - Array of root block offsets for each key.
  • key_del: my_off_t* - Array of deleted key block offsets.
  • rec_per_key_rows: my_off_t - Rows for rec_per_key.
  • sec_index_changed: ulong - Secondary index change counter.
  • sec_index_used: ulong - Secondary index usage flags.
  • key_map: ulonglong - Bitmap of used keys.
  • checksum: ha_checksum - Table checksum.
  • version: ulong - File version timestamp.
  • create_time: time_t - Table creation time.
  • recover_time: time_t - Last recovery time.
  • check_time: time_t - Last check time.
  • sortkey: uint - Sort key index.
  • open_count: uint - Open count counter.
  • changed: uint8 - Changed flag.
  • state_diff_length: uint - State difference length.
  • state_length: uint - Full state length.
  • key_info: ulong* - Pointers to key info.

Base Section (variable length, at base_pos; MI_BASE_INFO struct):

  • rec_length: ulong - Record length.
  • fields: uint - Number of fields.
  • max_data_file_length: ulonglong - Max data file length.
  • max_key_file_length: ulonglong - Max key file length.
  • keys: uint - Number of keys (redundant with header).
  • key_parts: uint - Total key parts (redundant with header).
  • pack_bits: uint - Pack bits.
  • min_pack_length: uint - Minimum pack length.
  • max_pack_length: uint - Maximum pack length.
  • extra_rec_buff_length: uint - Extra record buffer length.
  • null_fields: uint - Number of NULL fields.
  • rec_reflength: uint - Record reference length.
  • reclength: uint - Record length (redundant).
  • table_options: uint - Table options flags.
  • raid_type: uint - RAID type (if used).
  • raid_chunks: uint - RAID chunks.
  • raid_chunk_size: ulong - RAID chunk size.

Keydef Sections (variable, one per key; MI_KEYDEF struct):

  • share: MYISAM_SHARE* - Pointer to shared structure (in memory).
  • keysegs: uint16 - Number of key segments.
  • flag: uint16 - Key flags (e.g., unique, primary).
  • key_alg: uint8 - Key algorithm (e.g., B-tree, R-tree).
  • block_length: uint16 - Block length.
  • underflow_block_length: uint16 - Underflow block length.
  • keylength: uint16 - Key length.
  • minlength: uint16 - Minimum key length.
  • maxlength: uint16 - Maximum key length.
  • block_size_index: uint16 - Block size index.
  • version: uint32 - Version.
  • ftkey_nr: uint32 - Full-text key number.
  • seg: HA_KEYSEG* - Pointer to key segments.
  • end: HA_KEYSEG* - End of key segments.
  • parser: st_mysql_ftparser* - Full-text parser.
  • bin_search: function pointer - Binary search function.
  • get_key: function pointer - Get key function.
  • pack_key: function pointer - Pack key function.
  • store_key: function pointer - Store key function.
  • ck_insert: function pointer - Check insert function.
  • ck_delete: function pointer - Check delete function.

Recinfo Sections (variable, one per field in keys; MI_COLUMNDEF struct):

  • type: uint16 - Field type.
  • length: uint16 - Field length.
  • null_bit: uint8 - NULL bit.
  • null_pos: uint16 - NULL position.

Key Blocks (variable, following headers; B-tree or R-tree pages):

  • Page header: 2 bytes - High bit for leaf/nonleaf, remaining for used size.
  • Key values: variable - Compressed keys with pointers to data or children.
  1. Two direct download links for files of format .MYI:

I was unable to find direct download links for .MYI files through web searches, as they are typically internal MySQL binary files not publicly hosted for download. They are generated by MySQL and often found in database backups or repositories, but no direct, public links to individual .MYI files were located. You can generate one by creating a MyISAM table in MySQL and locating the file in the data directory.

  1. Ghost blog embedded HTML JavaScript for drag and drop .MYI file to dump properties:
.MYI File Dumper
Drag and drop .MYI file here

(Note: This is a basic dumper focusing on the header and some state fields. Extend the script to parse variable sections using header lengths for full properties.)

  1. Python class for .MYI file:
import struct
import os

class MYIParser:
    def __init__(self, filename):
        self.filename = filename
        self.properties = {}
        self._read_file()

    def _read_file(self):
        with open(self.filename, 'rb') as f:
            data = f.read()
        # Unpack header (24 bytes)
        header = struct.unpack('<4BHH H H H H B B B B B B', data[0:24])
        self.properties['file_version'] = list(header[0:4])
        self.properties['options'] = header[4]
        self.properties['header_length'] = header[5]
        self.properties['state_info_length'] = header[6]
        self.properties['base_info_length'] = header[7]
        self.properties['base_pos'] = header[8]
        self.properties['key_parts'] = header[9]
        self.properties['unique_key_parts'] = header[10]
        self.properties['keys'] = header[11]
        self.properties['uniques'] = header[12]
        self.properties['language'] = header[13]
        self.properties['max_block_size_index'] = header[14]
        self.properties['fulltext_keys'] = header[15]
        self.properties['not_used'] = header[16]

        # Parse state section (example for some fields; extend for full)
        offset = 24
        self.properties['split'] = struct.unpack('<Q', data[offset:offset+8])[0]
        offset += 8
        self.properties['dellink'] = struct.unpack('<Q', data[offset:offset+8])[0]
        offset += 8
        self.properties['auto_increment'] = struct.unpack('<Q', data[offset:offset+8])[0]
        # ... Add unpacking for other state, base, keydef, recinfo based on lengths

    def print_properties(self):
        for key, value in self.properties.items():
            print(f"{key}: {value}")

    def write_file(self, new_filename):
        # Example write: pack current properties back (basic; extend for full R/W)
        data = bytearray()
        header = struct.pack('<4BHH H H H H B B B B B B', *self.properties['file_version'], self.properties['options'], self.properties['header_length'],
                             self.properties['state_info_length'], self.properties['base_info_length'], self.properties['base_pos'], self.properties['key_parts'],
                             self.properties['unique_key_parts'], self.properties['keys'], self.properties['uniques'], self.properties['language'],
                             self.properties['max_block_size_index'], self.properties['fulltext_keys'], self.properties['not_used'])
        data.extend(header)
        # Add packing for other sections...
        with open(new_filename, 'wb') as f:
            f.write(data)

# Usage example
# parser = MYIParser('example.MYI')
# parser.print_properties()
# parser.write_file('modified.MYI')
  1. Java class for .MYI file:
import java.io.*;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.nio.channels.FileChannel;
import java.nio.file.StandardOpenOption;
import java.util.HashMap;
import java.util.Map;

public class MYIParser {
    private String filename;
    private Map<String, Object> properties = new HashMap<>();

    public MYIParser(String filename) {
        this.filename = filename;
        readFile();
    }

    private void readFile() {
        try (RandomAccessFile raf = new RandomAccessFile(filename, "r")) {
            FileChannel channel = raf.getChannel();
            ByteBuffer buffer = ByteBuffer.allocate((int) channel.size());
            buffer.order(ByteOrder.LITTLE_ENDIAN);
            channel.read(buffer);
            buffer.flip();

            // Read header (24 bytes)
            byte[] fileVersion = new byte[4];
            buffer.get(fileVersion);
            properties.put("file_version", fileVersion);
            properties.put("options", buffer.getShort());
            properties.put("header_length", buffer.getShort());
            properties.put("state_info_length", buffer.getShort());
            properties.put("base_info_length", buffer.getShort());
            properties.put("base_pos", buffer.getShort());
            properties.put("key_parts", buffer.getShort());
            properties.put("unique_key_parts", buffer.getShort());
            properties.put("keys", buffer.get());
            properties.put("uniques", buffer.get());
            properties.put("language", buffer.get());
            properties.put("max_block_size_index", buffer.get());
            properties.put("fulltext_keys", buffer.get());
            properties.put("not_used", buffer.get());

            // Parse state (example)
            properties.put("split", buffer.getLong());
            properties.put("dellink", buffer.getLong());
            properties.put("auto_increment", buffer.getLong());
            // ... Extend for more

        } catch (IOException e) {
            e.printStack.printStackTrace();
        }
    }

    public void printProperties() {
        properties.forEach((key, value) -> System.out.println(key + ": " + value));
    }

    public void writeFile(String newFilename) {
        try (FileOutputStream fos = new FileOutputStream(newFilename);
             FileChannel channel = fos.getChannel()) {
            ByteBuffer buffer = ByteBuffer.allocate(1024); // Adjust size
            buffer.order(ByteOrder.LITTLE_ENDIAN);
            byte[] fv = (byte[]) properties.get("file_version");
            for (byte b : fv) buffer.put(b);
            buffer.putShort((short) properties.get("options"));
            buffer.putShort((short) properties.get("header_length"));
            buffer.putShort((short) properties.get("state_info_length"));
            buffer.putShort((short) properties.get("base_info_length"));
            buffer.putShort((short) properties.get("base_pos"));
            buffer.putShort((short) properties.get("key_parts"));
            buffer.putShort((short) properties.get("unique_key_parts"));
            buffer.put((byte) properties.get("keys"));
            buffer.put((byte) properties.get("uniques"));
            buffer.put((byte) properties.get("language"));
            buffer.put((byte) properties.get("max_block_size_index"));
            buffer.put((byte) properties.get("fulltext_keys"));
            buffer.put((byte) properties.get("not_used"));
            // Add more...
            buffer.flip();
            channel.write(buffer);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    // Usage
    // public static void main(String[] args) {
    //     MYIParser parser = new MYIParser("example.MYI");
    //     parser.printProperties();
    //     parser.writeFile("modified.MYI");
    // }
}
  1. JavaScript class for .MYI file:
class MYIParser {
  constructor(file) {
    this.file = file;
    this.properties = {};
    this.readFile();
  }

  readFile() {
    const reader = new FileReader();
    reader.onload = (e) => {
      const dv = new DataView(e.target.result);
      // Header
      this.properties.file_version = [];
      for (let i = 0; i < 4; i++) this.properties.file_version.push(dv.getUint8(i));
      this.properties.options = dv.getUint16(4, true);
      this.properties.header_length = dv.getUint16(6, true);
      this.properties.state_info_length = dv.getUint16(8, true);
      this.properties.base_info_length = dv.getUint16(10, true);
      this.properties.base_pos = dv.getUint16(12, true);
      this.properties.key_parts = dv.getUint16(14, true);
      this.properties.unique_key_parts = dv.getUint16(16, true);
      this.properties.keys = dv.getUint8(18);
      this.properties.uniques = dv.getUint8(19);
      this.properties.language = dv.getUint8(20);
      this.properties.max_block_size_index = dv.getUint8(21);
      this.properties.fulltext_keys = dv.getUint8(22);
      this.properties.not_used = dv.getUint8(23);

      // State example
      let offset = 24;
      this.properties.split = Number(dv.getBigUint64(offset, true));
      offset += 8;
      this.properties.dellink = Number(dv.getBigUint64(offset, true));
      offset += 8;
      this.properties.auto_increment = Number(dv.getBigUint64(offset, true));
      // ... Extend

      this.printProperties();
    };
    reader.readAsArrayBuffer(this.file);
  }

  printProperties() {
    for (const [key, value] of Object.entries(this.properties)) {
      console.log(`${key}: ${value}`);
    }
  }

  writeFile(newFilename) {
    const buffer = new ArrayBuffer(1024); // Adjust
    const dv = new DataView(buffer);
    let offset = 0;
    this.properties.file_version.forEach(b => dv.setUint8(offset++, b));
    dv.setUint16(offset, this.properties.options, true); offset += 2;
    dv.setUint16(offset, this.properties.header_length, true); offset += 2;
    // ... Add more
    const blob = new Blob([buffer]);
    const url = URL.createObjectURL(blob);
    const a = document.createElement('a');
    a.href = url;
    a.download = newFilename;
    a.click();
  }
}

// Usage
// const input = document.getElementById('fileInput');
// input.addEventListener('change', (e) => new MYIParser(e.target.files[0]));
// parser.writeFile('modified.MYI');
  1. C "class" (struct with functions) for .MYI file:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>

typedef struct {
    char *filename;
    // Properties (use a struct or map; for simplicity, individual vars)
    uint8_t file_version[4];
    uint16_t options;
    uint16_t header_length;
    uint16_t state_info_length;
    uint16_t base_info_length;
    uint16_t base_pos;
    uint16_t key_parts;
    uint16_t unique_key_parts;
    uint8_t keys;
    uint8_t uniques;
    uint8_t language;
    uint8_t max_block_size_index;
    uint8_t fulltext_keys;
    uint8_t not_used;
    uint64_t split;
    uint64_t dellink;
    uint64_t auto_increment;
    // ... Add more
} MYIParser;

MYIParser* myi_parser_new(const char *filename) {
    MYIParser *parser = malloc(sizeof(MYIParser));
    parser->filename = strdup(filename);
    FILE *f = fopen(filename, "rb");
    if (!f) {
        free(parser);
        return NULL;
    }
    // Read header
    fread(parser->file_version, 1, 4, f);
    fread(&parser->options, 2, 1, f);
    fread(&parser->header_length, 2, 1, f);
    fread(&parser->state_info_length, 2, 1, f);
    fread(&parser->base_info_length, 2, 1, f);
    fread(&parser->base_pos, 2, 1, f);
    fread(&parser->key_parts, 2, 1, f);
    fread(&parser->unique_key_parts, 2, 1, f);
    fread(&parser->keys, 1, 1, f);
    fread(&parser->uniques, 1, 1, f);
    fread(&parser->language, 1, 1, f);
    fread(&parser->max_block_size_index, 1, 1, f);
    fread(&parser->fulltext_keys, 1, 1, f);
    fread(&parser->not_used, 1, 1, f);

    // State example
    fread(&parser->split, 8, 1, f);
    fread(&parser->dellink, 8, 1, f);
    fread(&parser->auto_increment, 8, 1, f);
    // ... Extend

    fclose(f);
    return parser;
}

void myi_parser_print(MYIParser *parser) {
    printf("file_version: [%u, %u, %u, %u]\n", parser->file_version[0], parser->file_version[1], parser->file_version[2], parser->file_version[3]);
    printf("options: %u\n", parser->options);
    // ... Print all
}

void myi_parser_write(MYIParser *parser, const char *new_filename) {
    FILE *f = fopen(new_filename, "wb");
    if (!f) return;
    fwrite(parser->file_version, 1, 4, f);
    fwrite(&parser->options, 2, 1, f);
    // ... Write all
    fclose(f);
}

void myi_parser_free(MYIParser *parser) {
    free(parser->filename);
    free(parser);
}

// Usage
// int main() {
//     MYIParser *parser = myi_parser_new("example.MYI");
//     if (parser) {
//         myi_parser_print(parser);
//         myi_parser_write(parser, "modified.MYI");
//         myi_parser_free(parser);
//     }
//     return 0;
// }