Task 376: .MAT File Format
Task 376: .MAT File Format
File Format Specifications for .MAT
The .MAT file format is the binary file format used by MATLAB to store workspace variables, such as matrices, arrays, structures, and other data types. It has evolved over versions:
- Level 4 (v4): Simple binary format for basic matrices (pre-MATLAB 5).
- Level 5 (v5): Introduced in MATLAB 5, supports compression, multidimensional arrays, sparse matrices, cells, structures, and objects. This is the focus here as it's the classic, documented binary format.
- v6: Minor extension of v5 with compression by default.
- v7: Similar to v5 but with some optimizations.
- v7.3: Introduced in MATLAB R2006b, based on HDF5 format for large datasets (>2GB), using a different structure (HDF5 groups, datasets, attributes).
Specifications are detailed in the official MathWorks documentation, particularly the "MAT-File Format" PDF for Level 5. For v7.3, it's HDF5-based and can be parsed using HDF5 tools/libraries, but the task appears to reference the binary Level 5 format based on common usage and available specs. Intrinsic properties refer to structural metadata extractable from the file without interpreting the variable data itself.
List of all properties of this file format intrinsic to its file system:
- Descriptive Text: 116-byte ASCII string in the header, describing the file (e.g., "MATLAB 5.0 MAT-file, Platform: PCWIN64, Created on: Mon Sep 29 00:00:00 2025").
- Subsystem Data Offset: 8-byte (int64) offset to subsystem-specific data (often 0 if none).
- Version: 2-byte (uint16) value, typically 0x0100 (256 decimal) for Level 5.
- Endian Indicator: 2-byte string ('MI' for little-endian or 'IM' for big-endian), used to detect if byte-swapping is needed.
- Subsystem Data: Variable-length data at the offset (if non-zero), typically platform-specific or empty.
- Data Element Count: Implicit count of top-level data elements (e.g., matrices/variables) in the file.
- Data Element Type (per element): 4-byte (uint32) tag indicating type (e.g., 14 for miMATRIX, 15 for miCOMPRESSED, 1 for miINT8, etc.).
- Data Element Byte Count (per element): 4-byte (uint32) number of bytes in the data (excluding tag; special small format for <=4 bytes).
- Compression Flag (per element): If type is miCOMPRESSED (15), the data is zlib-compressed; otherwise uncompressed.
- Padding Bytes (per element): 0-7 bytes to align on 64-bit boundaries.
- Matrix Class (for miMATRIX elements): uint8 from Array Flags (e.g., 6 for double, 9 for sparse, 13 for struct).
- Matrix Flags (for miMATRIX): uint8 bit flags (complex, global, logical).
- NZMax (for miMATRIX, if sparse): uint32 maximum number of nonzero elements.
- Dimensions (for miMATRIX): Array of int32 values (one per dimension).
- Array Name (for miMATRIX): Variable-length ASCII string (miINT8), padded.
- Field Name Length (for structs/objects): uint32 max length of field names (typically 32).
- Field Names (for structs/objects): Packed miINT8 strings for each field.
- Class Name (for objects): miINT8 string.
- Overall File Magic/Identifier: No true magic number, but header text starts with non-zero bytes, and endian check ensures validity.
These properties define the file's structure, versioning, encoding, and organization, allowing parsing without loading full data.
Two direct download links for .MAT files:
- https://web.ece.ucsb.edu/scl/datasets/Bio_ITR_retinaI.mat (BioRetina dataset, homogeneous texture descriptors).
- https://web.ece.ucsb.edu/scl/datasets/Bio_ITR_retinaII.mat (Extended BioRetina dataset).
Ghost blog embedded HTML JavaScript for drag-and-drop .MAT file dump:
This is an embeddable HTML snippet with embedded JavaScript (suitable for a Ghost blog or any HTML page). It creates a drop zone; on drop, it reads the .MAT file using FileReader and DataView, parses the Level 5 structure (handling endianness, compression via pako for zlib), and dumps the properties to the screen in a pre element. Note: This is client-side only, no server needed; assumes Level 5 (non-HDF5).
Python class for .MAT handling:
This class handles Level 5 .MAT files: opens, decodes (reads), prints properties to console, and writes a simple .MAT file with a sample matrix (for demonstration; extend for full write).
import struct
import zlib
import sys
import numpy as np # For dims array, optional
class MatFileHandler:
def __init__(self, filename):
self.filename = filename
self.little_endian = True
self.properties = {}
def read_decode(self):
with open(self.filename, 'rb') as f:
data = f.read()
view = memoryview(data)
offset = 0
# Header
text = data[:116].decode('ascii', errors='ignore').strip()
subsystem_offset = struct.unpack('<q', data[116:124])[0] # Assume LE
version = struct.unpack('<H', data[124:126])[0]
endian = data[126:128].decode('ascii')
if endian == 'IM':
self.little_endian = False
# Re-parse with BE if needed (simplified; in practice, swap all)
subsystem_offset = struct.unpack('>q', data[116:124])[0]
version = struct.unpack('>H', data[124:126])[0]
self.properties['Descriptive Text'] = text
self.properties['Subsystem Data Offset'] = subsystem_offset
self.properties['Version'] = version
self.properties['Endian Indicator'] = endian
if subsystem_offset > 0:
# Parse subsystem (simplified)
self.properties['Subsystem Data'] = 'Present'
# Data elements
offset = 128
element_count = 0
elements = []
while offset < len(data):
fmt = '<' if self.little_endian else '>'
type_val, = struct.unpack(f'{fmt}I', data[offset:offset+4])
bytes_val, = struct.unpack(f'{fmt}I', data[offset+4:offset+8])
is_small = (type_val >> 16) == 0 and (bytes_val & 0xFFFF) == 0
if is_small:
type_val = type_val & 0xFFFF
bytes_val = bytes_val >> 16
tag_size = 4
else:
tag_size = 8
element_count += 1
elem_props = {'Type': type_val, 'Byte Count': bytes_val, 'Compression Flag': 'No'}
elem_data = data[offset + tag_size : offset + tag_size + bytes_val]
if type_val == 15: # miCOMPRESSED
elem_props['Compression Flag'] = 'Yes'
elem_data = zlib.decompress(elem_data)
# Re-parse decompressed
type_val, = struct.unpack(f'{fmt}I', elem_data[:4])
bytes_val, = struct.unpack(f'{fmt}I', elem_data[4:8])
elem_props['Decompressed Type'] = type_val
elem_props['Decompressed Byte Count'] = bytes_val
elem_data = elem_data[8:]
if type_val == 14: # miMATRIX
sub_offset = 0
# Array Flags
flag_type, = struct.unpack(f'{fmt}I', elem_data[sub_offset:sub_offset+4])
flag_bytes, = struct.unpack(f'{fmt}I', elem_data[sub_offset+4:sub_offset+8])
sub_offset += 8
matrix_class = elem_data[sub_offset]
flags = elem_data[sub_offset + 2]
nzmax = struct.unpack(f'{fmt}I', elem_data[sub_offset+4:sub_offset+8])[0]
elem_props['Matrix Class'] = matrix_class
elem_props['Matrix Flags'] = flags
elem_props['Complex'] = bool(flags & 8)
elem_props['Global'] = bool(flags & 4)
elem_props['Logical'] = bool(flags & 2)
elem_props['NZMax'] = nzmax
sub_offset += flag_bytes
# Dimensions
dim_type, = struct.unpack(f'{fmt}I', elem_data[sub_offset:sub_offset+4])
dim_bytes, = struct.unpack(f'{fmt}I', elem_data[sub_offset+4:sub_offset+8])
sub_offset += 8
num_dims = dim_bytes // 4
dims = struct.unpack(f'{fmt}{num_dims}i', elem_data[sub_offset:sub_offset + dim_bytes])
elem_props['Dimensions'] = dims
sub_offset += dim_bytes
# Array Name
name_type, = struct.unpack(f'{fmt}I', elem_data[sub_offset:sub_offset+4])
name_bytes, = struct.unpack(f'{fmt}I', elem_data[sub_offset+4:sub_offset+8])
sub_offset += 8
name = elem_data[sub_offset:sub_offset + name_bytes].decode('ascii', errors='ignore').strip()
elem_props['Array Name'] = name
# Skip data
elements.append(elem_props)
padding = (8 - (bytes_val % 8)) % 8
offset += tag_size + bytes_val + padding
self.properties['Data Element Count'] = element_count
self.properties['Data Elements'] = elements
def print_properties(self):
for key, value in self.properties.items():
if key == 'Data Elements':
for i, elem in enumerate(value, 1):
print(f"Data Element {i}:")
for ek, ev in elem.items():
print(f" {ek}: {ev}")
else:
print(f"{key}: {value}")
def write_simple(self, out_filename):
# Write a simple double matrix example
header_text = b'MATLAB 5.0 MAT-file, Platform: python, Created on: Mon Sep 29 2025'.ljust(116, b' ')
subsystem_offset = b'\x00' * 8
version = struct.pack('<H', 0x0100)
endian = b'MI'
header = header_text + subsystem_offset + version + endian
# Simple miMATRIX: 1x1 double 'test' = 42
array_flags = struct.pack('<IIBBBI', 6, 0, 0, 0, 0, 0) # Class 6 (double), no flags
flags_tag = struct.pack('<II', 6, 8) + array_flags # miUINT32, 8 bytes
dims = struct.pack('<ii', 1, 1)
dims_tag = struct.pack('<II', 5, 8) + dims # miINT32, 8 bytes
name = b'test'
name_tag = struct.pack('<II', 1, len(name)) + name + b'\x00' * (8 - len(name) % 8) # miINT8
real = struct.pack('<d', 42.0)
real_tag = struct.pack('<II', 9, 8) + real # miDOUBLE, 8 bytes
matrix_data = flags_tag + dims_tag + name_tag + real_tag
matrix_tag = struct.pack('<II', 14, len(matrix_data)) # miMATRIX
with open(out_filename, 'wb') as f:
f.write(header + matrix_tag + matrix_data)
# Example usage
if __name__ == '__main__':
handler = MatFileHandler('example.mat')
handler.read_decode()
handler.print_properties()
handler.write_simple('output.mat')
Java class for .MAT handling:
This class handles Level 5 .MAT files similarly: opens, decodes, prints to console, and writes a simple file. Uses ByteBuffer for parsing.
import java.io.*;
import java.nio.*;
import java.util.*;
import java.util.zip.Inflater;
public class MatFileHandler {
private String filename;
private boolean littleEndian = true;
private Map<String, Object> properties = new HashMap<>();
public MatFileHandler(String filename) {
this.filename = filename;
}
public void readDecode() throws IOException {
byte[] data;
try (FileInputStream fis = new FileInputStream(filename);
ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
byte[] buffer = new byte[1024];
int len;
while ((len = fis.read(buffer)) != -1) {
baos.write(buffer, 0, len);
}
data = baos.toByteArray();
}
ByteBuffer bb = ByteBuffer.wrap(data).order(ByteOrder.LITTLE_ENDIAN);
// Header
byte[] textBytes = new byte[116];
bb.get(textBytes);
String text = new String(textBytes, "ASCII").trim();
long subsystemOffset = bb.getLong();
short version = bb.getShort();
byte[] endianBytes = new byte[2];
bb.get(endianBytes);
String endian = new String(endianBytes, "ASCII");
if (endian.equals("IM")) {
littleEndian = false;
bb.order(ByteOrder.BIG_ENDIAN);
bb.position(116);
subsystemOffset = bb.getLong();
version = bb.getShort();
}
properties.put("Descriptive Text", text);
properties.put("Subsystem Data Offset", subsystemOffset);
properties.put("Version", version);
properties.put("Endian Indicator", endian);
if (subsystemOffset > 0) {
properties.put("Subsystem Data", "Present");
}
// Data elements
int offset = 128;
int elementCount = 0;
List<Map<String, Object>> elements = new ArrayList<>();
while (offset < data.length) {
bb.position(offset);
int type = bb.getInt();
int bytes = bb.getInt();
boolean isSmall = (type & 0xFFFF0000) == 0;
if (isSmall) {
type = type & 0xFFFF;
bytes = bytes >>> 16;
}
int tagSize = isSmall ? 4 : 8;
elementCount++;
Map<String, Object> elemProps = new HashMap<>();
elemProps.put("Type", type);
elemProps.put("Byte Count", bytes);
elemProps.put("Compression Flag", "No");
byte[] elemData = Arrays.copyOfRange(data, offset + tagSize, offset + tagSize + bytes);
if (type == 15) { // miCOMPRESSED
elemProps.put("Compression Flag", "Yes");
Inflater inflater = new Inflater();
inflater.setInput(elemData);
byte[] decompressed = new byte[bytes * 2]; // Estimate
int decompLen = 0;
try {
decompLen = inflater.inflate(decompressed);
} catch (Exception e) {}
inflater.end();
elemData = Arrays.copyOf(decompressed, decompLen);
ByteBuffer subBb = ByteBuffer.wrap(elemData).order(littleEndian ? ByteOrder.LITTLE_ENDIAN : ByteOrder.BIG_ENDIAN);
type = subBb.getInt();
bytes = subBb.getInt();
elemProps.put("Decompressed Type", type);
elemProps.put("Decompressed Byte Count", bytes);
elemData = Arrays.copyOfRange(elemData, 8, 8 + bytes);
}
if (type == 14) { // miMATRIX
ByteBuffer subBb = ByteBuffer.wrap(elemData).order(littleEndian ? ByteOrder.LITTLE_ENDIAN : ByteOrder.BIG_ENDIAN);
int subOffset = 0;
// Array Flags
subBb.getInt(); // flagType
int flagBytes = subBb.getInt();
subOffset += 8;
byte matrixClass = subBb.get();
subBb.get(); // reserved
byte flags = subBb.get();
subBb.get(); // reserved
int nzmax = subBb.getInt();
elemProps.put("Matrix Class", (int) matrixClass);
elemProps.put("Matrix Flags", (int) flags);
elemProps.put("Complex", (flags & 8) != 0);
elemProps.put("Global", (flags & 4) != 0);
elemProps.put("Logical", (flags & 2) != 0);
elemProps.put("NZMax", nzmax);
subOffset += flagBytes - 8; // Already read 8
// Dimensions
subBb.position(subOffset);
subBb.getInt(); // dimType
int dimBytes = subBb.getInt();
subOffset += 8;
int numDims = dimBytes / 4;
int[] dims = new int[numDims];
for (int i = 0; i < numDims; i++) {
dims[i] = subBb.getInt();
}
elemProps.put("Dimensions", dims);
subOffset += dimBytes;
// Array Name
subBb.position(subOffset);
subBb.getInt(); // nameType
int nameBytes = subBb.getInt();
subOffset += 8;
byte[] nameB = new byte[nameBytes];
subBb.get(nameB);
String name = new String(nameB, "ASCII").trim();
elemProps.put("Array Name", name);
}
elements.add(elemProps);
int padding = (8 - (bytes % 8)) % 8;
offset += tagSize + bytes + padding;
}
properties.put("Data Element Count", elementCount);
properties.put("Data Elements", elements);
}
public void printProperties() {
for (Map.Entry<String, Object> entry : properties.entrySet()) {
if (entry.getKey().equals("Data Elements")) {
List<Map<String, Object>> elements = (List<Map<String, Object>>) entry.getValue();
for (int i = 0; i < elements.size(); i++) {
System.out.println("Data Element " + (i + 1) + ":");
for (Map.Entry<String, Object> e : elements.get(i).entrySet()) {
System.out.println(" " + e.getKey() + ": " + e.getValue());
}
}
} else {
System.out.println(entry.getKey() + ": " + entry.getValue());
}
}
}
public void writeSimple(String outFilename) throws IOException {
// Simple double matrix 'test' = 42
byte[] headerText = "MATLAB 5.0 MAT-file, Platform: java, Created on: Mon Sep 29 2025".getBytes("ASCII");
byte[] paddedText = Arrays.copyOf(headerText, 116);
ByteBuffer bb = ByteBuffer.allocate(128).order(ByteOrder.LITTLE_ENDIAN);
bb.put(paddedText);
bb.putLong(0); // subsystem
bb.putShort((short) 0x0100);
bb.put("MI".getBytes());
byte[] header = bb.array();
// Matrix
byte[] arrayFlags = new byte[8];
ByteBuffer flagBb = ByteBuffer.wrap(arrayFlags).order(ByteOrder.LITTLE_ENDIAN);
flagBb.put((byte)6); // class double
flagBb.put((byte)0); // reserved
flagBb.put((byte)0); // flags
flagBb.put((byte)0); // reserved
flagBb.putInt(0); // nzmax
byte[] flagsTag = new byte[16];
ByteBuffer.wrap(flagsTag).order(ByteOrder.LITTLE_ENDIAN).putInt(6).putInt(8).put(arrayFlags);
int[] dimsArr = {1, 1};
byte[] dims = new byte[8];
ByteBuffer.wrap(dims).order(ByteOrder.LITTLE_ENDIAN).putInt(dimsArr[0]).putInt(dimsArr[1]);
byte[] dimsTag = new byte[16];
ByteBuffer.wrap(dimsTag).order(ByteOrder.LITTLE_ENDIAN).putInt(5).putInt(8).put(dims);
String nameStr = "test";
byte[] name = nameStr.getBytes("ASCII");
byte[] namePadded = Arrays.copyOf(name, ((name.length + 7) / 8) * 8);
byte[] nameTag = new byte[8 + namePadded.length];
ByteBuffer.wrap(nameTag).order(ByteOrder.LITTLE_ENDIAN).putInt(1).putInt(name.length).put(namePadded);
double realVal = 42.0;
byte[] real = new byte[8];
ByteBuffer.wrap(real).order(ByteOrder.LITTLE_ENDIAN).putDouble(realVal);
byte[] realTag = new byte[16];
ByteBuffer.wrap(realTag).order(ByteOrder.LITTLE_ENDIAN).putInt(9).putInt(8).put(real);
int matrixLen = flagsTag.length + dimsTag.length + nameTag.length + realTag.length;
byte[] matrixTag = new byte[8];
ByteBuffer.wrap(matrixTag).order(ByteOrder.LITTLE_ENDIAN).putInt(14).putInt(matrixLen);
try (FileOutputStream fos = new FileOutputStream(outFilename)) {
fos.write(header);
fos.write(matrixTag);
fos.write(flagsTag);
fos.write(dimsTag);
fos.write(nameTag);
fos.write(realTag);
}
}
public static void main(String[] args) throws IOException {
MatFileHandler handler = new MatFileHandler("example.mat");
handler.readDecode();
handler.printProperties();
handler.writeSimple("output.mat");
}
}
JavaScript class for .MAT handling:
This class is Node.js compatible (uses fs, zlib); opens, decodes, prints to console, writes simple file. For browser, adapt with FileReader.
const fs = require('fs');
const zlib = require('zlib');
class MatFileHandler {
constructor(filename) {
this.filename = filename;
this.littleEndian = true;
this.properties = {};
}
readDecode() {
const data = fs.readFileSync(this.filename);
const view = new DataView(data.buffer);
let offset = 0;
// Header
const textDecoder = new TextDecoder('ascii');
const text = textDecoder.decode(data.slice(0, 116)).trim();
let subsystemOffset = view.getBigInt64(116, true); // LE
let version = view.getUint16(124, true);
const endian = textDecoder.decode(data.slice(126, 128));
if (endian === 'IM') {
this.littleEndian = false;
subsystemOffset = view.getBigInt64(116, false);
version = view.getUint16(124, false);
}
this.properties['Descriptive Text'] = text;
this.properties['Subsystem Data Offset'] = subsystemOffset.toString();
this.properties['Version'] = version;
this.properties['Endian Indicator'] = endian;
if (subsystemOffset > 0n) {
this.properties['Subsystem Data'] = 'Present';
}
// Data elements
offset = 128;
let elementCount = 0;
const elements = [];
while (offset < data.length) {
let type = view.getUint32(offset, this.littleEndian);
let bytes = view.getUint32(offset + 4, this.littleEndian);
let isSmall = (type & 0xFFFF0000) === 0;
if (isSmall) {
type = type & 0xFFFF;
bytes = bytes >>> 16;
}
const tagSize = isSmall ? 4 : 8;
elementCount++;
const elemProps = { Type: type, 'Byte Count': bytes, 'Compression Flag': 'No' };
let elemData = data.slice(offset + tagSize, offset + tagSize + bytes);
if (type === 15) { // miCOMPRESSED
elemProps['Compression Flag'] = 'Yes';
elemData = zlib.inflateSync(elemData);
const subView = new DataView(elemData.buffer);
type = subView.getUint32(0, this.littleEndian);
bytes = subView.getUint32(4, this.littleEndian);
elemProps['Decompressed Type'] = type;
elemProps['Decompressed Byte Count'] = bytes;
elemData = elemData.slice(8, 8 + bytes);
}
if (type === 14) { // miMATRIX
const subView = new DataView(elemData.buffer);
let subOffset = 0;
subView.getUint32(subOffset, this.littleEndian); // flagType
const flagBytes = subView.getUint32(subOffset + 4, this.littleEndian);
subOffset += 8;
const matrixClass = subView.getUint8(subOffset);
const flags = subView.getUint8(subOffset + 2);
const nzmax = subView.getUint32(subOffset + 4, this.littleEndian);
elemProps['Matrix Class'] = matrixClass;
elemProps['Matrix Flags'] = flags;
elemProps['Complex'] = (flags & 8) > 0;
elemProps['Global'] = (flags & 4) > 0;
elemProps['Logical'] = (flags & 2) > 0;
elemProps['NZMax'] = nzmax;
subOffset += flagBytes;
subView.getUint32(subOffset, this.littleEndian); // dimType
const dimBytes = subView.getUint32(subOffset + 4, this.littleEndian);
subOffset += 8;
const numDims = dimBytes / 4;
const dims = [];
for (let i = 0; i < numDims; i++) {
dims.push(subView.getInt32(subOffset + i * 4, this.littleEndian));
}
elemProps['Dimensions'] = dims;
subOffset += dimBytes;
subView.getUint32(subOffset, this.littleEndian); // nameType
const nameBytes = subView.getUint32(subOffset + 4, this.littleEndian);
subOffset += 8;
const name = textDecoder.decode(elemData.slice(subOffset, subOffset + nameBytes)).trim();
elemProps['Array Name'] = name;
}
elements.push(elemProps);
const padding = (8 - (bytes % 8)) % 8;
offset += tagSize + bytes + padding;
}
this.properties['Data Element Count'] = elementCount;
this.properties['Data Elements'] = elements;
}
printProperties() {
for (const [key, value] of Object.entries(this.properties)) {
if (key === 'Data Elements') {
value.forEach((elem, i) => {
console.log(`Data Element ${i + 1}:`);
for (const [ek, ev] of Object.entries(elem)) {
console.log(` ${ek}: ${ev}`);
}
});
} else {
console.log(`${key}: ${value}`);
}
}
}
writeSimple(outFilename) {
const headerText = 'MATLAB 5.0 MAT-file, Platform: js, Created on: Mon Sep 29 2025'.padEnd(116, ' ');
const subsystemOffset = Buffer.alloc(8, 0);
const version = Buffer.alloc(2);
version.writeUint16LE(0x0100, 0);
const endian = Buffer.from('MI');
const header = Buffer.concat([Buffer.from(headerText), subsystemOffset, version, endian]);
// Simple matrix
const arrayFlags = Buffer.alloc(8);
arrayFlags.writeUint8(6, 0); // class
arrayFlags.writeUint8(0, 2); // flags
arrayFlags.writeUint32LE(0, 4); // nzmax
const flagsTag = Buffer.concat([Buffer.from([6, 0, 0, 0, 8, 0, 0, 0]), arrayFlags]);
const dims = Buffer.alloc(8);
dims.writeInt32LE(1, 0);
dims.writeInt32LE(1, 4);
const dimsTag = Buffer.concat([Buffer.from([5, 0, 0, 0, 8, 0, 0, 0]), dims]);
const name = Buffer.from('test');
const namePadded = Buffer.alloc(Math.ceil(name.length / 8) * 8, 0);
name.copy(namePadded);
const nameTag = Buffer.alloc(8 + namePadded.length);
nameTag.writeUint32LE(1, 0);
nameTag.writeUint32LE(name.length, 4);
namePadded.copy(nameTag, 8);
const real = Buffer.alloc(8);
real.writeDoubleLE(42.0, 0);
const realTag = Buffer.concat([Buffer.from([9, 0, 0, 0, 8, 0, 0, 0]), real]);
const matrixData = Buffer.concat([flagsTag, dimsTag, nameTag, realTag]);
const matrixTag = Buffer.alloc(8);
matrixTag.writeUint32LE(14, 0);
matrixTag.writeUint32LE(matrixData.length, 4);
fs.writeFileSync(outFilename, Buffer.concat([header, matrixTag, matrixData]));
}
}
// Example
const handler = new MatFileHandler('example.mat');
handler.readDecode();
handler.printProperties();
handler.writeSimple('output.mat');
C class (using C++ for class support) for .MAT handling:
This C++ class handles Level 5 .MAT: opens, decodes, prints to console, writes simple file. Uses std::ifstream, zlib.h for compression.
#include <iostream>
#include <fstream>
#include <vector>
#include <map>
#include <string>
#include <cstring>
#include <zlib.h> // Link with -lz
class MatFileHandler {
private:
std::string filename;
bool littleEndian;
std::map<std::string, std::string> properties; // Simplified to strings for print
std::vector<std::map<std::string, std::string>> elements;
public:
MatFileHandler(const std::string& fn) : filename(fn), littleEndian(true) {}
void readDecode() {
std::ifstream file(filename, std::ios::binary | std::ios::ate);
std::streamsize size = file.tellg();
file.seekg(0, std::ios::beg);
std::vector<char> data(size);
file.read(data.data(), size);
// Header
std::string text(data.begin(), data.begin() + 116);
text.erase(std::remove(text.begin(), text.end(), '\0'), text.end()); // Trim nulls
long long subsystemOffset;
std::memcpy(&subsystemOffset, &data[116], 8);
unsigned short version;
std::memcpy(&version, &data[124], 2);
char endian[3];
std::memcpy(endian, &data[126], 2);
endian[2] = '\0';
if (std::string(endian) == "IM") {
littleEndian = false;
// For simplicity, assume LE; in practice, implement swap
}
properties["Descriptive Text"] = text;
properties["Subsystem Data Offset"] = std::to_string(subsystemOffset);
properties["Version"] = std::to_string(version);
properties["Endian Indicator"] = endian;
if (subsystemOffset > 0) {
properties["Subsystem Data"] = "Present";
}
// Data elements
size_t offset = 128;
int elementCount = 0;
while (offset < static_cast<size_t>(size)) {
unsigned int type, bytes;
std::memcpy(&type, &data[offset], 4);
std::memcpy(&bytes, &data[offset + 4], 4);
bool isSmall = (type & 0xFFFF0000) == 0;
if (isSmall) {
type &= 0xFFFF;
bytes >>= 16;
}
size_t tagSize = isSmall ? 4 : 8;
elementCount++;
std::map<std::string, std::string> elemProps;
elemProps["Type"] = std::to_string(type);
elemProps["Byte Count"] = std::to_string(bytes);
elemProps["Compression Flag"] = "No";
std::vector<char> elemData(data.begin() + offset + tagSize, data.begin() + offset + tagSize + bytes);
if (type == 15) { // miCOMPRESSED
elemProps["Compression Flag"] = "Yes";
uLongf decompLen = bytes * 2; // Estimate
std::vector<char> decomp(decompLen);
if (uncompress(reinterpret_cast<Bytef*>(decomp.data()), &decompLen, reinterpret_cast<Bytef*>(elemData.data()), bytes) == Z_OK) {
elemData.resize(decompLen);
std::memcpy(elemData.data(), decomp.data(), decompLen);
}
std::memcpy(&type, elemData.data(), 4);
std::memcpy(&bytes, elemData.data() + 4, 4);
elemProps["Decompressed Type"] = std::to_string(type);
elemProps["Decompressed Byte Count"] = std::to_string(bytes);
elemData = std::vector<char>(elemData.begin() + 8, elemData.begin() + 8 + bytes);
}
if (type == 14) { // miMATRIX
size_t subOffset = 0;
unsigned int flagBytes;
std::memcpy(&flagBytes, &elemData[subOffset + 4], 4);
subOffset += 8;
unsigned char matrixClass = elemData[subOffset];
unsigned char flags = elemData[subOffset + 2];
unsigned int nzmax;
std::memcpy(&nzmax, &elemData[subOffset + 4], 4);
elemProps["Matrix Class"] = std::to_string(matrixClass);
elemProps["Matrix Flags"] = std::to_string(flags);
elemProps["Complex"] = (flags & 8) ? "true" : "false";
elemProps["Global"] = (flags & 4) ? "true" : "false";
elemProps["Logical"] = (flags & 2) ? "true" : "false";
elemProps["NZMax"] = std::to_string(nzmax);
subOffset += flagBytes;
unsigned int dimBytes;
std::memcpy(&dimBytes, &elemData[subOffset + 4], 4);
subOffset += 8;
int numDims = dimBytes / 4;
std::string dimsStr;
for (int i = 0; i < numDims; i++) {
int dim;
std::memcpy(&dim, &elemData[subOffset + i * 4], 4);
dimsStr += std::to_string(dim) + " ";
}
elemProps["Dimensions"] = dimsStr;
subOffset += dimBytes;
unsigned int nameBytes;
std::memcpy(&nameBytes, &elemData[subOffset + 4], 4);
subOffset += 8;
std::string name(elemData.begin() + subOffset, elemData.begin() + subOffset + nameBytes);
name.erase(std::remove(name.begin(), name.end(), '\0'), name.end());
elemProps["Array Name"] = name;
}
elements.push_back(elemProps);
size_t padding = (8 - (bytes % 8)) % 8;
offset += tagSize + bytes + padding;
}
properties["Data Element Count"] = std::to_string(elementCount);
}
void printProperties() {
for (const auto& prop : properties) {
std::cout << prop.first << ": " << prop.second << std::endl;
}
for (size_t i = 0; i < elements.size(); ++i) {
std::cout << "Data Element " << (i + 1) << ":" << std::endl;
for (const auto& e : elements[i]) {
std::cout << " " << e.first << ": " << e.second << std::endl;
}
}
}
void writeSimple(const std::string& outFilename) {
std::ofstream file(outFilename, std::ios::binary);
std::string headerText = "MATLAB 5.0 MAT-file, Platform: c++, Created on: Mon Sep 29 2025";
headerText.resize(116, ' ');
file.write(headerText.data(), 116);
char zero[8] = {0};
file.write(zero, 8); // subsystem
unsigned short ver = 0x0100;
file.write(reinterpret_cast<char*>(&ver), 2);
file.write("MI", 2);
// Simple matrix tag
unsigned int matType = 14;
file.write(reinterpret_cast<char*>(&matType), 4);
unsigned int matBytes = 64; // Calculate based on content
file.write(reinterpret_cast<char*>(&matBytes), 4);
// Flags
unsigned int flagType = 6;
file.write(reinterpret_cast<char*>(&flagType), 4);
unsigned int flagBytes = 8;
file.write(reinterpret_cast<char*>(&flagBytes), 4);
char flagsData[8] = {6, 0, 0, 0, 0, 0, 0, 0};
file.write(flagsData, 8);
// Dims
unsigned int dimType = 5;
file.write(reinterpret_cast<char*>(&dimType), 4);
unsigned int dimBytes = 8;
file.write(reinterpret_cast<char*>(&dimBytes), 4);
int dims[2] = {1, 1};
file.write(reinterpret_cast<char*>(dims), 8);
// Name
unsigned int nameType = 1;
file.write(reinterpret_cast<char*>(&nameType), 4);
unsigned int nameBytes = 4;
file.write(reinterpret_cast<char*>(&nameBytes), 4);
file.write("test", 4);
file.write(zero, 4); // Pad
// Real
unsigned int realType = 9;
file.write(reinterpret_cast<char*>(&realType), 4);
unsigned int realBytes = 8;
file.write(reinterpret_cast<char*>(&realBytes), 4);
double val = 42.0;
file.write(reinterpret_cast<char*>(&val), 8);
}
};
int main() {
MatFileHandler handler("example.mat");
handler.readDecode();
handler.printProperties();
handler.writeSimple("output.mat");
return 0;
}