Task 282: .HAR File Format
Task 282: .HAR File Format
File Format Specifications for .HDF (HDF4)
The .HDF file format refers to the Hierarchical Data Format version 4 (HDF4), a self-describing, platform-independent format for storing and organizing large amounts of scientific and graphical data in a hierarchical structure. It supports multidimensional arrays, raster images, tables, annotations, and metadata, with extensibility for custom data types. HDF4 is deprecated in favor of HDF5, but remains in use for legacy data.
Detailed specifications are available in the official HDF4 User's Guide and Reference Manual, which describes the binary layout, tags, data descriptors, and object parsing rules. A key document is the HDF 4.2r4 User's Guide (PDF), covering file structure in Chapter 2 and object-specific formats in subsequent chapters. Additional technical details are in the HDF Reference Manual on Zenodo.
- List of All Properties Intrinsic to Its File System
The HDF4 file system is a hierarchical, block-based structure resembling a tree (like a UNIX filesystem), with self-describing metadata via tags and references. Intrinsic properties include structural elements for identification, organization, navigation, and data management. Based on the format specification, here is a comprehensive list:
- File Signature: 4-byte magic number (0x0E 0x03 0x13 0x01 or ASCII ^N^C^S^A) at offset 0, identifying the file as HDF4 and aiding byte-order detection/portability.
- File Header: Initial metadata block following the signature, including version and utility tags for compatibility.
- Library Version: Stored as a DFTAG_VERSION (tag 0x001E) data object; a 4-byte integer + string describing the HDF library version (e.g., "HDF4.2r4").
- Machine Type: DFTAG_MT (tag 0x006B) data object; 4-byte integer indicating the native machine format (e.g., 0 for little-endian IEEE).
- Number Type: DFTAG_NT (tag 0x006A) data object; 4-byte integer specifying numeric representation (e.g., 0 for IEEE float).
- Data Descriptor (DD): 12-byte structure per object (2-byte tag + 2-byte ref + 4-byte offset + 4-byte length); describes object type, unique ID, location, and size.
- Data Descriptor Block (DD Block): Linked list of DDs; starts with 6-byte header (2-byte count of DDs + 4-byte offset to next block); default 16 DDs per block (tunable); enables navigation via offsets.
- Linked Blocks: Chain of DD blocks for large files or features like chunking/compression; next-block offset is absolute from file start (0 for last block).
- Reference Numbers: 16-bit unique IDs per tag instance; immutable, used for linking objects (e.g., in Vgroups).
- Tags: 16-bit identifiers (1-65,535) for object types; reserved ranges (1-32,767 for core, 32,768-64,999 user-defined); DFTAG_NULL (0) for empty.
- Offsets and Lengths: 32-bit signed integers in DDs; limit file size to ~2 GB; absolute byte positions from file start.
- Vgroups (Virtual Groups): Hierarchical containers (tag 0x07B5); like directories, holding refs to child objects (up to 65,535); support nesting for tree structure.
- Vdatas (Virtual Data): Table-like objects (tag 0x07B4); support fields, rows, and appendable storage via linked blocks.
- Scientific Data Sets (SDS): Multidimensional arrays (tag 0x000B for array data); include dimensions, data types, scales, and labels.
- Raster Images (GR): 8-bit (tag 0x00D0) or 24-bit (tag 0x00D1) images; support interlacing (pixel/plane) and palettes.
- Palettes: 3x256-byte color tables (tag 0x00CC); linked to images via refs.
- Annotations: Descriptive text (tags 0x02BC-0x02BF for file/label/description/data types).
- Attributes: Key-value metadata (tags 0x720 for numeric, 0x726 for char); attached to objects via refs.
- Chunks: Subdivisions of large SDS/GR/Vdata for efficient access (max 65,535 per object); dimensions and compression type stored in headers.
- Compression: Built-in methods (e.g., RLE, NCOMP) applied to chunks; parameters in object headers.
- Self-Describing Nature: All metadata embedded; no external schema needed; utilities like hdfdump parse via DDs/tags.
- Platform Independence: Byte-swapping and type conversion handled via tags (NT/MT); supports big/little-endian.
- Extensibility: User-defined tags and linked blocks for custom objects/external files.
- Block Size Tunability: Adjustable for performance (e.g., ≥4 KB, power-of-2); affects seeking in linked/appendable data.
These properties enable the format's core functionality: hierarchical organization, efficient storage/retrieval, and portability.
- Two Direct Download Links for .HDF Files
- Ghost Blog Embedded HTML JavaScript for Drag-and-Drop .HDF Property Dump
Embed this as a <div>
in a Ghost blog post (e.g., via HTML card). It uses browser File API for drag-and-drop, reads the file as ArrayBuffer, and parses basic intrinsic properties (signature, version, DD blocks with tags/refs/offsets/lengths). For full hierarchy, it sketches Vgroup refs. Outputs to a <pre>
element. No external libs; pure JS.
Drag & drop a .HDF file here to dump its properties
- Python Class for .HDF (HDF4) Handling
This class uses struct
for binary parsing (no external HDF libs needed). read()
parses and prints properties to console. write()
creates a simple .HDF file with signature, a version DD, and one empty Vgroup DD.
import struct
import os
class HDF4Handler:
def __init__(self, filename=None):
self.filename = filename
self.file_size = 0
self.buffer = None
self.dds = []
def read(self):
if not self.filename or not os.path.exists(self.filename):
print("File not found.")
return
with open(self.filename, 'rb') as f:
self.buffer = f.read()
self.file_size = len(self.buffer)
self._parse_signature()
self._parse_version()
self.dds = self._parse_dd_blocks()
self._print_properties()
# Sketch Vgroup
if self.dds:
print(f"Example Vgroup (ref 1): {self._parse_vgroup(1)}")
def _parse_signature(self):
sig = struct.unpack('>4B', self.buffer[:4])
expected = (14, 3, 19, 1) # 0x0E 0x03 0x13 0x01
self.signature = 'Valid HDF4' if sig == expected else 'Invalid'
print(f"Signature: {self.signature} ({[hex(b) for b in sig]})")
def _parse_version(self):
# Find DFTAG_VERSION (30 or 0x001E)
for dd in self.dds:
if dd['tag'] == 30:
offset = dd['offset']
ver_int, *ver_bytes = struct.unpack(f'>{dd["length"]}s', self.buffer[offset:offset + dd['length']])
ver_str = ver_bytes[0].decode('ascii').rstrip('\x00')
print(f"Version: {ver_int} - {ver_str}")
return
print("Version not found")
def _parse_dd_blocks(self):
dds = []
offset = 4 # After signature
while offset < self.file_size:
if offset + 6 > self.file_size:
break
block_size, next_offset = struct.unpack('>HI', self.buffer[offset:offset + 6])
offset += 6
for _ in range(block_size):
if offset + 12 > self.file_size:
break
tag, ref, data_offset, length = struct.unpack('>HHi i', self.buffer[offset:offset + 12])
if tag != 0:
dds.append({'tag': tag, 'ref': ref, 'offset': data_offset, 'length': length})
offset += 12
offset = next_offset
if next_offset == 0:
break
return dds
def _print_properties(self):
print(f"File Size: {self.file_size} bytes")
print(f"Data Descriptors ({len(self.dds)} total):")
for i, dd in enumerate(self.dds):
print(f" DD{i}: Tag 0x{dd['tag']:04x}, Ref {dd['ref']}, Offset {dd['offset']}, Length {dd['length']}")
def _parse_vgroup(self, ref):
vg = next((dd for dd in self.dds if dd['ref'] == ref and dd['tag'] == 1973), None) # 0x07B5
if vg:
return f"Offset {vg['offset']}, Length {vg['length']} (hierarchy refs omitted)"
return "Vgroup not found"
def write(self, output_filename):
with open(output_filename, 'wb') as f:
# Signature
f.write(struct.pack('>4B', 14, 3, 19, 1))
# Simple DD block: 1 DD for version, 1 for empty Vgroup
f.write(struct.pack('>HI', 2, 0)) # Block size 2, no next
# Version DD (tag 30, ref 1, offset 36, length 8)
f.write(struct.pack('>HHi i', 30, 1, 36, 8))
# Vgroup DD (tag 1973, ref 2, offset 44, length 4)
f.write(struct.pack('>HHi i', 1973, 2, 44, 4))
# Dummy version data
f.write(struct.pack('>I', 1) + b'HDF4.0\x00')
# Dummy Vgroup data (minimal header)
f.write(struct.pack('>I', 0)) # No children
print(f"Simple .HDF written to {output_filename}")
# Usage
# handler = HDF4Handler('sample.hdf')
# handler.read()
# handler.write('simple.hdf')
- Java Class for .HDF (HDF4) Handling
Uses ByteBuffer
for binary parsing. read()
prints properties to console. write()
creates a simple file.
import java.io.*;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
public class HDF4Handler {
private String filename;
private byte[] buffer;
private int fileSize;
private List<DD> dds = new ArrayList<>();
public HDF4Handler(String filename) {
this.filename = filename;
}
public void read() throws IOException {
if (filename == null || !new File(filename).exists()) {
System.out.println("File not found.");
return;
}
buffer = Files.readAllBytes(Paths.get(filename));
fileSize = buffer.length;
parseSignature();
parseVersion();
dds = parseDDBlocks();
printProperties();
// Sketch Vgroup
if (!dds.isEmpty()) {
System.out.println("Example Vgroup (ref 1): " + parseVgroup(1));
}
}
private void parseSignature() {
ByteBuffer bb = ByteBuffer.wrap(buffer, 0, 4).order(ByteOrder.BIG_ENDIAN);
int sig1 = bb.get() & 0xFF; // 14
int sig2 = bb.get() & 0xFF; // 3
int sig3 = bb.get() & 0xFF; // 19
int sig4 = bb.get() & 0xFF; // 1
String sig = (sig1 == 14 && sig2 == 3 && sig3 == 19 && sig4 == 1) ? "Valid HDF4" : "Invalid";
System.out.println("Signature: " + sig + " (0x0E 0x03 0x13 0x01)");
}
private void parseVersion() {
for (DD dd : dds) {
if (dd.tag == 30) { // 0x001E
ByteBuffer bb = ByteBuffer.wrap(buffer, dd.offset, dd.length).order(ByteOrder.BIG_ENDIAN);
int verInt = bb.getInt();
byte[] verBytes = new byte[dd.length - 4];
bb.get(verBytes);
String verStr = new String(verBytes).trim().replace("\0", "");
System.out.println("Version: " + verInt + " - " + verStr);
return;
}
}
System.out.println("Version not found");
}
private List<DD> parseDDBlocks() {
List<DD> ddsLocal = new ArrayList<>();
int offset = 4;
while (offset < fileSize) {
if (offset + 6 > fileSize) break;
ByteBuffer bb = ByteBuffer.wrap(buffer, offset, 6).order(ByteOrder.BIG_ENDIAN);
int blockSize = bb.getShort() & 0xFFFF;
int nextOffset = bb.getInt();
offset += 6;
for (int i = 0; i < blockSize; i++) {
if (offset + 12 > fileSize) break;
bb = ByteBuffer.wrap(buffer, offset, 12).order(ByteOrder.BIG_ENDIAN);
int tag = bb.getShort() & 0xFFFF;
int ref = bb.getShort() & 0xFFFF;
int dataOffset = bb.getInt();
int length = bb.getInt();
if (tag != 0) {
ddsLocal.add(new DD(tag, ref, dataOffset, length));
}
offset += 12;
}
offset = nextOffset;
if (nextOffset == 0) break;
}
return ddsLocal;
}
private void printProperties() {
System.out.println("File Size: " + fileSize + " bytes");
System.out.println("Data Descriptors (" + dds.size() + " total):");
for (int i = 0; i < dds.size(); i++) {
DD dd = dds.get(i);
System.out.println(" DD" + i + ": Tag 0x" + String.format("%04x", dd.tag) + ", Ref " + dd.ref + ", Offset " + dd.offset + ", Length " + dd.length);
}
}
private String parseVgroup(int ref) {
for (DD dd : dds) {
if (dd.ref == ref && dd.tag == 1973) { // 0x07B5
return "Offset " + dd.offset + ", Length " + dd.length + " (hierarchy refs omitted)";
}
}
return "Vgroup not found";
}
public void write(String outputFilename) throws IOException {
try (FileOutputStream fos = new FileOutputStream(outputFilename)) {
// Signature
fos.write(new byte[]{14, 3, 19, 1});
// DD block: 2 DDs
fos.write(ByteBuffer.allocate(6).order(ByteOrder.BIG_ENDIAN).putShort((short) 2).putInt(0).array());
// Version DD
fos.write(ByteBuffer.allocate(12).order(ByteOrder.BIG_ENDIAN).putShort((short) 30).putShort((short) 1).putInt(36).putInt(8).array());
// Vgroup DD
fos.write(ByteBuffer.allocate(12).order(ByteOrder.BIG_ENDIAN).putShort((short) 1973).putShort((short) 2).putInt(44).putInt(4).array());
// Dummy version
fos.write(ByteBuffer.allocate(8).order(ByteOrder.BIG_ENDIAN).putInt(1).array());
fos.write("HDF4.0".getBytes()); // Simplified
// Dummy Vgroup
fos.write(ByteBuffer.allocate(4).order(ByteOrder.BIG_ENDIAN).putInt(0).array());
}
System.out.println("Simple .HDF written to " + outputFilename);
}
static class DD {
int tag, ref, offset, length;
DD(int tag, int ref, int offset, int length) {
this.tag = tag; this.ref = ref; this.offset = offset; this.length = length;
}
}
// Usage
// HDF4Handler handler = new HDF4Handler("sample.hdf");
// handler.read();
// handler.write("simple.hdf");
}
- JavaScript Class for .HDF (HDF4) Handling
Node.js class using fs
for file I/O. read()
parses and console.logs properties. write()
creates a simple file. Run with node script.js
.
const fs = require('fs');
class HDF4Handler {
constructor(filename = null) {
this.filename = filename;
this.buffer = null;
this.fileSize = 0;
this.dds = [];
}
read() {
if (!this.filename || !fs.existsSync(this.filename)) {
console.log('File not found.');
return;
}
this.buffer = fs.readFileSync(this.filename);
this.fileSize = this.buffer.length;
this.parseSignature();
this.parseVersion();
this.dds = this.parseDDBlocks();
this.printProperties();
// Sketch Vgroup
if (this.dds.length > 0) {
console.log(`Example Vgroup (ref 1): ${this.parseVgroup(1)}`);
}
}
parseSignature() {
const sig = this.buffer.slice(0, 4);
const expected = Buffer.from([0x0E, 0x03, 0x13, 0x01]);
this.signature = Buffer.compare(sig, expected) === 0 ? 'Valid HDF4' : 'Invalid';
console.log(`Signature: ${this.signature} (0x0E 0x03 0x13 0x01)`);
}
parseVersion() {
for (let dd of this.dds) {
if (dd.tag === 30) { // 0x001E
const slice = this.buffer.slice(dd.offset, dd.offset + dd.length);
const verInt = slice.readUInt32BE(0);
const verStr = slice.slice(4).toString('ascii').replace(/\0/g, '').trim();
console.log(`Version: ${verInt} - ${verStr}`);
return;
}
}
console.log('Version not found');
}
parseDDBlocks() {
let dds = [];
let offset = 4;
while (offset < this.fileSize) {
if (offset + 6 > this.fileSize) break;
const blockSize = this.buffer.readUInt16BE(offset);
const nextOffset = this.buffer.readUInt32BE(offset + 2);
offset += 6;
for (let i = 0; i < blockSize; i++) {
if (offset + 12 > this.fileSize) break;
const tag = this.buffer.readUInt16BE(offset);
const ref = this.buffer.readUInt16BE(offset + 2);
const dataOffset = this.buffer.readInt32BE(offset + 4);
const length = this.buffer.readInt32BE(offset + 8);
if (tag !== 0) dds.push({ tag, ref, offset: dataOffset, length });
offset += 12;
}
offset = nextOffset;
if (nextOffset === 0) break;
}
return dds;
}
printProperties() {
console.log(`File Size: ${this.fileSize} bytes`);
console.log(`Data Descriptors (${this.dds.length} total):`);
this.dds.forEach((dd, i) => {
console.log(` DD${i}: Tag 0x${dd.tag.toString(16).padStart(4, '0')}, Ref ${dd.ref}, Offset ${dd.offset}, Length ${dd.length}`);
});
}
parseVgroup(ref) {
const vg = this.dds.find(dd => dd.ref === ref && dd.tag === 1973); // 0x07B5
if (vg) {
return `Offset ${vg.offset}, Length ${vg.length} (hierarchy refs omitted)`;
}
return 'Vgroup not found';
}
write(outputFilename) {
const fd = fs.openSync(outputFilename, 'w');
// Signature
fs.writeSync(fd, Buffer.from([0x0E, 0x03, 0x13, 0x01]));
// DD block
const blockHeader = Buffer.alloc(6);
blockHeader.writeUInt16BE(2, 0); // 2 DDs
blockHeader.writeUInt32BE(0, 2); // No next
fs.writeSync(fd, blockHeader);
// Version DD
const verDD = Buffer.alloc(12);
verDD.writeUInt16BE(30, 0); // Tag
verDD.writeUInt16BE(1, 2); // Ref
verDD.writeInt32BE(36, 4); // Offset
verDD.writeInt32BE(8, 8); // Length
fs.writeSync(fd, verDD);
// Vgroup DD
const vgDD = Buffer.alloc(12);
vgDD.writeUInt16BE(1973, 0);
vgDD.writeUInt16BE(2, 2);
vgDD.writeInt32BE(44, 4);
vgDD.writeInt32BE(4, 8);
fs.writeSync(fd, vgDD);
// Dummy version
const dummyVer = Buffer.alloc(8);
dummyVer.writeUInt32BE(1, 0);
fs.writeSync(fd, dummyVer);
fs.writeSync(fd, Buffer.from('HDF4.0'));
// Dummy Vgroup
const dummyVg = Buffer.alloc(4);
dummyVg.writeUInt32BE(0, 0);
fs.writeSync(fd, dummyVg);
fs.closeSync(fd);
console.log(`Simple .HDF written to ${outputFilename}`);
}
}
// Usage
// const handler = new HDF4Handler('sample.hdf');
// handler.read();
// handler.write('simple.hdf');
- C Class (Struct) for .HDF (HDF4) Handling
Uses stdio and stdint for binary parsing. Compile with gcc hdf4_handler.c -o hdf4_handler
. read()
prints to stdout. write()
creates a simple file. (C uses struct, not class; functions wrap a struct.)
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <sys/stat.h>
typedef struct {
char *filename;
uint8_t *buffer;
size_t file_size;
struct DD *dds;
int num_dds;
} HDF4Handler;
typedef struct {
uint16_t tag;
uint16_t ref;
int32_t offset;
int32_t length;
} DD;
void parse_signature(HDF4Handler *h) {
uint8_t sig[4];
memcpy(sig, h->buffer, 4);
uint8_t expected[4] = {0x0E, 0x03, 0x13, 0x01};
int valid = (sig[0] == expected[0] && sig[1] == expected[1] && sig[2] == expected[2] && sig[3] == expected[3]);
printf("Signature: %s (0x%02X 0x%02X 0x%02X 0x%02X)\n", valid ? "Valid HDF4" : "Invalid",
sig[0], sig[1], sig[2], sig[3]);
}
void parse_version(HDF4Handler *h) {
for (int i = 0; i < h->num_dds; i++) {
if (h->dds[i].tag == 30) { // 0x001E
uint8_t *data = h->buffer + h->dds[i].offset;
uint32_t ver_int = *(uint32_t *)data; // Big-endian assumed
char ver_str[32];
strncpy(ver_str, (char *)(data + 4), h->dds[i].length - 4);
ver_str[h->dds[i].length - 4] = '\0';
printf("Version: %u - %s\n", ver_int, ver_str);
return;
}
}
printf("Version not found\n");
}
DD *parse_dd_blocks(HDF4Handler *h, int *num) {
DD *dds = malloc(1024 * sizeof(DD)); // Arbitrary max
int count = 0;
size_t offset = 4;
while (offset < h->file_size) {
if (offset + 6 > h->file_size) break;
uint16_t block_size = *(uint16_t *)(h->buffer + offset);
uint32_t next_offset = *(uint32_t *)(h->buffer + offset + 2);
offset += 6;
for (int i = 0; i < block_size; i++) {
if (offset + 12 > h->file_size) break;
uint16_t tag = *(uint16_t *)(h->buffer + offset);
uint16_t ref = *(uint16_t *)(h->buffer + offset + 2);
int32_t data_offset = *(int32_t *)(h->buffer + offset + 4);
int32_t length = *(int32_t *)(h->buffer + offset + 8);
if (tag != 0) {
dds[count].tag = tag;
dds[count].ref = ref;
dds[count].offset = data_offset;
dds[count].length = length;
count++;
}
offset += 12;
}
offset = next_offset;
if (next_offset == 0) break;
}
*num = count;
return dds;
}
void print_properties(HDF4Handler *h) {
printf("File Size: %zu bytes\n", h->file_size);
printf("Data Descriptors (%d total):\n", h->num_dds);
for (int i = 0; i < h->num_dds; i++) {
printf(" DD%d: Tag 0x%04X, Ref %d, Offset %d, Length %d\n",
i, h->dds[i].tag, h->dds[i].ref, h->dds[i].offset, h->dds[i].length);
}
}
char *parse_vgroup(HDF4Handler *h, uint16_t ref) {
for (int i = 0; i < h->num_dds; i++) {
if (h->dds[i].ref == ref && h->dds[i].tag == 1973) { // 0x07B5
char *str = malloc(100);
snprintf(str, 100, "Offset %d, Length %d (hierarchy refs omitted)", h->dds[i].offset, h->dds[i].length);
return str;
}
}
return strdup("Vgroup not found");
}
void hdf4_read(HDF4Handler *h) {
FILE *f = fopen(h->filename, "rb");
if (!f) {
printf("File not found.\n");
return;
}
fseek(f, 0, SEEK_END);
h->file_size = ftell(f);
fseek(f, 0, SEEK_SET);
h->buffer = malloc(h->file_size);
fread(h->buffer, 1, h->file_size, f);
fclose(f);
parse_signature(h);
parse_version(h);
h->dds = parse_dd_blocks(h, &h->num_dds);
print_properties(h);
if (h->num_dds > 0) {
char *vg = parse_vgroup(h, 1);
printf("Example Vgroup (ref 1): %s\n", vg);
free(vg);
}
}
void hdf4_write(HDF4Handler *h, char *output_filename) {
FILE *f = fopen(output_filename, "wb");
if (!f) return;
// Signature
fwrite((uint8_t[]){0x0E, 0x03, 0x13, 0x01}, 1, 4, f);
// DD block header
uint16_t block_size = 2;
uint32_t next = 0;
fwrite(&block_size, 2, 1, f);
fwrite(&next, 4, 1, f);
// Version DD
uint16_t tag1 = 30, ref1 = 1;
int32_t off1 = 36, len1 = 8;
fwrite(&tag1, 2, 1, f); fwrite(&ref1, 2, 1, f);
fwrite(&off1, 4, 1, f); fwrite(&len1, 4, 1, f);
// Vgroup DD
uint16_t tag2 = 1973, ref2 = 2;
int32_t off2 = 44, len2 = 4;
fwrite(&tag2, 2, 1, f); fwrite(&ref2, 2, 1, f);
fwrite(&off2, 4, 1, f); fwrite(&len2, 4, 1, f);
// Dummy version
uint32_t ver_int = 1;
fwrite(&ver_int, 4, 1, f);
fwrite("HDF4.0", 1, 6, f);
// Dummy Vgroup
uint32_t zero = 0;
fwrite(&zero, 4, 1, f);
fclose(f);
printf("Simple .HDF written to %s\n", output_filename);
}
int main() {
HDF4Handler h = { .filename = "sample.hdf" };
hdf4_read(&h);
// Free resources...
free(h.buffer);
free(h.dds);
hdf4_write(&h, "simple.hdf");
return 0;
}