Task 280: .HACK File Format

Task 280: .HACK File Format

File Format Specifications for .HACK

The .HACK file format (typically with the extension .hack) is the machine code representation for the Hack computer architecture, as defined in the Nand2Tetris project ("The Elements of Computing Systems" by Nisan and Schocken). It is a simple text-based format where each instruction is encoded as a 16-bit binary value in human-readable form (a string of 16 '0' and '1' characters). These files are loaded directly into the Hack computer's ROM (instruction memory) starting at address 0. The format supports a 16-bit Von Neumann-style architecture with 32K words of ROM and RAM. There is no binary encoding, compression, or complex header—it's purely sequential text lines for instructions.

1. List of All Properties Intrinsic to the .HACK File Format

The following are the core structural and behavioral properties inherent to the format, derived from the Hack machine language specification. These define how files are parsed, loaded, and executed without external dependencies:

  • File Extension: .hack (case-insensitive in most systems).
  • Content Type: Plain text file.
  • Character Encoding: ASCII (each character is '0' or '1'; no other characters allowed in instruction lines).
  • Line Structure: Each non-empty line contains exactly 16 characters representing one 16-bit instruction (no padding or delimiters within the line).
  • Line Separator: Standard newline (\n); blank lines are ignored during parsing/loading.
  • No Header/Footer/Metadata: Instructions begin immediately from the first valid line; no magic bytes, version info, or checksums.
  • Maximum File Size: Up to 32,768 lines (corresponding to 32K x 16-bit ROM capacity; addresses 0–32,767).
  • Instruction Encoding: Each line is a binary string convertible to a 16-bit integer (0 to 65,535 decimal).
  • Instruction Types (two mutually exclusive formats):
  • A-Instruction (address/constant load): Starts with 0 (bit 15=0), followed by 15-bit unsigned value (0–32,767 decimal). Loads value into A-register and sets addressM.
  • C-Instruction (compute/store/jump): Starts with 111 (bits 15–13=111), followed by:
  • 1-bit a (bit 12): 0 for A/D-register ops, 1 for memory (M) ops.
  • 6-bit comp (bits 11–6): ALU computation mnemonic (e.g., 'D+A', 'M-1').
  • 3-bit dest (bits 5–3): Destination for ALU result (e.g., 'AMD', 'null').
  • 3-bit jump (bits 2–0): Conditional jump (e.g., 'JMP', 'JEQ').
  • Memory Mapping (intrinsic to execution but affects valid values):
  • ROM (instruction memory): Addresses 0–32,767 (read-only, loaded from file).
  • RAM (data memory): Addresses 0–32,767, with special regions:
  • 0–16,383: General-purpose data/variables.
  • 16,384–24,575: Screen buffer (512x256 black/white pixels, 16 bits per word for 2 rows).
  • 24,576: Keyboard input register (16-bit scan code or 0).
  • Error Handling (Implicit): Invalid lines (wrong length, non-binary chars, or malformed opcodes) are skipped or cause load failures; no built-in validation.
  • Execution Model: Sequential fetch-execute cycle; PC starts at 0, increments by 1 or jumps based on conditions.
  • Portability: Platform-independent text; no endianness or alignment issues.

These properties ensure simplicity for educational purposes, allowing direct text-to-binary conversion for the Hack CPU emulator.

3. Ghost Blog Embedded HTML JavaScript for Drag-and-Drop .HACK Parsing

Embed this as raw HTML in a Ghost blog post (use the HTML card in the editor). It creates a drag-and-drop zone that reads a dropped .hack file, parses it, decodes instructions using the spec, and dumps all properties (instruction count, binary/decimal values, decoded fields) to a scrollable <pre> block below.

Drag and drop a .hack file here to parse its properties.

4. Python Class for .HACK Handling

This class opens a .hack file, reads and decodes all instructions (validating format), prints all properties (count, per-instruction details), and supports writing back to a new .hack file (serializing from decoded data, though input is read-only for simplicity).

import os

class HackFile:
    def __init__(self, filename):
        self.filename = filename
        self.instructions = []

    # Comp, dest, jump tables
    comp_table = {
        '0101010': '0', '0111111': '1', '0111010': '-1', '0001100': 'D', '0110000': 'A',
        '0001101': '!D', '0110001': '!A', '0001111': '-D', '0110011': '-A', '0011111': 'D+1',
        '0110111': 'A+1', '0001110': 'D-1', '0110010': 'A-1', '0000010': 'D+A', '0010011': 'D-A',
        '0000111': 'A-D', '0000000': 'D&A', '0010101': 'D|A',
        '1110000': 'M', '1110001': '!M', '1110011': '-M', '1110111': 'M+1', '1110010': 'M-1',
        '1000010': 'D+M', '1010011': 'D-M', '1000111': 'M-D', '1000000': 'D&M', '1010101': 'D|M'
    }
    dest_table = {'000': 'null', '001': 'M', '010': 'D', '011': 'MD', '100': 'A', '101': 'AM', '110': 'AD', '111': 'AMD'}
    jump_table = {'000': 'null', '001': 'JGT', '010': 'JEQ', '011': 'JGE', '100': 'JLT', '101': 'JNE', '110': 'JLE', '111': 'JMP'}

    def read(self):
        if not os.path.exists(self.filename):
            print(f"Error: File {self.filename} not found.")
            return
        self.instructions = []
        with open(self.filename, 'r') as f:
            for line_num, line in enumerate(f, 1):
                line = line.strip()
                if not line:
                    continue
                if len(line) != 16 or not all(c in '01' for c in line):
                    print(f"Warning: Invalid line {line_num}: {line}")
                    continue
                bin_str = line
                val = int(bin_str, 2)
                instr = {'binary': bin_str, 'decimal': val, 'line': line_num}
                if bin_str[0] == '0':
                    instr['type'] = 'A'
                    instr['value'] = int(bin_str[1:], 2)
                elif bin_str[:3] == '111':
                    instr['type'] = 'C'
                    a = bin_str[3]
                    comp_bin = bin_str[4:10]
                    dest_bin = bin_str[10:13]
                    jump_bin = bin_str[13:]
                    comp_key = a + comp_bin
                    instr['a'] = a
                    instr['comp'] = self.comp_table.get(comp_key, 'unknown')
                    instr['dest'] = self.dest_table.get(dest_bin, 'unknown')
                    instr['jump'] = self.jump_table.get(jump_bin, 'unknown')
                else:
                    print(f"Warning: Invalid opcode line {line_num}: {line}")
                    continue
                self.instructions.append(instr)

    def print_properties(self):
        if not self.instructions:
            print("No valid instructions found.")
            return
        print(f"File: {self.filename}")
        print(f"Number of instructions: {len(self.instructions)}")
        print()
        for i, instr in enumerate(self.instructions):
            print(f"Instruction {i+1} (line {instr['line']}):")
            print(f"  Binary: {instr['binary']}")
            print(f"  Decimal: {instr['decimal']}")
            print(f"  Type: {instr['type']}")
            if instr['type'] == 'A':
                print(f"  Value: {instr['value']}")
            else:
                print(f"  a: {instr['a']}, comp: {instr['comp']}, dest: {instr['dest']}, jump: {instr['jump']}")
            print()

    def write(self, output_filename):
        with open(output_filename, 'w') as f:
            for instr in self.instructions:
                f.write(instr['binary'] + '\n')
        print(f"Wrote {len(self.instructions)} instructions to {output_filename}")

# Example usage
if __name__ == "__main__":
    hf = HackFile("example.hack")  # Replace with actual file
    hf.read()
    hf.print_properties()
    hf.write("output.hack")

5. Java Class for .HACK Handling

This Java class reads a .hack file, decodes instructions, prints properties to console, and writes back to a new file. Compile and run with javac HackFile.java && java HackFile example.hack.

import java.io.*;
import java.nio.file.*;
import java.util.*;

public class HackFile {
    private String filename;
    private List<Map<String, Object>> instructions = new ArrayList<>();

    // Comp, dest, jump tables
    private static final Map<String, String> COMP_TABLE = Map.ofEntries(
        Map.entry("0101010", "0"), Map.entry("0111111", "1"), Map.entry("0111010", "-1"), Map.entry("0001100", "D"), Map.entry("0110000", "A"),
        Map.entry("0001101", "!D"), Map.entry("0110001", "!A"), Map.entry("0001111", "-D"), Map.entry("0110011", "-A"), Map.entry("0011111", "D+1"),
        Map.entry("0110111", "A+1"), Map.entry("0001110", "D-1"), Map.entry("0110010", "A-1"), Map.entry("0000010", "D+A"), Map.entry("0010011", "D-A"),
        Map.entry("0000111", "A-D"), Map.entry("0000000", "D&A"), Map.entry("0010101", "D|A"),
        Map.entry("1110000", "M"), Map.entry("1110001", "!M"), Map.entry("1110011", "-M"), Map.entry("1110111", "M+1"), Map.entry("1110010", "M-1"),
        Map.entry("1000010", "D+M"), Map.entry("1010011", "D-M"), Map.entry("1000111", "M-D"), Map.entry("1000000", "D&M"), Map.entry("1010101", "D|M")
    );
    private static final Map<String, String> DEST_TABLE = Map.of(
        "000", "null", "001", "M", "010", "D", "011", "MD", "100", "A", "101", "AM", "110", "AD", "111", "AMD"
    );
    private static final Map<String, String> JUMP_TABLE = Map.of(
        "000", "null", "001", "JGT", "010", "JEQ", "011", "JGE", "100", "JLT", "101", "JNE", "110", "JLE", "111", "JMP"
    );

    public HackFile(String filename) {
        this.filename = filename;
    }

    public void read() {
        instructions.clear();
        try {
            List<String> lines = Files.readAllLines(Paths.get(filename));
            for (int lineNum = 1; lineNum <= lines.size(); lineNum++) {
                String line = lines.get(lineNum - 1).trim();
                if (line.isEmpty()) continue;
                if (line.length() != 16 || !line.matches("[01]{16}")) {
                    System.err.println("Warning: Invalid line " + lineNum + ": " + line);
                    continue;
                }
                String binStr = line;
                int val = Integer.parseInt(binStr, 2);
                Map<String, Object> instr = new HashMap<>();
                instr.put("binary", binStr);
                instr.put("decimal", val);
                instr.put("line", lineNum);
                if (binStr.charAt(0) == '0') {
                    instr.put("type", "A");
                    instr.put("value", Integer.parseInt(binStr.substring(1), 2));
                } else if (binStr.startsWith("111")) {
                    instr.put("type", "C");
                    String a = binStr.substring(3, 4);
                    String compBin = binStr.substring(4, 10);
                    String destBin = binStr.substring(10, 13);
                    String jumpBin = binStr.substring(13);
                    String compKey = a + compBin;
                    instr.put("a", a);
                    instr.put("comp", COMP_TABLE.getOrDefault(compKey, "unknown"));
                    instr.put("dest", DEST_TABLE.getOrDefault(destBin, "unknown"));
                    instr.put("jump", JUMP_TABLE.getOrDefault(jumpBin, "unknown"));
                } else {
                    System.err.println("Warning: Invalid opcode line " + lineNum + ": " + line);
                    continue;
                }
                instructions.add(instr);
            }
        } catch (IOException e) {
            System.err.println("Error reading file: " + e.getMessage());
        }
    }

    public void printProperties() {
        if (instructions.isEmpty()) {
            System.out.println("No valid instructions found.");
            return;
        }
        System.out.println("File: " + filename);
        System.out.println("Number of instructions: " + instructions.size());
        System.out.println();
        for (int i = 0; i < instructions.size(); i++) {
            Map<String, Object> instr = instructions.get(i);
            System.out.println("Instruction " + (i + 1) + " (line " + instr.get("line") + "):");
            System.out.println("  Binary: " + instr.get("binary"));
            System.out.println("  Decimal: " + instr.get("decimal"));
            System.out.println("  Type: " + instr.get("type"));
            if ("A".equals(instr.get("type"))) {
                System.out.println("  Value: " + instr.get("value"));
            } else {
                System.out.println("  a: " + instr.get("a") + ", comp: " + instr.get("comp") + ", dest: " + instr.get("dest") + ", jump: " + instr.get("jump"));
            }
            System.out.println();
        }
    }

    public void write(String outputFilename) {
        try {
            PrintWriter writer = new PrintWriter(new FileWriter(outputFilename));
            for (Map<String, Object> instr : instructions) {
                writer.println(instr.get("binary"));
            }
            writer.close();
            System.out.println("Wrote " + instructions.size() + " instructions to " + outputFilename);
        } catch (IOException e) {
            System.err.println("Error writing file: " + e.getMessage());
        }
    }

    public static void main(String[] args) {
        if (args.length != 1) {
            System.err.println("Usage: java HackFile <filename.hack>");
            return;
        }
        HackFile hf = new HackFile(args[0]);
        hf.read();
        hf.printProperties();
        hf.write("output.hack");
    }
}

6. JavaScript Class for .HACK Handling (Node.js)

This Node.js class reads a .hack file (using fs), decodes, prints properties to console, and writes back. Run with node hackfile.js example.hack.

const fs = require('fs');

class HackFile {
  constructor(filename) {
    this.filename = filename;
    this.instructions = [];
  }

  // Comp, dest, jump tables
  compTable = {
    '0101010': '0', '0111111': '1', '0111010': '-1', '0001100': 'D', '0110000': 'A',
    '0001101': '!D', '0110001': '!A', '0001111': '-D', '0110011': '-A', '0011111': 'D+1',
    '0110111': 'A+1', '0001110': 'D-1', '0110010': 'A-1', '0000010': 'D+A', '0010011': 'D-A',
    '0000111': 'A-D', '0000000': 'D&A', '0010101': 'D|A',
    '1110000': 'M', '1110001': '!M', '1110011': '-M', '1110111': 'M+1', '1110010': 'M-1',
    '1000010': 'D+M', '1010011': 'D-M', '1000111': 'M-D', '1000000': 'D&M', '1010101': 'D|M'
  };
  destTable = { '000': 'null', '001': 'M', '010': 'D', '011': 'MD', '100': 'A', '101': 'AM', '110': 'AD', '111': 'AMD' };
  jumpTable = { '000': 'null', '001': 'JGT', '010': 'JEQ', '011': 'JGE', '100': 'JLT', '101': 'JNE', '110': 'JLE', '111': 'JMP' };

  read() {
    if (!fs.existsSync(this.filename)) {
      console.error(`Error: File ${this.filename} not found.`);
      return;
    }
    const text = fs.readFileSync(this.filename, 'utf8');
    const lines = text.split('\n');
    this.instructions = [];
    for (let lineNum = 1; lineNum <= lines.length; lineNum++) {
      let line = lines[lineNum - 1].trim();
      if (!line) continue;
      if (line.length !== 16 || !/^[01]{16}$/.test(line)) {
        console.warn(`Warning: Invalid line ${lineNum}: ${line}`);
        continue;
      }
      const binStr = line;
      const val = parseInt(binStr, 2);
      let instr = { binary: binStr, decimal: val, line: lineNum };
      if (binStr[0] === '0') {
        instr.type = 'A';
        instr.value = parseInt(binStr.slice(1), 2);
      } else if (binStr.startsWith('111')) {
        instr.type = 'C';
        const a = binStr[3];
        const compBin = binStr.slice(4, 10);
        const destBin = binStr.slice(10, 13);
        const jumpBin = binStr.slice(13);
        const compKey = a + compBin;
        instr.a = a;
        instr.comp = this.compTable[compKey] || 'unknown';
        instr.dest = this.destTable[destBin] || 'unknown';
        instr.jump = this.jumpTable[jumpBin] || 'unknown';
      } else {
        console.warn(`Warning: Invalid opcode line ${lineNum}: ${line}`);
        continue;
      }
      this.instructions.push(instr);
    }
  }

  printProperties() {
    if (this.instructions.length === 0) {
      console.log('No valid instructions found.');
      return;
    }
    console.log(`File: ${this.filename}`);
    console.log(`Number of instructions: ${this.instructions.length}`);
    console.log('');
    this.instructions.forEach((instr, i) => {
      console.log(`Instruction ${i + 1} (line ${instr.line}):`);
      console.log(`  Binary: ${instr.binary}`);
      console.log(`  Decimal: ${instr.decimal}`);
      console.log(`  Type: ${instr.type}`);
      if (instr.type === 'A') {
        console.log(`  Value: ${instr.value}`);
      } else {
        console.log(`  a: ${instr.a}, comp: ${instr.comp}, dest: ${instr.dest}, jump: ${instr.jump}`);
      }
      console.log('');
    });
  }

  write(outputFilename) {
    const content = this.instructions.map(instr => instr.binary).join('\n') + '\n';
    fs.writeFileSync(outputFilename, content);
    console.log(`Wrote ${this.instructions.length} instructions to ${outputFilename}`);
  }
}

// Example usage
if (require.main === module) {
  if (process.argv.length !== 3) {
    console.error('Usage: node hackfile.js <filename.hack>');
    process.exit(1);
  }
  const hf = new HackFile(process.argv[2]);
  hf.read();
  hf.printProperties();
  hf.write('output.hack');
}

module.exports = HackFile;

7. C Class (Struct) for .HACK Handling

This C implementation uses a struct for instructions and functions to read/decode/print/write. Compile with gcc hackfile.c -o hackfile && ./hackfile example.hack. It uses standard C (no external libs).

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_INSTR 32768
#define INSTR_LEN 16

typedef struct {
    char binary[17];  // Null-terminated
    int decimal;
    int line;
    char type[2];     // 'A' or 'C'
    union {
        int value;    // For A
        struct {
            char a;
            char comp[10];  // Mnemonic
            char dest[4];
            char jump[4];
        } c;
    };
} Instruction;

typedef struct {
    char filename[256];
    Instruction instructions[MAX_INSTR];
    int count;
} HackFile;

// Static tables (simplified strings)
static const char* comp_table[] = {
    // Keys are hardcoded checks for brevity; in full impl, use hash or array
    // For demo, we'll implement decoding inline
};
static const char* dest_table[8] = {"null", "M", "D", "MD", "A", "AM", "AD", "AMD"};
static const char* jump_table[8] = {"null", "JGT", "JEQ", "JGE", "JLT", "JNE", "JLE", "JMP"};

// Comp decode function (hardcoded mappings for common cases)
void decode_comp(char a, const char* comp_bin, char* out) {
    char key[8];
    sprintf(key, "%c%s", a, comp_bin);
    if (strcmp(key, "0101010") == 0) strcpy(out, "0");
    else if (strcmp(key, "0111111") == 0) strcpy(out, "1");
    else if (strcmp(key, "0111010") == 0) strcpy(out, "-1");
    else if (strcmp(key, "0001100") == 0) strcpy(out, "D");
    else if (strcmp(key, "0110000") == 0) strcpy(out, "A");
    else if (strcmp(key, "0001101") == 0) strcpy(out, "!D");
    else if (strcmp(key, "0110001") == 0) strcpy(out, "!A");
    else if (strcmp(key, "0001111") == 0) strcpy(out, "-D");
    else if (strcmp(key, "0110011") == 0) strcpy(out, "-A");
    else if (strcmp(key, "0011111") == 0) strcpy(out, "D+1");
    else if (strcmp(key, "0110111") == 0) strcpy(out, "A+1");
    else if (strcmp(key, "0001110") == 0) strcpy(out, "D-1");
    else if (strcmp(key, "0110010") == 0) strcpy(out, "A-1");
    else if (strcmp(key, "0000010") == 0) strcpy(out, "D+A");
    else if (strcmp(key, "0010011") == 0) strcpy(out, "D-A");
    else if (strcmp(key, "0000111") == 0) strcpy(out, "A-D");
    else if (strcmp(key, "0000000") == 0) strcpy(out, "D&A");
    else if (strcmp(key, "0010101") == 0) strcpy(out, "D|A");
    else if (strcmp(key, "1110000") == 0) strcpy(out, "M");
    else if (strcmp(key, "1110001") == 0) strcpy(out, "!M");
    else if (strcmp(key, "1110011") == 0) strcpy(out, "-M");
    else if (strcmp(key, "1110111") == 0) strcpy(out, "M+1");
    else if (strcmp(key, "1110010") == 0) strcpy(out, "M-1");
    else if (strcmp(key, "1000010") == 0) strcpy(out, "D+M");
    else if (strcmp(key, "1010011") == 0) strcpy(out, "D-M");
    else if (strcmp(key, "1000111") == 0) strcpy(out, "M-D");
    else if (strcmp(key, "1000000") == 0) strcpy(out, "D&M");
    else if (strcmp(key, "1010101") == 0) strcpy(out, "D|M");
    else strcpy(out, "unknown");
}

void read_hack(HackFile* hf) {
    FILE* f = fopen(hf->filename, "r");
    if (!f) {
        printf("Error: File %s not found.\n", hf->filename);
        return;
    }
    hf->count = 0;
    char line[INSTR_LEN + 2];
    int line_num = 1;
    while (fgets(line, sizeof(line), f) && hf->count < MAX_INSTR) {
        char* trimmed = strtok(line, "\n\r ");
        if (!trimmed || strlen(trimmed) == 0) {
            line_num++;
            continue;
        }
        if (strlen(trimmed) != INSTR_LEN || strpbrk(trimmed, "01") != trimmed) {
            printf("Warning: Invalid line %d: %s\n", line_num, trimmed);
            line_num++;
            continue;
        }
        strcpy(hf->instructions[hf->count].binary, trimmed);
        hf->instructions[hf->count].decimal = strtol(trimmed, NULL, 2);
        hf->instructions[hf->count].line = line_num;
        if (trimmed[0] == '0') {
            strcpy(hf->instructions[hf->count].type, "A");
            hf->instructions[hf->count].value = strtol(trimmed + 1, NULL, 2);
        } else if (strncmp(trimmed, "111", 3) == 0) {
            strcpy(hf->instructions[hf->count].type, "C");
            char a = trimmed[3];
            char comp_bin[7];
            strncpy(comp_bin, trimmed + 4, 6);
            comp_bin[6] = '\0';
            char dest_bin[4], jump_bin[4];
            strncpy(dest_bin, trimmed + 10, 3);
            dest_bin[3] = '\0';
            strncpy(jump_bin, trimmed + 13, 3);
            jump_bin[3] = '\0';
            hf->instructions[hf->count].c.a = a;
            decode_comp(a, comp_bin, hf->instructions[hf->count].c.comp);
            int d_idx = strtol(dest_bin, NULL, 2);
            strcpy(hf->instructions[hf->count].c.dest, dest_table[d_idx]);
            int j_idx = strtol(jump_bin, NULL, 2);
            strcpy(hf->instructions[hf->count].c.jump, jump_table[j_idx]);
        } else {
            printf("Warning: Invalid opcode line %d: %s\n", line_num, trimmed);
            line_num++;
            continue;
        }
        hf->count++;
        line_num++;
    }
    fclose(f);
}

void print_properties(HackFile* hf) {
    if (hf->count == 0) {
        printf("No valid instructions found.\n");
        return;
    }
    printf("File: %s\n", hf->filename);
    printf("Number of instructions: %d\n\n", hf->count);
    for (int i = 0; i < hf->count; i++) {
        Instruction* instr = &hf->instructions[i];
        printf("Instruction %d (line %d):\n", i + 1, instr->line);
        printf("  Binary: %s\n", instr->binary);
        printf("  Decimal: %d\n", instr->decimal);
        printf("  Type: %s\n", instr->type);
        if (strcmp(instr->type, "A") == 0) {
            printf("  Value: %d\n", instr->value);
        } else {
            printf("  a: %c, comp: %s, dest: %s, jump: %s\n", instr->c.a, instr->c.comp, instr->c.dest, instr->c.jump);
        }
        printf("\n");
    }
}

void write_hack(HackFile* hf, const char* output_filename) {
    FILE* f = fopen(output_filename, "w");
    if (!f) {
        printf("Error writing file %s.\n", output_filename);
        return;
    }
    for (int i = 0; i < hf->count; i++) {
        fprintf(f, "%s\n", hf->instructions[i].binary);
    }
    fclose(f);
    printf("Wrote %d instructions to %s\n", hf->count, output_filename);
}

int main(int argc, char* argv[]) {
    if (argc != 2) {
        fprintf(stderr, "Usage: %s <filename.hack>\n", argv[0]);
        return 1;
    }
    HackFile hf;
    strcpy(hf.filename, argv[1]);
    read_hack(&hf);
    print_properties(&hf);
    write_hack(&hf, "output.hack");
    return 0;
}