Task 440: .NB File Format

Task 440: .NB File Format

1. List of all the properties of this file format intrinsic to its file system

Based on the specifications of the .NB file format (which is the Wolfram Mathematica Notebook format, a plain text format using Wolfram Language syntax for structured interactive documents), the intrinsic properties are as follows. These are derived from the format's structure, storage conventions, and metadata elements that define how the file is recognized, parsed, and handled by systems:

  • File Extension: .nb (case-insensitive, but typically lowercase)
  • MIME Type: application/vnd.wolfram.mathematica
  • Encoding: 7-bit ASCII (printable characters only, ensuring human-readability in any text editor)
  • Human-Readable: Yes (the file is plain text, viewable and editable outside of Mathematica, though best interpreted by Wolfram software)
  • Cross-Platform Compatibility: Yes (readable on any platform with Wolfram Language support; no platform-specific byte order or binary elements)
  • Newline Convention: Platform-independent (supports LF for macOS/Unix or CR+LF for Windows; interpreted consistently by Wolfram Language)
  • CreatedBy Metadata: A comment string indicating the Wolfram product and version that created the file, e.g., (* CreatedBy='Mathematica 14.0' *)
  • CacheID: A unique identifier in a comment for the file outline cache, e.g., (*CacheID: 232 *)
  • File Outline Cache: Optional internal cache data for incremental loading, enclosed in comments like (* Internal cache information ) ... ( End of internal cache information *); used for performance but can be omitted or invalidated
  • Notebook Expression Structure: The core content is a Wolfram Language expression in the form Notebook[{list of Cell[...] expressions}, {notebook options}]; cells can include types like "Input", "Output", "Text", with content as BoxData, TextData, etc.
  • Security Features: Supports dynamic content evaluation flags; can include options like DynamicUpdatingEnabled or InitializationCell to control code execution on open
  • Compression Support: Inline compression possible for obfuscation (e.g., via Compress[] in cells), but base format is uncompressed text

These properties define the format's identity, parsing rules, and system interactions without relying on external metadata.

Here are two direct download links to sample .NB (Mathematica Notebook) files:

These are publicly available samples from academic resources. Clicking them should prompt a download of the .nb file.

3. Ghost blog embedded HTML JavaScript for drag-and-drop .NB file dump

Here's a self-contained HTML snippet with embedded JavaScript that can be embedded in a Ghost blog post (e.g., via the HTML card in the Ghost editor). It creates a drag-and-drop area where users can drop a .NB file. The script reads the file as text, parses it using regex to extract the properties from the list above, and dumps them to the screen in a readable format.

Drag and drop a .NB file here

4. Python class for .NB file handling

Here's a Python class that can open a .NB file, decode/read its text content, extract and print the properties, and write a modified version (e.g., updating CreatedBy) to a new file.

import re

class NBFileHandler:
    def __init__(self, filepath):
        self.filepath = filepath
        self.content = None
        self.properties = {}

    def read(self):
        with open(self.filepath, 'r', encoding='ascii') as f:
            self.content = f.read()
        self._decode_properties()

    def _decode_properties(self):
        if not self.content:
            raise ValueError("No content loaded")
        self.properties = {
            'File Extension': '.nb',
            'MIME Type': 'application/vnd.wolfram.mathematica',
            'Encoding': '7-bit ASCII' if all(ord(c) < 128 for c in self.content) else 'Unknown',
            'Human-Readable': 'Yes',
            'Cross-Platform Compatibility': 'Yes',
            'Newline Convention': 'CR+LF (Windows)' if '\r\n' in self.content else 'LF (Unix/macOS)',
            'CreatedBy Metadata': re.search(r'\(\* CreatedBy=\'(.*?)\' \*\)', self.content).group(1) if re.search(r'\(\* CreatedBy=\'(.*?)\' \*\)', self.content) else 'Not found',
            'CacheID': re.search(r'\(\*CacheID: (.*?) \*\)', self.content).group(1) if re.search(r'\(\*CacheID: (.*?) \*\)', self.content) else 'Not found',
            'File Outline Cache': re.search(r'\(\* Internal cache information \*\)([\s\S]*?)\(\* End of internal cache information \*\)', self.content).group(1).strip() if re.search(r'\(\* Internal cache information \*\)([\s\S]*?)\(\* End of internal cache information \*\)', self.content) else 'None',
            'Notebook Expression Structure': 'Notebook[{...}] (simplified)',
            'Number of Cells (approximate)': len(re.findall(r'Cell\[', self.content)),
            'Security Features': 'Present' if 'DynamicUpdatingEnabled' in self.content else 'Not detected',
            'Compression Support': 'Inline compression detected' if 'Compress[' in self.content else 'None detected'
        }

    def print_properties(self):
        if not self.properties:
            raise ValueError("Properties not decoded")
        for key, value in self.properties.items():
            print(f"{key}: {value}")

    def write(self, new_filepath, update_created_by=None):
        if not self.content:
            raise ValueError("No content loaded")
        new_content = self.content
        if update_created_by:
            new_content = re.sub(r'\(\* CreatedBy=\'(.*?)\' \*\)', f"(* CreatedBy='{update_created_by}' *)", new_content)
        with open(new_filepath, 'w', encoding='ascii') as f:
            f.write(new_content)

# Example usage:
# handler = NBFileHandler('example.nb')
# handler.read()
# handler.print_properties()
# handler.write('modified.nb', update_created_by='Mathematica 14.1')

5. Java class for .NB file handling

Here's a Java class that performs similar operations: open, decode/read, extract/print properties, and write a modified file.

import java.io.*;
import java.util.*;
import java.util.regex.*;

public class NBFileHandler {
    private String filepath;
    private String content;
    private Map<String, String> properties = new HashMap<>();

    public NBFileHandler(String filepath) {
        this.filepath = filepath;
    }

    public void read() throws IOException {
        StringBuilder sb = new StringBuilder();
        try (BufferedReader br = new BufferedReader(new FileReader(filepath))) {
            String line;
            while ((line = br.readLine()) != null) {
                sb.append(line).append("\n");
            }
        }
        content = sb.toString();
        decodeProperties();
    }

    private void decodeProperties() {
        if (content == null) throw new IllegalStateException("No content loaded");
        properties.put("File Extension", ".nb");
        properties.put("MIME Type", "application/vnd.wolfram.mathematica");
        properties.put("Encoding", content.matches("[\\x00-\\x7F]*") ? "7-bit ASCII" : "Unknown");
        properties.put("Human-Readable", "Yes");
        properties.put("Cross-Platform Compatibility", "Yes");
        properties.put("Newline Convention", content.contains("\r\n") ? "CR+LF (Windows)" : "LF (Unix/macOS)");
        Matcher createdByMatcher = Pattern.compile("\\(\\* CreatedBy='(.*?)' \\*\\)").matcher(content);
        properties.put("CreatedBy Metadata", createdByMatcher.find() ? createdByMatcher.group(1) : "Not found");
        Matcher cacheIDMatcher = Pattern.compile("\\(\\*CacheID: (.*?) \\*\\)").matcher(content);
        properties.put("CacheID", cacheIDMatcher.find() ? cacheIDMatcher.group(1) : "Not found");
        Matcher cacheInfoMatcher = Pattern.compile("\\(\\* Internal cache information \\*\\)([\\s\\S]*?)\\(\\* End of internal cache information \\*\\)").matcher(content);
        properties.put("File Outline Cache", cacheInfoMatcher.find() ? cacheInfoMatcher.group(1).trim() : "None");
        properties.put("Notebook Expression Structure", "Notebook[{...}] (simplified)");
        properties.put("Number of Cells (approximate)", String.valueOf(countMatches(content, "Cell[")));
        properties.put("Security Features", content.contains("DynamicUpdatingEnabled") ? "Present" : "Not detected");
        properties.put("Compression Support", content.contains("Compress[") ? "Inline compression detected" : "None detected");
    }

    private int countMatches(String str, String sub) {
        int count = 0;
        int idx = 0;
        while ((idx = str.indexOf(sub, idx)) != -1) {
            count++;
            idx += sub.length();
        }
        return count;
    }

    public void printProperties() {
        if (properties.isEmpty()) throw new IllegalStateException("Properties not decoded");
        for (Map.Entry<String, String> entry : properties.entrySet()) {
            System.out.println(entry.getKey() + ": " + entry.getValue());
        }
    }

    public void write(String newFilepath, String updateCreatedBy) throws IOException {
        if (content == null) throw new IllegalStateException("No content loaded");
        String newContent = content;
        if (updateCreatedBy != null) {
            newContent = content.replaceAll("\\(\\* CreatedBy='(.*?)' \\*\\)", "(* CreatedBy='" + updateCreatedBy + "' *)");
        }
        try (BufferedWriter bw = new BufferedWriter(new FileWriter(newFilepath))) {
            bw.write(newContent);
        }
    }

    // Example usage:
    // public static void main(String[] args) throws IOException {
    //     NBFileHandler handler = new NBFileHandler("example.nb");
    //     handler.read();
    //     handler.printProperties();
    //     handler.write("modified.nb", "Mathematica 14.1");
    // }
}

6. JavaScript class for .NB file handling

Here's a JavaScript class (Node.js compatible) that can open a .NB file, decode/read, extract/print properties to console, and write a modified file. Requires fs module.

const fs = require('fs');

class NBFileHandler {
  constructor(filepath) {
    this.filepath = filepath;
    this.content = null;
    this.properties = {};
  }

  read() {
    this.content = fs.readFileSync(this.filepath, 'ascii');
    this._decodeProperties();
  }

  _decodeProperties() {
    if (!this.content) throw new Error('No content loaded');
    this.properties = {
      'File Extension': '.nb',
      'MIME Type': 'application/vnd.wolfram.mathematica',
      'Encoding': /^[\x00-\x7F]*$/.test(this.content) ? '7-bit ASCII' : 'Unknown',
      'Human-Readable': 'Yes',
      'Cross-Platform Compatibility': 'Yes',
      'Newline Convention': this.content.includes('\r\n') ? 'CR+LF (Windows)' : 'LF (Unix/macOS)',
      'CreatedBy Metadata': this.content.match(/\(\* CreatedBy='(.*?)' \*\)/)?.[1] || 'Not found',
      'CacheID': this.content.match(/\(\*CacheID: (.*?) \*\)/)?.[1] || 'Not found',
      'File Outline Cache': this.content.match(/\(\* Internal cache information \*\)([\s\S]*?)\(\* End of internal cache information \*\)/)?.[1].trim() || 'None',
      'Notebook Expression Structure': 'Notebook[{...}] (simplified)',
      'Number of Cells (approximate)': (this.content.match(/Cell\[/g) || []).length,
      'Security Features': this.content.includes('DynamicUpdatingEnabled') ? 'Present' : 'Not detected',
      'Compression Support': this.content.includes('Compress[') ? 'Inline compression detected' : 'None detected'
    };
  }

  printProperties() {
    if (!Object.keys(this.properties).length) throw new Error('Properties not decoded');
    for (const [key, value] of Object.entries(this.properties)) {
      console.log(`${key}: ${value}`);
    }
  }

  write(newFilepath, updateCreatedBy = null) {
    if (!this.content) throw new Error('No content loaded');
    let newContent = this.content;
    if (updateCreatedBy) {
      newContent = this.content.replace(/\(\* CreatedBy='(.*?)' \*\)/, `(* CreatedBy='${updateCreatedBy}' *)`);
    }
    fs.writeFileSync(newFilepath, newContent, 'ascii');
  }
}

// Example usage:
// const handler = new NBFileHandler('example.nb');
// handler.read();
// handler.printProperties();
// handler.write('modified.nb', 'Mathematica 14.1');

7. C class (C++) for .NB file handling

Here's a C++ class that can open a .NB file, decode/read, extract/print properties to console, and write a modified file. Uses <regex> for parsing.

#include <iostream>
#include <fstream>
#include <string>
#include <regex>
#include <map>

class NBFileHandler {
private:
    std::string filepath;
    std::string content;
    std::map<std::string, std::string> properties;

public:
    NBFileHandler(const std::string& fp) : filepath(fp) {}

    void read() {
        std::ifstream file(filepath, std::ios::in);
        if (!file) throw std::runtime_error("Failed to open file");
        std::string line;
        content.clear();
        while (std::getline(file, line)) {
            content += line + "\n";
        }
        file.close();
        decodeProperties();
    }

    void decodeProperties() {
        if (content.empty()) throw std::runtime_error("No content loaded");
        properties["File Extension"] = ".nb";
        properties["MIME Type"] = "application/vnd.wolfram.mathematica";
        bool isAscii = true;
        for (char c : content) if (static_cast<unsigned char>(c) > 127) { isAscii = false; break; }
        properties["Encoding"] = isAscii ? "7-bit ASCII" : "Unknown";
        properties["Human-Readable"] = "Yes";
        properties["Cross-Platform Compatibility"] = "Yes";
        properties["Newline Convention"] = content.find("\r\n") != std::string::npos ? "CR+LF (Windows)" : "LF (Unix/macOS)";
        std::smatch match;
        std::regex_search(content, match, std::regex(R"(\(\* CreatedBy='(.*?)' \*\))"));
        properties["CreatedBy Metadata"] = match.size() > 1 ? match[1] : "Not found";
        std::regex_search(content, match, std::regex(R"(\(\*CacheID: (.*?) \*\))"));
        properties["CacheID"] = match.size() > 1 ? match[1] : "Not found";
        std::regex_search(content, match, std::regex(R"(\(\* Internal cache information \*\)([\s\S]*?)\(\* End of internal cache information \*\))"));
        properties["File Outline Cache"] = match.size() > 1 ? std::regex_replace(match[1].str(), std::regex("\n"), " ") : "None"; // Trimmed
        properties["Notebook Expression Structure"] = "Notebook[{...}] (simplified)";
        int cellCount = 0;
        size_t pos = 0;
        while ((pos = content.find("Cell[", pos)) != std::string::npos) { cellCount++; pos += 5; }
        properties["Number of Cells (approximate)"] = std::to_string(cellCount);
        properties["Security Features"] = content.find("DynamicUpdatingEnabled") != std::string::npos ? "Present" : "Not detected";
        properties["Compression Support"] = content.find("Compress[") != std::string::npos ? "Inline compression detected" : "None detected";
    }

    void printProperties() const {
        if (properties.empty()) throw std::runtime_error("Properties not decoded");
        for (const auto& pair : properties) {
            std::cout << pair.first << ": " << pair.second << std::endl;
        }
    }

    void write(const std::string& newFilepath, const std::string& updateCreatedBy = "") const {
        if (content.empty()) throw std::runtime_error("No content loaded");
        std::string newContent = content;
        if (!updateCreatedBy.empty()) {
            std::regex_replace(std::back_inserter(newContent), content.begin(), content.end(),
                               std::regex(R"(\(\* CreatedBy='(.*?)' \*\))"), "(* CreatedBy='" + updateCreatedBy + "' *)");
        }
        std::ofstream file(newFilepath, std::ios::out);
        if (!file) throw std::runtime_error("Failed to write file");
        file << newContent;
        file.close();
    }
};

// Example usage:
// int main() {
//     try {
//         NBFileHandler handler("example.nb");
//         handler.read();
//         handler.printProperties();
//         handler.write("modified.nb", "Mathematica 14.1");
//     } catch (const std::exception& e) {
//         std::cerr << e.what() << std::endl;
//     }
//     return 0;
// }