Task 160: .DX File Format

Task 160: .DX File Format

1. List of Properties Intrinsic to the .DX File Format

The .DX file format refers to the JCAMP-DX standard, a text-based format for exchanging spectroscopic data, particularly infrared spectra. The intrinsic properties are defined by Label Data Records (LDRs), which are key-value pairs starting with "##" and representing headers, parameters, and data. These LDRs form the structure of the file, with required LDRs ensuring basic compatibility and optional ones providing additional metadata. Below is a comprehensive list of all LDRs (properties) based on the JCAMP-DX version 4.24 specification, categorized for clarity:

Required Core LDRs (Must appear in every simple data block for a valid file):

##TITLE=: Provides a concise description of the spectrum or block.
##JCAMP-DX=: Specifies the version of the format (e.g., 4.24).
##DATA TYPE=: Defines the type of data (e.g., INFRARED SPECTRUM, LINK for compound files).
##XUNITS=: Units for the abscissa (x-axis), such as 1/CM or MICROMETERS.
##YUNITS=: Units for the ordinate (y-axis), such as ABSORBANCE or TRANSMITTANCE.
##FIRSTX=: The first actual abscissa value.
##LASTX=: The last actual abscissa value.
##NPOINTS=: The number of data points in the spectral data.
##FIRSTY=: The actual y-value corresponding to ##FIRSTX=.
##XYDATA=: The spectral data table in the form of ordinates at equal x-intervals (e.g., (X++(Y..Y))).
##END=: Marks the end of the block or file.

Optional Spectral Parameter LDRs (Recommended for scaling and inspection):

##XFACTOR=: Scaling factor for x-values.
##YFACTOR=: Scaling factor for y-values.
##DELTAX=: Nominal spacing between data points.
##RESOLUTION=: Nominal spectral resolution (single value or pairs for varying resolution).
##MAXX=: Largest x-value in the spectrum.
##MINX=: Smallest x-value in the spectrum.
##MAXY=: Largest y-value in the spectrum.
##MINY=: Smallest y-value in the spectrum.

Optional Notes LDRs (For metadata, at least one sample-related LDR recommended):

##CLASS=: Spectrum classification (e.g., Coblentz or IUPAC class).
##ORIGIN=: Source organization or individual.
##OWNER=: Owner of the data, including copyright if applicable.
##DATE=: Date of measurement (format: YY/MM/DD).
##TIME=: Time of measurement (format: HH:MM:SS).
##SOURCE REFERENCE=: Reference to the original spectrum location.
##CROSS REFERENCE=: Links to related spectra or data.
##SAMPLE DESCRIPTION=: Description of the sample (required for non-pure compounds).
##CAS NAME=: Chemical Abstracts Service name.
##MOLFORM=: Molecular formula.
##MP=: Melting point.
##BP=: Boiling point.
##REFRACTIVE INDEX=: Refractive index.
##DENSITY=: Density.
##MW=: Molecular weight.
##CAS REGISTRY NO=: CAS registry number.
##WISWESSER=: Wiswesser line notation.
##BEILSTEIN REFERENCE=: Beilstein reference.
##SPECTROMETER/DATA SYSTEM=: Instrument and system details.
##INSTRUMENT PARAMETERS=: Specific instrument settings.
##PATH LENGTH=: Sample path length.
##SAMPLING PROCEDURE=: Method of sampling.
##STATE=: Physical state of the sample.
##NAMES=: Alternate names for the compound.
##NIST COMPLEX=: NIST complex identifier.
##CONCENTRATION=: Sample concentration.
##STANDARD=: Reference to standard used.

Additional LDRs for Specialized Data (e.g., Interferograms):

##RUNITS=: Units for optical retardation.
##AUNITS=: Units for signal amplitude.
##FIRSTR=: First optical retardation value.
##LASTR=: Last optical retardation value.
##DELTAR=: Optical retardation per data point.
##ALIAS=: Alias fraction.
##ZPD=: Zero path difference position.
##RFACTOR=: Scaling factor for retardation.
##AFACTOR=: Scaling factor for amplitude.
##FIRSTA=: First amplitude value.
##RADATA=: Interferogram data.

These properties are intrinsic to the format's structure, ensuring portability across systems. Files are ASCII text, with LDRs on separate lines, and data compressed using Alphanumeric String Data Format (ASDF) for efficiency.

Note: JCAMP-DX files commonly use .jdx or .dx extensions. The following are direct links to sample .jdx files:

3. Ghost Blog Embedded HTML JavaScript for Drag-and-Drop .DX File Processing

The following is a self-contained HTML page with embedded JavaScript that allows users to drag and drop a .DX (JCAMP-DX) file. It reads the file, parses the LDRs (properties), and dumps them to the screen in a structured list. Data sections (e.g., ##XYDATA=) are summarized to avoid overwhelming output.

JCAMP-DX File Parser
Drag and drop a .DX file here

4. Python Class for .DX File Handling

The following Python class can open, decode (parse), read, write, and print all properties from a .DX (JCAMP-DX) file to the console.

import os

class DXFileHandler:
    def __init__(self, filepath):
        self.filepath = filepath
        self.properties = {}
        self.read()

    def read(self):
        """Read and decode the .DX file, storing properties."""
        if not os.path.exists(self.filepath):
            raise FileNotFoundError(f"File {self.filepath} not found.")
        with open(self.filepath, 'r') as f:
            content = f.read()
        lines = content.split('\n')
        current_key = None
        for line in lines:
            line = line.strip()
            if line.startswith('##'):
                key, value = line.split('=', 1) if '=' in line else (line, '')
                current_key = key.strip()
                self.properties[current_key] = value.strip()
            elif current_key:
                self.properties[current_key] += '\n' + line.strip()

    def print_properties(self):
        """Print all properties to console."""
        for key, value in self.properties.items():
            print(f"{key}: {value}")

    def write(self, new_filepath=None):
        """Write the properties back to a new .DX file."""
        filepath = new_filepath or self.filepath
        with open(filepath, 'w') as f:
            for key, value in self.properties.items():
                f.write(f"{key}={value}\n")

# Example usage:
# handler = DXFileHandler('sample.dx')
# handler.print_properties()
# handler.write('output.dx')

5. Java Class for .DX File Handling

The following Java class can open, decode (parse), read, write, and print all properties from a .DX (JCAMP-DX) file to the console.

import java.io.*;
import java.util.LinkedHashMap;
import java.util.Map;

public class DXFileHandler {
    private String filepath;
    private Map<String, String> properties = new LinkedHashMap<>();

    public DXFileHandler(String filepath) {
        this.filepath = filepath;
        read();
    }

    public void read() {
        try (BufferedReader reader = new BufferedReader(new FileReader(filepath))) {
            String line;
            String currentKey = null;
            StringBuilder valueBuilder = new StringBuilder();
            while ((line = reader.readLine()) != null) {
                line = line.trim();
                if (line.startsWith("##")) {
                    if (currentKey != null) {
                        properties.put(currentKey, valueBuilder.toString().trim());
                    }
                    String[] parts = line.split("=", 2);
                    currentKey = parts[0].trim();
                    valueBuilder = new StringBuilder(parts.length > 1 ? parts[1].trim() : "");
                } else if (currentKey != null) {
                    valueBuilder.append("\n").append(line);
                }
            }
            if (currentKey != null) {
                properties.put(currentKey, valueBuilder.toString().trim());
            }
        } catch (IOException e) {
            System.err.println("Error reading file: " + e.getMessage());
        }
    }

    public void printProperties() {
        for (Map.Entry<String, String> entry : properties.entrySet()) {
            System.out.println(entry.getKey() + ": " + entry.getValue());
        }
    }

    public void write(String newFilepath) {
        try (BufferedWriter writer = new BufferedWriter(new FileWriter(newFilepath == null ? filepath : newFilepath))) {
            for (Map.Entry<String, String> entry : properties.entrySet()) {
                writer.write(entry.getKey() + "=" + entry.getValue() + "\n");
            }
        } catch (IOException e) {
            System.err.println("Error writing file: " + e.getMessage());
        }
    }

    // Example usage:
    // public static void main(String[] args) {
    //     DXFileHandler handler = new DXFileHandler("sample.dx");
    //     handler.printProperties();
    //     handler.write("output.dx");
    // }
}

6. JavaScript Class for .DX File Handling

The following JavaScript class (Node.js compatible) can open, decode (parse), read, write, and print all properties from a .DX (JCAMP-DX) file to the console. Requires the 'fs' module.

const fs = require('fs');

class DXFileHandler {
    constructor(filepath) {
        this.filepath = filepath;
        this.properties = {};
        this.read();
    }

    read() {
        try {
            const content = fs.readFileSync(this.filepath, 'utf8');
            const lines = content.split('\n');
            let currentKey = null;
            lines.forEach(line => {
                line = line.trim();
                if (line.startsWith('##')) {
                    const [key, value] = line.split('=', 2).map(s => s.trim());
                    currentKey = key;
                    this.properties[currentKey] = value || '';
                } else if (currentKey) {
                    this.properties[currentKey] += '\n' + line.trim();
                }
            });
        } catch (error) {
            console.error('Error reading file:', error.message);
        }
    }

    printProperties() {
        for (const [key, value] of Object.entries(this.properties)) {
            console.log(`${key}: ${value}`);
        }
    }

    write(newFilepath = this.filepath) {
        try {
            let content = '';
            for (const [key, value] of Object.entries(this.properties)) {
                content += `${key}=${value}\n`;
            }
            fs.writeFileSync(newFilepath, content, 'utf8');
        } catch (error) {
            console.error('Error writing file:', error.message);
        }
    }
}

// Example usage:
// const handler = new DXFileHandler('sample.dx');
// handler.printProperties();
// handler.write('output.dx');

7. C Class for .DX File Handling

The following C++ class (using standard libraries) can open, decode (parse), read, write, and print all properties from a .DX (JCAMP-DX) file to the console.

#include <iostream>
#include <fstream>
#include <string>
#include <map>
#include <sstream>

class DXFileHandler {
private:
    std::string filepath;
    std::map<std::string, std::string> properties;

public:
    DXFileHandler(const std::string& fp) : filepath(fp) {
        read();
    }

    void read() {
        std::ifstream file(filepath);
        if (!file.is_open()) {
            std::cerr << "Error opening file: " << filepath << std::endl;
            return;
        }
        std::string line, currentKey;
        std::stringstream valueStream;
        while (std::getline(file, line)) {
            std::string trimmed = line;
            trimmed.erase(0, trimmed.find_first_not_of(" \t"));
            trimmed.erase(trimmed.find_last_not_of(" \t") + 1);
            if (trimmed.rfind("##", 0) == 0) {
                if (!currentKey.empty()) {
                    properties[currentKey] = valueStream.str();
                    valueStream.str("");
                }
                size_t eqPos = trimmed.find('=');
                currentKey = (eqPos != std::string::npos) ? trimmed.substr(0, eqPos) : trimmed;
                currentKey.erase(currentKey.find_last_not_of(" \t") + 1);
                std::string value = (eqPos != std::string::npos) ? trimmed.substr(eqPos + 1) : "";
                value.erase(0, value.find_first_not_of(" \t"));
                valueStream << value;
            } else if (!currentKey.empty()) {
                valueStream << "\n" << trimmed;
            }
        }
        if (!currentKey.empty()) {
            properties[currentKey] = valueStream.str();
        }
        file.close();
    }

    void printProperties() const {
        for (const auto& pair : properties) {
            std::cout << pair.first << ": " << pair.second << std::endl;
        }
    }

    void write(const std::string& newFilepath = "") const {
        std::string outPath = newFilepath.empty() ? filepath : newFilepath;
        std::ofstream file(outPath);
        if (!file.is_open()) {
            std::cerr << "Error writing file: " << outPath << std::endl;
            return;
        }
        for (const auto& pair : properties) {
            file << pair.first << "=" << pair.second << "\n";
        }
        file.close();
    }
};

// Example usage:
// int main() {
//     DXFileHandler handler("sample.dx");
//     handler.printProperties();
//     handler.write("output.dx");
//     return 0;
// }