Task 690: .STEP File Format

Task 690: .STEP File Format

1. File Format Specifications for .STEP

The .STEP file format (also known as .STP) is defined by ISO 10303-21, which specifies the clear text encoding of the exchange structure for product data modeled in the EXPRESS language (ISO 10303-11). It is a standard for representing 3D CAD models and related product data in a vendor-neutral way. The format is text-based (UTF-8 encoded), sequential, and uses a context-free grammar in Wirth Syntax Notation (WSN). Key features include:

  • Starts with "ISO-10303-21;"
  • Contains a mandatory HEADER section for metadata.
  • Optional ANCHOR and REFERENCE sections for external references.
  • One or more DATA sections for entity instances.
  • Optional SIGNATURE sections for authentication.
  • Ends with "END-ISO-10303-21;"
  • Supports compression (e.g., ZIP archives).
  • Conformance classes: Syntactical (1-3), indicated in the header.
  • Data types: Integers, reals, strings, enumerations, binaries, lists/sets/arrays/bags.
  • Entities: Defined by schemas (e.g., AP203, AP214 for CAD).

The full specification is available in ISO 10303-21:2016.

2. List of Properties Intrinsic to the .STEP File Format

Based on the ISO 10303-21 specification, the intrinsic properties refer to the metadata fields in the mandatory HEADER section, which describe the file's context, conformance, and origin. These are extractable without interpreting the full DATA section. The properties (entities and their attributes) are:

Required Properties:

  • file_description:
  • description: List of strings (informal file contents description).
  • implementation_level: String (e.g., "4;1" for conformance class).
  • file_name:
  • name: String (file identifier, may include OID).
  • time_stamp: String (ISO 8601 timestamp, e.g., "2023-01-01T12:00:00").
  • author: List of strings (creator's name and address).
  • organization: List of strings (organization details).
  • preprocessor_version: String (creation tool version).
  • originating_system: String (source system).
  • authorization: String (authorizing person).
  • file_schema:
  • schema_identifiers: List of unique strings (EXPRESS schema names, e.g., "CONFIG_CONTROL_DESIGN", optionally with OIDs).

Optional Properties:

  • schema_population:
  • external_file_identifications: List of lists (triples: URI, optional timestamp, optional message digest).
  • file_population:
  • governing_schema: String (schema name from file_schema).
  • determination_method: String (algorithm for instance selection, e.g., "INCLUDE_ALL_COMPATIBLE").
  • governed_sections: Optional set of strings (data section names).
  • section_language:
  • section: Optional string (data section name; "$" if global).
  • default_language: String (ISO 639-2 code, e.g., "eng").
  • section_context:
  • section: Optional string (data section name; "$" if global).
  • context_identifiers: List of strings (context tags, e.g., conformance info).
  • User-defined entities: Optional custom keywords (e.g., "!CUSTOM_ENTITY") with parameters (implementation-specific).

Other format-level properties:

  • Encoding: UTF-8 (basic alphabet U+0020–U+007E, U+0080–U+10FFFF).
  • File extension: .step or .stp.
  • MIME type: model/step (not formally defined in spec, but common).
  • Magic string: Starts with "ISO-10303-21;".
  • Sections order: HEADER, [ANCHOR], [REFERENCE], {DATA}, "END-ISO-10303-21;", {SIGNATURE}.

4. Ghost Blog Embedded HTML/JavaScript for Drag-and-Drop .STEP File Properties Dump

Here's an HTML snippet with embedded JavaScript that can be embedded in a Ghost blog post (or any HTML page). It creates a drag-and-drop area where users can drop a .STEP file. The script reads the file as text, parses the HEADER section, extracts the properties listed above, and dumps them to the screen in a readable format. Note: This is client-side only; no server needed. Parsing is basic (using regex and string manipulation) and assumes well-formed files; it doesn't handle full EXPRESS complexity or errors deeply.

Drag and drop a .STEP file here

5. Python Class for .STEP File Handling

Here's a Python class StepFile that opens a .STEP file, decodes/parses the HEADER properties, prints them to console, and can write a modified version (e.g., updates properties and rewrites the file with the original DATA intact).

import re

class StepFile:
    def __init__(self, filepath):
        self.filepath = filepath
        self.properties = {}
        self.header_raw = ''
        self.rest_of_file = ''
        self.read_and_decode()

    def read_and_decode(self):
        with open(self.filepath, 'r', encoding='utf-8') as f:
            content = f.read()
        
        if not content.startswith('ISO-10303-21;'):
            raise ValueError('Not a valid .STEP file')
        
        header_match = re.search(r'HEADER;(.*?)ENDSEC;', content, re.DOTALL)
        if not header_match:
            raise ValueError('No HEADER section found')
        
        self.header_raw = header_match.group(1)
        self.rest_of_file = content[content.find('ENDSEC;', header_match.end()) + len('ENDSEC;'):]
        
        # Remove comments
        header = re.sub(r'/\*.*?\*/', '', self.header_raw, flags=re.DOTALL)
        
        # Find entities
        entities = re.findall(r'([A-Z_!]+)\((.*?)\);', header, re.DOTALL)
        for key, params_str in entities:
            key = key.lower().replace('!', 'user_')
            params = self._parse_params(params_str)
            self.properties[key] = params

    def _parse_params(self, params_str):
        # Simple recursive parser for params (handles lists, strings, nulls)
        tokens = []
        current = ''
        in_string = False
        in_list = 0
        for char in params_str:
            if char == "'" and not in_string:
                in_string = True
                current += char
            elif char == "'" and in_string and current.endswith('\\'):
                current += char
            elif char == "'" and in_string:
                in_string = False
                current += char
                tokens.append(current)
                current = ''
            elif char == '(' and not in_string:
                in_list += 1
                current += char
            elif char == ')' and not in_string:
                in_list -= 1
                current += char
            elif char == ',' and not in_string and in_list == 0:
                tokens.append(current.strip())
                current = ''
            else:
                current += char
        if current:
            tokens.append(current.strip())
        
        def parse_token(token):
            if token.startswith('(') and token.endswith(')'):
                return self._parse_params(token[1:-1])
            if token.startswith("'") and token.endswith("'"):
                return token[1:-1].replace("''", "'")
            if token == '$':
                return None
            if token == '*':
                return 'derived'
            try:
                return float(token) if '.' in token or 'E' in token.upper() else int(token)
            except ValueError:
                return token
        
        return [parse_token(t) for t in tokens]

    def print_properties(self):
        import json
        print(json.dumps(self.properties, indent=4))

    def write(self, new_filepath, updated_properties=None):
        if updated_properties:
            self.properties.update(updated_properties)
        
        header_lines = []
        for key, params in self.properties.items():
            orig_key = key.upper().replace('USER_', '!')
            params_str = self._serialize_params(params)
            header_lines.append(f"{orig_key}({params_str});")
        
        new_header = 'ISO-10303-21;\nHEADER;\n' + '\n'.join(header_lines) + '\nENDSEC;'
        with open(new_filepath, 'w', encoding='utf-8') as f:
            f.write(new_header + self.rest_of_file)

    def _serialize_params(self, params):
        if isinstance(params, list):
            return ','.join(self._serialize_params(p) for p in params)
        if params is None:
            return '$'
        if params == 'derived':
            return '*'
        if isinstance(params, str):
            return f"'{params.replace("'", "''")}'"
        return str(params)

# Example usage:
# sf = StepFile('example.step')
# sf.print_properties()
# sf.write('modified.step', {'file_name': ['new_name', '2025-11-22T00:00:00', ['author'], ['org'], 'v1', 'sys', 'auth']})

6. Java Class for .STEP File Handling

Here's a Java class StepFile that does the same: opens, parses HEADER properties, prints to console, and writes a modified file.

import java.io.*;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.*;
import java.util.regex.*;

public class StepFile {
    private String filepath;
    private Map<String, Object> properties = new HashMap<>();
    private String headerRaw = "";
    private String restOfFile = "";

    public StepFile(String filepath) {
        this.filepath = filepath;
        readAndDecode();
    }

    private void readAndDecode() {
        try {
            String content = new String(Files.readAllBytes(Paths.get(filepath)), StandardCharsets.UTF_8);
            if (!content.startsWith("ISO-10303-21;")) {
                throw new IllegalArgumentException("Not a valid .STEP file");
            }

            Pattern headerPattern = Pattern.compile("HEADER;(.*?)ENDSEC;", Pattern.DOTALL);
            Matcher headerMatcher = headerPattern.matcher(content);
            if (!headerMatcher.find()) {
                throw new IllegalArgumentException("No HEADER section found");
            }

            headerRaw = headerMatcher.group(1);
            restOfFile = content.substring(headerMatcher.end());

            // Remove comments
            String header = headerRaw.replaceAll("/\\*.*?\\*/", "");

            // Find entities
            Pattern entityPattern = Pattern.compile("([A-Z_!]+)\\((.*?)\\);", Pattern.DOTALL);
            Matcher entityMatcher = entityPattern.matcher(header);
            while (entityMatcher.find()) {
                String key = entityMatcher.group(1).toLowerCase().replace('!', '_');
                String paramsStr = entityMatcher.group(2);
                Object params = parseParams(paramsStr);
                properties.put(key, params);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    private Object parseParams(String paramsStr) {
        List<Object> tokens = new ArrayList<>();
        StringBuilder current = new StringBuilder();
        boolean inString = false;
        int inList = 0;
        for (char ch : paramsStr.toCharArray()) {
            if (ch == '\'' && !inString) {
                inString = true;
                current.append(ch);
            } else if (ch == '\'' && inString && current.charAt(current.length() - 1) == '\\') {
                current.append(ch);
            } else if (ch == '\'' && inString) {
                inString = false;
                current.append(ch);
                tokens.add(current.toString());
                current = new StringBuilder();
            } else if (ch == '(' && !inString) {
                inList++;
                current.append(ch);
            } else if (ch == ')' && !inString) {
                inList--;
                current.append(ch);
            } else if (ch == ',' && !inString && inList == 0) {
                tokens.add(current.toString().trim());
                current = new StringBuilder();
            } else {
                current.append(ch);
            }
        }
        if (current.length() > 0) {
            tokens.add(current.toString().trim());
        }

        List<Object> result = new ArrayList<>();
        for (Object tokenObj : tokens) {
            String token = (String) tokenObj;
            if (token.startsWith("(") && token.endsWith(")")) {
                result.add(parseParams(token.substring(1, token.length() - 1)));
            } else if (token.startsWith("'") && token.endsWith("'")) {
                result.add(token.substring(1, token.length() - 1).replace("''", "'"));
            } else if (token.equals("$")) {
                result.add(null);
            } else if (token.equals("*")) {
                result.add("derived");
            } else {
                try {
                    if (token.contains(".") || token.toUpperCase().contains("E")) {
                        result.add(Double.parseDouble(token));
                    } else {
                        result.add(Integer.parseInt(token));
                    }
                } catch (NumberFormatException e) {
                    result.add(token);
                }
            }
        }
        return result.size() == 1 ? result.get(0) : result;
    }

    public void printProperties() {
        System.out.println(properties);  // Simple print; use JSON library for pretty print if needed
    }

    public void write(String newFilepath, Map<String, Object> updatedProperties) {
        if (updatedProperties != null) {
            properties.putAll(updatedProperties);
        }

        StringBuilder headerBuilder = new StringBuilder("ISO-10303-21;\nHEADER;\n");
        for (Map.Entry<String, Object> entry : properties.entrySet()) {
            String key = entry.getKey().toUpperCase().replace("_", "!");
            String paramsStr = serializeParams(entry.getValue());
            headerBuilder.append(key).append("(").append(paramsStr).append(");\n");
        }
        headerBuilder.append("ENDSEC;");

        try (FileWriter writer = new FileWriter(newFilepath, StandardCharsets.UTF_8)) {
            writer.write(headerBuilder.toString() + restOfFile);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    private String serializeParams(Object params) {
        if (params instanceof List) {
            List<?> list = (List<?>) params;
            return list.stream().map(this::serializeParams).collect(Collectors.joining(","));
        } else if (params == null) {
            return "$";
        } else if (params.equals("derived")) {
            return "*";
        } else if (params instanceof String) {
            return "'" + ((String) params).replace("'", "''") + "'";
        } else {
            return params.toString();
        }
    }

    // Example usage:
    // public static void main(String[] args) {
    //     StepFile sf = new StepFile("example.step");
    //     sf.printProperties();
    //     sf.write("modified.step", new HashMap<>() {{ put("file_name", Arrays.asList("new_name", "2025-11-22T00:00:00", Arrays.asList("author"), Arrays.asList("org"), "v1", "sys", "auth")); }});
    // }
}

7. JavaScript Class for .STEP File Handling

Here's a JavaScript (Node.js) class StepFile for the same functionality. Requires fs module.

const fs = require('fs');

class StepFile {
  constructor(filepath) {
    this.filepath = filepath;
    this.properties = {};
    this.headerRaw = '';
    this.restOfFile = '';
    this.readAndDecode();
  }

  readAndDecode() {
    const content = fs.readFileSync(this.filepath, 'utf-8');
    if (!content.startsWith('ISO-10303-21;')) {
      throw new Error('Not a valid .STEP file');
    }

    const headerMatch = content.match(/HEADER;(.*?)ENDSEC;/s);
    if (!headerMatch) {
      throw new Error('No HEADER section found');
    }

    this.headerRaw = headerMatch[1];
    this.restOfFile = content.slice(content.indexOf('ENDSEC;', headerMatch.index + headerMatch[0].length));

    // Remove comments
    const header = this.headerRaw.replace(/\/\*.*?\*\//gs, '');

    // Find entities
    const entities = header.match(/([A-Z_!]+)\((.*?)\);/gs) || [];
    entities.forEach((entity) => {
      const match = entity.match(/([A-Z_!]+)\((.*?)\);/s);
      if (match) {
        const key = match[1].toLowerCase().replace('!', 'user_');
        const paramsStr = match[2];
        const params = this.parseParams(paramsStr);
        this.properties[key] = params;
      }
    });
  }

  parseParams(paramsStr) {
    const tokens = [];
    let current = '';
    let inString = false;
    let inList = 0;
    for (let char of paramsStr) {
      if (char === "'" && !inString) inString = true;
      else if (char === "'" && inString && current.endsWith('\\')) current += char;
      else if (char === "'" && inString) {
        inString = false;
        tokens.push(current + char);
        current = '';
      } else if (char === '(' && !inString) inList++;
      else if (char === ')' && !inString) inList--;
      else if (char === ',' && !inString && inList === 0) {
        tokens.push(current.trim());
        current = '';
      } else {
        current += char;
      }
    }
    if (current) tokens.push(current.trim());

    return tokens.map((token) => {
      if (token.startsWith('(') && token.endsWith(')')) return this.parseParams(token.slice(1, -1));
      if (token.startsWith("'") && token.endsWith("'")) return token.slice(1, -1).replace(/''/g, "'");
      if (token === '$') return null;
      if (token === '*') return 'derived';
      const num = parseFloat(token);
      return isNaN(num) ? token : num;
    });
  }

  printProperties() {
    console.log(JSON.stringify(this.properties, null, 2));
  }

  write(newFilepath, updatedProperties = {}) {
    Object.assign(this.properties, updatedProperties);

    const headerLines = [];
    for (let [key, params] of Object.entries(this.properties)) {
      const origKey = key.toUpperCase().replace('USER_', '!');
      const paramsStr = this.serializeParams(params);
      headerLines.push(`${origKey}(${paramsStr});`);
    }

    const newHeader = 'ISO-10303-21;\nHEADER;\n' + headerLines.join('\n') + '\nENDSEC;';
    fs.writeFileSync(newFilepath, newHeader + this.restOfFile, 'utf-8');
  }

  serializeParams(params) {
    if (Array.isArray(params)) {
      return params.map(this.serializeParams.bind(this)).join(',');
    }
    if (params === null) return '$';
    if (params === 'derived') return '*';
    if (typeof params === 'string') return `'${params.replace(/'/g, "''")}'`;
    return params.toString();
  }
}

// Example usage:
// const sf = new StepFile('example.step');
// sf.printProperties();
// sf.write('modified.step', { file_name: ['new_name', '2025-11-22T00:00:00', ['author'], ['org'], 'v1', 'sys', 'auth'] });

8. C++ Class for .STEP File Handling

Here's a C++ class StepFile (using std::regex for parsing). Compile with C++11 or later. Handles read, decode, print (to std::cout), and write.

#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <map>
#include <regex>
#include <variant>
#include <sstream>

using PropertyValue = std::variant<std::nullptr_t, std::string, double, int, std::vector<PropertyValue>>;

class StepFile {
private:
    std::string filepath;
    std::map<std::string, PropertyValue> properties;
    std::string header_raw;
    std::string rest_of_file;

    PropertyValue parse_params(const std::string& params_str) {
        std::vector<PropertyValue> tokens;
        std::string current;
        bool in_string = false;
        int in_list = 0;
        for (char ch : params_str) {
            if (ch == '\'' && !in_string) {
                in_string = true;
                current += ch;
            } else if (ch == '\'' && in_string && !current.empty() && current.back() == '\\') {
                current += ch;
            } else if (ch == '\'' && in_string) {
                in_string = false;
                current += ch;
                tokens.emplace_back(current);
                current.clear();
            } else if (ch == '(' && !in_string) {
                ++in_list;
                current += ch;
            } else if (ch == ')' && !in_string) {
                --in_list;
                current += ch;
            } else if (ch == ',' && !in_string && in_list == 0) {
                tokens.emplace_back(current);
                current.clear();
            } else {
                current += ch;
            }
        }
        if (!current.empty()) {
            tokens.emplace_back(current);
        }

        std::vector<PropertyValue> result;
        for (auto& token_var : tokens) {
            if (auto token = std::get_if<std::string>(&token_var)) {
                token->erase(std::remove_if(token->begin(), token->end(), isspace), token->end()); // Trim
                if (token->front() == '(' && token->back() == ')') {
                    result.push_back(parse_params(token->substr(1, token->size() - 2)));
                } else if (token->front() == '\'' && token->back() == '\'') {
                    std::string str = token->substr(1, token->size() - 2);
                    size_t pos;
                    while ((pos = str.find("''")) != std::string::npos) {
                        str.replace(pos, 2, "'");
                    }
                    result.emplace_back(str);
                } else if (*token == "$") {
                    result.emplace_back(nullptr);
                } else if (*token == "*") {
                    result.emplace_back("derived");
                } else {
                    try {
                        if (token->find('.') != std::string::npos || token->find('e') != std::string::npos || token->find('E') != std::string::npos) {
                            result.emplace_back(std::stod(*token));
                        } else {
                            result.emplace_back(std::stoi(*token));
                        }
                    } catch (...) {
                        result.emplace_back(*token);
                    }
                }
            }
        }
        return result.size() == 1 ? result[0] : PropertyValue(result);
    }

    std::string serialize_params(const PropertyValue& params) {
        if (std::holds_alternative<std::nullptr_t>(params)) {
            return "$";
        } else if (auto str = std::get_if<std::string>(&params)) {
            if (*str == "derived") return "*";
            std::string escaped = *str;
            size_t pos = 0;
            while ((pos = escaped.find("'", pos)) != std::string::npos) {
                escaped.replace(pos, 1, "''");
                pos += 2;
            }
            return "'" + escaped + "'";
        } else if (auto num_d = std::get_if<double>(&params)) {
            return std::to_string(*num_d);
        } else if (auto num_i = std::get_if<int>(&params)) {
            return std::to_string(*num_i);
        } else if (auto list = std::get_if<std::vector<PropertyValue>>(&params)) {
            std::stringstream ss;
            for (size_t i = 0; i < list->size(); ++i) {
                ss << serialize_params((*list)[i]);
                if (i < list->size() - 1) ss << ",";
            }
            return ss.str();
        }
        return "";
    }

    void print_value(const PropertyValue& val, int indent = 0) {
        if (std::holds_alternative<std::nullptr_t>(val)) {
            std::cout << "null";
        } else if (auto str = std::get_if<std::string>(&val)) {
            std::cout << "\"" << *str << "\"";
        } else if (auto num_d = std::get_if<double>(&val)) {
            std::cout << *num_d;
        } else if (auto num_i = std::get_if<int>(&val)) {
            std::cout << *num_i;
        } else if (auto list = std::get_if<std::vector<PropertyValue>>(&val)) {
            std::cout << "[\n";
            for (const auto& item : *list) {
                std::cout << std::string(indent + 2, ' ');
                print_value(item, indent + 2);
                std::cout << ",\n";
            }
            std::cout << std::string(indent, ' ') << "]";
        }
    }

public:
    StepFile(const std::string& fp) : filepath(fp) {
        read_and_decode();
    }

    void read_and_decode() {
        std::ifstream file(filepath, std::ios::binary);
        std::string content((std::istreambuf_iterator<char>(file)), std::istreambuf_iterator<char>());

        if (content.substr(0, 13) != "ISO-10303-21;") {
            throw std::runtime_error("Not a valid .STEP file");
        }

        std::regex header_regex(R"(HEADER;(.*?)ENDSEC;)", std::regex::dotall);
        std::smatch header_match;
        if (!std::regex_search(content, header_match, header_regex)) {
            throw std::runtime_error("No HEADER section found");
        }

        header_raw = header_match[1].str();
        size_t end_pos = header_match.position() + header_match.length();
        rest_of_file = content.substr(end_pos);

        // Remove comments
        std::regex comment_regex(R"(/\*.*?\*/)", std::regex::dotall);
        std::string header = std::regex_replace(header_raw, comment_regex, "");

        // Find entities
        std::regex entity_regex(R"(([A-Z_!]+)\((.*?)\);)", std::regex::dotall);
        std::sregex_iterator iter(header.begin(), header.end(), entity_regex);
        std::sregex_iterator end;
        for (; iter != end; ++iter) {
            std::string key = iter->str(1);
            std::transform(key.begin(), key.end(), key.begin(), ::tolower);
            std::replace(key.begin(), key.end(), '!', 'user_');
            std::string params_str = iter->str(2);
            PropertyValue params = parse_params(params_str);
            properties[key] = params;
        }
    }

    void print_properties() {
        std::cout << "{\n";
        for (const auto& [key, val] : properties) {
            std::cout << "  \"" << key << "\": ";
            print_value(val, 2);
            std::cout << ",\n";
        }
        std::cout << "}\n";
    }

    void write(const std::string& new_filepath, const std::map<std::string, PropertyValue>& updated_properties = {}) {
        for (const auto& [k, v] : updated_properties) {
            properties[k] = v;
        }

        std::stringstream header_ss;
        header_ss << "ISO-10303-21;\nHEADER;\n";
        for (const auto& [key, params] : properties) {
            std::string orig_key = key;
            std::transform(orig_key.begin(), orig_key.end(), orig_key.begin(), ::toupper);
            std::replace(orig_key.begin(), orig_key.end(), 'USER_', "!");
            std::string params_str = serialize_params(params);
            header_ss << orig_key << "(" << params_str << ");\n";
        }
        header_ss << "ENDSEC;";

        std::ofstream out(new_filepath, std::ios::binary);
        out << header_ss.str() << rest_of_file;
    }
};

// Example usage:
// int main() {
//     StepFile sf("example.step");
//     sf.print_properties();
//     std::map<std::string, PropertyValue> updates;
//     updates["file_name"] = std::vector<PropertyValue>{"new_name", "2025-11-22T00:00:00", std::vector<PropertyValue>{"author"}, std::vector<PropertyValue>{"org"}, "v1", "sys", "auth"};
//     sf.write("modified.step", updates);
//     return 0;
// }