.CIF

Task 089: .CIF File Format

thinkmelt@protonmail.com

Sep 11, 2025 • 16 min read

1. List of All Properties of the .CIF File Format Intrinsic to Its Format

The .CIF (Crystallographic Information File) format is a standard text-based file format for exchanging crystallographic and structural data, developed by the International Union of Crystallography (IUCr). It is not tied to a specific file system but is a self-contained, extensible syntax for data representation. Below is a comprehensive list of its intrinsic properties, derived from the official IUCr CIF 1.1 syntax specification (the most widely used version; CIF 2.0 extends it with DDLm and UTF-8 support but maintains core structure). These properties define the format's structure, syntax, and constraints:

Text-Based and Encoding: Plain text file using printable characters from ISO 10646 (Unicode; originally restricted to 7-bit ASCII in CIF 1.0). No binary data allowed. Files are human-readable and line-oriented (newlines separate logical units).
Case Sensitivity: Reserved keywords (e.g., data_, loop_, save_, global_) are case-insensitive. Data names (tags) are case-sensitive when matching dictionary definitions but often stored in lowercase by convention.
Data Blocks: The top-level organizational unit. Starts with a header data_ followed immediately by a block code (1-75 non-blank characters). Multiple blocks per file, uniquely named within the file, separated by whitespace (spaces, tabs, or newlines). An empty file or file with only comments/whitespace is valid.
Global Blocks: Optional top-level construct starting with global_ (similar to data blocks but applies across all data blocks; rarely used outside dictionaries).
Save Frames: Sub-units within a data block for grouping related items. Starts with save_ followed by a frame code (1-75 non-blank characters), ends with a lone save_ on a new line. No nesting allowed; unique names within a block. Used primarily in CIF dictionaries.
Data Items: Core elements consisting of a tag (data name) and a value. Tags start with _ followed by 1-75 non-blank characters (alphanumeric, hyphens, etc.; no spaces). Each tag appears at most once per block or frame. Syntax: <tag> <whitespace> <value>.
Loops (Tabular Data): For arrays or tables of values. Starts with loop_ on its own line, followed by one or more tags (one per line), then rows of values (one row per line, values separated by whitespace). Number of values per row must match number of tags. No nested loops. Single-level only.
Values Types and Delimiters:
Null Values: . (not applicable/inapplicable) or ? (unknown).
Numeric Values: Integers, floats, or scientific notation (e.g., 1.23, 1.23e-4). Unquoted; must follow numeric grammar to be treated as numbers.
Character Strings: Short strings without whitespace or special starters (unquoted); or delimited if containing whitespace/special chars: single-quoted ('text'), double-quoted ("text"), or triple-quoted in CIF 2.0.
Text Fields (Multiline Strings): For long text or multi-line content. Starts after a value position with a semicolon (;) on a new line, ends with a semicolon on a new line (;<eol>). Preserves internal whitespace and newlines.
Whitespace Handling: Arbitrary spaces/tabs between tokens (e.g., after tags or in loops). Ignored except within delimited strings/text fields. Newlines are significant for headers, loops, and text fields.
Comments: Any line starting with # (after optional whitespace) is ignored. Can appear anywhere. Multi-line comments not supported.
Character Set Restrictions: Printable non-control characters only (no tabs in tags/values except as whitespace; no unprintable chars). Blank lines are whitespace.
File Constraints: No fixed header/footer. Max line length not strictly defined but practically limited by editors (recommend <80 chars). Files must be parsable sequentially from start to end.
Extensibility: Relies on external dictionaries (e.g., core CIF dictionary) for semantic meaning of tags/values, but syntax is self-contained.
Error Handling Intrinsic: Parsers must handle malformed items gracefully (e.g., unmatched loops, duplicate tags), but the format assumes well-formed input.
MIME Type and Extension: Typically text/plain or chemical/x-cif; file extension .cif (case-insensitive).
Versioning: Files may include _audit_conform.dict_version to specify CIF version (e.g., 1.0, 1.1).

These properties ensure CIF is flexible, extensible, and suitable for archiving complex scientific data without rigid schemas.

2. Two Direct Download Links for Files of Format .CIF

Diamond (carbon, COD ID 2002916): http://www.crystallography.net/cod/2002916.cif
Sodium Chloride (NaCl, COD ID 1011020): http://www.crystallography.net/cod/1011020.cif

These are sample files from the Crystallography Open Database (COD), a public repository of CIF files.

3. Ghost Blog Embedded HTML JavaScript for Drag-and-Drop .CIF File Parsing

This is a self-contained HTML snippet with embedded JavaScript that can be pasted into a Ghost blog post (using the HTML card in the editor). It allows users to drag-and-drop a .CIF file onto the designated area. The JS parses the file using a basic line-by-line tokenizer (handling blocks, items, loops, values, etc.) and dumps the extracted properties (e.g., data blocks, tags, values, loops) to a <pre> element on screen. It focuses on format structure, not semantic validation. Writing back is not implemented in JS here (browser security limits file writing; use Node.js for that).

Drag and drop a .CIF file here to parse its properties.

This JS is a basic parser; it identifies and extracts structural properties (blocks, frames, items, loops, comments) and outputs them as JSON. For full text fields, it notes but doesn't buffer multi-line (extend with line scanning for production).

4. Python Class for .CIF File Handling

This Python class (CIFHandler) opens a .CIF file, parses it to extract and print structural properties (similar to the list in part 1, applied to file content: blocks, items, etc.), and supports basic writing by serializing back to a new .CIF file. It uses built-in string processing (no external libs needed). Run as handler = CIFHandler('file.cif'); handler.read_and_print(); handler.write('output.cif').

import re
import sys

class CIFHandler:
    def __init__(self, filename=None):
        self.filename = filename
        self.properties = {
            'data_blocks': [],
            'global_blocks': [],
            'comments': [],
            'save_frames': [],
            'errors': []
        }
        if filename:
            self.read()

    def read(self):
        try:
            with open(self.filename, 'r', encoding='utf-8') as f:
                content = f.read()
        except FileNotFoundError:
            print("File not found.", file=sys.stderr)
            return
        lines = content.splitlines()
        current_block = None
        current_frame = None
        in_loop = False
        loop_tags = []
        loop_values = []
        text_buffer = ''
        text_start_line = None

        for i, line in enumerate(lines, 1):
            stripped = line.strip()
            if stripped.startswith('#'):
                self.properties['comments'].append({'line': i, 'content': stripped})
                continue

            # Data block
            if re.match(r'^data_[^ ]+$', stripped, re.IGNORECASE):
                if current_block:
                    self.properties['data_blocks'].append(current_block)
                block_name = stripped[5:].strip()
                current_block = {'name': block_name, 'line': i, 'items': {}, 'loops': [], 'save_frames': []}
                in_loop = False
                continue

            # Global block
            if re.match(r'^global_[^ ]+$', stripped, re.IGNORECASE):
                self.properties['global_blocks'].append({'name': stripped[7:].strip(), 'items': {}, 'loops': []})
                continue

            # Save frame start
            if re.match(r'^save_[^ ]+$', stripped, re.IGNORECASE):
                if current_frame:
                    current_block['save_frames'].append(current_frame)
                frame_name = stripped[5:].strip()
                current_frame = {'name': frame_name, 'line': i, 'items': {}, 'loops': []}
                continue

            # Save frame end
            if stripped.lower() == 'save':
                if current_frame:
                    current_block['save_frames'].append(current_frame)
                current_frame = None
                continue

            # Loop start
            if stripped.lower() == 'loop_':
                in_loop = True
                loop_tags = []
                loop_values = []
                continue

            # Loop tags
            if in_loop and stripped.startswith('_'):
                loop_tags.append({'tag': stripped.strip(), 'line': i})
                continue

            # Loop values
            if in_loop and stripped:
                values = re.split(r'\s+', stripped.strip())
                loop_values.append({'line': i, 'values': values})
                if len(loop_values) % len(loop_tags) == 0 and len(loop_values) > 0:
                    if current_frame:
                        current_frame['loops'].append({'tags': loop_tags[:], 'values': loop_values[:]})
                    elif current_block:
                        current_block['loops'].append({'tags': loop_tags[:], 'values': loop_values[:]})
                    in_loop = False
                continue

            # Data item
            match = re.match(r'^_([^\s]+)\s+(.*)$', stripped)
            if match:
                tag = match.group(1)
                value_str = match.group(2).strip()
                if value_str == '.':
                    value = {'type': 'n/a'}
                elif value_str == '?':
                    value = {'type': 'unknown'}
                elif re.match(r'^[\d.eE+-]+$', value_str):
                    value = {'type': 'numeric', 'value': value_str}
                else:
                    value = {'type': 'string', 'value': value_str}
                if current_frame:
                    current_frame['items'][tag] = value
                elif current_block:
                    current_block['items'][tag] = value
                continue

            # Text field simplistic handle (start)
            if value_str.startswith(';') and current_block:
                text_start_line = i
                text_buffer = line[line.find(';') + 1:] + '\n'
                continue
            # End text field (simplistic; scan for next ; on line start)
            if text_buffer and stripped == ';' and text_start_line:
                text_buffer += line[:line.find(';') + 1]
                if current_block:
                    current_block['items']['_text_field'] = {'type': 'text', 'value': text_buffer.strip(), 'lines': range(text_start_line, i + 1)}
                text_buffer = ''
                text_start_line = None
                continue

        if current_block:
            self.properties['data_blocks'].append(current_block)
        print(self.properties)  # Prints to console

    def write(self, output_filename):
        with open(output_filename, 'w', encoding='utf-8') as f:
            f.write('# Generated CIF\n')
            for block in self.properties['data_blocks']:
                f.write(f'data_{block["name"]}\n')
                for tag, value in block['items'].items():
                    val_str = self._value_to_str(value)
                    f.write(f'_{tag} {val_str}\n')
                for loop in block['loops']:
                    f.write('loop_\n')
                    for ltag in loop['tags']:
                        f.write(f'{ltag["tag"]}\n')
                    for lval in loop['values']:
                        f.write(' '.join(lval['values']) + '\n')
                for frame in block['save_frames']:
                    f.write(f'save_{frame["name"]}\n')
                    # Similar for frame items/loops...
                    f.write('save\n')
        print(f'Written to {output_filename}')

    def _value_to_str(self, value):
        if value['type'] == 'n/a':
            return '.'
        elif value['type'] == 'unknown':
            return '?'
        elif value['type'] == 'numeric':
            return value['value']
        else:
            return f'"{value["value"]}"'  # Quoted for safety

    def read_and_print(self):
        self.read()
        # Additional print for console dump
        print('CIF Properties Dump:')
        print(f'Data Blocks: {len(self.properties["data_blocks"])}')
        for block in self.properties['data_blocks']:
            print(f'  Block {block["name"]}: {len(block["items"])} items, {len(block["loops"])} loops')
        print(f'Comments: {len(self.properties["comments"])}')
        if self.properties['errors']:
            print('Errors:', self.properties['errors'])

Usage: python script.py with the class instantiated prints to console. Handles read/write for basic structures; text fields are approximated.

5. Java Class for .CIF File Handling

This Java class (CIFHandler) uses BufferedReader for reading .CIF files, parses structural properties, prints to console (System.out), and writes to a new file. Compile with javac CIFHandler.java and run java CIFHandler input.cif output.cif. It mirrors the Python logic.

import java.io.*;
import java.util.*;
import java.util.regex.*;

public class CIFHandler {
    private String filename;
    private Map<String, Object> properties = new HashMap<>();

    public CIFHandler(String filename) {
        this.filename = filename;
        properties.put("dataBlocks", new ArrayList<>());
        properties.put("globalBlocks", new ArrayList<>());
        properties.put("comments", new ArrayList<>());
        properties.put("errors", new ArrayList<>());
        if (filename != null) {
            read();
        }
    }

    public void read() {
        List<Map<String, Object>> dataBlocks = (List<Map<String, Object>>) properties.get("dataBlocks");
        List<Map<String, Object>> comments = (List<Map<String, Object>>) properties.get("comments");
        List<String> errors = (List<String>) properties.get("errors");

        Map<String, Object> currentBlock = null;
        Map<String, Object> currentFrame = null;
        boolean inLoop = false;
        List<Map<String, Object>> loopTags = new ArrayList<>();
        List<Map<String, Object>> loopValues = new ArrayList<>();
        StringBuilder textBuffer = new StringBuilder();
        int textStartLine = -1;

        try (BufferedReader br = new BufferedReader(new FileReader(filename))) {
            String line;
            int lineNum = 0;
            while ((line = br.readLine()) != null) {
                lineNum++;
                String stripped = line.trim();
                if (stripped.startsWith("#")) {
                    comments.add(Map.of("line", lineNum, "content", stripped));
                    continue;
                }

                Pattern dataPat = Pattern.compile("^data_\\S+$", Pattern.CASE_INSENSITIVE);
                if (dataPat.matcher(stripped).matches()) {
                    if (currentBlock != null) {
                        dataBlocks.add(currentBlock);
                    }
                    String blockName = stripped.substring(5).trim();
                    currentBlock = new HashMap<>();
                    currentBlock.put("name", blockName);
                    currentBlock.put("line", lineNum);
                    currentBlock.put("items", new HashMap<String, Map<String, Object>>());
                    currentBlock.put("loops", new ArrayList<>());
                    currentBlock.put("saveFrames", new ArrayList<>());
                    inLoop = false;
                    continue;
                }

                Pattern globalPat = Pattern.compile("^global_\\S+$", Pattern.CASE_INSENSITIVE);
                if (globalPat.matcher(stripped).matches()) {
                    // Similar handling for global
                    continue;
                }

                Pattern saveStartPat = Pattern.compile("^save_\\S+$", Pattern.CASE_INSENSITIVE);
                if (saveStartPat.matcher(stripped).matches()) {
                    if (currentFrame != null) {
                        ((List<Map<String, Object>>) currentBlock.get("saveFrames")).add(currentFrame);
                    }
                    String frameName = stripped.substring(5).trim();
                    currentFrame = new HashMap<>();
                    currentFrame.put("name", frameName);
                    currentFrame.put("line", lineNum);
                    currentFrame.put("items", new HashMap<String, Map<String, Object>>());
                    currentFrame.put("loops", new ArrayList<>());
                    continue;
                }

                if (stripped.toLowerCase().equals("save")) {
                    if (currentFrame != null) {
                        ((List<Map<String, Object>>) currentBlock.get("saveFrames")).add(currentFrame);
                    }
                    currentFrame = null;
                    continue;
                }

                if (stripped.toLowerCase().equals("loop_")) {
                    inLoop = true;
                    loopTags.clear();
                    loopValues.clear();
                    continue;
                }

                if (inLoop && stripped.startsWith("_")) {
                    loopTags.add(Map.of("tag", stripped, "line", lineNum));
                    continue;
                }

                if (inLoop && !stripped.isEmpty()) {
                    String[] values = stripped.split("\\s+");
                    loopValues.add(Map.of("line", lineNum, "values", Arrays.asList(values)));
                    // Simplistic loop end
                    if (loopValues.size() % loopTags.size() == 0 && !loopValues.isEmpty()) {
                        Map<String, Object> loop = new HashMap<>();
                        loop.put("tags", new ArrayList<>(loopTags));
                        loop.put("values", new ArrayList<>(loopValues));
                        if (currentFrame != null) {
                            ((List<Map<String, Object>>) currentFrame.get("loops")).add(loop);
                        } else if (currentBlock != null) {
                            ((List<Map<String, Object>>) currentBlock.get("loops")).add(loop);
                        }
                        inLoop = false;
                    }
                    continue;
                }

                Pattern itemPat = Pattern.compile("^_([^\\s]+)\\s+(.*)$");
                Matcher matcher = itemPat.matcher(stripped);
                if (matcher.matches()) {
                    String tag = matcher.group(1);
                    String valueStr = matcher.group(2).trim();
                    Map<String, Object> value = new HashMap<>();
                    if (valueStr.equals(".")) {
                        value.put("type", "n/a");
                    } else if (valueStr.equals("?")) {
                        value.put("type", "unknown");
                    } else if (valueStr.matches("^[-+]?\\d*\\.?\\d+([eE][-+]?\\d+)?$")) {
                        value.put("type", "numeric");
                        value.put("value", valueStr);
                    } else {
                        value.put("type", "string");
                        value.put("value", valueStr);
                    }
                    if (currentFrame != null) {
                        ((Map<String, Map<String, Object>>) currentFrame.get("items")).put(tag, value);
                    } else if (currentBlock != null) {
                        ((Map<String, Map<String, Object>>) currentBlock.get("items")).put(tag, value);
                    }
                    continue;
                }

                // Text field simplistic
                if (valueStr.startsWith(";")) {
                    textStartLine = lineNum;
                    textBuffer.append(line.substring(line.indexOf(';') + 1)).append("\n");
                }
            }
            if (currentBlock != null) {
                dataBlocks.add(currentBlock);
            }
        } catch (IOException e) {
            errors.add(e.getMessage());
        }
        printProperties();
    }

    private void printProperties() {
        System.out.println(properties);
    }

    public void write(String outputFilename) {
        try (PrintWriter pw = new PrintWriter(new FileWriter(outputFilename))) {
            pw.println("# Generated CIF");
            List<Map<String, Object>> dataBlocks = (List<Map<String, Object>>) properties.get("dataBlocks");
            for (Map<String, Object> block : dataBlocks) {
                pw.println("data_" + block.get("name"));
                Map<String, Map<String, Object>> items = (Map<String, Map<String, Object>>) block.get("items");
                for (Map.Entry<String, Map<String, Object>> entry : items.entrySet()) {
                    String tag = entry.getKey();
                    Map<String, Object> value = entry.getValue();
                    pw.println("_" + tag + " " + valueToString(value));
                }
                // Loops and frames similar...
                List<Map<String, Object>> saveFrames = (List<Map<String, Object>>) block.get("saveFrames");
                for (Map<String, Object> frame : saveFrames) {
                    pw.println("save_" + frame.get("name"));
                    // Write frame items...
                    pw.println("save");
                }
            }
            System.out.println("Written to " + outputFilename);
        } catch (IOException e) {
            System.err.println(e.getMessage());
        }
    }

    private String valueToString(Map<String, Object> value) {
        String type = (String) value.get("type");
        if ("n/a".equals(type)) return ".";
        if ("unknown".equals(type)) return "?";
        if ("numeric".equals(type)) return (String) value.get("value");
        return "\"" + value.get("value") + "\"";
    }

    public static void main(String[] args) {
        if (args.length < 1) {
            System.out.println("Usage: java CIFHandler <input.cif> [output.cif]");
            return;
        }
        CIFHandler handler = new CIFHandler(args[0]);
        if (args.length > 1) {
            handler.write(args[1]);
        }
    }
}

This handles read/print/write; console output via printProperties(). Basic error handling for file I/O.

6. JavaScript Class for .CIF File Handling (Node.js)

This Node.js class (CIFHandler) reads .CIF files using fs, parses properties, prints to console (console.log), and writes to a new file using fs.writeFileSync. Run with node script.js input.cif output.cif. Similar logic to Python.

const fs = require('fs');

class CIFHandler {
  constructor(filename = null) {
    this.filename = filename;
    this.properties = {
      dataBlocks: [],
      globalBlocks: [],
      comments: [],
      errors: []
    };
    if (filename) {
      this.read();
    }
  }

  read() {
    try {
      const content = fs.readFileSync(this.filename, 'utf8');
      const lines = content.split('\n');
      let currentBlock = null;
      let currentFrame = null;
      let inLoop = false;
      let loopTags = [];
      let loopValues = [];
      let textBuffer = '';
      let textStartLine = null;

      lines.forEach((line, i) => {
        const lineNum = i + 1;
        const stripped = line.trim();
        if (stripped.startsWith('#')) {
          this.properties.comments.push({ line: lineNum, content: stripped });
          return;
        }

        if (/^data_\S+$/i.test(stripped)) {
          if (currentBlock) this.properties.dataBlocks.push(currentBlock);
          const blockName = stripped.slice(5).trim();
          currentBlock = { name: blockName, line: lineNum, items: {}, loops: [], saveFrames: [] };
          inLoop = false;
          return;
        }

        if (/^global_\S+$/i.test(stripped)) {
          // Handle global
          return;
        }

        if (/^save_\S+$/i.test(stripped)) {
          if (currentFrame) currentBlock.saveFrames.push(currentFrame);
          const frameName = stripped.slice(5).trim();
          currentFrame = { name: frameName, line: lineNum, items: {}, loops: [] };
          return;
        }

        if (stripped.toLowerCase() === 'save') {
          if (currentFrame) currentBlock.saveFrames.push(currentFrame);
          currentFrame = null;
          return;
        }

        if (stripped.toLowerCase() === 'loop_') {
          inLoop = true;
          loopTags = [];
          loopValues = [];
          return;
        }

        if (inLoop && stripped.startsWith('_')) {
          loopTags.push({ tag: stripped, line: lineNum });
          return;
        }

        if (inLoop && stripped) {
          const values = stripped.split(/\s+/).filter(v => v);
          loopValues.push({ line: lineNum, values });
          if (loopValues.length % loopTags.length === 0 && loopValues.length > 0) {
            const loop = { tags: [...loopTags], values: [...loopValues] };
            if (currentFrame) {
              currentFrame.loops.push(loop);
            } else if (currentBlock) {
              currentBlock.loops.push(loop);
            }
            inLoop = false;
          }
          return;
        }

        const match = stripped.match(/^_([^\s]+)\s+(.*)$/);
        if (match) {
          const tag = match[1];
          let valueStr = match[2].trim();
          let value;
          if (valueStr === '.') {
            value = { type: 'n/a' };
          } else if (valueStr === '?') {
            value = { type: 'unknown' };
          } else if (/^[\d.eE+-]+$/.test(valueStr)) {
            value = { type: 'numeric', value: valueStr };
          } else {
            value = { type: 'string', value: valueStr };
          }
          if (currentFrame) {
            currentFrame.items[tag] = value;
          } else if (currentBlock) {
            currentBlock.items[tag] = value;
          }
          return;
        }

        // Text field simplistic
        const valueStr = stripped; // Reuse
        if (valueStr.startsWith(';')) {
          textStartLine = lineNum;
          textBuffer = line.slice(line.indexOf(';') + 1) + '\n';
        }
      });

      if (currentBlock) this.properties.dataBlocks.push(currentBlock);
      this.printProperties();
    } catch (err) {
      this.properties.errors.push(err.message);
      console.error(err.message);
    }
  }

  printProperties() {
    console.log(this.properties);
  }

  write(outputFilename) {
    let output = '# Generated CIF\n';
    this.properties.dataBlocks.forEach(block => {
      output += `data_${block.name}\n`;
      Object.entries(block.items).forEach(([tag, value]) => {
        const valStr = this.valueToString(value);
        output += `_${tag} ${valStr}\n`;
      });
      block.loops.forEach(loop => {
        output += 'loop_\n';
        loop.tags.forEach(ltag => output += `${ltag.tag}\n`);
        loop.values.forEach(lval => output += lval.values.join(' ') + '\n');
      });
      block.saveFrames.forEach(frame => {
        output += `save_${frame.name}\n`;
        // Similar for frame...
        output += 'save\n';
      });
    });
    fs.writeFileSync(outputFilename, output);
    console.log(`Written to ${outputFilename}`);
  }

  valueToString(value) {
    const type = value.type;
    if (type === 'n/a') return '.';
    if (type === 'unknown') return '?';
    if (type === 'numeric') return value.value;
    return `"${value.value}"`;
  }
}

// Usage: node script.js
if (process.argv.length < 3) {
  console.log('Usage: node script.js <input.cif> [output.cif]');
} else {
  const handler = new CIFHandler(process.argv[2]);
  if (process.argv.length > 3) {
    handler.write(process.argv[3]);
  }
}

Console dump via printProperties(). Supports read/write for core structures.

7. C Class (Struct-Based) for .CIF File Handling

This C implementation uses structs for properties, fopen/fgets for reading, parses and prints to stdout (printf), and writes to a file. Compile with gcc cif_handler.c -o cif_handler and run ./cif_handler input.cif output.cif. Basic parser; memory managed with malloc/free. Limited to core features due to C constraints.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <regex.h>  // For regex, or implement simple matching

// Structs for properties
typedef struct {
    int line;
    char* content;
} Comment;

typedef struct {
    char* name;
    int line;
    // Items: simple map simulation with arrays
    char** tags;
    char** values;
    int item_count;
    // Loops and frames simplified as counts for demo
    int loop_count;
    int save_frame_count;
} Block;

typedef struct {
    Block* data_blocks;
    int block_count;
    Comment* comments;
    int comment_count;
    char** errors;
    int error_count;
} CIFProperties;

// Function prototypes
CIFProperties* read_cif(const char* filename);
void print_properties(CIFProperties* props);
void write_cif(CIFProperties* props, const char* output_filename);
void free_properties(CIFProperties* props);
char* value_to_str(const char* val_type, const char* val);  // Simplistic

int main(int argc, char* argv[]) {
    if (argc < 2) {
        printf("Usage: %s <input.cif> [output.cif]\n", argv[0]);
        return 1;
    }
    CIFProperties* props = read_cif(argv[1]);
    print_properties(props);
    if (argc > 2) {
        write_cif(props, argv[2]);
    }
    free_properties(props);
    return 0;
}

CIFProperties* read_cif(const char* filename) {
    CIFProperties* props = malloc(sizeof(CIFProperties));
    props->block_count = 0;
    props->comment_count = 0;
    props->error_count = 0;
    props->data_blocks = NULL;
    props->comments = NULL;
    props->errors = NULL;

    FILE* file = fopen(filename, "r");
    if (!file) {
        char* err = malloc(100);
        snprintf(err, 100, "File not found: %s", filename);
        props->errors = realloc(props->errors, sizeof(char*) * (props->error_count + 1));
        props->errors[props->error_count++] = err;
        return props;
    }

    Block* current_block = NULL;
    int max_blocks = 10;  // Arbitrary max
    props->data_blocks = malloc(sizeof(Block) * max_blocks);
    char line[1024];
    int line_num = 0;
    int in_loop = 0;

    while (fgets(line, sizeof(line), file)) {
        line_num++;
        char* stripped = line;
        while (*stripped == ' ' || *stripped == '\t') stripped++;  // Trim left
        char* end = stripped + strlen(stripped) - 1;
        while (end > stripped && (*end == ' ' || *end == '\t' || *end == '\n')) *end-- = '\0';

        if (stripped[0] == '#') {
            props->comments = realloc(props->comments, sizeof(Comment) * (props->comment_count + 1));
            props->comments[props->comment_count].line = line_num;
            props->comments[props->comment_count].content = strdup(stripped);
            props->comment_count++;
            continue;
        }

        // Simple string match for data_
        if (strncmp(stripped, "data_", 5) == 0 && strlen(stripped) > 5 && stripped[5] != ' ') {
            if (current_block) {
                props->data_blocks[props->block_count++] = *current_block;
                if (props->block_count >= max_blocks) max_blocks *= 2, props->data_blocks = realloc(props->data_blocks, sizeof(Block) * max_blocks);
            }
            current_block = malloc(sizeof(Block));
            current_block->name = strdup(stripped + 5);
            current_block->line = line_num;
            current_block->item_count = 0;
            current_block->tags = malloc(sizeof(char*) * 10);  // Max items
            current_block->values = malloc(sizeof(char*) * 10);
            current_block->loop_count = 0;
            current_block->save_frame_count = 0;
            in_loop = 0;
            continue;
        }

        // Similar simplistic matches for save_, loop_, items
        // For items: check if starts with _ and has space
        if (stripped[0] == '_' && strchr(stripped, ' ')) {
            char* space_pos = strchr(stripped, ' ');
            *space_pos = '\0';
            char* tag = strdup(stripped);
            char* val = strdup(space_pos + 1);
            current_block->tags[current_block->item_count] = tag;
            current_block->values[current_block->item_count] = val;
            current_block->item_count++;
            continue;
        }

        // Loop simplistic: count 'loop_' occurrences
        if (strcasestr(stripped, "loop_")) {
            current_block->loop_count++;
            in_loop = 1;
            continue;
        }

        // Comments/errors etc.
    }
    if (current_block) {
        props->data_blocks[props->block_count++] = *current_block;
        free(current_block);
    }
    fclose(file);
    return props;
}

void print_properties(CIFProperties* props) {
    printf("CIF Properties:\n");
    printf("Data Blocks: %d\n", props->block_count);
    for (int i = 0; i < props->block_count; i++) {
        printf("  Block %s: %d items, %d loops\n", props->data_blocks[i].name, props->data_blocks[i].item_count, props->data_blocks[i].loop_count);
    }
    printf("Comments: %d\n", props->comment_count);
    for (int i = 0; i < props->error_count; i++) {
        printf("Error: %s\n", props->errors[i]);
    }
    // Full dump would print all
}

void write_cif(CIFProperties* props, const char* output_filename) {
    FILE* out = fopen(output_filename, "w");
    if (!out) return;
    fprintf(out, "# Generated CIF\n");
    for (int i = 0; i < props->block_count; i++) {
        Block b = props->data_blocks[i];
        fprintf(out, "data_%s\n", b.name);
        for (int j = 0; j < b.item_count; j++) {
            fprintf(out, "_%s %s\n", b.tags[j], value_to_str("string", b.values[j]));  // Simplistic
        }
        // Loops/frames...
    }
    fclose(out);
    printf("Written to %s\n", output_filename);
}

char* value_to_str(const char* val_type, const char* val) {
    if (strcmp(val_type, "n/a") == 0) return ".";
    if (strcmp(val_type, "unknown") == 0) return "?";
    char* quoted = malloc(strlen(val) + 3);
    snprintf(quoted, strlen(val) + 3, "\"%s\"", val);
    return quoted;
}

void free_properties(CIFProperties* props) {
    for (int i = 0; i < props->block_count; i++) {
        Block b = props->data_blocks[i];
        for (int j = 0; j < b.item_count; j++) {
            free(b.tags[j]);
            free(b.values[j]);
        }
        free(b.tags);
        free(b.values);
        free(b.name);
    }
    free(props->data_blocks);
    for (int i = 0; i < props->comment_count; i++) free(props->comments[i].content);
    free(props->comments);
    for (int i = 0; i < props->error_count; i++) free(props->errors[i]);
    free(props->errors);
    free(props);
}

This C code provides basic parsing (focuses on blocks/items/loops counts, simple items), prints to console, and writes a basic reconstruction. Memory is freed; extend arrays as needed. For full regex, link -lregex if available.