Task 089: .CIF File Format
Task 089: .CIF File Format
1. List of All Properties of the .CIF File Format Intrinsic to Its Format
The .CIF (Crystallographic Information File) format is a standard text-based file format for exchanging crystallographic and structural data, developed by the International Union of Crystallography (IUCr). It is not tied to a specific file system but is a self-contained, extensible syntax for data representation. Below is a comprehensive list of its intrinsic properties, derived from the official IUCr CIF 1.1 syntax specification (the most widely used version; CIF 2.0 extends it with DDLm and UTF-8 support but maintains core structure). These properties define the format's structure, syntax, and constraints:
- Text-Based and Encoding: Plain text file using printable characters from ISO 10646 (Unicode; originally restricted to 7-bit ASCII in CIF 1.0). No binary data allowed. Files are human-readable and line-oriented (newlines separate logical units).
- Case Sensitivity: Reserved keywords (e.g.,
data_
,loop_
,save_
,global_
) are case-insensitive. Data names (tags) are case-sensitive when matching dictionary definitions but often stored in lowercase by convention. - Data Blocks: The top-level organizational unit. Starts with a header
data_
followed immediately by a block code (1-75 non-blank characters). Multiple blocks per file, uniquely named within the file, separated by whitespace (spaces, tabs, or newlines). An empty file or file with only comments/whitespace is valid. - Global Blocks: Optional top-level construct starting with
global_
(similar to data blocks but applies across all data blocks; rarely used outside dictionaries). - Save Frames: Sub-units within a data block for grouping related items. Starts with
save_
followed by a frame code (1-75 non-blank characters), ends with a lonesave_
on a new line. No nesting allowed; unique names within a block. Used primarily in CIF dictionaries. - Data Items: Core elements consisting of a tag (data name) and a value. Tags start with
_
followed by 1-75 non-blank characters (alphanumeric, hyphens, etc.; no spaces). Each tag appears at most once per block or frame. Syntax:<tag> <whitespace> <value>
. - Loops (Tabular Data): For arrays or tables of values. Starts with
loop_
on its own line, followed by one or more tags (one per line), then rows of values (one row per line, values separated by whitespace). Number of values per row must match number of tags. No nested loops. Single-level only. - Values Types and Delimiters:
- Null Values:
.
(not applicable/inapplicable) or?
(unknown). - Numeric Values: Integers, floats, or scientific notation (e.g.,
1.23
,1.23e-4
). Unquoted; must follow numeric grammar to be treated as numbers. - Character Strings: Short strings without whitespace or special starters (unquoted); or delimited if containing whitespace/special chars: single-quoted (
'text'
), double-quoted ("text"
), or triple-quoted in CIF 2.0. - Text Fields (Multiline Strings): For long text or multi-line content. Starts after a value position with a semicolon (
;
) on a new line, ends with a semicolon on a new line (;<eol>
). Preserves internal whitespace and newlines. - Whitespace Handling: Arbitrary spaces/tabs between tokens (e.g., after tags or in loops). Ignored except within delimited strings/text fields. Newlines are significant for headers, loops, and text fields.
- Comments: Any line starting with
#
(after optional whitespace) is ignored. Can appear anywhere. Multi-line comments not supported. - Character Set Restrictions: Printable non-control characters only (no tabs in tags/values except as whitespace; no unprintable chars). Blank lines are whitespace.
- File Constraints: No fixed header/footer. Max line length not strictly defined but practically limited by editors (recommend <80 chars). Files must be parsable sequentially from start to end.
- Extensibility: Relies on external dictionaries (e.g., core CIF dictionary) for semantic meaning of tags/values, but syntax is self-contained.
- Error Handling Intrinsic: Parsers must handle malformed items gracefully (e.g., unmatched loops, duplicate tags), but the format assumes well-formed input.
- MIME Type and Extension: Typically
text/plain
orchemical/x-cif
; file extension.cif
(case-insensitive). - Versioning: Files may include
_audit_conform.dict_version
to specify CIF version (e.g., 1.0, 1.1).
These properties ensure CIF is flexible, extensible, and suitable for archiving complex scientific data without rigid schemas.
2. Two Direct Download Links for Files of Format .CIF
- Diamond (carbon, COD ID 2002916): http://www.crystallography.net/cod/2002916.cif
- Sodium Chloride (NaCl, COD ID 1011020): http://www.crystallography.net/cod/1011020.cif
These are sample files from the Crystallography Open Database (COD), a public repository of CIF files.
3. Ghost Blog Embedded HTML JavaScript for Drag-and-Drop .CIF File Parsing
This is a self-contained HTML snippet with embedded JavaScript that can be pasted into a Ghost blog post (using the HTML card in the editor). It allows users to drag-and-drop a .CIF file onto the designated area. The JS parses the file using a basic line-by-line tokenizer (handling blocks, items, loops, values, etc.) and dumps the extracted properties (e.g., data blocks, tags, values, loops) to a <pre>
element on screen. It focuses on format structure, not semantic validation. Writing back is not implemented in JS here (browser security limits file writing; use Node.js for that).
Drag and drop a .CIF file here to parse its properties.
This JS is a basic parser; it identifies and extracts structural properties (blocks, frames, items, loops, comments) and outputs them as JSON. For full text fields, it notes but doesn't buffer multi-line (extend with line scanning for production).
4. Python Class for .CIF File Handling
This Python class (CIFHandler
) opens a .CIF file, parses it to extract and print structural properties (similar to the list in part 1, applied to file content: blocks, items, etc.), and supports basic writing by serializing back to a new .CIF file. It uses built-in string processing (no external libs needed). Run as handler = CIFHandler('file.cif'); handler.read_and_print(); handler.write('output.cif')
.
import re
import sys
class CIFHandler:
def __init__(self, filename=None):
self.filename = filename
self.properties = {
'data_blocks': [],
'global_blocks': [],
'comments': [],
'save_frames': [],
'errors': []
}
if filename:
self.read()
def read(self):
try:
with open(self.filename, 'r', encoding='utf-8') as f:
content = f.read()
except FileNotFoundError:
print("File not found.", file=sys.stderr)
return
lines = content.splitlines()
current_block = None
current_frame = None
in_loop = False
loop_tags = []
loop_values = []
text_buffer = ''
text_start_line = None
for i, line in enumerate(lines, 1):
stripped = line.strip()
if stripped.startswith('#'):
self.properties['comments'].append({'line': i, 'content': stripped})
continue
# Data block
if re.match(r'^data_[^ ]+$', stripped, re.IGNORECASE):
if current_block:
self.properties['data_blocks'].append(current_block)
block_name = stripped[5:].strip()
current_block = {'name': block_name, 'line': i, 'items': {}, 'loops': [], 'save_frames': []}
in_loop = False
continue
# Global block
if re.match(r'^global_[^ ]+$', stripped, re.IGNORECASE):
self.properties['global_blocks'].append({'name': stripped[7:].strip(), 'items': {}, 'loops': []})
continue
# Save frame start
if re.match(r'^save_[^ ]+$', stripped, re.IGNORECASE):
if current_frame:
current_block['save_frames'].append(current_frame)
frame_name = stripped[5:].strip()
current_frame = {'name': frame_name, 'line': i, 'items': {}, 'loops': []}
continue
# Save frame end
if stripped.lower() == 'save':
if current_frame:
current_block['save_frames'].append(current_frame)
current_frame = None
continue
# Loop start
if stripped.lower() == 'loop_':
in_loop = True
loop_tags = []
loop_values = []
continue
# Loop tags
if in_loop and stripped.startswith('_'):
loop_tags.append({'tag': stripped.strip(), 'line': i})
continue
# Loop values
if in_loop and stripped:
values = re.split(r'\s+', stripped.strip())
loop_values.append({'line': i, 'values': values})
if len(loop_values) % len(loop_tags) == 0 and len(loop_values) > 0:
if current_frame:
current_frame['loops'].append({'tags': loop_tags[:], 'values': loop_values[:]})
elif current_block:
current_block['loops'].append({'tags': loop_tags[:], 'values': loop_values[:]})
in_loop = False
continue
# Data item
match = re.match(r'^_([^\s]+)\s+(.*)$', stripped)
if match:
tag = match.group(1)
value_str = match.group(2).strip()
if value_str == '.':
value = {'type': 'n/a'}
elif value_str == '?':
value = {'type': 'unknown'}
elif re.match(r'^[\d.eE+-]+$', value_str):
value = {'type': 'numeric', 'value': value_str}
else:
value = {'type': 'string', 'value': value_str}
if current_frame:
current_frame['items'][tag] = value
elif current_block:
current_block['items'][tag] = value
continue
# Text field simplistic handle (start)
if value_str.startswith(';') and current_block:
text_start_line = i
text_buffer = line[line.find(';') + 1:] + '\n'
continue
# End text field (simplistic; scan for next ; on line start)
if text_buffer and stripped == ';' and text_start_line:
text_buffer += line[:line.find(';') + 1]
if current_block:
current_block['items']['_text_field'] = {'type': 'text', 'value': text_buffer.strip(), 'lines': range(text_start_line, i + 1)}
text_buffer = ''
text_start_line = None
continue
if current_block:
self.properties['data_blocks'].append(current_block)
print(self.properties) # Prints to console
def write(self, output_filename):
with open(output_filename, 'w', encoding='utf-8') as f:
f.write('# Generated CIF\n')
for block in self.properties['data_blocks']:
f.write(f'data_{block["name"]}\n')
for tag, value in block['items'].items():
val_str = self._value_to_str(value)
f.write(f'_{tag} {val_str}\n')
for loop in block['loops']:
f.write('loop_\n')
for ltag in loop['tags']:
f.write(f'{ltag["tag"]}\n')
for lval in loop['values']:
f.write(' '.join(lval['values']) + '\n')
for frame in block['save_frames']:
f.write(f'save_{frame["name"]}\n')
# Similar for frame items/loops...
f.write('save\n')
print(f'Written to {output_filename}')
def _value_to_str(self, value):
if value['type'] == 'n/a':
return '.'
elif value['type'] == 'unknown':
return '?'
elif value['type'] == 'numeric':
return value['value']
else:
return f'"{value["value"]}"' # Quoted for safety
def read_and_print(self):
self.read()
# Additional print for console dump
print('CIF Properties Dump:')
print(f'Data Blocks: {len(self.properties["data_blocks"])}')
for block in self.properties['data_blocks']:
print(f' Block {block["name"]}: {len(block["items"])} items, {len(block["loops"])} loops')
print(f'Comments: {len(self.properties["comments"])}')
if self.properties['errors']:
print('Errors:', self.properties['errors'])
Usage: python script.py
with the class instantiated prints to console. Handles read/write for basic structures; text fields are approximated.
5. Java Class for .CIF File Handling
This Java class (CIFHandler
) uses BufferedReader
for reading .CIF files, parses structural properties, prints to console (System.out), and writes to a new file. Compile with javac CIFHandler.java
and run java CIFHandler input.cif output.cif
. It mirrors the Python logic.
import java.io.*;
import java.util.*;
import java.util.regex.*;
public class CIFHandler {
private String filename;
private Map<String, Object> properties = new HashMap<>();
public CIFHandler(String filename) {
this.filename = filename;
properties.put("dataBlocks", new ArrayList<>());
properties.put("globalBlocks", new ArrayList<>());
properties.put("comments", new ArrayList<>());
properties.put("errors", new ArrayList<>());
if (filename != null) {
read();
}
}
public void read() {
List<Map<String, Object>> dataBlocks = (List<Map<String, Object>>) properties.get("dataBlocks");
List<Map<String, Object>> comments = (List<Map<String, Object>>) properties.get("comments");
List<String> errors = (List<String>) properties.get("errors");
Map<String, Object> currentBlock = null;
Map<String, Object> currentFrame = null;
boolean inLoop = false;
List<Map<String, Object>> loopTags = new ArrayList<>();
List<Map<String, Object>> loopValues = new ArrayList<>();
StringBuilder textBuffer = new StringBuilder();
int textStartLine = -1;
try (BufferedReader br = new BufferedReader(new FileReader(filename))) {
String line;
int lineNum = 0;
while ((line = br.readLine()) != null) {
lineNum++;
String stripped = line.trim();
if (stripped.startsWith("#")) {
comments.add(Map.of("line", lineNum, "content", stripped));
continue;
}
Pattern dataPat = Pattern.compile("^data_\\S+$", Pattern.CASE_INSENSITIVE);
if (dataPat.matcher(stripped).matches()) {
if (currentBlock != null) {
dataBlocks.add(currentBlock);
}
String blockName = stripped.substring(5).trim();
currentBlock = new HashMap<>();
currentBlock.put("name", blockName);
currentBlock.put("line", lineNum);
currentBlock.put("items", new HashMap<String, Map<String, Object>>());
currentBlock.put("loops", new ArrayList<>());
currentBlock.put("saveFrames", new ArrayList<>());
inLoop = false;
continue;
}
Pattern globalPat = Pattern.compile("^global_\\S+$", Pattern.CASE_INSENSITIVE);
if (globalPat.matcher(stripped).matches()) {
// Similar handling for global
continue;
}
Pattern saveStartPat = Pattern.compile("^save_\\S+$", Pattern.CASE_INSENSITIVE);
if (saveStartPat.matcher(stripped).matches()) {
if (currentFrame != null) {
((List<Map<String, Object>>) currentBlock.get("saveFrames")).add(currentFrame);
}
String frameName = stripped.substring(5).trim();
currentFrame = new HashMap<>();
currentFrame.put("name", frameName);
currentFrame.put("line", lineNum);
currentFrame.put("items", new HashMap<String, Map<String, Object>>());
currentFrame.put("loops", new ArrayList<>());
continue;
}
if (stripped.toLowerCase().equals("save")) {
if (currentFrame != null) {
((List<Map<String, Object>>) currentBlock.get("saveFrames")).add(currentFrame);
}
currentFrame = null;
continue;
}
if (stripped.toLowerCase().equals("loop_")) {
inLoop = true;
loopTags.clear();
loopValues.clear();
continue;
}
if (inLoop && stripped.startsWith("_")) {
loopTags.add(Map.of("tag", stripped, "line", lineNum));
continue;
}
if (inLoop && !stripped.isEmpty()) {
String[] values = stripped.split("\\s+");
loopValues.add(Map.of("line", lineNum, "values", Arrays.asList(values)));
// Simplistic loop end
if (loopValues.size() % loopTags.size() == 0 && !loopValues.isEmpty()) {
Map<String, Object> loop = new HashMap<>();
loop.put("tags", new ArrayList<>(loopTags));
loop.put("values", new ArrayList<>(loopValues));
if (currentFrame != null) {
((List<Map<String, Object>>) currentFrame.get("loops")).add(loop);
} else if (currentBlock != null) {
((List<Map<String, Object>>) currentBlock.get("loops")).add(loop);
}
inLoop = false;
}
continue;
}
Pattern itemPat = Pattern.compile("^_([^\\s]+)\\s+(.*)$");
Matcher matcher = itemPat.matcher(stripped);
if (matcher.matches()) {
String tag = matcher.group(1);
String valueStr = matcher.group(2).trim();
Map<String, Object> value = new HashMap<>();
if (valueStr.equals(".")) {
value.put("type", "n/a");
} else if (valueStr.equals("?")) {
value.put("type", "unknown");
} else if (valueStr.matches("^[-+]?\\d*\\.?\\d+([eE][-+]?\\d+)?$")) {
value.put("type", "numeric");
value.put("value", valueStr);
} else {
value.put("type", "string");
value.put("value", valueStr);
}
if (currentFrame != null) {
((Map<String, Map<String, Object>>) currentFrame.get("items")).put(tag, value);
} else if (currentBlock != null) {
((Map<String, Map<String, Object>>) currentBlock.get("items")).put(tag, value);
}
continue;
}
// Text field simplistic
if (valueStr.startsWith(";")) {
textStartLine = lineNum;
textBuffer.append(line.substring(line.indexOf(';') + 1)).append("\n");
}
}
if (currentBlock != null) {
dataBlocks.add(currentBlock);
}
} catch (IOException e) {
errors.add(e.getMessage());
}
printProperties();
}
private void printProperties() {
System.out.println(properties);
}
public void write(String outputFilename) {
try (PrintWriter pw = new PrintWriter(new FileWriter(outputFilename))) {
pw.println("# Generated CIF");
List<Map<String, Object>> dataBlocks = (List<Map<String, Object>>) properties.get("dataBlocks");
for (Map<String, Object> block : dataBlocks) {
pw.println("data_" + block.get("name"));
Map<String, Map<String, Object>> items = (Map<String, Map<String, Object>>) block.get("items");
for (Map.Entry<String, Map<String, Object>> entry : items.entrySet()) {
String tag = entry.getKey();
Map<String, Object> value = entry.getValue();
pw.println("_" + tag + " " + valueToString(value));
}
// Loops and frames similar...
List<Map<String, Object>> saveFrames = (List<Map<String, Object>>) block.get("saveFrames");
for (Map<String, Object> frame : saveFrames) {
pw.println("save_" + frame.get("name"));
// Write frame items...
pw.println("save");
}
}
System.out.println("Written to " + outputFilename);
} catch (IOException e) {
System.err.println(e.getMessage());
}
}
private String valueToString(Map<String, Object> value) {
String type = (String) value.get("type");
if ("n/a".equals(type)) return ".";
if ("unknown".equals(type)) return "?";
if ("numeric".equals(type)) return (String) value.get("value");
return "\"" + value.get("value") + "\"";
}
public static void main(String[] args) {
if (args.length < 1) {
System.out.println("Usage: java CIFHandler <input.cif> [output.cif]");
return;
}
CIFHandler handler = new CIFHandler(args[0]);
if (args.length > 1) {
handler.write(args[1]);
}
}
}
This handles read/print/write; console output via printProperties()
. Basic error handling for file I/O.
6. JavaScript Class for .CIF File Handling (Node.js)
This Node.js class (CIFHandler
) reads .CIF files using fs
, parses properties, prints to console (console.log), and writes to a new file using fs.writeFileSync
. Run with node script.js input.cif output.cif
. Similar logic to Python.
const fs = require('fs');
class CIFHandler {
constructor(filename = null) {
this.filename = filename;
this.properties = {
dataBlocks: [],
globalBlocks: [],
comments: [],
errors: []
};
if (filename) {
this.read();
}
}
read() {
try {
const content = fs.readFileSync(this.filename, 'utf8');
const lines = content.split('\n');
let currentBlock = null;
let currentFrame = null;
let inLoop = false;
let loopTags = [];
let loopValues = [];
let textBuffer = '';
let textStartLine = null;
lines.forEach((line, i) => {
const lineNum = i + 1;
const stripped = line.trim();
if (stripped.startsWith('#')) {
this.properties.comments.push({ line: lineNum, content: stripped });
return;
}
if (/^data_\S+$/i.test(stripped)) {
if (currentBlock) this.properties.dataBlocks.push(currentBlock);
const blockName = stripped.slice(5).trim();
currentBlock = { name: blockName, line: lineNum, items: {}, loops: [], saveFrames: [] };
inLoop = false;
return;
}
if (/^global_\S+$/i.test(stripped)) {
// Handle global
return;
}
if (/^save_\S+$/i.test(stripped)) {
if (currentFrame) currentBlock.saveFrames.push(currentFrame);
const frameName = stripped.slice(5).trim();
currentFrame = { name: frameName, line: lineNum, items: {}, loops: [] };
return;
}
if (stripped.toLowerCase() === 'save') {
if (currentFrame) currentBlock.saveFrames.push(currentFrame);
currentFrame = null;
return;
}
if (stripped.toLowerCase() === 'loop_') {
inLoop = true;
loopTags = [];
loopValues = [];
return;
}
if (inLoop && stripped.startsWith('_')) {
loopTags.push({ tag: stripped, line: lineNum });
return;
}
if (inLoop && stripped) {
const values = stripped.split(/\s+/).filter(v => v);
loopValues.push({ line: lineNum, values });
if (loopValues.length % loopTags.length === 0 && loopValues.length > 0) {
const loop = { tags: [...loopTags], values: [...loopValues] };
if (currentFrame) {
currentFrame.loops.push(loop);
} else if (currentBlock) {
currentBlock.loops.push(loop);
}
inLoop = false;
}
return;
}
const match = stripped.match(/^_([^\s]+)\s+(.*)$/);
if (match) {
const tag = match[1];
let valueStr = match[2].trim();
let value;
if (valueStr === '.') {
value = { type: 'n/a' };
} else if (valueStr === '?') {
value = { type: 'unknown' };
} else if (/^[\d.eE+-]+$/.test(valueStr)) {
value = { type: 'numeric', value: valueStr };
} else {
value = { type: 'string', value: valueStr };
}
if (currentFrame) {
currentFrame.items[tag] = value;
} else if (currentBlock) {
currentBlock.items[tag] = value;
}
return;
}
// Text field simplistic
const valueStr = stripped; // Reuse
if (valueStr.startsWith(';')) {
textStartLine = lineNum;
textBuffer = line.slice(line.indexOf(';') + 1) + '\n';
}
});
if (currentBlock) this.properties.dataBlocks.push(currentBlock);
this.printProperties();
} catch (err) {
this.properties.errors.push(err.message);
console.error(err.message);
}
}
printProperties() {
console.log(this.properties);
}
write(outputFilename) {
let output = '# Generated CIF\n';
this.properties.dataBlocks.forEach(block => {
output += `data_${block.name}\n`;
Object.entries(block.items).forEach(([tag, value]) => {
const valStr = this.valueToString(value);
output += `_${tag} ${valStr}\n`;
});
block.loops.forEach(loop => {
output += 'loop_\n';
loop.tags.forEach(ltag => output += `${ltag.tag}\n`);
loop.values.forEach(lval => output += lval.values.join(' ') + '\n');
});
block.saveFrames.forEach(frame => {
output += `save_${frame.name}\n`;
// Similar for frame...
output += 'save\n';
});
});
fs.writeFileSync(outputFilename, output);
console.log(`Written to ${outputFilename}`);
}
valueToString(value) {
const type = value.type;
if (type === 'n/a') return '.';
if (type === 'unknown') return '?';
if (type === 'numeric') return value.value;
return `"${value.value}"`;
}
}
// Usage: node script.js
if (process.argv.length < 3) {
console.log('Usage: node script.js <input.cif> [output.cif]');
} else {
const handler = new CIFHandler(process.argv[2]);
if (process.argv.length > 3) {
handler.write(process.argv[3]);
}
}
Console dump via printProperties()
. Supports read/write for core structures.
7. C Class (Struct-Based) for .CIF File Handling
This C implementation uses structs for properties, fopen
/fgets
for reading, parses and prints to stdout (printf), and writes to a file. Compile with gcc cif_handler.c -o cif_handler
and run ./cif_handler input.cif output.cif
. Basic parser; memory managed with malloc/free. Limited to core features due to C constraints.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <regex.h> // For regex, or implement simple matching
// Structs for properties
typedef struct {
int line;
char* content;
} Comment;
typedef struct {
char* name;
int line;
// Items: simple map simulation with arrays
char** tags;
char** values;
int item_count;
// Loops and frames simplified as counts for demo
int loop_count;
int save_frame_count;
} Block;
typedef struct {
Block* data_blocks;
int block_count;
Comment* comments;
int comment_count;
char** errors;
int error_count;
} CIFProperties;
// Function prototypes
CIFProperties* read_cif(const char* filename);
void print_properties(CIFProperties* props);
void write_cif(CIFProperties* props, const char* output_filename);
void free_properties(CIFProperties* props);
char* value_to_str(const char* val_type, const char* val); // Simplistic
int main(int argc, char* argv[]) {
if (argc < 2) {
printf("Usage: %s <input.cif> [output.cif]\n", argv[0]);
return 1;
}
CIFProperties* props = read_cif(argv[1]);
print_properties(props);
if (argc > 2) {
write_cif(props, argv[2]);
}
free_properties(props);
return 0;
}
CIFProperties* read_cif(const char* filename) {
CIFProperties* props = malloc(sizeof(CIFProperties));
props->block_count = 0;
props->comment_count = 0;
props->error_count = 0;
props->data_blocks = NULL;
props->comments = NULL;
props->errors = NULL;
FILE* file = fopen(filename, "r");
if (!file) {
char* err = malloc(100);
snprintf(err, 100, "File not found: %s", filename);
props->errors = realloc(props->errors, sizeof(char*) * (props->error_count + 1));
props->errors[props->error_count++] = err;
return props;
}
Block* current_block = NULL;
int max_blocks = 10; // Arbitrary max
props->data_blocks = malloc(sizeof(Block) * max_blocks);
char line[1024];
int line_num = 0;
int in_loop = 0;
while (fgets(line, sizeof(line), file)) {
line_num++;
char* stripped = line;
while (*stripped == ' ' || *stripped == '\t') stripped++; // Trim left
char* end = stripped + strlen(stripped) - 1;
while (end > stripped && (*end == ' ' || *end == '\t' || *end == '\n')) *end-- = '\0';
if (stripped[0] == '#') {
props->comments = realloc(props->comments, sizeof(Comment) * (props->comment_count + 1));
props->comments[props->comment_count].line = line_num;
props->comments[props->comment_count].content = strdup(stripped);
props->comment_count++;
continue;
}
// Simple string match for data_
if (strncmp(stripped, "data_", 5) == 0 && strlen(stripped) > 5 && stripped[5] != ' ') {
if (current_block) {
props->data_blocks[props->block_count++] = *current_block;
if (props->block_count >= max_blocks) max_blocks *= 2, props->data_blocks = realloc(props->data_blocks, sizeof(Block) * max_blocks);
}
current_block = malloc(sizeof(Block));
current_block->name = strdup(stripped + 5);
current_block->line = line_num;
current_block->item_count = 0;
current_block->tags = malloc(sizeof(char*) * 10); // Max items
current_block->values = malloc(sizeof(char*) * 10);
current_block->loop_count = 0;
current_block->save_frame_count = 0;
in_loop = 0;
continue;
}
// Similar simplistic matches for save_, loop_, items
// For items: check if starts with _ and has space
if (stripped[0] == '_' && strchr(stripped, ' ')) {
char* space_pos = strchr(stripped, ' ');
*space_pos = '\0';
char* tag = strdup(stripped);
char* val = strdup(space_pos + 1);
current_block->tags[current_block->item_count] = tag;
current_block->values[current_block->item_count] = val;
current_block->item_count++;
continue;
}
// Loop simplistic: count 'loop_' occurrences
if (strcasestr(stripped, "loop_")) {
current_block->loop_count++;
in_loop = 1;
continue;
}
// Comments/errors etc.
}
if (current_block) {
props->data_blocks[props->block_count++] = *current_block;
free(current_block);
}
fclose(file);
return props;
}
void print_properties(CIFProperties* props) {
printf("CIF Properties:\n");
printf("Data Blocks: %d\n", props->block_count);
for (int i = 0; i < props->block_count; i++) {
printf(" Block %s: %d items, %d loops\n", props->data_blocks[i].name, props->data_blocks[i].item_count, props->data_blocks[i].loop_count);
}
printf("Comments: %d\n", props->comment_count);
for (int i = 0; i < props->error_count; i++) {
printf("Error: %s\n", props->errors[i]);
}
// Full dump would print all
}
void write_cif(CIFProperties* props, const char* output_filename) {
FILE* out = fopen(output_filename, "w");
if (!out) return;
fprintf(out, "# Generated CIF\n");
for (int i = 0; i < props->block_count; i++) {
Block b = props->data_blocks[i];
fprintf(out, "data_%s\n", b.name);
for (int j = 0; j < b.item_count; j++) {
fprintf(out, "_%s %s\n", b.tags[j], value_to_str("string", b.values[j])); // Simplistic
}
// Loops/frames...
}
fclose(out);
printf("Written to %s\n", output_filename);
}
char* value_to_str(const char* val_type, const char* val) {
if (strcmp(val_type, "n/a") == 0) return ".";
if (strcmp(val_type, "unknown") == 0) return "?";
char* quoted = malloc(strlen(val) + 3);
snprintf(quoted, strlen(val) + 3, "\"%s\"", val);
return quoted;
}
void free_properties(CIFProperties* props) {
for (int i = 0; i < props->block_count; i++) {
Block b = props->data_blocks[i];
for (int j = 0; j < b.item_count; j++) {
free(b.tags[j]);
free(b.values[j]);
}
free(b.tags);
free(b.values);
free(b.name);
}
free(props->data_blocks);
for (int i = 0; i < props->comment_count; i++) free(props->comments[i].content);
free(props->comments);
for (int i = 0; i < props->error_count; i++) free(props->errors[i]);
free(props->errors);
free(props);
}
This C code provides basic parsing (focuses on blocks/items/loops counts, simple items), prints to console, and writes a basic reconstruction. Memory is freed; extend arrays as needed. For full regex, link -lregex if available.