Task 849: .XX File Format
Task 849: .XX File Format
1. Properties of the .XX File Format Intrinsic to Its File System
The .XX file format, as defined by the xx project, is a text-based representation of binary data designed for human readability, editability, and annotation. It allows users to describe binary data using various notations interspersed with comments and artistic elements (e.g., ASCII art or Unicode diagrams). The format has no rigid binary structure, header, or footer; instead, it relies on parsing rules to extract and compile hexadecimal or other representations into binary output. The intrinsic properties include:
- Text-Based Encoding: The format uses plain text (ASCII or Unicode) to represent binary data, making it compatible with standard text editors and file systems without requiring special handling beyond text file support.
- Supported Data Notations:
- Hexadecimal bytes in pairs (e.g., "414243" for bytes 0x41, 0x42, 0x43).
- Prefixed hexadecimal (e.g., "0x41", "\x41", "$41", "h41").
- Separators for hex groups (e.g., spaces, commas, colons between bytes).
- Binary notation (e.g., "0y01010101" for a single byte, where "0y" prefixes exactly 8 bits, convertible to hex).
- Quoted ASCII strings (e.g., "ABC" converts to "414243"), with support for escape sequences (\n, \t, \r, \).
- Comment Support:
- Single-line comments starting with "#", ";", "%", "|", ESC (0x1B), "-", "/", "--", "//".
- Multi-line comments delimited by "/" and "/".
- Unicode box-drawing characters (Unicode code points U+2500 to U+259F) treated as comments for creating diagrams or art.
- Parsing Behavior: Lines are processed sequentially; comments are stripped, ignored characters (e.g., prefixes, separators) are filtered, and valid data representations are concatenated into a binary buffer. Multi-line comments and quoted strings preserve internal content (e.g., spaces, comments within strings are not ignored).
- Output Generation: The format compiles to binary data; the reference implementation outputs a file named based on the input filename and a short SHA-256 hash of the binary (e.g., "file..bin").
- Version-Specific Features: As of version 0.5, binary notation ("0y") is supported.
- File System Intrinsic Aspects: As a text file, it inherits standard text file properties (e.g., line endings, encoding tolerance), with no embedded metadata, signatures, or compression. It is platform-agnostic, with no dependencies on specific file system features beyond text support.
- Error Handling: Invalid hex or malformed notations are ignored during parsing, ensuring robustness for annotated or artistic files.
These properties enable the format to integrate seamlessly with file systems while prioritizing flexibility for descriptive binary authoring.
2. Two Direct Download Links for .XX Files
- https://raw.githubusercontent.com/netspooky/xx/main/examples/elf.xx
- https://raw.githubusercontent.com/netspooky/xx/main/examples/png.xx
3. Ghost Blog Embedded HTML JavaScript for Drag-and-Drop .XX File Dumping
The following is a self-contained HTML snippet with embedded JavaScript suitable for embedding in a Ghost blog post (or any HTML-based platform). It creates a drag-and-drop area where a user can drop a .XX file. Upon dropping, the script parses the file according to the .XX specification and dumps the properties to the screen (displaying the extracted binary as a hex dump, along with detected comments and notations for illustration).
4. Python Class for .XX File Handling
import hashlib
class XXFileHandler:
def __init__(self, filename):
self.filename = filename
self.binary_data = b""
self.properties = {
"text_based": True,
"supported_notations": ["hex", "prefixed_hex", "binary", "quoted_strings"],
"comment_styles": ["single-line", "multi-line", "unicode_box"],
"no_header": True,
"version_features": "0.5 (binary support)",
"parsing_behavior": "concatenated binary extraction"
}
self.read_and_decode()
def read_and_decode(self):
with open(self.filename, 'r') as f:
lines = f.readlines()
self.binary_data = self.parse_xx(lines)
def parse_xx(self, lines):
# Implementation based on the reference parser (simplified for class)
out = b""
multiline_comment = False
joined_line = ""
ascii_comments = ["#", ";", "%", "|", "\x1b", "-", "/"]
two_char_comments = ["--", "//"]
filter_list = [",", "$", "\\x", "0x", "h", ":", " "]
escapes = {"n": "\n", "\\": "\\", "t": "\t", "r": "\r"}
for line in lines:
multiline_comment, joined_line, line, must_continue = self.filter_multi_line_comments(
multiline_comment, joined_line, line
)
if must_continue:
continue
tokens = self.tokenize_xx(line)
line_hex = ""
for token in tokens:
if self.is_comment(token, ascii_comments, two_char_comments):
continue
cleaned = self.filter_ignored(token["data"], filter_list)
if token["is_string"]:
cleaned = self.ascii_to_hex(token["data"], escapes)
elif cleaned.startswith("0y") and len(cleaned) == 10:
cleaned = f"{int(cleaned[2:], 2):02x}"
try:
out += bytes.fromhex(cleaned)
line_hex += cleaned
except:
pass
return out
def filter_multi_line_comments(self, multiline_comment, joined_line, line):
line_result = joined_line
joined_line = ""
must_continue = False
while line:
if multiline_comment:
if "*/" in line:
parts = line.split("*/", 1)
line = parts[1] if len(parts) > 1 else ""
multiline_comment = False
else:
joined_line += line_result
must_continue = True
break
else:
if "/*" in line:
parts = line.split("/*", 1)
line_result += parts[0]
line = parts[1] if len(parts) > 1 else ""
multiline_comment = True
else:
line_result += line
break
return multiline_comment, joined_line, line_result, must_continue
def tokenize_xx(self, line):
tokens = []
buf = ""
verbatim = False
is_escape = False
is_string = False
line = line.strip()
for c in line:
if c == "\\" and verbatim and not is_escape:
is_escape = True
continue
if is_escape:
buf += escapes.get(c, "\\" + c)
is_escape = False
continue
if c == '"':
verbatim = not verbatim
is_string = True
if not verbatim and buf:
tokens.append({"data": buf, "is_string": is_string})
is_string = False
buf = ""
continue
if c == " " and not verbatim:
if buf:
tokens.append({"data": buf, "is_string": is_string})
buf = ""
continue
buf += c
if buf:
tokens.append({"data": buf, "is_string": is_string})
return tokens
def is_comment(self, token, ascii_comments, two_char_comments):
data = token["data"]
if not data:
return False
if data[0] in ascii_comments or (len(data) > 1 and data[:2] in two_char_comments):
return True
code = ord(data[0])
return 9472 <= code < 9632
def filter_ignored(self, text, filter_list):
for f in filter_list:
text = text.replace(f, "")
return text
def ascii_to_hex(self, string, escapes):
hex_str = ""
i = 0
while i < len(string):
if string[i] == "\\" and i + 1 < len(string):
next_c = string[i + 1]
hex_str += f"{ord(escapes.get(next_c, '\\' + next_c)):02x}"
i += 2
else:
hex_str += f"{ord(string[i]):02x}"
i += 1
return hex_str
def write(self, output_filename=None):
if not output_filename:
m = hashlib.sha256()
m.update(self.binary_data)
short_hash = m.hexdigest()[:8]
output_filename = f"{self.filename.split('.xx')[0]}.{short_hash}.bin"
with open(output_filename, 'wb') as f:
f.write(self.binary_data)
return output_filename
def print_properties(self):
print("Properties of the .XX File Format:")
for key, value in self.properties.items():
print(f"{key}: {value}")
print("\nCompiled Binary (Hex Dump):")
self.hex_dump(self.binary_data)
def hex_dump(self, data):
offset = 0
while offset < len(data):
chunk = data[offset:offset + 16]
hex_str = " ".join(f"{b:02x}" for b in chunk)
ascii_str = "".join(chr(b) if 32 <= b < 127 else "." for b in chunk)
print(f"{offset:08x}: {hex_str.ljust(47)} {ascii_str}")
offset += 16
# Example usage:
# handler = XXFileHandler("example.xx")
# handler.print_properties()
# handler.write()
5. Java Class for .XX File Handling
import java.io.*;
import java.nio.charset.StandardCharsets;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.*;
public class XXFileHandler {
private String filename;
private byte[] binaryData;
private Map<String, Object> properties;
public XXFileHandler(String filename) {
this.filename = filename;
this.properties = new HashMap<>();
properties.put("text_based", true);
properties.put("supported_notations", Arrays.asList("hex", "prefixed_hex", "binary", "quoted_strings"));
properties.put("comment_styles", Arrays.asList("single-line", "multi-line", "unicode_box"));
properties.put("no_header", true);
properties.put("version_features", "0.5 (binary support)");
properties.put("parsing_behavior", "concatenated binary extraction");
readAndDecode();
}
private void readAndDecode() {
List<String> lines = new ArrayList<>();
try (BufferedReader reader = new BufferedReader(new FileReader(filename))) {
String line;
while ((line = reader.readLine()) != null) {
lines.add(line);
}
} catch (IOException e) {
e.printStackTrace();
}
binaryData = parseXX(lines);
}
private byte[] parseXX(List<String> lines) {
ByteArrayOutputStream out = new ByteArrayOutputStream();
boolean multilineComment = false;
String joinedLine = "";
String[] asciiComments = {"#", ";", "%", "|", "\u001B", "-", "/"};
String[] twoCharComments = {"--", "//"};
String[] filterList = {",", "$", "\\x", "0x", "h", ":", " "};
Map<String, String> escapes = new HashMap<>();
escapes.put("n", "\n");
escapes.put("\\", "\\");
escapes.put("t", "\t");
escapes.put("r", "\r");
for (String line : lines) {
Object[] result = filterMultiLineComments(multilineComment, joinedLine, line);
multilineComment = (boolean) result[0];
joinedLine = (String) result[1];
line = (String) result[2];
boolean mustContinue = (boolean) result[3];
if (mustContinue) continue;
List<Map<String, Object>> tokens = tokenizeXX(line, escapes);
String lineHex = "";
for (Map<String, Object> token : tokens) {
String data = (String) token.get("data");
if (isComment(data, asciiComments, twoCharComments)) continue;
String cleaned = filterIgnored(data, filterList);
boolean isString = (boolean) token.get("isString");
if (isString) {
cleaned = asciiToHex((String) token.get("rawData"), escapes);
} else if (cleaned.startsWith("0y") && cleaned.length() == 10) {
cleaned = String.format("%02x", Integer.parseInt(cleaned.substring(2), 2));
}
try {
byte[] bytes = hexStringToByteArray(cleaned);
out.write(bytes);
lineHex += cleaned;
} catch (Exception ignored) {}
}
}
return out.toByteArray();
}
private Object[] filterMultiLineComments(boolean multilineComment, String joinedLine, String line) {
String lineResult = joinedLine;
joinedLine = "";
boolean mustContinue = false;
while (!line.isEmpty()) {
if (multilineComment) {
if (line.contains("*/")) {
String[] parts = line.split("\\*/", 2);
line = parts.length > 1 ? parts[1] : "";
multilineComment = false;
} else {
joinedLine += lineResult;
mustContinue = true;
break;
}
} else {
if (line.contains("/*")) {
String[] parts = line.split("\\/\\*", 2);
lineResult += parts[0];
line = parts.length > 1 ? parts[1] : "";
multilineComment = true;
} else {
lineResult += line;
break;
}
}
}
return new Object[]{multilineComment, joinedLine, lineResult, mustContinue};
}
private List<Map<String, Object>> tokenizeXX(String line, Map<String, String> escapes) {
List<Map<String, Object>> tokens = new ArrayList<>();
StringBuilder buf = new StringBuilder();
boolean verbatim = false;
boolean isEscape = false;
boolean isString = false;
line = line.trim();
for (char c : line.toCharArray()) {
if (c == '\\' && verbatim && !isEscape) {
isEscape = true;
continue;
}
if (isEscape) {
String esc = escapes.getOrDefault(String.valueOf(c), "\\" + c);
buf.append(esc);
isEscape = false;
continue;
}
if (c == '"') {
verbatim = !verbatim;
isString = true;
if (!verbatim && buf.length() > 0) {
Map<String, Object> token = new HashMap<>();
token.put("data", buf.toString());
token.put("rawData", buf.toString());
token.put("isString", isString);
tokens.add(token);
isString = false;
}
buf = new StringBuilder();
continue;
}
if (c == ' ' && !verbatim) {
if (buf.length() > 0) {
Map<String, Object> token = new HashMap<>();
token.put("data", buf.toString());
token.put("isString", isString);
tokens.add(token);
}
buf = new StringBuilder();
continue;
}
buf.append(c);
}
if (buf.length() > 0) {
Map<String, Object> token = new HashMap<>();
token.put("data", buf.toString());
token.put("isString", isString);
tokens.add(token);
}
return tokens;
}
private boolean isComment(String data, String[] asciiComments, String[] twoCharComments) {
if (data.isEmpty()) return false;
char first = data.charAt(0);
if (Arrays.asList(asciiComments).contains(String.valueOf(first)) ||
(data.length() > 1 && Arrays.asList(twoCharComments).contains(data.substring(0, 2)))) {
return true;
}
int code = (int) first;
return code >= 9472 && code < 9632;
}
private String filterIgnored(String text, String[] filterList) {
for (String f : filterList) {
text = text.replace(f, "");
}
return text;
}
private String asciiToHex(String str, Map<String, String> escapes) {
StringBuilder hex = new StringBuilder();
for (int i = 0; i < str.length(); ) {
if (str.charAt(i) == '\\' && i + 1 < str.length()) {
char next = str.charAt(i + 1);
String esc = escapes.getOrDefault(String.valueOf(next), "\\" + next);
hex.append(String.format("%02x", (int) esc.charAt(0)));
i += 2;
} else {
hex.append(String.format("%02x", (int) str.charAt(i)));
i++;
}
}
return hex.toString();
}
private byte[] hexStringToByteArray(String hex) {
int len = hex.length();
byte[] data = new byte[len / 2];
for (int i = 0; i < len; i += 2) {
data[i / 2] = (byte) ((Character.digit(hex.charAt(i), 16) << 4) + Character.digit(hex.charAt(i + 1), 16));
}
return data;
}
public String write(String outputFilename) throws NoSuchAlgorithmException, IOException {
if (outputFilename == null) {
MessageDigest md = MessageDigest.getInstance("SHA-256");
byte[] hash = md.digest(binaryData);
String shortHash = bytesToHex(hash).substring(0, 8);
outputFilename = filename.replace(".xx", "." + shortHash + ".bin");
}
try (FileOutputStream fos = new FileOutputStream(outputFilename)) {
fos.write(binaryData);
}
return outputFilename;
}
private String bytesToHex(byte[] bytes) {
StringBuilder sb = new StringBuilder();
for (byte b : bytes) {
sb.append(String.format("%02x", b));
}
return sb.toString();
}
public void printProperties() {
System.out.println("Properties of the .XX File Format:");
properties.forEach((key, value) -> System.out.println(key + ": " + value));
System.out.println("\nCompiled Binary (Hex Dump):");
hexDump(binaryData);
}
private void hexDump(byte[] data) {
for (int offset = 0; offset < data.length; offset += 16) {
StringBuilder hex = new StringBuilder();
StringBuilder ascii = new StringBuilder();
int end = Math.min(offset + 16, data.length);
for (int i = offset; i < end; i++) {
hex.append(String.format("%02x ", data[i]));
ascii.append((data[i] >= 32 && data[i] < 127) ? (char) data[i] : '.');
}
System.out.printf("%08x: %-47s %s%n", offset, hex.toString(), ascii.toString());
}
}
// Example usage:
// public static void main(String[] args) throws Exception {
// XXFileHandler handler = new XXFileHandler("example.xx");
// handler.printProperties();
// handler.write(null);
// }
}
6. JavaScript Class for .XX File Handling
class XXFileHandler {
constructor(filename) {
this.filename = filename;
this.binaryData = new Uint8Array();
this.properties = {
text_based: true,
supported_notations: ['hex', 'prefixed_hex', 'binary', 'quoted_strings'],
comment_styles: ['single-line', 'multi-line', 'unicode_box'],
no_header: true,
version_features: '0.5 (binary support)',
parsing_behavior: 'concatenated binary extraction'
};
this.readAndDecode();
}
async readAndDecode() {
// Assuming Node.js environment with fs module
const fs = require('fs');
const content = fs.readFileSync(this.filename, 'utf8');
const lines = content.split('\n');
this.binaryData = this.parseXX(lines);
}
parseXX(lines) {
let out = new Uint8Array();
let multilineComment = false;
let joinedLine = '';
const asciiComments = ['#', ';', '%', '|', '\x1B', '-', '/'];
const twoCharComments = ['--', '//'];
const filterList = [',', '$', '\\x', '0x', 'h', ':', ' '];
const escapes = { n: '\n', '\\': '\\', t: '\t', r: '\r' };
for (let line of lines) {
({ multilineComment, joinedLine, line, mustContinue: let mustContinue } = this.filterMultiLineComments(multilineComment, joinedLine, line));
if (mustContinue) continue;
const tokens = this.tokenizeXX(line, escapes);
let lineHex = '';
for (let token of tokens) {
if (this.isComment(token.data, asciiComments, twoCharComments)) continue;
let cleaned = this.filterIgnored(token.data, filterList);
if (token.isString) {
cleaned = this.asciiToHex(token.data, escapes);
} else if (cleaned.startsWith('0y') && cleaned.length === 10) {
cleaned = parseInt(cleaned.slice(2), 2).toString(16).padStart(2, '0');
}
try {
const bytes = this.hexToBytes(cleaned);
out = new Uint8Array([...out, ...bytes]);
lineHex += cleaned;
} catch {}
}
}
return out;
}
filterMultiLineComments(multilineComment, joinedLine, line) {
let lineResult = joinedLine;
joinedLine = '';
let mustContinue = false;
while (line.length > 0) {
if (multilineComment) {
if (line.includes('*/')) {
const parts = line.split('*/');
line = parts.slice(1).join('*/');
multilineComment = false;
} else {
joinedLine += lineResult;
mustContinue = true;
break;
}
} else {
if (line.includes('/*')) {
const parts = line.split('/*');
lineResult += parts[0];
line = parts.slice(1).join('/*');
multilineComment = true;
} else {
lineResult += line;
break;
}
}
}
return { multilineComment, joinedLine, line: lineResult, mustContinue };
}
tokenizeXX(line, escapes) {
const tokens = [];
let buf = '';
let verbatim = false;
let isEscape = false;
let isString = false;
line = line.trim();
for (let c of line) {
if (c === '\\' && verbatim && !isEscape) {
isEscape = true;
continue;
}
if (isEscape) {
buf += escapes[c] || '\\' + c;
isEscape = false;
continue;
}
if (c === '"') {
verbatim = !verbatim;
isString = true;
if (!verbatim && buf) {
tokens.push({ data: buf, isString });
isString = false;
}
buf = '';
continue;
}
if (c === ' ' && !verbatim) {
if (buf) tokens.push({ data: buf, isString });
buf = '';
continue;
}
buf += c;
}
if (buf) tokens.push({ data: buf, isString });
return tokens;
}
isComment(data, asciiComments, twoCharComments) {
if (!data) return false;
if (asciiComments.includes(data[0]) || (data.length > 1 && twoCharComments.includes(data.slice(0, 2)))) return true;
const code = data.charCodeAt(0);
return code >= 9472 && code < 9632;
}
filterIgnored(text, filterList) {
filterList.forEach(f => text = text.replace(new RegExp(f.replace('\\', '\\\\'), 'g'), ''));
return text;
}
asciiToHex(str, escapes) {
let hex = '';
for (let i = 0; i < str.length; ) {
if (str[i] === '\\' && i + 1 < str.length) {
const next = str[i + 1];
const esc = escapes[next] || '\\' + next;
hex += esc.charCodeAt(0).toString(16).padStart(2, '0');
i += 2;
} else {
hex += str.charCodeAt(i).toString(16).padStart(2, '0');
i++;
}
}
return hex;
}
hexToBytes(hex) {
const bytes = [];
for (let i = 0; i < hex.length; i += 2) {
bytes.push(parseInt(hex.substr(i, 2), 16));
}
return bytes;
}
async write(outputFilename) {
const crypto = require('crypto');
if (!outputFilename) {
const hash = crypto.createHash('sha256').update(Buffer.from(this.binaryData)).digest('hex').slice(0, 8);
outputFilename = this.filename.replace('.xx', `.${hash}.bin`);
}
const fs = require('fs');
fs.writeFileSync(outputFilename, Buffer.from(this.binaryData));
return outputFilename;
}
printProperties() {
console.log('Properties of the .XX File Format:');
for (let [key, value] of Object.entries(this.properties)) {
console.log(`${key}: ${Array.isArray(value) ? value.join(', ') : value}`);
}
console.log('\nCompiled Binary (Hex Dump):');
this.hexDump(this.binaryData);
}
hexDump(data) {
for (let offset = 0; offset < data.length; offset += 16) {
const chunk = Array.from(data.slice(offset, offset + 16));
const hex = chunk.map(b => b.toString(16).padStart(2, '0')).join(' ');
const ascii = chunk.map(b => (b >= 32 && b < 127) ? String.fromCharCode(b) : '.').join('');
console.log(`${offset.toString(16).padStart(8, '0')}: ${hex.padEnd(47)} ${ascii}`);
}
}
}
// Example usage (Node.js):
// const handler = new XXFileHandler('example.xx');
// handler.printProperties();
// handler.write();
7. C++ Class for .XX File Handling
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <map>
#include <iomanip>
#include <sstream>
#include <openssl/sha.h> // Requires OpenSSL for SHA-256
class XXFileHandler {
private:
std::string filename;
std::vector<unsigned char> binaryData;
std::map<std::string, std::string> properties;
struct Token {
std::string data;
bool isString;
};
void initProperties() {
properties["text_based"] = "true";
properties["supported_notations"] = "hex, prefixed_hex, binary, quoted_strings";
properties["comment_styles"] = "single-line, multi-line, unicode_box";
properties["no_header"] = "true";
properties["version_features"] = "0.5 (binary support)";
properties["parsing_behavior"] = "concatenated binary extraction";
}
public:
XXFileHandler(const std::string& fname) : filename(fname) {
initProperties();
readAndDecode();
}
void readAndDecode() {
std::ifstream file(filename);
std::vector<std::string> lines;
std::string line;
while (std::getline(file, line)) {
lines.push_back(line);
}
file.close();
binaryData = parseXX(lines);
}
std::vector<unsigned char> parseXX(const std::vector<std::string>& lines) {
std::vector<unsigned char> out;
bool multilineComment = false;
std::string joinedLine;
std::string asciiComments[] = {"#", ";", "%", "|", "\x1B", "-", "/"};
std::string twoCharComments[] = {"--", "//"};
std::string filterList[] = {",", "$", "\\x", "0x", "h", ":", " "};
std::map<char, char> escapes = {{'n', '\n'}, {'\\', '\\'}, {'t', '\t'}, {'r', '\r'}};
for (const auto& ln : lines) {
auto result = filterMultiLineComments(multilineComment, joinedLine, ln);
multilineComment = std::get<0>(result);
joinedLine = std::get<1>(result);
std::string line = std::get<2>(result);
bool mustContinue = std::get<3>(result);
if (mustContinue) continue;
auto tokens = tokenizeXX(line, escapes);
std::string lineHex;
for (const auto& token : tokens) {
if (isComment(token.data, asciiComments, sizeof(asciiComments)/sizeof(std::string), twoCharComments, sizeof(twoCharComments)/sizeof(std::string))) continue;
std::string cleaned = filterIgnored(token.data, filterList, sizeof(filterList)/sizeof(std::string));
if (token.isString) {
cleaned = asciiToHex(token.data, escapes);
} else if (cleaned.substr(0, 2) == "0y" && cleaned.length() == 10) {
unsigned char val = std::stoi(cleaned.substr(2), nullptr, 2);
std::stringstream ss;
ss << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(val);
cleaned = ss.str();
}
auto bytes = hexToBytes(cleaned);
out.insert(out.end(), bytes.begin(), bytes.end());
lineHex += cleaned;
}
}
return out;
}
std::tuple<bool, std::string, std::string, bool> filterMultiLineComments(bool multilineComment, std::string joinedLine, std::string line) {
std::string lineResult = joinedLine;
joinedLine = "";
bool mustContinue = false;
while (!line.empty()) {
if (multilineComment) {
size_t pos = line.find("*/");
if (pos != std::string::npos) {
line = line.substr(pos + 2);
multilineComment = false;
} else {
joinedLine += lineResult;
mustContinue = true;
break;
}
} else {
size_t pos = line.find("/*");
if (pos != std::string::npos) {
lineResult += line.substr(0, pos);
line = line.substr(pos + 2);
multilineComment = true;
} else {
lineResult += line;
break;
}
}
}
return {multilineComment, joinedLine, lineResult, mustContinue};
}
std::vector<Token> tokenizeXX(std::string line, const std::map<char, char>& escapes) {
std::vector<Token> tokens;
std::string buf;
bool verbatim = false;
bool isEscape = false;
bool isString = false;
line.erase(0, line.find_first_not_of(" \t")); // trim left
line.erase(line.find_last_not_of(" \t") + 1); // trim right
for (char c : line) {
if (c == '\\' && verbatim && !isEscape) {
isEscape = true;
continue;
}
if (isEscape) {
auto it = escapes.find(c);
buf += (it != escapes.end()) ? it->second : ('\\' + c);
isEscape = false;
continue;
}
if (c == '"') {
verbatim = !verbatim;
isString = true;
if (!verbatim && !buf.empty()) {
tokens.push_back({buf, isString});
isString = false;
}
buf.clear();
continue;
}
if (c == ' ' && !verbatim) {
if (!buf.empty()) tokens.push_back({buf, isString});
buf.clear();
continue;
}
buf += c;
}
if (!buf.empty()) tokens.push_back({buf, isString});
return tokens;
}
bool isComment(const std::string& data, const std::string* asciiComments, size_t asciiSize, const std::string* twoCharComments, size_t twoSize) {
if (data.empty()) return false;
char first = data[0];
for (size_t i = 0; i < asciiSize; ++i) {
if (asciiComments[i][0] == first) return true;
}
if (data.length() > 1) {
std::string two = data.substr(0, 2);
for (size_t i = 0; i < twoSize; ++i) {
if (twoCharComments[i] == two) return true;
}
}
unsigned int code = static_cast<unsigned int>(first);
return code >= 9472 && code < 9632;
}
std::string filterIgnored(std::string text, const std::string* filterList, size_t filterSize) {
for (size_t i = 0; i < filterSize; ++i) {
size_t pos;
while ((pos = text.find(filterList[i])) != std::string::npos) {
text.erase(pos, filterList[i].length());
}
}
return text;
}
std::string asciiToHex(const std::string& str, const std::map<char, char>& escapes) {
std::stringstream hex;
for (size_t i = 0; i < str.length(); ) {
if (str[i] == '\\' && i + 1 < str.length()) {
char next = str[i + 1];
auto it = escapes.find(next);
char esc = (it != escapes.end()) ? it->second : ('\\' + next);
hex << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(static_cast<unsigned char>(esc));
i += 2;
} else {
hex << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(static_cast<unsigned char>(str[i]));
++i;
}
}
return hex.str();
}
std::vector<unsigned char> hexToBytes(const std::string& hex) {
std::vector<unsigned char> bytes;
for (size_t i = 0; i < hex.length(); i += 2) {
std::string byteStr = hex.substr(i, 2);
if (byteStr.empty()) continue;
unsigned char byte = static_cast<unsigned char>(std::stoi(byteStr, nullptr, 16));
bytes.push_back(byte);
}
return bytes;
}
std::string write(const std::string& outputFilename = "") {
std::string outFile = outputFilename;
if (outFile.empty()) {
unsigned char hash[SHA256_DIGEST_LENGTH];
SHA256(&binaryData[0], binaryData.size(), hash);
std::stringstream ss;
ss << std::hex << std::setw(2) << std::setfill('0');
for (int i = 0; i < 4; ++i) { // First 8 hex chars (4 bytes)
ss << static_cast<int>(hash[i]);
}
std::string shortHash = ss.str();
size_t pos = filename.rfind(".xx");
outFile = (pos != std::string::npos) ? filename.substr(0, pos) : filename;
outFile += "." + shortHash + ".bin";
}
std::ofstream ofs(outFile, std::ios::binary);
ofs.write(reinterpret_cast<const char*>(binaryData.data()), binaryData.size());
ofs.close();
return outFile;
}
void printProperties() {
std::cout << "Properties of the .XX File Format:" << std::endl;
for (const auto& prop : properties) {
std::cout << prop.first << ": " << prop.second << std::endl;
}
std::cout << "\nCompiled Binary (Hex Dump):" << std::endl;
hexDump(binaryData);
}
void hexDump(const std::vector<unsigned char>& data) {
for (size_t offset = 0; offset < data.size(); offset += 16) {
std::stringstream hex;
std::stringstream ascii;
size_t end = std::min(offset + 16, data.size());
for (size_t i = offset; i < end; ++i) {
hex << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(data[i]) << " ";
ascii << ((data[i] >= 32 && data[i] < 127) ? static_cast<char>(data[i]) : '.');
}
std::cout << std::hex << std::setw(8) << std::setfill('0') << offset << ": "
<< std::left << std::setw(47) << hex.str() << " " << ascii.str() << std::endl;
}
}
};
// Example usage:
// int main() {
// XXFileHandler handler("example.xx");
// handler.printProperties();
// handler.write();
// return 0;
// }