Task 659: .SFX File Format
Task 659: .SFX File Format
File Format Specifications for .SFX
The .SFX file format refers to self-extracting archives, which are executable files (typically with a .exe extension) that embed a compressed archive—most commonly in the RAR format—along with a stub executable module responsible for extraction upon execution. These files are generated by tools such as WinRAR or 7-Zip. The format lacks a universal standard, as the structure varies by the underlying archiver, but RAR-based SFX modules are the most prevalent.
The overall structure consists of:
- SFX Stub: An optional executable module (variable size, up to approximately 1 MB in modern implementations) containing code to handle extraction, user prompts, and optional custom behaviors (e.g., post-extraction commands). This stub is platform-specific (e.g., Windows PE format).
- Embedded Archive: The RAR archive data, beginning with a fixed signature. The RAR format defines the intrinsic file system, representing a hierarchical structure of files and directories with metadata such as names, sizes, timestamps, and attributes.
Specifications are derived from official RAR documentation (RAR 5.0 technical note from rarlab.com) and detailed format analyses (e.g., acritum.com for RAR 3.x compatibility). RAR archives support versions 3.x (legacy, fixed-size headers) and 5.0 (modern, variable-length integers via "vint" encoding). Parsers must search for the RAR signature to locate the archive offset after the stub. The file system is sequential: a main archive header followed by file/service headers, with no explicit root directory but paths encoded in file names using forward slashes.
1. List of Properties Intrinsic to the .SFX File System
The intrinsic properties pertain to the embedded RAR archive's structure and contents, forming a virtual file system with files, directories, and metadata. Properties are categorized below for clarity. Detection of RAR version (3.x or 5.0) influences field encoding.
| Category | Property | Description | RAR Version Notes |
|---|---|---|---|
| Archive-Level | SFX Stub Offset | Byte offset where the RAR signature begins (end of stub). | Common to both; searched sequentially. |
| Archive-Level | RAR Signature | Fixed bytes confirming archive start: RAR!\x1A\x07\x00 (3.x) or RAR!\x1A\x07\x01\x00 (5.0). |
7 bytes (3.x) or 8 bytes (5.0). |
| Archive-Level | Archive Header Size | Total size of the main archive header block. | Fixed 2 bytes (3.x); vint (5.0). |
| Archive-Level | Archive Flags | Bitmask including: volume (multi-part), solid (shared compression dictionary), locked (read-only), recovery record present, encrypted headers, first volume. | 2 bytes (3.x); vint (5.0). |
| Archive-Level | Volume Number | Sequential number for multi-volume archives (0 for first). | Absent in 3.x first volume; vint in 5.0 if flagged. |
| Archive-Level | Host OS | Operating system for archiving (e.g., 0=Windows, 3=Unix). | 1 byte (3.x); vint (5.0). |
| Archive-Level | Archive Comment | Optional UTF-8 text comment on the archive. | Separate block (3.x); service header (5.0). |
| File/Directory-Level | File Name | UTF-8 encoded path (forward slashes as separators; no trailing null). | Variable length; supports Unicode extensions. |
| File/Directory-Level | Packed Size | Compressed size in bytes (64-bit support via high/low parts). | 4 bytes + optional 4-byte high (3.x); vint (5.0). |
| File/Directory-Level | Unpacked Size | Original uncompressed size in bytes (64-bit). | 4 bytes + optional 4-byte high (3.x); vint (5.0). |
| File/Directory-Level | File CRC32 | 32-bit checksum of unpacked data. | 4 bytes. |
| File/Directory-Level | Modification Time (mtime) | Timestamp in MS-DOS format (3.x) or Unix/Windows (5.0); optional nanoseconds. | 4 bytes (3.x); uint32/uint64 + flags (5.0). |
| File/Directory-Level | Creation/Access Time (ctime/atime) | Optional high-precision timestamps. | Absent in 3.x base; extra area (5.0). |
| File/Directory-Level | File Attributes | OS-specific flags (e.g., read-only, directory, hidden; Unix modes). | 4 bytes (3.x); vint (5.0). |
| File/Directory-Level | Directory Flag | Indicates if entry is a directory (no data). | Bits 5-7 in flags (3.x: 111); vint flag (5.0). |
| File/Directory-Level | Compression Method | Algorithm (e.g., 0x30=store, 0x33=normal; dictionary size encoded). | 1 byte (3.x); vint with bits (5.0). |
| File/Directory-Level | Dictionary Size | Compression window size (e.g., 64 KB to 4 GB). | Encoded in flags bits (3.x); vint bits (5.0). |
| File/Directory-Level | Unpack Version | RAR version required for extraction. | 1 byte (3.x); derived from compression info (5.0). |
| File/Directory-Level | Encryption Flag | Indicates password protection. | Bit in flags (both versions). |
| File/Directory-Level | Salt | 8-byte random value for encryption key derivation. | Optional 8 bytes (3.x); 16-byte salt + IV (5.0). |
| File/Directory-Level | Solid Flag | Uses dictionary from prior files. | Bit in flags (both). |
| File/Directory-Level | Extra Fields | Optional: NTFS ACLs, streams, hashes (e.g., BLAKE2), symlinks, owner/group. | Comments/salt in 3.x; extra area records (5.0). |
These properties enable reconstruction of the file system hierarchy, with directories implied by paths and flags.
2. Two Direct Download Links for .SFX Files
- RAR 3.x SFX (Windows console module): https://github.com/ssokolow/rar-test-files/raw/master/build/testfile.rar3.wincon.sfx.exe
- RAR 5.0 SFX (Windows console module): https://github.com/ssokolow/rar-test-files/raw/master/build/testfile.rar5.wincon.sfx.exe
These are minimal test files from a public repository, suitable for verification.
3. Ghost Blog Embedded HTML JavaScript for Drag-and-Drop .SFX Parsing
The following is a self-contained HTML snippet embeddable in a Ghost blog post (e.g., via HTML card). It creates a drag-and-drop zone for .SFX files, parses the embedded RAR archive (supports RAR 3.x for simplicity; extend for 5.0 as needed), and dumps properties to a scrollable output area. Parsing uses FileReader and DataView for binary access.
Drag and drop a .SFX file here to parse its properties.
This code handles basic RAR 3.x parsing; for RAR 5.0, extend parseRAR3 with vint decoding.
4. Python Class for .SFX Parsing
The following Python class uses built-in struct for binary parsing. It supports reading, decoding, and writing (reconstructs a basic SFX by appending stub + RAR, though full write requires RAR compression logic). Run with python SfxParser.py input.sfx to print properties to console. Assumes RAR 3.x for fixed headers.
import struct
import sys
import os
class SfxParser:
RAR_SIGNATURE_3X = b'RAR!\x1A\x07\x00'
@staticmethod
def find_signature(data):
return data.find(SfxParser.RAR_SIGNATURE_3X)
@staticmethod
def parse_rar3(data, offset):
pos = offset + 7
props = []
# Archive header
if data[pos] != 0x73:
props.append('Invalid archive header.')
return props
pos += 1
head_flags, head_size = struct.unpack_from('<HH', data, pos)
pos += 4
reserved1, reserved2 = struct.unpack_from('<HI', data, pos)
pos += 6
props.append(f'Archive Flags: 0x{head_flags:04x}')
props.append(f'Archive Header Size: {head_size} bytes')
# Files
props.append('\nFile Properties:')
while pos < len(data):
head_crc = struct.unpack_from('<H', data, pos)[0]
pos += 2
head_type = data[pos]
pos += 1
head_flags, head_size = struct.unpack_from('<HH', data, pos)
pos += 4
if head_type == 0x74: # File
pack_size, unp_size = struct.unpack_from('<II', data, pos)
pos += 8
host_os = data[pos]
pos += 1
file_crc = struct.unpack_from('<I', data, pos)[0]
pos += 4
ftime = struct.unpack_from('<I', data, pos)[0]
pos += 4
unp_ver = data[pos]
pos += 1
method = data[pos]
pos += 1
name_size, = struct.unpack_from('<H', data, pos)
pos += 2
name = data[pos:pos + name_size].decode('utf-8', errors='ignore')
pos += name_size
attr, = struct.unpack_from('<I', data, pos)
pos += 4
is_dir = (head_flags & 0xE000) == 0xE000
props.extend([
f'- Name: {name}',
f' Packed Size: {pack_size}',
f' Unpacked Size: {unp_size}',
f' CRC32: 0x{file_crc:08x}',
f' Host OS: {host_os}',
f' mtime (DOS): {ftime}',
f' Attributes: 0x{attr:08x}',
f' Directory: {is_dir}',
f' Method: {method}',
f' Unpack Ver: {unp_ver}'
])
pos += pack_size # Skip data
else:
pos += head_size - 7 # Skip other
return props
@classmethod
def parse(cls, filename):
with open(filename, 'rb') as f:
data = f.read()
sig_offset = cls.find_signature(data)
if sig_offset == -1:
print('No RAR signature found.')
return
print(f'SFX Stub Offset: {sig_offset}')
props = cls.parse_rar3(data, sig_offset)
for p in props:
print(p)
@classmethod
def write(cls, input_sfx, output_sfx, new_comment=None):
# Basic write: copy input, modify comment if provided (simplified; requires full RAR writer for changes)
with open(input_sfx, 'rb') as f:
data = bytearray(f.read())
if new_comment:
# Locate and update comment block (placeholder; implement full search/update)
pass
with open(output_sfx, 'wb') as f:
f.write(data)
print(f'Wrote modified SFX to {output_sfx}')
if __name__ == '__main__':
if len(sys.argv) < 2:
print('Usage: python SfxParser.py <sfx_file>')
else:
SfxParser.parse(sys.argv[1])
5. Java Class for .SFX Parsing
This Java class uses java.nio for binary I/O. Compile with javac SfxParser.java and run java SfxParser input.sfx to print properties. Supports read/decode; write reconstructs by copying (extend for modifications). RAR 3.x focused.
import java.io.*;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;
public class SfxParser {
private static final byte[] RAR_SIGNATURE_3X = {0x52, 0x61, 0x72, 0x21, 0x1A, (byte)0x07, 0x00};
private static int findSignature(ByteBuffer buffer) {
byte[] data = new byte[buffer.remaining()];
buffer.get(data);
for (int i = 0; i <= data.length - 7; i++) {
boolean match = true;
for (int j = 0; j < 7; j++) {
if (data[i + j] != RAR_SIGNATURE_3X[j]) {
match = false;
break;
}
}
if (match) return i;
}
return -1;
}
private static void parseRar3(ByteBuffer buffer, int offset, PrintStream out) {
buffer.position(offset + 7);
if (buffer.get() != 0x73) {
out.println("Invalid archive header.");
return;
}
buffer.position(buffer.position() + 1); // Skip type
short headFlags = buffer.getShort();
short headSize = buffer.getShort();
buffer.position(buffer.position() + 4); // Reserved
out.println("Archive Flags: 0x" + Integer.toHexString(headFlags & 0xFFFF).toUpperCase());
out.println("Archive Header Size: " + headSize + " bytes");
out.println("\nFile Properties:");
while (buffer.hasRemaining()) {
short headCrc = buffer.getShort();
byte headType = buffer.get();
short headFlags2 = buffer.getShort();
short headSize2 = buffer.getShort();
if (headType == 0x74) { // File
int packSize = buffer.getInt();
int unpSize = buffer.getInt();
byte hostOS = buffer.get();
int fileCrc = buffer.getInt();
int ftime = buffer.getInt();
byte unpVer = buffer.get();
byte method = buffer.get();
short nameSize = buffer.getShort();
byte[] nameBytes = new byte[nameSize];
buffer.get(nameBytes);
String name = new String(nameBytes);
int attr = buffer.getInt();
boolean isDir = (headFlags2 & 0xE000) == 0xE000;
out.println("- Name: " + name);
out.println(" Packed Size: " + packSize);
out.println(" Unpacked Size: " + unpSize);
out.println(" CRC32: 0x" + Integer.toHexString(fileCrc).toUpperCase());
out.println(" Host OS: " + hostOS);
out.println(" mtime (DOS): " + ftime);
out.println(" Attributes: 0x" + Integer.toHexString(attr).toUpperCase());
out.println(" Directory: " + isDir);
out.println(" Method: " + method);
out.println(" Unpack Ver: " + unpVer);
buffer.position(buffer.position() + packSize); // Skip data
} else {
buffer.position(buffer.position() + headSize2 - 7);
}
}
}
public static void parse(String filename) throws IOException {
FileChannel channel = FileChannel.open(Paths.get(filename), StandardOpenOption.READ);
ByteBuffer buffer = ByteBuffer.allocate((int) channel.size());
channel.read(buffer);
buffer.flip();
int sigOffset = findSignature(buffer);
if (sigOffset == -1) {
System.out.println("No RAR signature found.");
return;
}
System.out.println("SFX Stub Offset: " + sigOffset);
parseRar3(buffer, sigOffset, System.out);
}
public static void write(String inputSfx, String outputSfx) throws IOException {
// Basic copy for write (extend for modifications)
Files.copy(Paths.get(inputSfx), Paths.get(outputSfx), StandardCopyOption.REPLACE_EXISTING);
System.out.println("Wrote SFX to " + outputSfx);
}
public static void main(String[] args) throws IOException {
if (args.length < 1) {
System.out.println("Usage: java SfxParser <sfx_file>");
return;
}
parse(args[0]);
}
}
6. JavaScript Class for .SFX Parsing
This Node.js class uses fs for file I/O. Run with node sfxParser.js input.sfx to print properties. Supports read/decode; write copies file. RAR 3.x focused. For browser, adapt to File API.
const fs = require('fs');
class SfxParser {
static RAR_SIGNATURE_3X = Buffer.from([0x52, 0x61, 0x72, 0x21, 0x1A, 0x07, 0x00]);
static findSignature(data) {
return data.indexOf(this.RAR_SIGNATURE_3X);
}
static parseRar3(data, offset) {
let pos = offset + 7;
const props = [];
if (data[pos] !== 0x73) {
props.push('Invalid archive header.');
return props;
}
pos += 1;
const headFlags = data.readUInt16LE(pos); pos += 2;
const headSize = data.readUInt16LE(pos); pos += 2;
pos += 4; // Reserved
props.push(`Archive Flags: 0x${headFlags.toString(16).padStart(4, '0')}`);
props.push(`Archive Header Size: ${headSize} bytes`);
props.push('\nFile Properties:');
while (pos < data.length) {
pos += 2; // CRC
const headType = data[pos]; pos += 1;
const headFlags2 = data.readUInt16LE(pos); pos += 2;
const headSize2 = data.readUInt16LE(pos); pos += 2;
if (headType === 0x74) {
const packSize = data.readUInt32LE(pos); pos += 4;
const unpSize = data.readUInt32LE(pos); pos += 4;
const hostOS = data[pos]; pos += 1;
const fileCrc = data.readUInt32LE(pos); pos += 4;
const ftime = data.readUInt32LE(pos); pos += 4;
const unpVer = data[pos]; pos += 1;
const method = data[pos]; pos += 1;
const nameSize = data.readUInt16LE(pos); pos += 2;
const name = data.toString('utf8', pos, pos + nameSize);
pos += nameSize;
const attr = data.readUInt32LE(pos); pos += 4;
const isDir = (headFlags2 & 0xE000) === 0xE000;
props.push(`- Name: ${name}`);
props.push(` Packed Size: ${packSize}`);
props.push(` Unpacked Size: ${unpSize}`);
props.push(` CRC32: 0x${fileCrc.toString(16).padStart(8, '0')}`);
props.push(` Host OS: ${hostOS}`);
props.push(` mtime (DOS): ${ftime}`);
props.push(` Attributes: 0x${attr.toString(16).padStart(8, '0')}`);
props.push(` Directory: ${isDir}`);
props.push(` Method: ${method}`);
props.push(` Unpack Ver: ${unpVer}`);
pos += packSize; // Skip data
} else {
pos += headSize2 - 7;
}
}
return props;
}
static parse(filename) {
const data = fs.readFileSync(filename);
const sigOffset = this.findSignature(data);
if (sigOffset === -1) {
console.log('No RAR signature found.');
return;
}
console.log(`SFX Stub Offset: ${sigOffset}`);
const props = this.parseRar3(data, sigOffset);
props.forEach(p => console.log(p));
}
static write(inputSfx, outputSfx) {
fs.copyFileSync(inputSfx, outputSfx);
console.log(`Wrote SFX to ${outputSfx}`);
}
}
if (require.main === module) {
if (process.argv.length < 3) {
console.log('Usage: node sfxParser.js <sfx_file>');
} else {
SfxParser.parse(process.argv[2]);
}
}
module.exports = SfxParser;
7. C Class for .SFX Parsing
This C implementation uses standard I/O and manual binary reading. Compile with gcc -o sfx_parser sfx_parser.c and run ./sfx_parser input.sfx. Supports read/decode; write copies via cp-like logic (uses fread/fwrite). RAR 3.x focused.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#define RAR_SIGNATURE_3X_LEN 7
static const uint8_t RAR_SIGNATURE_3X[RAR_SIGNATURE_3X_LEN] = {0x52, 0x61, 0x72, 0x21, 0x1A, 0x07, 0x00};
typedef struct {
uint8_t* data;
size_t size;
size_t pos;
} Buffer;
static int find_signature(Buffer* buf) {
for (size_t i = 0; i <= buf->size - RAR_SIGNATURE_3X_LEN; i++) {
int match = 1;
for (int j = 0; j < RAR_SIGNATURE_3X_LEN; j++) {
if (buf->data[i + j] != RAR_SIGNATURE_3X[j]) {
match = 0;
break;
}
}
if (match) return i;
}
return -1;
}
static void parse_rar3(Buffer* buf, size_t offset, FILE* out) {
buf->pos = offset + RAR_SIGNATURE_3X_LEN;
if (buf->data[buf->pos] != 0x73) {
fprintf(out, "Invalid archive header.\n");
return;
}
buf->pos += 1;
uint16_t head_flags = *(uint16_t*)(buf->data + buf->pos); buf->pos += 2;
uint16_t head_size = *(uint16_t*)(buf->data + buf->pos); buf->pos += 2;
buf->pos += 4; // Reserved
fprintf(out, "Archive Flags: 0x%04x\n", head_flags);
fprintf(out, "Archive Header Size: %u bytes\n", head_size);
fprintf(out, "\nFile Properties:\n");
while (buf->pos < buf->size) {
buf->pos += 2; // CRC
uint8_t head_type = buf->data[buf->pos]; buf->pos += 1;
uint16_t head_flags2 = *(uint16_t*)(buf->data + buf->pos); buf->pos += 2;
uint16_t head_size2 = *(uint16_t*)(buf->data + buf->pos); buf->pos += 2;
if (head_type == 0x74) {
uint32_t pack_size = *(uint32_t*)(buf->data + buf->pos); buf->pos += 4;
uint32_t unp_size = *(uint32_t*)(buf->data + buf->pos); buf->pos += 4;
uint8_t host_os = buf->data[buf->pos]; buf->pos += 1;
uint32_t file_crc = *(uint32_t*)(buf->data + buf->pos); buf->pos += 4;
uint32_t ftime = *(uint32_t*)(buf->data + buf->pos); buf->pos += 4;
uint8_t unp_ver = buf->data[buf->pos]; buf->pos += 1;
uint8_t method = buf->data[buf->pos]; buf->pos += 1;
uint16_t name_size = *(uint16_t*)(buf->data + buf->pos); buf->pos += 2;
char name[1024] = {0};
memcpy(name, buf->data + buf->pos, name_size > 1023 ? 1023 : name_size);
buf->pos += name_size;
uint32_t attr = *(uint32_t*)(buf->data + buf->pos); buf->pos += 4;
int is_dir = (head_flags2 & 0xE000) == 0xE000;
fprintf(out, "- Name: %s\n", name);
fprintf(out, " Packed Size: %u\n", pack_size);
fprintf(out, " Unpacked Size: %u\n", unp_size);
fprintf(out, " CRC32: 0x%08x\n", file_crc);
fprintf(out, " Host OS: %u\n", host_os);
fprintf(out, " mtime (DOS): %u\n", ftime);
fprintf(out, " Attributes: 0x%08x\n", attr);
fprintf(out, " Directory: %s\n", is_dir ? "true" : "false");
fprintf(out, " Method: %u\n", method);
fprintf(out, " Unpack Ver: %u\n", unp_ver);
buf->pos += pack_size; // Skip data
} else {
buf->pos += head_size2 - 7;
}
}
}
static void parse_file(const char* filename) {
FILE* fp = fopen(filename, "rb");
if (!fp) {
perror("Error opening file");
return;
}
fseek(fp, 0, SEEK_END);
size_t size = ftell(fp);
fseek(fp, 0, SEEK_SET);
uint8_t* data = malloc(size);
fread(data, 1, size, fp);
fclose(fp);
Buffer buf = {data, size, 0};
int sig_offset = find_signature(&buf);
if (sig_offset == -1) {
printf("No RAR signature found.\n");
free(data);
return;
}
printf("SFX Stub Offset: %d\n", sig_offset);
parse_rar3(&buf, sig_offset, stdout);
free(data);
}
static void write_file(const char* input, const char* output) {
FILE* in = fopen(input, "rb");
FILE* out = fopen(output, "wb");
if (!in || !out) {
perror("Error in write");
if (in) fclose(in);
if (out) fclose(out);
return;
}
uint8_t buf[4096];
size_t n;
while ((n = fread(buf, 1, sizeof(buf), in)) > 0) {
fwrite(buf, 1, n, out);
}
fclose(in);
fclose(out);
printf("Wrote SFX to %s\n", output);
}
int main(int argc, char** argv) {
if (argc < 2) {
printf("Usage: %s <sfx_file>\n", argv[0]);
return 1;
}
parse_file(argv[1]);
// For write: write_file(argv[1], "output.sfx");
return 0;
}
These implementations provide core functionality for decoding and basic writing. For production use, consider edge cases like multi-volume archives or RAR 5.0 vint parsing.