Task 820: .WS File Format
Task 820: .WS File Format
1. Properties of the .WS File Format Intrinsic to Its File System
The .WS file format refers to the WordStar document format, a proprietary plain text-based structure used by the WordStar word processor. The following is a comprehensive list of intrinsic properties derived from its specifications, focusing on structural, encoding, and formatting elements that define the format:
- Encoding: Utilizes 7-bit ASCII for printable characters. In versions prior to 5.0, the high bit (8th bit) is repurposed for formatting information, such as marking the last character of words for microjustification, rather than extending the character set.
- Line Termination: Normal lines (hard returns) end with the sequence 0Dh 0Ah (carriage return followed by line feed). Soft returns for word-wrapped lines use 8Dh 0Ah. Trailing spaces at line ends are preserved to maintain formatting fidelity.
- Soft Space: The byte A0h is employed for tabbing, text justification, and left-margin indentation, distinguishing it from standard spaces (20h).
- Extended Sequences: From version 3.4 onward, 3-byte sequences are used for advanced formatting, starting with lead-in 1Bh, followed by a middle byte (00h to FFh), and ending with trailer 1Ch.
- Magic Numbers (Header Signifiers): Specific byte sequences at the file beginning indicate the version, such as 1D 7D 00 00 50 for version 5, 1D 7D 00 00 60 for version 6, and 1D 7D 00 00 70 for version 7. For WordStar 2000 (Windows version 2), the sequence is 57 53 32 30 30 30 (ASCII "WS2000").
- Dot Commands: Formatting instructions embedded as lines starting with a period (.) in the first column, followed by two characters (e.g., .PO for page offset, .PA for new page, .PN for page number). These commands occupy space in the editor but not in printed output.
- Control Characters for Formatting: Specific low-ASCII bytes toggle inline formatting, including 02h for bold on/off and 13h for underline on/off. Nested formatting is supported by sequential application without immediate clearing.
- File Padding: Files are padded to multiples of 128 bytes with repeated 1Ah (EOF marker) bytes in unused space at the end.
- Non-Document Mode: An optional mode for plain text files (e.g., for programming), adhering strictly to 7-bit ASCII without high-bit formatting or control characters, ensuring compatibility with other applications.
- Backward Compatibility: Pre-version 5.0 files may use high-bit settings for printer-specific features like microjustification, which can render files unreadable in standard text viewers without processing.
These properties are fundamental to the format's structure and behavior within file systems, ensuring consistent parsing, rendering, and editing in compatible software.
2. Two Direct Download Links for .WS Files
- https://raw.githubusercontent.com/kevinboone/ws2txt/main/sample1.ws
- https://www.goatley.com/wordstar/samples/oldtimes.ws (Note: This link points to a sample file within the WordStar archive context; direct access may require extraction from the associated ZIP if not hosted independently.)
3. Ghost Blog Embedded HTML JavaScript for Drag-and-Drop .WS File Dumping
The following is a self-contained HTML page with embedded JavaScript that can be embedded in a Ghost blog post. It allows users to drag and drop a .WS file, parses it to extract and display the properties listed in section 1.
Drag and Drop .WS File Analyzer
4. Python Class for .WS File Handling
The following Python class can open, decode, read, write, and print the properties of a .WS file.
import struct
import os
class WSFile:
def __init__(self, filepath):
self.filepath = filepath
self.bytes = None
self.properties = {}
def read(self):
with open(self.filepath, 'rb') as f:
self.bytes = f.read()
self.decode_properties()
def decode_properties(self):
if not self.bytes:
raise ValueError("No data loaded.")
# Magic Number
magic = ' '.join(f'{b:02x}' for b in self.bytes[:5])
self.properties['Magic Number'] = magic
# High Bit Usage
self.properties['High Bit Usage'] = 'Detected' if any(b > 127 for b in self.bytes) else 'Not detected'
# Line Endings
hard, soft = 0, 0
for i in range(len(self.bytes) - 1):
if self.bytes[i] == 0x0D and self.bytes[i+1] == 0x0A:
hard += 1
if self.bytes[i] == 0x8D and self.bytes[i+1] == 0x0A:
soft += 1
self.properties['Line Endings'] = f'Hard Returns: {hard}, Soft Returns: {soft}'
# Soft Spaces Count
self.properties['Soft Spaces Count'] = sum(1 for b in self.bytes if b == 0xA0)
# Extended Sequences
seqs = []
for i in range(len(self.bytes) - 2):
if self.bytes[i] == 0x1B and self.bytes[i+2] == 0x1C:
seqs.append(f'{self.bytes[i+1]:02x}')
self.properties['Extended Sequences'] = seqs
# Dot Commands
text = self.bytes.decode('ascii', errors='ignore')
lines = text.splitlines()
dot_cmds = [line[:3] for line in lines if line.startswith('.')]
self.properties['Dot Commands'] = dot_cmds
# Formatting Controls
bold = sum(1 for b in self.bytes if b == 0x02)
underline = sum(1 for b in self.bytes if b == 0x13)
self.properties['Formatting Controls'] = {'Bold': bold, 'Underline': underline}
# Padding EOF Count
eof_count = 0
for b in reversed(self.bytes):
if b == 0x1A:
eof_count += 1
else:
break
self.properties['Padding EOF Count'] = eof_count
# Non-Document Mode
self.properties['Non-Document Mode'] = 'Likely' if all(32 <= b <= 127 for b in self.bytes) else 'Unlikely'
def print_properties(self):
for key, value in self.properties.items():
print(f'{key}: {value}')
def write(self, new_filepath, content=b''):
# Simple write: append content to a basic template
header = b'\x1D\x7D\x00\x00\x70' # Example v7 magic
padding_len = (128 - (len(header + content) % 128)) % 128
padding = b'\x1A' * padding_len
with open(new_filepath, 'wb') as f:
f.write(header + content + padding)
# Example usage:
# ws = WSFile('sample1.ws')
# ws.read()
# ws.print_properties()
# ws.write('new.ws', b'Test content\x0D\x0A')
5. Java Class for .WS File Handling
The following Java class can open, decode, read, write, and print the properties of a .WS file.
import java.io.*;
import java.util.*;
public class WSFile {
private String filepath;
private byte[] bytes;
private Map<String, Object> properties = new HashMap<>();
public WSFile(String filepath) {
this.filepath = filepath;
}
public void read() throws IOException {
try (FileInputStream fis = new FileInputStream(filepath)) {
bytes = fis.readAllBytes();
}
decodeProperties();
}
private void decodeProperties() {
if (bytes == null) throw new IllegalStateException("No data loaded.");
// Magic Number
StringBuilder magic = new StringBuilder();
for (int i = 0; i < 5 && i < bytes.length; i++) {
magic.append(String.format("%02x ", bytes[i] & 0xFF));
}
properties.put("Magic Number", magic.toString().trim());
// High Bit Usage
boolean highBit = false;
for (byte b : bytes) {
if ((b & 0xFF) > 127) {
highBit = true;
break;
}
}
properties.put("High Bit Usage", highBit ? "Detected" : "Not detected");
// Line Endings
int hard = 0, soft = 0;
for (int i = 0; i < bytes.length - 1; i++) {
if ((bytes[i] & 0xFF) == 0x0D && (bytes[i+1] & 0xFF) == 0x0A) hard++;
if ((bytes[i] & 0xFF) == 0x8D && (bytes[i+1] & 0xFF) == 0x0A) soft++;
}
properties.put("Line Endings", "Hard Returns: " + hard + ", Soft Returns: " + soft);
// Soft Spaces Count
int softSpaces = 0;
for (byte b : bytes) {
if ((b & 0xFF) == 0xA0) softSpaces++;
}
properties.put("Soft Spaces Count", softSpaces);
// Extended Sequences
List<String> seqs = new ArrayList<>();
for (int i = 0; i < bytes.length - 2; i++) {
if ((bytes[i] & 0xFF) == 0x1B && (bytes[i+2] & 0xFF) == 0x1C) {
seqs.add(String.format("%02x", bytes[i+1] & 0xFF));
}
}
properties.put("Extended Sequences", seqs);
// Dot Commands
String text = new String(bytes);
String[] lines = text.split("\\r?\\n");
List<String> dotCmds = new ArrayList<>();
for (String line : lines) {
if (line.startsWith(".")) dotCmds.add(line.substring(0, Math.min(3, line.length())));
}
properties.put("Dot Commands", dotCmds);
// Formatting Controls
int bold = 0, underline = 0;
for (byte b : bytes) {
if ((b & 0xFF) == 0x02) bold++;
if ((b & 0xFF) == 0x13) underline++;
}
Map<String, Integer> controls = new HashMap<>();
controls.put("Bold", bold);
controls.put("Underline", underline);
properties.put("Formatting Controls", controls);
// Padding EOF Count
int eofCount = 0;
for (int i = bytes.length - 1; i >= 0; i--) {
if ((bytes[i] & 0xFF) == 0x1A) eofCount++;
else break;
}
properties.put("Padding EOF Count", eofCount);
// Non-Document Mode
boolean nonDoc = true;
for (byte b : bytes) {
int val = b & 0xFF;
if (val < 32 || val > 127) {
nonDoc = false;
break;
}
}
properties.put("Non-Document Mode", nonDoc ? "Likely" : "Unlikely");
}
public void printProperties() {
for (Map.Entry<String, Object> entry : properties.entrySet()) {
System.out.println(entry.getKey() + ": " + entry.getValue());
}
}
public void write(String newFilepath, byte[] content) throws IOException {
byte[] header = new byte[]{0x1D, 0x7D, 0x00, 0x00, 0x70}; // Example v7 magic
byte[] combined = new byte[header.length + content.length];
System.arraycopy(header, 0, combined, 0, header.length);
System.arraycopy(content, 0, combined, header.length, content.length);
int paddingLen = (128 - (combined.length % 128)) % 128;
byte[] padding = new byte[paddingLen];
Arrays.fill(padding, (byte) 0x1A);
try (FileOutputStream fos = new FileOutputStream(newFilepath)) {
fos.write(combined);
fos.write(padding);
}
}
// Example usage:
// public static void main(String[] args) throws IOException {
// WSFile ws = new WSFile("sample1.ws");
// ws.read();
// ws.printProperties();
// ws.write("new.ws", "Test content\r\n".getBytes());
// }
}
6. JavaScript Class for .WS File Handling
The following JavaScript class can open (via FileReader), decode, read, write (using Blob), and print the properties of a .WS file to the console.
class WSFile {
constructor() {
this.bytes = null;
this.properties = {};
}
async read(file) {
return new Promise((resolve, reject) => {
const reader = new FileReader();
reader.onload = (e) => {
this.bytes = new Uint8Array(e.target.result);
this.decodeProperties();
resolve();
};
reader.onerror = reject;
reader.readAsArrayBuffer(file);
});
}
decodeProperties() {
if (!this.bytes) throw new Error("No data loaded.");
// Magic Number
let magic = Array.from(this.bytes.slice(0, 5)).map(b => b.toString(16).padStart(2, '0')).join(' ');
this.properties['Magic Number'] = magic;
// High Bit Usage
this.properties['High Bit Usage'] = this.bytes.some(b => b > 127) ? 'Detected' : 'Not detected';
// Line Endings
let hard = 0, soft = 0;
for (let i = 0; i < this.bytes.length - 1; i++) {
if (this.bytes[i] === 0x0D && this.bytes[i+1] === 0x0A) hard++;
if (this.bytes[i] === 0x8D && this.bytes[i+1] === 0x0A) soft++;
}
this.properties['Line Endings'] = `Hard Returns: ${hard}, Soft Returns: ${soft}`;
// Soft Spaces Count
this.properties['Soft Spaces Count'] = this.bytes.filter(b => b === 0xA0).length;
// Extended Sequences
let seqs = [];
for (let i = 0; i < this.bytes.length - 2; i++) {
if (this.bytes[i] === 0x1B && this.bytes[i+2] === 0x1C) {
seqs.push(this.bytes[i+1].toString(16).padStart(2, '0'));
}
}
this.properties['Extended Sequences'] = seqs;
// Dot Commands
let text = new TextDecoder('ascii').decode(this.bytes);
let lines = text.split(/\r?\n/);
let dotCmds = lines.filter(line => line.startsWith('.')).map(line => line.substring(0, 3));
this.properties['Dot Commands'] = dotCmds;
// Formatting Controls
let bold = this.bytes.filter(b => b === 0x02).length;
let underline = this.bytes.filter(b => b === 0x13).length;
this.properties['Formatting Controls'] = { Bold: bold, Underline: underline };
// Padding EOF Count
let eofCount = 0;
for (let i = this.bytes.length - 1; i >= 0; i--) {
if (this.bytes[i] === 0x1A) eofCount++;
else break;
}
this.properties['Padding EOF Count'] = eofCount;
// Non-Document Mode
this.properties['Non-Document Mode'] = this.bytes.every(b => b >= 32 && b <= 127) ? 'Likely' : 'Unlikely';
}
printProperties() {
console.log(this.properties);
}
write(filename, content) {
const header = new Uint8Array([0x1D, 0x7D, 0x00, 0x00, 0x70]); // Example v7 magic
const contentBytes = new TextEncoder().encode(content);
const combined = new Uint8Array(header.length + contentBytes.length);
combined.set(header);
combined.set(contentBytes, header.length);
const paddingLen = (128 - (combined.length % 128)) % 128;
const padding = new Uint8Array(paddingLen).fill(0x1A);
const full = new Uint8Array(combined.length + paddingLen);
full.set(combined);
full.set(padding, combined.length);
const blob = new Blob([full], { type: 'application/octet-stream' });
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = filename;
a.click();
URL.revokeObjectURL(url);
}
}
// Example usage:
// const ws = new WSFile();
// const input = document.createElement('input');
// input.type = 'file';
// input.onchange = async (e) => {
// await ws.read(e.target.files[0]);
// ws.printProperties();
// ws.write('new.ws', 'Test content\r\n');
// };
// input.click();
7. C++ Class for .WS File Handling
The following C++ class (using "c class" interpreted as C++) can open, decode, read, write, and print the properties of a .WS file to the console.
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <map>
#include <iomanip>
#include <sstream>
class WSFile {
private:
std::string filepath;
std::vector<unsigned char> bytes;
std::map<std::string, std::string> properties;
public:
WSFile(const std::string& fp) : filepath(fp) {}
void read() {
std::ifstream file(filepath, std::ios::binary);
if (file) {
file.seekg(0, std::ios::end);
size_t size = file.tellg();
file.seekg(0, std::ios::beg);
bytes.resize(size);
file.read(reinterpret_cast<char*>(bytes.data()), size);
decodeProperties();
} else {
throw std::runtime_error("Failed to open file.");
}
}
void decodeProperties() {
if (bytes.empty()) throw std::runtime_error("No data loaded.");
// Magic Number
std::stringstream magicSs;
for (size_t i = 0; i < 5 && i < bytes.size(); ++i) {
magicSs << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(bytes[i]) << " ";
}
properties["Magic Number"] = magicSs.str();
// High Bit Usage
bool highBit = false;
for (auto b : bytes) {
if (b > 127) highBit = true;
}
properties["High Bit Usage"] = highBit ? "Detected" : "Not detected";
// Line Endings
int hard = 0, soft = 0;
for (size_t i = 0; i < bytes.size() - 1; ++i) {
if (bytes[i] == 0x0D && bytes[i+1] == 0x0A) ++hard;
if (bytes[i] == 0x8D && bytes[i+1] == 0x0A) ++soft;
}
std::stringstream leSs;
leSs << "Hard Returns: " << hard << ", Soft Returns: " << soft;
properties["Line Endings"] = leSs.str();
// Soft Spaces Count
int softSpaces = 0;
for (auto b : bytes) if (b == 0xA0) ++softSpaces;
properties["Soft Spaces Count"] = std::to_string(softSpaces);
// Extended Sequences
std::stringstream seqSs;
for (size_t i = 0; i < bytes.size() - 2; ++i) {
if (bytes[i] == 0x1B && bytes[i+2] == 0x1C) {
seqSs << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(bytes[i+1]) << " ";
}
}
properties["Extended Sequences"] = seqSs.str();
// Dot Commands
std::string text(bytes.begin(), bytes.end());
std::stringstream textSs(text);
std::string line, dotCmds;
while (std::getline(textSs, line)) {
if (line.rfind('.', 0) == 0 && line.size() >= 3) {
dotCmds += line.substr(0, 3) + " ";
}
}
properties["Dot Commands"] = dotCmds;
// Formatting Controls
int bold = 0, underline = 0;
for (auto b : bytes) {
if (b == 0x02) ++bold;
if (b == 0x13) ++underline;
}
std::stringstream fcSs;
fcSs << "Bold: " << bold << ", Underline: " << underline;
properties["Formatting Controls"] = fcSs.str();
// Padding EOF Count
int eofCount = 0;
for (auto it = bytes.rbegin(); it != bytes.rend(); ++it) {
if (*it == 0x1A) ++eofCount;
else break;
}
properties["Padding EOF Count"] = std::to_string(eofCount);
// Non-Document Mode
bool nonDoc = true;
for (auto b : bytes) {
if (b < 32 || b > 127) nonDoc = false;
}
properties["Non-Document Mode"] = nonDoc ? "Likely" : "Unlikely";
}
void printProperties() const {
for (const auto& prop : properties) {
std::cout << prop.first << ": " << prop.second << std::endl;
}
}
void write(const std::string& newFilepath, const std::string& content) {
std::vector<unsigned char> header = {0x1D, 0x7D, 0x00, 0x00, 0x70}; // Example v7 magic
std::vector<unsigned char> contentBytes(content.begin(), content.end());
std::vector<unsigned char> combined;
combined.reserve(header.size() + contentBytes.size());
combined.insert(combined.end(), header.begin(), header.end());
combined.insert(combined.end(), contentBytes.begin(), contentBytes.end());
size_t paddingLen = (128 - (combined.size() % 128)) % 128;
std::vector<unsigned char> padding(paddingLen, 0x1A);
combined.insert(combined.end(), padding.begin(), padding.end());
std::ofstream file(newFilepath, std::ios::binary);
if (file) {
file.write(reinterpret_cast<const char*>(combined.data()), combined.size());
} else {
throw std::runtime_error("Failed to write file.");
}
}
};
// Example usage:
// int main() {
// try {
// WSFile ws("sample1.ws");
// ws.read();
// ws.printProperties();
// ws.write("new.ws", "Test content\r\n");
// } catch (const std::exception& e) {
// std::cerr << e.what() << std::endl;
// }
// return 0;
// }