Task 132: .DBF File Format
Task 132: .DBF File Format
File Format Specifications for the .DBF File Format
The .DBF file format, originally developed for dBASE database management systems, consists of a fixed-length header, an array of field descriptors, data records, and an end-of-file marker (0x1A). The specifications vary slightly across versions (e.g., dBASE III, IV, V, or FoxPro extensions), but the core structure is consistent. The header provides metadata about the file, followed by field descriptors defining the table structure, and then the records. Data is stored in little-endian byte order where applicable. For this response, the specifications are based on the Level 5 DOS dBASE format (compatible with dBASE III+ to V), as it represents a widely adopted standard.
1. List of All Properties Intrinsic to the File Format
The properties refer to the metadata fields in the file header and the field descriptor array, which define the file's structure and are essential for interpreting the data. These are intrinsic to the format's organization within the file system. The following table enumerates them, including byte positions, sizes, and descriptions:
Property | Byte Position | Size (Bytes) | Description |
---|---|---|---|
Version Signature | 0 | 1 | Indicates the dBASE version and flags for memo files or SQL tables (e.g., 0x03 for dBASE III without memo, 0x83 with memo). Bits 0-2 denote version; bit 3 indicates memo presence; bits 4-6 indicate SQL table; bit 7 indicates any memo file. |
Last Update Year | 1 | 1 | Year of last update minus 1900 (binary value). |
Last Update Month | 2 | 1 | Month of last update (1-12, binary). |
Last Update Day | 3 | 1 | Day of last update (1-31, binary). |
Number of Records | 4-7 | 4 | Total number of data records (unsigned 32-bit integer, little-endian). |
Header Length | 8-9 | 2 | Total bytes in the header, including field descriptors (unsigned 16-bit integer, little-endian). |
Record Length | 10-11 | 2 | Bytes per data record, including deletion flag (unsigned 16-bit integer, little-endian). |
Reserved (1) | 12-13 | 2 | Reserved; typically filled with zeros. |
Incomplete Transaction Flag | 14 | 1 | 0x01 if transaction is incomplete; 0x00 otherwise. |
Encryption Flag | 15 | 1 | 0x01 if encrypted; 0x00 otherwise. |
Multi-User Reserved | 16-27 | 12 | Reserved for multi-user environments (e.g., dBASE in LAN setups); typically zeros in single-user files. |
MDX Production Flag | 28 | 1 | 0x01 if a production .MDX index file exists; 0x00 otherwise. |
Language Driver ID | 29 | 1 | Identifier for the code page or language driver (e.g., 0x01 for DOS USA). |
Reserved (2) | 30-31 | 2 | Reserved; typically filled with zeros. |
Field Descriptors Array | 32 to (Header Length - 2) | Variable (32 bytes per field) | Array of field definitions, each 32 bytes: - Bytes 0-10: Field name (ASCII, null-padded). - Byte 11: Field type (e.g., 'C' for character, 'N' for numeric). - Bytes 12-15: Reserved. - Byte 16: Field length (1-254). - Byte 17: Decimal places (for numeric fields). - Bytes 18-19: Work area ID. - Byte 20: Reserved (example flag in some versions). - Bytes 21-30: Reserved. - Byte 31: MDX index flag (0x01 if indexed). |
Field Descriptor Terminator | Header Length - 1 | 1 | Always 0x0D to mark the end of the field descriptors. |
Following the header are the data records (each of Record Length bytes, starting with a deletion flag: ' ' for active, '*' for deleted), and the file ends with 0x1A.
2. Two Direct Download Links for .DBF Files
- https://raw.githubusercontent.com/infused/dbf/master/spec/fixtures/dbase_03.dbf (dBASE III sample without memo file)
- https://raw.githubusercontent.com/infused/dbf/master/spec/fixtures/dbase_8b.dbf (dBASE IV sample with memo file)
3. Ghost Blog Embedded HTML/JavaScript for Drag-and-Drop .DBF File Dump
The following is a self-contained HTML page with embedded JavaScript that can be embedded in a Ghost blog post (e.g., via the HTML card). It allows users to drag and drop a .DBF file, parses the header, and displays all properties listed above on the screen.
4. Python Class for .DBF File Handling
The following Python class can open a .DBF file, decode and read the properties, print them to the console, and write modifications (e.g., update the last update date) back to a new file.
import struct
import datetime
class DBFHandler:
def __init__(self, filepath):
self.filepath = filepath
self.header = {}
self.fields = []
self._read_header()
def _read_header(self):
with open(self.filepath, 'rb') as f:
data = f.read(32)
self.header['version'] = data[0]
self.header['year'] = 1900 + data[1]
self.header['month'] = data[2]
self.header['day'] = data[3]
self.header['num_records'] = struct.unpack('<I', data[4:8])[0]
header_length = struct.unpack('<H', data[8:10])[0]
self.header['record_length'] = struct.unpack('<H', data[10:12])[0]
self.header['transaction_flag'] = data[14]
self.header['encryption_flag'] = data[15]
self.header['mdx_flag'] = data[28]
self.header['language_id'] = data[29]
# Read fields
f.seek(32)
while True:
field_data = f.read(32)
if field_data[0] == 0x0D:
break
name = field_data[:11].decode('ascii').rstrip('\x00')
field_type = chr(field_data[11])
length = field_data[16]
decimals = field_data[17]
indexed = bool(field_data[31])
self.fields.append({
'name': name,
'type': field_type,
'length': length,
'decimals': decimals,
'indexed': indexed
})
def print_properties(self):
print("DBF Properties:")
for key, value in self.header.items():
print(f"{key.capitalize().replace('_', ' ')}: {value}")
print("Field Descriptors:")
for field in self.fields:
print(field)
def write(self, output_path, update_date=False):
with open(self.filepath, 'rb') as infile:
data = bytearray(infile.read())
if update_date:
now = datetime.date.today()
data[1] = now.year - 1900
data[2] = now.month
data[3] = now.day
with open(output_path, 'wb') as outfile:
outfile.write(data)
# Example usage:
# handler = DBFHandler('sample.dbf')
# handler.print_properties()
# handler.write('output.dbf', update_date=True)
5. Java Class for .DBF File Handling
The following Java class can open a .DBF file, decode and read the properties, print them to the console, and write modifications (e.g., update the last update date) back to a new file.
import java.io.*;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.*;
public class DBFHandler {
private String filepath;
private Map<String, Object> header = new HashMap<>();
private List<Map<String, Object>> fields = new ArrayList<>();
public DBFHandler(String filepath) {
this.filepath = filepath;
readHeader();
}
private void readHeader() {
try {
byte[] data = Files.readAllBytes(Paths.get(filepath));
ByteBuffer buffer = ByteBuffer.wrap(data).order(ByteOrder.LITTLE_ENDIAN);
header.put("version", Byte.toUnsignedInt(buffer.get(0)));
header.put("year", 1900 + Byte.toUnsignedInt(buffer.get(1)));
header.put("month", Byte.toUnsignedInt(buffer.get(2)));
header.put("day", Byte.toUnsignedInt(buffer.get(3)));
header.put("num_records", buffer.getInt(4));
int headerLength = Short.toUnsignedInt(buffer.getShort(8));
header.put("record_length", Short.toUnsignedInt(buffer.getShort(10)));
header.put("transaction_flag", Byte.toUnsignedInt(buffer.get(14)));
header.put("encryption_flag", Byte.toUnsignedInt(buffer.get(15)));
header.put("mdx_flag", Byte.toUnsignedInt(buffer.get(28)));
header.put("language_id", Byte.toUnsignedInt(buffer.get(29)));
// Read fields
int offset = 32;
while (offset < headerLength && data[offset] != 0x0D) {
Map<String, Object> field = new HashMap<>();
StringBuilder name = new StringBuilder();
for (int i = 0; i < 11; i++) {
if (data[offset + i] == 0) break;
name.append((char) data[offset + i]);
}
field.put("name", name.toString());
field.put("type", (char) data[offset + 11]);
field.put("length", Byte.toUnsignedInt(data[offset + 16]));
field.put("decimals", Byte.toUnsignedInt(data[offset + 17]));
field.put("indexed", data[offset + 31] != 0);
fields.add(field);
offset += 32;
}
} catch (IOException e) {
e.printStackTrace();
}
}
public void printProperties() {
System.out.println("DBF Properties:");
header.forEach((key, value) -> System.out.println(key.substring(0, 1).toUpperCase() + key.substring(1).replace("_", " ") + ": " + value));
System.out.println("Field Descriptors:");
fields.forEach(System.out::println);
}
public void write(String outputPath, boolean updateDate) throws IOException {
byte[] data = Files.readAllBytes(Paths.get(filepath));
if (updateDate) {
Calendar now = Calendar.getInstance();
data[1] = (byte) (now.get(Calendar.YEAR) - 1900);
data[2] = (byte) (now.get(Calendar.MONTH) + 1);
data[3] = (byte) now.get(Calendar.DAY_OF_MONTH);
}
Files.write(Paths.get(outputPath), data);
}
// Example usage:
// public static void main(String[] args) throws IOException {
// DBFHandler handler = new DBFHandler("sample.dbf");
// handler.printProperties();
// handler.write("output.dbf", true);
// }
}
6. JavaScript Class for .DBF File Handling
The following JavaScript class (for Node.js) can open a .DBF file, decode and read the properties, print them to the console, and write modifications (e.g., update the last update date) back to a new file. Requires Node.js with 'fs' module.
const fs = require('fs');
class DBFHandler {
constructor(filepath) {
this.filepath = filepath;
this.header = {};
this.fields = [];
this.readHeader();
}
readHeader() {
const data = fs.readFileSync(this.filepath);
const view = new DataView(data.buffer);
this.header.version = view.getUint8(0);
this.header.year = 1900 + view.getUint8(1);
this.header.month = view.getUint8(2);
this.header.day = view.getUint8(3);
this.header.num_records = view.getUint32(4, true);
const headerLength = view.getUint16(8, true);
this.header.record_length = view.getUint16(10, true);
this.header.transaction_flag = view.getUint8(14);
this.header.encryption_flag = view.getUint8(15);
this.header.mdx_flag = view.getUint8(28);
this.header.language_id = view.getUint8(29);
// Read fields
let offset = 32;
while (offset < headerLength && view.getUint8(offset) !== 0x0D) {
let name = '';
for (let i = 0; i < 11; i++) {
const char = view.getUint8(offset + i);
if (char === 0) break;
name += String.fromCharCode(char);
}
const type = String.fromCharCode(view.getUint8(offset + 11));
const length = view.getUint8(offset + 16);
const decimals = view.getUint8(offset + 17);
const indexed = view.getUint8(offset + 31) !== 0;
this.fields.push({ name, type, length, decimals, indexed });
offset += 32;
}
}
printProperties() {
console.log('DBF Properties:');
for (const [key, value] of Object.entries(this.header)) {
console.log(`${key.replace(/_/g, ' ').replace(/\b\w/g, c => c.toUpperCase())}: ${value}`);
}
console.log('Field Descriptors:');
this.fields.forEach(field => console.log(field));
}
write(outputPath, updateDate = false) {
let data = fs.readFileSync(this.filepath);
if (updateDate) {
const now = new Date();
data[1] = now.getFullYear() - 1900;
data[2] = now.getMonth() + 1;
data[3] = now.getDate();
}
fs.writeFileSync(outputPath, data);
}
}
// Example usage:
// const handler = new DBFHandler('sample.dbf');
// handler.printProperties();
// handler.write('output.dbf', true);
7. C++ Class for .DBF File Handling
The following C++ class can open a .DBF file, decode and read the properties, print them to the console, and write modifications (e.g., update the last update date) back to a new file. Compile with a C++ compiler (e.g., g++).
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <ctime>
#include <cstdint>
struct FieldDescriptor {
std::string name;
char type;
uint8_t length;
uint8_t decimals;
bool indexed;
};
class DBFHandler {
private:
std::string filepath;
std::vector<uint8_t> data;
struct Header {
uint8_t version;
uint16_t year;
uint8_t month;
uint8_t day;
uint32_t num_records;
uint16_t header_length;
uint16_t record_length;
uint8_t transaction_flag;
uint8_t encryption_flag;
uint8_t mdx_flag;
uint8_t language_id;
} header;
std::vector<FieldDescriptor> fields;
uint16_t readUint16(size_t offset) {
return data[offset] | (data[offset + 1] << 8);
}
uint32_t readUint32(size_t offset) {
return data[offset] | (data[offset + 1] << 8) | (data[offset + 2] << 16) | (data[offset + 3] << 24);
}
public:
DBFHandler(const std::string& filepath) : filepath(filepath) {
std::ifstream file(filepath, std::ios::binary | std::ios::ate);
size_t size = file.tellg();
data.resize(size);
file.seekg(0);
file.read(reinterpret_cast<char*>(data.data()), size);
readHeader();
}
void readHeader() {
header.version = data[0];
header.year = 1900 + data[1];
header.month = data[2];
header.day = data[3];
header.num_records = readUint32(4);
header.header_length = readUint16(8);
header.record_length = readUint16(10);
header.transaction_flag = data[14];
header.encryption_flag = data[15];
header.mdx_flag = data[28];
header.language_id = data[29];
// Read fields
size_t offset = 32;
while (offset < header.header_length && data[offset] != 0x0D) {
FieldDescriptor field;
field.name = "";
for (int i = 0; i < 11; ++i) {
if (data[offset + i] == 0) break;
field.name += static_cast<char>(data[offset + i]);
}
field.type = static_cast<char>(data[offset + 11]);
field.length = data[offset + 16];
field.decimals = data[offset + 17];
field.indexed = data[offset + 31] != 0;
fields.push_back(field);
offset += 32;
}
}
void printProperties() {
std::cout << "DBF Properties:" << std::endl;
std::cout << "Version: " << static_cast<int>(header.version) << std::endl;
std::cout << "Year: " << header.year << std::endl;
std::cout << "Month: " << static_cast<int>(header.month) << std::endl;
std::cout << "Day: " << static_cast<int>(header.day) << std::endl;
std::cout << "Num Records: " << header.num_records << std::endl;
std::cout << "Header Length: " << header.header_length << std::endl;
std::cout << "Record Length: " << header.record_length << std::endl;
std::cout << "Transaction Flag: " << static_cast<int>(header.transaction_flag) << std::endl;
std::cout << "Encryption Flag: " << static_cast<int>(header.encryption_flag) << std::endl;
std::cout << "MDX Flag: " << static_cast<int>(header.mdx_flag) << std::endl;
std::cout << "Language ID: " << static_cast<int>(header.language_id) << std::endl;
std::cout << "Field Descriptors:" << std::endl;
for (const auto& field : fields) {
std::cout << "Name: " << field.name << ", Type: " << field.type
<< ", Length: " << static_cast<int>(field.length)
<< ", Decimals: " << static_cast<int>(field.decimals)
<< ", Indexed: " << (field.indexed ? "Yes" : "No") << std::endl;
}
}
void write(const std::string& outputPath, bool updateDate) {
if (updateDate) {
time_t now = time(nullptr);
tm* local = localtime(&now);
data[1] = static_cast<uint8_t>(local->tm_year);
data[2] = static_cast<uint8_t>(local->tm_mon + 1);
data[3] = static_cast<uint8_t>(local->tm_mday);
}
std::ofstream outfile(outputPath, std::ios::binary);
outfile.write(reinterpret_cast<const char*>(data.data()), data.size());
}
};
// Example usage:
// int main() {
// DBFHandler handler("sample.dbf");
// handler.printProperties();
// handler.write("output.dbf", true);
// return 0;
// }