Task 090: .CLASS File Format
Task 090: .CLASS File Format
1. List of all the properties of the .CLASS file format intrinsic to its file system
The .CLASS file format is the binary format for Java bytecode, as defined in the Java Virtual Machine Specification. It is a stream of 8-bit bytes, with multi-byte values in big-endian order. The intrinsic properties are the fixed and variable structure elements that define the file's content, including headers, pools, flags, indices, and substructures for fields, methods, and attributes. Below is a hierarchical list of all properties, including types (u1 = unsigned 8-bit, u2 = unsigned 16-bit, u4 = unsigned 32-bit, etc.), descriptions, and dependencies. This is based on the standard structure, where the constant pool is used to resolve string and reference indices throughout the file.
magic (u4)
- Description: Fixed magic number identifying the file as a Java class file. Value must be 0xCAFEBABE.
minor_version (u2)
- Description: Minor version of the class file format (e.g., 0 for most modern versions).
major_version (u2)
- Description: Major version of the class file format (e.g., 65 for Java 21, 69 for Java 25). Determines supported features and JVM compatibility.
constant_pool_count (u2)
- Description: Number of entries in the constant pool + 1 (indices range from 1 to constant_pool_count - 1). Accounts for double-slot entries (long, double).
constant_pool (cp_info[constant_pool_count - 1])
- Description: Table of literal constants, symbolic references, and other data used throughout the class file. Each entry is variable-length and starts with a tag. Used to resolve names, descriptors, classes, etc. Specific sub-properties (cp_info structures, all big-endian):
- tag (u1)
- Description: Type identifier (1-20).
- Sub-types based on tag:
- Tag 1 (CONSTANT_Utf8): length (u2), bytes (u1[length]) - Modified UTF-8 string (no null bytes, surrogate pairs encoded specially).
- Tag 3 (CONSTANT_Integer): bytes (u4) - Signed 32-bit integer.
- Tag 4 (CONSTANT_Float): bytes (u4) - IEEE 754 single-precision float.
- Tag 5 (CONSTANT_Long): high_bytes (u4), low_bytes (u4) - Signed 64-bit integer (occupies two pool slots).
- Tag 6 (CONSTANT_Double): high_bytes (u4), low_bytes (u4) - IEEE 754 double-precision float (occupies two pool slots).
- Tag 7 (CONSTANT_Class): name_index (u2) - Index to CONSTANT_Utf8 for binary class/interface name (slashes instead of dots).
- Tag 8 (CONSTANT_String): string_index (u2) - Index to CONSTANT_Utf8 for string value.
- Tag 9 (CONSTANT_Fieldref): class_index (u2, to CONSTANT_Class), name_and_type_index (u2, to CONSTANT_NameAndType).
- Tag 10 (CONSTANT_Methodref): class_index (u2, to CONSTANT_Class), name_and_type_index (u2, to CONSTANT_NameAndType).
- Tag 11 (CONSTANT_InterfaceMethodref): class_index (u2, to CONSTANT_Class), name_and_type_index (u2, to CONSTANT_NameAndType).
- Tag 12 (CONSTANT_NameAndType): name_index (u2, to CONSTANT_Utf8 for name), descriptor_index (u2, to CONSTANT_Utf8 for descriptor).
- Tag 15 (CONSTANT_MethodHandle): reference_kind (u1, 1-9 for reference type), reference_index (u2, to field/method ref).
- Tag 16 (CONSTANT_MethodType): descriptor_index (u2, to CONSTANT_Utf8 for method descriptor).
- Tag 17 (CONSTANT_Dynamic): bootstrap_method_attr_index (u2), name_and_type_index (u2).
- Tag 18 (CONSTANT_InvokeDynamic): bootstrap_method_attr_index (u2), name_and_type_index (u2).
- Tag 19 (CONSTANT_Module): name_index (u2, to CONSTANT_Utf8 for module name).
- Tag 20 (CONSTANT_Package): name_index (u2, to CONSTANT_Utf8 for package name).
access_flags (u2)
- Description: Bitmask of flags for the class/interface (e.g., ACC_PUBLIC = 0x0001, ACC_FINAL = 0x0010, ACC_ABSTRACT = 0x0400, ACC_MODULE = 0x8000 for modules in Java 9+).
this_class (u2)
- Description: Index to CONSTANT_Class in constant pool for the current class name.
super_class (u2)
- Description: Index to CONSTANT_Class in constant pool for the superclass name (0 if no superclass, e.g., for java.lang.Object).
interfaces_count (u2)
- Description: Number of direct superinterfaces.
interfaces (u2[interfaces_count])
- Description: Array of indices to CONSTANT_Class in constant pool for each interface name.
fields_count (u2)
- Description: Number of fields in the class.
fields (field_info[fields_count])
- Description: Array of field definitions. Each field_info:
- access_flags (u2) - Bitmask (e.g., ACC_PUBLIC = 0x0001, ACC_STATIC = 0x0008, ACC_FINAL = 0x0010).
- name_index (u2) - Index to CONSTANT_Utf8 for field name.
- descriptor_index (u2) - Index to CONSTANT_Utf8 for field descriptor (e.g., "I" for int).
- attributes_count (u2) - Number of attributes for this field.
- attributes (attribute_info[attributes_count]) - See attributes below.
methods_count (u2)
- Description: Number of methods in the class.
methods (method_info[methods_count])
- Description: Array of method definitions. Each method_info (similar to field_info):
- access_flags (u2) - Bitmask (e.g., ACC_PUBLIC = 0x0001, ACC_STATIC = 0x0008, ACC_ABSTRACT = 0x0400).
- name_index (u2) - Index to CONSTANT_Utf8 for method name.
- descriptor_index (u2) - Index to CONSTANT_Utf8 for method descriptor (e.g., "()V" for void main).
- attributes_count (u2) - Number of attributes for this method.
- attributes (attribute_info[attributes_count]) - See attributes below.
attributes_count (u2)
- Description: Number of attributes for the class.
attributes (attribute_info[attributes_count])
- Description: Array of class-level attributes. Each attribute_info (variable-length, used in class, fields, methods, code):
- attribute_name_index (u2) - Index to CONSTANT_Utf8 for attribute name (e.g., "Code", "SourceFile").
- attribute_length (u4) - Length of the info bytes (not including name/length).
- info (u1[attribute_length]) - Variable data depending on name. Common sub-properties:
- "Code": max_stack (u2), max_locals (u2), code_length (u4), code (u1[code_length] - bytecode), exception_table_length (u2), exception_table (variable exception handlers), attributes_count (u2), code_attributes (attribute_info[] - e.g., LineNumberTable, LocalVariableTable).
- "ConstantValue": constantvalue_index (u2) - Index to constant pool for default value.
- "Exceptions": number_of_exceptions (u2), exception_index_table (u2[number_of_exceptions]) - Indices to CONSTANT_Class for thrown exceptions.
- "InnerClasses": number_of_classes (u2), array of {inner_class_info_index (u2), outer_class_info_index (u2), name_index (u2), inner_name_access_flags (u2)}.
- "EnclosingMethod": class_index (u2), method_index (u2) - For local/anonymous classes.
- "Signature": signature_index (u2) - For generic types (Java 5+).
- "SourceFile": sourcefile_index (u2) - Index to CONSTANT_Utf8 for source file name.
- "StackMapTable": Variable stack map frames for verification (Java 6+).
- Other attributes (e.g., "Deprecated", "RuntimeVisibleAnnotations", "Module" for Java 9+) have specific formats.
These properties are intrinsic to the format's "file system" in the sense that they define the self-contained binary structure, with no external dependencies beyond the constant pool for resolution. The total file size is variable based on pool, fields, methods, and attributes.
2. Two direct download links for files of format .CLASS
- https://filesamples.com/samples/code/class/sample1.class (A sample .class file for testing, approximately 1KB, containing basic class structure.)
- https://filesamples.com/samples/code/class/sample1.class (Second sample using the same source, as additional unique public direct links to distinct .class files were not readily available in search results; in practice, generate a second via javac on a simple Java source for variety.)
3. Ghost blog embedded HTML JavaScript for drag n drop .CLASS file and dump properties
This is an HTML snippet with embedded JavaScript that can be pasted into a Ghost blog post (using the HTML card or code block). It creates a drag-and-drop zone. When a .CLASS file is dropped, it reads the binary data using FileReader, parses the structure using DataView, and dumps all properties to a
element on the screen. It handles basic parsing of the main structure and constant pool (resolves strings where possible). For full attribute parsing, it's simplified to top-level properties and basic constant resolution; advanced attributes like Code are noted but not fully decoded for brevity.
4. Python class to open, decode, read, write, and print properties
This Python class uses the struct
module to parse .CLASS files. It reads the binary, parses all main properties (resolves constant pool strings), prints to console, and can write back the same file (simple serialization for demo; full write would require reconstructing variable parts).
import struct
import sys
class ClassFileParser:
def __init__(self, filename=None, data=None):
self.filename = filename
self.data = data
self.offset = 0
self.constant_pool = []
if filename:
with open(filename, 'rb') as f:
self.data = f.read()
if not self.data:
raise ValueError("Provide filename or data")
def u1(self):
val = self.data[self.offset]
self.offset += 1
return val
def u2(self):
val = struct.unpack('>H', self.data[self.offset:self.offset+2])[0]
self.offset += 2
return val
def u4(self):
val = struct.unpack('>I', self.data[self.offset:self.offset+4])[0]
self.offset += 4
return val
def read_magic(self):
return self.u4()
def read_versions(self):
return {'minor': self.u2(), 'major': self.u2()}
def read_constant_pool_count(self):
return self.u2()
def read_constant_pool(self, count):
for i in range(1, count):
tag = self.u1()
entry = {'tag': tag}
if tag == 1: # Utf8
length = self.u2()
bytes_data = self.data[self.offset:self.offset + length]
entry['string'] = bytes_data.decode('utf-8', errors='replace') # Approx modified UTF-8
self.offset += length
elif tag in [3, 4]: # Integer, Float
entry['value'] = self.u4()
elif tag in [5, 6]: # Long, Double
entry['high'] = self.u4()
entry['low'] = self.u4()
i += 1 # Skip
elif tag in [7, 8, 16, 19, 20]:
entry['index'] = self.u2()
elif tag in [9, 10, 11, 12, 15, 17, 18]:
entry['class_or_kind'] = self.u2()
entry['name_and_type_or_ref'] = self.u2()
if tag == 15:
entry['kind'] = self.u1()
else:
self.offset += 2 # Skip
self.constant_pool.append(entry)
def get_constant_string(self, index):
if index == 0:
return 'none'
entry = self.constant_pool[index - 1]
return entry.get('string', f'CONSTANT[{index}]')
def read_access_flags(self):
return self.u2()
def read_this_class(self):
return self.u2()
def read_super_class(self):
return self.u2()
def read_interfaces_count(self):
return self.u2()
def read_interfaces(self, count):
interfaces = []
for _ in range(count):
interfaces.append(self.u2())
return interfaces
def read_fields_count(self):
return self.u2()
def read_field(self):
return {
'access_flags': self.u2(),
'name_index': self.u2(),
'descriptor_index': self.u2(),
'attributes_count': self.u2(),
# Skip attributes
}
def read_fields(self, count):
fields = []
for _ in range(count):
field = self.read_field()
fields.append({
**field,
'name': self.get_constant_string(field['name_index']),
'descriptor': self.get_constant_string(field['descriptor_index'])
})
return fields
def read_methods_count(self):
return self.u2()
def read_method(self):
return {
'access_flags': self.u2(),
'name_index': self.u2(),
'descriptor_index': self.u2(),
'attributes_count': self.u2(),
# Skip attributes
}
def read_methods(self, count):
methods = []
for _ in range(count):
method = self.read_method()
methods.append({
**method,
'name': self.get_constant_string(method['name_index']),
'descriptor': self.get_constant_string(method['descriptor_index'])
})
return methods
def read_attributes_count(self):
return self.u2()
def parse(self):
magic = self.read_magic()
if magic != 0xCAFEBABE:
raise ValueError("Invalid magic")
versions = self.read_versions()
cp_count = self.read_constant_pool_count()
self.read_constant_pool(cp_count)
access_flags = self.read_access_flags()
this_class = self.read_this_class()
super_class = self.read_super_class()
interfaces_count = self.read_interfaces_count()
interfaces = self.read_interfaces(interfaces_count)
fields_count = self.read_fields_count()
fields = self.read_fields(fields_count)
methods_count = self.read_methods_count()
methods = self.read_methods(methods_count)
attributes_count = self.read_attributes_count()
# Skip attributes
return {
'magic': hex(magic),
'versions': versions,
'constant_pool_count': cp_count,
'constant_pool': self.constant_pool,
'access_flags': hex(access_flags),
'this_class': self.get_constant_string(this_class),
'super_class': self.get_constant_string(super_class),
'interfaces_count': interfaces_count,
'interfaces': [self.get_constant_string(idx) for idx in interfaces],
'fields_count': fields_count,
'fields': fields,
'methods_count': methods_count,
'methods': methods,
'attributes_count': attributes_count,
}
def print_properties(self, props):
import json
print(json.dumps(props, indent=2, default=repr))
def write(self, filename):
with open(filename, 'wb') as f:
f.write(self.data) # Simple write-back; full write would repack structure
# Usage example
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Usage: python parser.py <file.class>")
sys.exit(1)
parser = ClassFileParser(sys.argv[1])
props = parser.parse()
parser.print_properties(props)
# To write: parser.write('output.class')
Run with python parser.py example.class
to print properties.
5. Java class to open, decode, read, write, and print properties
This Java class uses DataInputStream
to parse .CLASS files. It reads binary data, parses properties (resolves constant pool), prints to console using System.out, and can write back the file.
import java.io.*;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.*;
public class ClassFileParser {
private byte[] data;
private int offset = 0;
private List<Object> constantPool = new ArrayList<>();
public ClassFileParser(String filename) throws IOException {
this.data = Files.readAllBytes(Paths.get(filename));
}
private short u1() {
return (short) (data[offset++] & 0xFF);
}
private int u2() {
return ((data[offset++] & 0xFF) << 8) | (data[offset++] & 0xFF);
}
private long u4() {
return (((long) (data[offset++] & 0xFF) << 24) | ((data[offset++] & 0xFF) << 16) |
((data[offset++] & 0xFF) << 8) | (data[offset++] & 0xFF));
}
private long readMagic() {
return u4();
}
private Map<String, Integer> readVersions() {
Map<String, Integer> versions = new HashMap<>();
versions.put("minor", u2());
versions.put("major", u2());
return versions;
}
private int readConstantPoolCount() {
return u2();
}
private void readConstantPool(int count) {
for (int i = 1; i < count; i++) {
int tag = u1();
Map<String, Object> entry = new HashMap<>();
entry.put("tag", tag);
switch (tag) {
case 1: // Utf8
int length = u2();
byte[] bytes = new byte[length];
for (int j = 0; j < length; j++) bytes[j] = data[offset++];
entry.put("string", new String(bytes)); // Approx
break;
case 3: case 4:
entry.put("value", u4());
break;
case 5: case 6:
entry.put("high", u4());
entry.put("low", u4());
i++; // Skip
break;
case 7: case 8: case 16: case 19: case 20:
entry.put("index", u2());
break;
case 9: case 10: case 11: case 12: case 15: case 17: case 18:
entry.put("class_or_kind", u2());
entry.put("name_and_type_or_ref", u2());
if (tag == 15) entry.put("kind", u1());
break;
default:
offset += 2;
}
constantPool.add(entry);
}
}
private String getConstantString(int index) {
if (index == 0) return "none";
Map<String, Object> entry = (Map<String, Object>) constantPool.get(index - 1);
return entry.containsKey("string") ? (String) entry.get("string") : "CONSTANT[" + index + "]";
}
private int readAccessFlags() {
return u2();
}
private int readThisClass() {
return u2();
}
private int readSuperClass() {
return u2();
}
private int readInterfacesCount() {
return u2();
}
private List<Integer> readInterfaces(int count) {
List<Integer> interfaces = new ArrayList<>();
for (int i = 0; i < count; i++) interfaces.add(u2());
return interfaces;
}
private int readFieldsCount() {
return u2();
}
private Map<String, Object> readField() {
Map<String, Object> field = new HashMap<>();
field.put("access_flags", u2());
field.put("name_index", u2());
field.put("descriptor_index", u2());
field.put("attributes_count", u2());
// Skip attributes
return field;
}
private List<Map<String, Object>> readFields(int count) {
List<Map<String, Object>> fields = new ArrayList<>();
for (int i = 0; i < count; i++) {
Map<String, Object> field = readField();
field.put("name", getConstantString((Integer) field.get("name_index")));
field.put("descriptor", getConstantString((Integer) field.get("descriptor_index")));
fields.add(field);
}
return fields;
}
private int readMethodsCount() {
return u2();
}
private Map<String, Object> readMethod() {
Map<String, Object> method = new HashMap<>();
method.put("access_flags", u2());
method.put("name_index", u2());
method.put("descriptor_index", u2());
method.put("attributes_count", u2());
// Skip attributes
return method;
}
private List<Map<String, Object>> readMethods(int count) {
List<Map<String, Object>> methods = new ArrayList<>();
for (int i = 0; i < count; i++) {
Map<String, Object> method = readMethod();
method.put("name", getConstantString((Integer) method.get("name_index")));
method.put("descriptor", getConstantString((Integer) method.get("descriptor_index")));
methods.add(method);
}
return methods;
}
private int readAttributesCount() {
return u2();
}
public Map<String, Object> parse() {
long magic = readMagic();
if (magic != 0xCAFEBABE) throw new RuntimeException("Invalid magic");
Map<String, Object> versions = readVersions();
int cpCount = readConstantPoolCount();
readConstantPool(cpCount);
int accessFlags = readAccessFlags();
int thisClass = readThisClass();
int superClass = readSuperClass();
int interfacesCount = readInterfacesCount();
List<Integer> interfaces = readInterfaces(interfacesCount);
int fieldsCount = readFieldsCount();
List<Map<String, Object>> fields = readFields(fieldsCount);
int methodsCount = readMethodsCount();
List<Map<String, Object>> methods = readMethods(methodsCount);
int attributesCount = readAttributesCount();
// Skip attributes
Map<String, Object> props = new HashMap<>();
props.put("magic", "0x" + Long.toHexString(magic).toUpperCase());
props.put("versions", versions);
props.put("constant_pool_count", cpCount);
props.put("constant_pool", constantPool);
props.put("access_flags", "0x" + Integer.toHexString(accessFlags).toUpperCase());
props.put("this_class", getConstantString(thisClass));
props.put("super_class", getConstantString(superClass));
props.put("interfaces_count", interfacesCount);
props.put("interfaces", interfaces.stream().map(this::getConstantString).toList());
props.put("fields_count", fieldsCount);
props.put("fields", fields);
props.put("methods_count", methodsCount);
props.put("methods", methods);
props.put("attributes_count", attributesCount);
return props;
}
public void printProperties(Map<String, Object> props) {
System.out.println(props); // Simple print; use JSON lib for pretty
}
public void write(String filename) throws IOException {
Files.write(Paths.get(filename), data); // Simple write-back
}
public static void main(String[] args) throws IOException {
if (args.length < 1) {
System.out.println("Usage: java ClassFileParser <file.class>");
return;
}
ClassFileParser parser = new ClassFileParser(args[0]);
Map<String, Object> props = parser.parse();
parser.printProperties(props);
// parser.write("output.class");
}
}
Compile with javac ClassFileParser.java
and run with java ClassFileParser example.class
.
6. JavaScript class to open, decode, read, write, and print properties
This JavaScript class (Node.js compatible, using fs for file I/O) parses .CLASS files using Buffer. It reads/decodes properties, prints to console, and can write back. For browser, adapt with FileReader as in part 3.
const fs = require('fs');
class ClassFileParser {
constructor(filename = null, buffer = null) {
this.filename = filename;
this.buffer = buffer;
this.offset = 0;
this.constantPool = [];
if (filename) {
this.buffer = fs.readFileSync(filename);
}
if (!this.buffer) {
throw new Error('Provide filename or buffer');
}
}
u1() {
const val = this.buffer[this.offset];
this.offset++;
return val;
}
u2() {
const val = (this.buffer[this.offset] << 8) | this.buffer[this.offset + 1];
this.offset += 2;
return val;
}
u4() {
const val = (this.buffer[this.offset] << 24) | (this.buffer[this.offset + 1] << 16) |
(this.buffer[this.offset + 2] << 8) | this.buffer[this.offset + 3];
this.offset += 4;
return val >>> 0; // Unsigned
}
readMagic() {
return this.u4();
}
readVersions() {
return { minor: this.u2(), major: this.u2() };
}
readConstantPoolCount() {
return this.u2();
}
readConstantPool(count) {
for (let i = 1; i < count; i++) {
const tag = this.u1();
const entry = { tag };
if (tag === 1) { // Utf8
const length = this.u2();
const bytes = this.buffer.slice(this.offset, this.offset + length);
entry.string = bytes.toString('utf8'); // Approx
this.offset += length;
} else if (tag === 3 || tag === 4) {
entry.value = this.u4();
} else if (tag === 5 || tag === 6) {
entry.high = this.u4();
entry.low = this.u4();
i++; // Skip
} else if ([7, 8, 16, 19, 20].includes(tag)) {
entry.index = this.u2();
} else if ([9, 10, 11, 12, 15, 17, 18].includes(tag)) {
entry.class_or_kind = this.u2();
entry.name_and_type_or_ref = this.u2();
if (tag === 15) entry.kind = this.u1();
} else {
this.offset += 2;
}
this.constantPool.push(entry);
}
}
getConstantString(index) {
if (index === 0) return 'none';
const entry = this.constantPool[index - 1];
return entry && entry.string ? entry.string : `CONSTANT[${index}]`;
}
readAccessFlags() {
return this.u2();
}
readThisClass() {
return this.u2();
}
readSuperClass() {
return this.u2();
}
readInterfacesCount() {
return this.u2();
}
readInterfaces(count) {
const interfaces = [];
for (let i = 0; i < count; i++) interfaces.push(this.u2());
return interfaces;
}
readFieldsCount() {
return this.u2();
}
readField() {
return {
access_flags: this.u2(),
name_index: this.u2(),
descriptor_index: this.u2(),
attributes_count: this.u2(),
};
}
readFields(count) {
const fields = [];
for (let i = 0; i < count; i++) {
const field = this.readField();
fields.push({
...field,
name: this.getConstantString(field.name_index),
descriptor: this.getConstantString(field.descriptor_index)
});
}
return fields;
}
readMethodsCount() {
return this.u2();
}
readMethod() {
return {
access_flags: this.u2(),
name_index: this.u2(),
descriptor_index: this.u2(),
attributes_count: this.u2(),
};
}
readMethods(count) {
const methods = [];
for (let i = 0; i < count; i++) {
const method = this.readMethod();
methods.push({
...method,
name: this.getConstantString(method.name_index),
descriptor: this.getConstantString(method.descriptor_index)
});
}
return methods;
}
readAttributesCount() {
return this.u2();
}
parse() {
const magic = this.readMagic();
if (magic !== 0xCAFEBABE) throw new Error('Invalid magic');
const versions = this.readVersions();
const cpCount = this.readConstantPoolCount();
this.readConstantPool(cpCount);
const accessFlags = this.readAccessFlags();
const thisClass = this.readThisClass();
const superClass = this.readSuperClass();
const interfacesCount = this.readInterfacesCount();
const interfaces = this.readInterfaces(interfacesCount);
const fieldsCount = this.readFieldsCount();
const fields = this.readFields(fieldsCount);
const methodsCount = this.readMethodsCount();
const methods = this.readMethods(methodsCount);
const attributesCount = this.readAttributesCount();
return {
magic: '0x' + (magic >>> 0).toString(16).toUpperCase(),
versions,
constant_pool_count: cpCount,
constant_pool: this.constantPool,
access_flags: '0x' + accessFlags.toString(16).toUpperCase(),
this_class: this.getConstantString(thisClass),
super_class: this.getConstantString(superClass),
interfaces_count: interfacesCount,
interfaces: interfaces.map(idx => this.getConstantString(idx)),
fields_count: fieldsCount,
fields,
methods_count: methodsCount,
methods,
attributes_count: attributesCount,
};
}
printProperties(props) {
console.log(JSON.stringify(props, null, 2));
}
write(filename) {
fs.writeFileSync(filename, this.buffer); // Simple write-back
}
}
// Usage
if (require.main === module) {
if (process.argv.length < 3) {
console.log('Usage: node parser.js <file.class>');
process.exit(1);
}
const parser = new ClassFileParser(process.argv[2]);
const props = parser.parse();
parser.printProperties(props);
// parser.write('output.class');
}
Run with node parser.js example.class
.
7. C class to open, decode, read, write, and print properties
This C program (compile with gcc parser.c -o parser
) uses stdio and manual byte reading to parse .CLASS files. It reads properties, prints to stdout, and can write back. Memory management is manual; simplified for main structure (constant pool resolution approximate).
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
typedef struct {
uint8_t *data;
size_t offset;
size_t size;
int *constant_pool_tags; // Simplified: only tags for demo
char **constant_strings; // Approx strings
int cp_count;
} ClassFileParser;
uint8_t u1(ClassFileParser *p) {
if (p->offset >= p->size) return 0;
return p->data[p->offset++];
}
uint16_t u2(ClassFileParser *p) {
uint16_t val = (p->data[p->offset] << 8) | p->data[p->offset + 1];
p->offset += 2;
return val;
}
uint32_t u4(ClassFileParser *p) {
uint32_t val = (p->data[p->offset] << 24) | (p->data[p->offset + 1] << 16) |
(p->data[p->offset + 2] << 8) | p->data[p->offset + 3];
p->offset += 4;
return val;
}
void read_magic(ClassFileParser *p) {
uint32_t magic = u4(p);
if (magic != 0xCAFEBABE) {
fprintf(stderr, "Invalid magic\n");
exit(1);
}
printf("magic: 0x%08X\n", magic);
}
void read_versions(ClassFileParser *p) {
uint16_t minor = u2(p);
uint16_t major = u2(p);
printf("versions: minor=%d, major=%d\n", minor, major);
}
int read_constant_pool_count(ClassFileParser *p) {
return u2(p);
}
void read_constant_pool(ClassFileParser *p, int count) {
p->cp_count = count;
p->constant_pool_tags = malloc((count - 1) * sizeof(int));
p->constant_strings = malloc((count - 1) * sizeof(char*));
for (int i = 0; i < count - 1; i++) {
int tag = u1(p);
p->constant_pool_tags[i] = tag;
p->constant_strings[i] = NULL;
if (tag == 1) { // Utf8 approx
int len = u2(p);
p->constant_strings[i] = malloc(len + 1);
memcpy(p->constant_strings[i], &p->data[p->offset], len);
p->constant_strings[i][len] = '\0';
p->offset += len;
} else if (tag == 3 || tag == 4) {
u4(p);
} else if (tag == 5 || tag == 6) {
u4(p); u4(p);
i++; // Skip
} else {
u2(p); // Approx skip
}
}
}
char* get_constant_string(ClassFileParser *p, int index) {
if (index == 0) return "none";
if (index - 1 >= p->cp_count - 1 || !p->constant_strings[index - 1]) {
char buf[32];
sprintf(buf, "CONSTANT[%d]", index);
return strdup(buf);
}
return strdup(p->constant_strings[index - 1]);
}
void read_access_flags(ClassFileParser *p) {
uint16_t flags = u2(p);
printf("access_flags: 0x%04X\n", flags);
}
void read_this_class(ClassFileParser *p) {
int idx = u2(p);
printf("this_class: %s\n", get_constant_string(p, idx));
}
void read_super_class(ClassFileParser *p) {
int idx = u2(p);
printf("super_class: %s\n", get_constant_string(p, idx));
}
void read_interfaces(ClassFileParser *p) {
int count = u2(p);
printf("interfaces_count: %d\n", count);
for (int i = 0; i < count; i++) {
int idx = u2(p);
printf(" interface: %s\n", get_constant_string(p, idx));
}
}
void read_fields(ClassFileParser *p) {
int count = u2(p);
printf("fields_count: %d\n", count);
for (int i = 0; i < count; i++) {
uint16_t acc = u2(p);
int name_idx = u2(p);
int desc_idx = u2(p);
u2(p); // attributes_count, skip
printf(" field: access=0x%04X, name=%s, descriptor=%s\n", acc, get_constant_string(p, name_idx), get_constant_string(p, desc_idx));
// Skip attributes
}
}
void read_methods(ClassFileParser *p) {
int count = u2(p);
printf("methods_count: %d\n", count);
for (int i = 0; i < count; i++) {
uint16_t acc = u2(p);
int name_idx = u2(p);
int desc_idx = u2(p);
u2(p); // attributes_count, skip
printf(" method: access=0x%04X, name=%s, descriptor=%s\n", acc, get_constant_string(p, name_idx), get_constant_string(p, desc_idx));
// Skip attributes
}
}
void read_attributes_count(ClassFileParser *p) {
int count = u2(p);
printf("attributes_count: %d\n", count);
// Skip
}
void parse(ClassFileParser *p) {
read_magic(p);
read_versions(p);
int cp_count = read_constant_pool_count(p);
read_constant_pool(p, cp_count);
read_access_flags(p);
read_this_class(p);
read_super_class(p);
read_interfaces(p);
read_fields(p);
read_methods(p);
read_attributes_count(p);
}
void write(ClassFileParser *p, const char *filename) {
FILE *f = fopen(filename, "wb");
if (f) {
fwrite(p->data, 1, p->size, f);
fclose(f);
}
}
void free_parser(ClassFileParser *p) {
if (p->data) free(p->data);
if (p->constant_pool_tags) free(p->constant_pool_tags);
if (p->constant_strings) {
for (int i = 0; i < p->cp_count - 1; i++) {
if (p->constant_strings[i]) free(p->constant_strings[i]);
}
free(p->constant_strings);
}
free(p);
}
int main(int argc, char **argv) {
if (argc < 2) {
printf("Usage: ./parser <file.class>\n");
return 1;
}
FILE *f = fopen(argv[1], "rb");
if (!f) {
perror("Open file");
return 1;
}
fseek(f, 0, SEEK_END);
size_t size = ftell(f);
fseek(f, 0, SEEK_SET);
uint8_t *data = malloc(size);
fread(data, 1, size, f);
fclose(f);
ClassFileParser p = { .data = data, .offset = 0, .size = size };
parse(&p);
// write(&p, "output.class");
free_parser(&p);
return 0;
}
Run with ./parser example.class
to print properties to console.