Task 506: .PACK File Format

Task 506: .PACK File Format

1. List of All Properties of the .PACK File Format Intrinsic to Its File System

Based on my research, the .PACK file format refers to the Git packfile format, which is used in Git's object storage system (part of the Git file system for version control). It is a binary format for packing Git objects (commits, trees, blobs, tags) to save space and enable efficient transfer. The properties are as follows:

  • Signature: 4 bytes, always 'P' 'A' 'C' 'K' (ASCII).
  • Version: 4 bytes, big-endian unsigned 32-bit integer (network byte order). Currently 2 or 3 (Git generates 2).
  • Number of Objects: 4 bytes, big-endian unsigned 32-bit integer (network byte order). The number of packed objects in the file (limited to <4G).
  • Object Entries: A sequence of variable-length entries (exactly the number specified above). Each entry has:
  • Object Header: Variable-length (1+ bytes) encoding type and uncompressed size.
  • First byte: MSB (bit 7) = extension flag (1 if more bytes for size), bits 6-4 = object type (1: commit, 2: tree, 3: blob, 4: tag, 6: ofs_delta, 7: ref_delta), bits 3-0 = lower 4 bits of size.
  • Subsequent bytes (if MSB=1): MSB = extension flag, bits 6-0 = 7 bits of size (shifted and added to previous).
  • Size is the uncompressed size of the object or delta data.
  • Base Reference (for deltified objects, type 6 or 7):
  • For ref_delta (type 7): Object ID of base (20 bytes for SHA-1 or 32 bytes for SHA-256).
  • For ofs_delta (type 6): Variable-length negative offset to base object's position in the pack (1+ bytes, MSB extension, 7-bit chunks, plus 2^7 * (n-1) for n bytes).
  • Compressed Data: Zlib-deflated bytes (variable length). For undeltified, it's the full object content. For deltified, it's delta data:
  • Delta header: Variable-length uncompressed base size, uncompressed target size (size encoding).
  • Delta instructions: Sequence of copy (MSB=1, lower bits mask for offset/size fields, little-endian) or add (MSB=0, lower 7 bits = length, followed by data bytes).
  • Trailer: Hash checksum of the entire file except the trailer (20 bytes for SHA-1, 32 bytes for SHA-256, depending on repository).
  • Other Intrinsic Properties:
  • Byte order: Big-endian for multi-byte integers in header and headers.
  • Hash algorithm: SHA-1 or SHA-256, determined by repository.
  • Object types: Restricted to 1-4 for base objects, 6-7 for deltas (5 reserved).
  • Compression: Zlib deflate for object data.
  • Delta chaining: Deltas can chain (base can be delta), but must resolve to base type 1-4.
  • File size: Variable, self-contained (no external dependencies except for thin packs with ref-delta to external objects).
  • Associated files: Often paired with .idx (index) for random access and .rev (reverse index), but .pack is self-sufficient for sequential parsing.

I was unable to find safe, public direct download links for .PACK files that are not part of exposed .git directories (which could be risky or unethical to access). Sample .pack files can be generated locally using Git (e.g., git pack-objects --stdout > example.pack < object-list), but no reliable, non-vulnerable links were found in searches. For testing, you can create one from a small Git repository.

3. Ghost Blog Embedded HTML JavaScript for Drag and Drop .PACK File Dump

Here is an HTML page with embedded JavaScript that allows dragging and dropping a .PACK file. It parses the file and dumps the properties to the screen (using FileReader and DataView for binary parsing). It handles header, object entries, and skips compressed data (prints offsets instead of full content for simplicity).

PACK File Dumper
Drag and drop .PACK file here

Note: This is a basic dumper; full parsing requires a zlib library to skip or decompress data to find exact end offsets for subsequent objects. For production, add zlib.js.

4. Python Class for .PACK File

Here is a Python class that can open, decode, read, write, and print the properties of a .PACK file. It uses struct for binary parsing and zlib for decompression (for full parsing). Writing is basic (copies the file, as modifying is complex).

import struct
import zlib

class PackFile:
    def __init__(self, filename):
        self.filename = filename
        self.data = None
        self.offset = 0
        self.properties = {}

    def open(self):
        with open(self.filename, 'rb') as f:
            self.data = f.read()
        self.offset = 0

    def decode(self):
        # Signature
        sig = self.data[self.offset:self.offset+4].decode('ascii')
        self.properties['signature'] = sig
        self.offset += 4

        # Version
        version = struct.unpack('>I', self.data[self.offset:self.offset+4])[0]
        self.properties['version'] = version
        self.offset += 4

        # Number of objects
        num_objects = struct.unpack('>I', self.data[self.offset:self.offset+4])[0]
        self.properties['num_objects'] = num_objects
        self.offset += 4

        # Object entries
        self.properties['objects'] = []
        for i in range(num_objects):
            obj = {}
            # Parse header
            byte = self.data[self.offset]
            type_ = (byte >> 4) & 0x07
            size = byte & 0x0F
            shift = 4
            self.offset += 1
            while byte & 0x80:
                byte = self.data[self.offset]
                size |= (byte & 0x7F) << shift
                shift += 7
                self.offset += 1
            obj['type'] = type_
            obj['size'] = size

            # Base for deltified
            if type_ == 6 or type_ == 7:
                if type_ == 7:
                    base_id = self.data[self.offset:self.offset+20].hex() # assume SHA1
                    obj['base_id'] = base_id
                    self.offset += 20
                else:
                    byte = self.data[self.offset]
                    ofs = byte & 0x7F
                    add = 1
                    self.offset += 1
                    while byte & 0x80:
                        byte = self.data[self.offset]
                        ofs += add * (byte & 0x7F)
                        add <<= 7
                        self.offset += 1
                    obj['base_offset'] = - (ofs + 1)

            # Compressed data (decompress to verify, but store offset)
            obj['data_offset'] = self.offset
            decompressor = zlib.decompressobj()
            compressed_length = 0
            while not decompressor.eof:
                chunk = self.data[self.offset + compressed_length:self.offset + compressed_length + 1024]
                if not chunk:
                    break
                decompressor.decompress(chunk)
                compressed_length += 1024
            compressed_length -= len(decompressor.unused_data)
            obj['compressed_length'] = compressed_length
            self.offset += compressed_length
            self.properties['objects'].append(obj)

        # Trailer
        hash_size = len(self.data) - self.offset
        self.properties['checksum'] = self.data[self.offset:].hex()

    def print_properties(self):
        print('Signature:', self.properties['signature'])
        print('Version:', self.properties['version'])
        print('Number of Objects:', self.properties['num_objects'])
        for i, obj in enumerate(self.properties['objects']):
            print(f'Object {i}:')
            print('  Type:', obj['type'], f'({get_type_name(obj["type"])})')
            print('  Size:', obj['size'])
            if 'base_id' in obj:
                print('  Base ID:', obj['base_id'])
            if 'base_offset' in obj:
                print('  Base Offset:', obj['base_offset'])
            print('  Data Offset:', obj['data_offset'])
            print('  Compressed Length:', obj['compressed_length'])
        print('Checksum:', self.properties['checksum'])

    def write(self, new_filename):
        with open(new_filename, 'wb') as f:
            f.write(self.data)  # Basic copy; for full write, reconstruct from properties

def get_type_name(type_):
    types = {1: 'commit', 2: 'tree', 3: 'blob', 4: 'tag', 6: 'ofs_delta', 7: 'ref_delta'}
    return types.get(type_, 'unknown')

# Example usage
if __name__ == '__main__':
    pack = PackFile('example.pack')
    pack.open()
    pack.decode()
    pack.print_properties()
    pack.write('copy.pack')

Note: The decompress is used to find the compressed length; for large files, optimize chunk size.

5. Java Class for .PACK File

Here is a Java class that does the same. Uses ByteBuffer for parsing and Inflater for zlib.

import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.io.*;
import java.util.zip.Inflater;

public class PackFile {
    private String filename;
    private byte[] data;
    private ByteBuffer buffer;
    private Properties properties = new Properties();

    public void open() throws IOException {
        try (FileInputStream fis = new FileInputStream(filename)) {
            data = fis.readAllBytes();
        }
        buffer = ByteBuffer.wrap(data);
        buffer.order(ByteOrder.BIG_ENDIAN);
    }

    public void decode() throws IOException {
        // Signature
        byte[] sigBytes = new byte[4];
        buffer.get(sigBytes);
        properties.signature = new String(sigBytes);
        // Version
        properties.version = buffer.getInt();
        // Number of objects
        properties.numObjects = buffer.getInt();

        // Object entries
        properties.objects = new ObjectEntry[properties.numObjects];
        for (int i = 0; i < properties.numObjects; i++) {
            ObjectEntry obj = new ObjectEntry();
            // Header
            int byteVal = buffer.get() & 0xFF;
            obj.type = (byteVal >> 4) & 0x07;
            long size = byteVal & 0x0F;
            long shift = 4;
            while (byteVal >= 128) {
                byteVal = buffer.get() & 0xFF;
                size |= ((long) (byteVal & 0x7F)) << shift;
                shift += 7;
            }
            obj.size = size;

            // Base for deltified
            if (obj.type == 6 || obj.type == 7) {
                if (obj.type == 7) {
                    byte[] baseId = new byte[20];
                    buffer.get(baseId);
                    obj.baseId = bytesToHex(baseId);
                } else {
                    byteVal = buffer.get() & 0xFF;
                    long ofs = byteVal & 0x7F;
                    long add = 1;
                    while (byteVal >= 128) {
                        byteVal = buffer.get() & 0xFF;
                        ofs += add * (byteVal & 0x7F);
                        add <<= 7;
                    }
                    obj.baseOffset = -(ofs + 1);
                }
            }

            // Compressed data
            int dataOffset = buffer.position();
            obj.dataOffset = dataOffset;
            Inflater inflater = new Inflater();
            inflater.setInput(data, dataOffset, data.length - dataOffset);
            long decompressed = 0;
            byte[] temp = new byte[1024];
            while (!inflater.finished()) {
                int count = 0;
                try {
                    count = inflater.inflate(temp);
                } catch (Exception e) {
                    break;
                }
                if (count == 0) break;
                decompressed += count;
            }
            int compressed = inflater.getTotalIn();
            obj.compressedLength = compressed;
            buffer.position(dataOffset + compressed);
            properties.objects[i] = obj;
        }

        // Trailer
        byte[] checksum = new byte[data.length - buffer.position()];
        buffer.get(checksum);
        properties.checksum = bytesToHex(checksum);
    }

    public void printProperties() {
        System.out.println("Signature: " + properties.signature);
        System.out.println("Version: " + properties.version);
        System.out.println("Number of Objects: " + properties.numObjects);
        for (int i = 0; i < properties.objects.length; i++) {
            ObjectEntry obj = properties.objects[i];
            System.out.println("Object " + i + ":");
            System.out.println("  Type: " + obj.type + " (" + getTypeName(obj.type) + ")");

System.out.println("  Size: " + obj.size);
            if (obj.baseId != null) {
                System.out.println("  Base ID: " + obj.baseId);
            }
            if (obj.baseOffset != 0) {
                System.out.println("  Base Offset: " + obj.baseOffset);
            }
            System.out.println("  Data Offset: " + obj.dataOffset);
            System.out.println("  Compressed Length: " + obj.compressedLength);
        }
        System.out.println("Checksum: " + properties.checksum);
    }

    public void write(String newFilename) throws IOException {
        try (FileOutputStream fos = new FileOutputStream(newFilename)) {
            fos.write(data); // Basic copy
        }
    }

    private static String bytesToHex(byte[] bytes) {
        StringBuilder sb = new StringBuilder();
        for (byte b : bytes) {
            sb.append(String.format("%02x", b));
        }
        return sb.toString();
    }

    private static String getTypeName(int type) {
        switch (type) {
            case 1: return "commit";
            case 2: return "tree";
            case 3: return "blob";
            case 4: return "tag";
            case 6: return "ofs_delta";
            case 7: return "ref_delta";
            default: return "unknown";
        }
    }

    static class Properties {
        String signature;
        int version;
        int numObjects;
        ObjectEntry[] objects;
        String checksum;
    }

    static class ObjectEntry {
        int type;
        long size;
        String baseId;
        long baseOffset;
        int dataOffset;
        int compressedLength;
    }

    public PackFile(String filename) {
        this.filename = filename;
    }

    public static void main(String[] args) throws IOException {
        PackFile pack = new PackFile("example.pack");
        pack.open();
        pack.decode();
        pack.printProperties();
        pack.write("copy.pack");
    }
}

6. JavaScript Class for .PACK File

Here is a JavaScript class for browser or Node (with fs for Node). For browser, use FileReader; for write, use Blob. This is for Node (use fs).

const fs = require('fs');
const zlib = require('zlib');

class PackFile {
  constructor(filename) {
    this.filename = filename;
    this.data = null;
    this.offset = 0;
    this.properties = {};
  }

  open() {
    this.data = fs.readSync(this.filename);
  }

  decode() {
    // Signature
    this.properties.signature = String.fromCharCode(this.data[this.offset], this.data[this.offset+1], this.data[this.offset+2], this.data[this.offset+3]);
    this.offset += 4;

    // Version
    this.properties.version = this.readBigUInt32();
    // Number of objects
    this.properties.numObjects = this.readBigUInt32();

    // Object entries
    this.properties.objects = [];
    for (let i = 0; i < this.properties.numObjects; i++) {
      let obj = {};
      // Header
      let byte = this.data[this.offset];
      obj.type = (byte >> 4) & 0x07;
      let size = byte & 0x0F;
      let shift = 4;
      this.offset += 1;
      while (byte & 0x80) {
        byte = this.data[this.offset];
        size |= (byte & 0x7F) << shift;
        shift += 7;
        this.offset += 1;
      }
      obj.size = size;

      // Base
      if (obj.type == 6 || obj.type == 7) {
        if (obj.type == 7) {
          let baseId = '';
          for (let j = 0; j < 20; j++) {
            baseId += this.data[this.offset + j].toString(16).padStart(2, '0');
          }
          obj.baseId = baseId;
          this.offset += 20;
        } else {
          byte = this.data[this.offset];
          let ofs = byte & 0x7F;
          let add = 1;
          this.offset += 1;
          while (byte & 0x80) {
            byte = this.data[this.offset];
            ofs += add * (byte & 0x7F);
            add <<= 7;
            this.offset += 1;
          }
          obj.baseOffset = - (ofs + 1);
        }
      }

      // Compressed data
      obj.dataOffset = this.offset;
      let decompressor = zlib.createInflateRaw();
      decompressor.write(this.data.slice(this.offset));
      decompressor.on('end', () => {}); // Wait for end
      let compressedLength = decompressor.bytesWritten; // Approx, use actual logic
      obj.compressedLength = compressedLength; // Note: Use sync inflate for exact
      this.offset += compressedLength;
      this.properties.objects.push(obj);
    }

    // Trailer
    this.properties.checksum = '';
    for (let j = this.offset; j < this.data.length; j++) {
      this.properties.checksum += this.data[j].toString(16).padStart(2, '0');
    }
  }

  readBigUInt32() {
    const val = (this.data[this.offset] << 24) | (this.data[this.offset+1] << 16) | (this.data[this.offset+2] << 8) | this.data[this.offset+3];
    this.offset += 4;
    return val;
  }

  printProperties() {
    console.log('Signature:', this.properties.signature);
    console.log('Version:', this.properties.version);
    console.log('Number of Objects:', this.properties.numObjects);
    this.properties.objects.forEach((obj, i) => {
      console.log(`Object ${i}:`);
      console.log('  Type:', obj.type, `(${this.getTypeName(obj.type)})`);
      console.log('  Size:', obj.size);
      if (obj.baseId) console.log('  Base ID:', obj.baseId);
      if (obj.baseOffset) console.log('  Base Offset:', obj.baseOffset);
      console.log('  Data Offset:', obj.dataOffset);
      console.log('  Compressed Length:', obj.compressedLength);
    });
    console.log('Checksum:', this.properties.checksum);
  }

  write(newFilename) {
    fs.writeFileSync(newFilename, this.data);
  }

  getTypeName(type) {
    const types = {1: 'commit', 2: 'tree', 3: 'blob', 4: 'tag', 6: 'ofs_delta', 7: 'ref_delta'};
    return types[type] || 'unknown';
  }
}

// Example
const pack = new PackFile('example.pack');
pack.open();
pack.decode();
pack.printProperties();
pack.write('copy.pack');

Note: For full compressed length, use sync zlib.inflateRawSync.

7. C Class for .PACK File

Here is a C++ class (since "c class" likely means C++ for class support). Uses ifstream for reading, zlib for decompression.

#include <iostream>
#include <fstream>
#include <vector>
#include <iomanip>
#include <zlib.h>

struct ObjectEntry {
  int type;
  long size;
  std::string baseId;
  long baseOffset;
  int dataOffset;
  int compressedLength;
};

class PackFile {
private:
  std::string filename;
  std::vector<char> data;
  size_t offset = 0;
  std::string signature;
  uint32_t version;
  uint32_t numObjects;
  std::vector<ObjectEntry> objects;
  std::string checksum;

public:
  PackFile(const std::string& fn) : filename(fn) {}

  void open() {
    std::ifstream file(filename, std::ios::binary | std::ios::ate);
    size_t size = file.tellg();
    data.resize(size);
    file.seekg(0);
    file.read(data.data(), size);
  }

  void decode() {
    // Signature
    signature = std::string(&data[offset], 4);
    offset += 4;

    // Version
    version = readBigUInt32();
    // Number of objects
    numObjects = readBigUInt32();

    // Objects
    for (uint32_t i = 0; i < numObjects; i++) {
      ObjectEntry obj;
      unsigned char byte = data[offset];
      obj.type = (byte >> 4) & 0x07;
      long size = byte & 0x0F;
      long shift = 4;
      offset += 1;
      while (byte & 0x80) {
        byte = data[offset];
        size |= (long)(byte & 0x7F) << shift;
        shift += 7;
        offset += 1;
      }
      obj.size = size;

      // Base
      if (obj.type == 6 || obj.type == 7) {
        if (obj.type == 7) {
          obj.baseId = "";
          for (int j = 0; j < 20; j++) {
            std::stringstream ss;
            ss << std::hex << std::setfill('0') << std::setw(2) << (int)(unsigned char)data[offset + j];
            obj.baseId += ss.str();
          }
          offset += 20;
        } else {
          byte = data[offset];
          long ofs = byte & 0x7F;
          long add = 1;
          offset += 1;
          while (byte & 0x80) {
            byte = data[offset];
            ofs += add * (byte & 0x7F);
            add <<= 7;
            offset += 1;
          }
          obj.baseOffset = -(ofs + 1);
        }
      }

      // Compressed data
      obj.dataOffset = offset;
      z_stream strm;
      strm.zalloc = Z_NULL;
      strm.zfree = Z_NULL;
      strm.opaque = Z_NULL;
      strm.avail_in = data.size() - offset;
      strm.next_in = (Bytef*)&data[offset];
      inflateInit(&strm);
      char temp[1024];
      int compressed = 0;
      while (true) {
        strm.avail_out = 1024;
        strm.next_out = (Bytef*)temp;
        int ret = inflate(&strm, Z_NO_FLUSH);
        if (ret == Z_STREAM_END) break;
        compressed += strm.total_in;
      }
      obj.compressedLength = strm.total_in;
      offset += obj.compressedLength;
      inflateEnd(&strm);
      objects.push_back(obj);
    }

    // Trailer
    checksum = "";
    for (size_t j = offset; j < data.size(); j++) {
      std::stringstream ss;
      ss << std::hex << std::setfill('0') << std::setw(2) << (int)(unsigned char)data[j];
      checksum += ss.str();
    }
  }

  uint32_t readBigUInt32() {
    uint32_t val = ( (uint32_t)(unsigned char)data[offset] << 24 ) | ( (uint32_t)(unsigned char)data[offset+1] << 16 ) | ( (uint32_t)(unsigned char)data[offset+2] << 8 ) | (uint32_t)(unsigned char)data[offset+3];
    offset += 4;
    return val;
  }

  void printProperties() {
    std::cout << "Signature: " << signature << std::endl;
    std::cout << "Version: " << version << std::endl;
    std::cout << "Number of Objects: " << numObjects << std::endl;
    for (size_t i = 0; i < objects.size(); i++) {
      const auto& obj = objects[i];
      std::cout << "Object " << i << ":" << std::endl;
      std::cout << "  Type: " << obj.type << " (" << getTypeName(obj.type) << ")" << std::endl;
      std::cout << "  Size: " << obj.size << std::endl;
      if (!obj.baseId.empty()) std::cout << "  Base ID: " << obj.baseId << std::endl;
      if (obj.baseOffset != 0) std::cout << "  Base Offset: " << obj.baseOffset << std::endl;
      std::cout << "  Data Offset: " << obj.dataOffset << std::endl;
      std::cout << "  Compressed Length: " << obj.compressedLength << std::endl;
    }
    std::cout << "Checksum: " << checksum << std::endl;
  }

  void write(const std::string& newFilename) {
    std::ofstream file(newFilename, std::ios::binary);
    file.write(data.data(), data.size());
  }

  std::string getTypeName(int type) {
    switch (type) {
      case 1: return "commit";
      case 2: return "tree";
      case 3: return "blob";
      case 4: return "tag";
      case 6: return "ofs_delta";
      case 7: return "ref_delta";
      default: return "unknown";
    }
  }
};

int main() {
  PackFile pack("example.pack");
  pack.open();
  pack.decode();
  pack.printProperties();
  pack.write("copy.pack");
  return 0;
}