Task 109: .CR2 File Format

Task 109: .CR2 File Format

.CR2 File Format Specifications

The .CR2 file format represents Canon's RAW image format (version 2), utilized by Canon digital cameras to store uncompressed sensor data alongside metadata. It adheres to the TIFF 6.0 specification with Canon-specific extensions, including a custom header and Image File Directories (IFDs) for organizing image data and metadata. The format employs lossless JPEG compression for raw image data. Specifications are derived from reverse-engineered sources, as Canon does not provide an official public documentation. Key structural elements include a 16-byte header followed by TIFF-compliant IFDs.

1. List of Properties Intrinsic to the File Format

The properties listed below encompass the core structural components, header fields, and primary metadata tags inherent to the .CR2 format. These are derived from the file's internal organization, including offsets, data types, and descriptions. Properties are categorized by header and IFDs for clarity. All offsets are in bytes from the file start, and data types follow TIFF conventions (e.g., SHORT = 2 bytes unsigned, LONG = 4 bytes unsigned). Byte order is typically little-endian ('II'), but can be big-endian ('MM').

Header Properties (Offsets 0-15):

  • Byte Order (Offset 0, 2 bytes, ASCII): Indicates endianness; 'II' for little-endian (common in .CR2) or 'MM' for big-endian.
  • TIFF Magic Number (Offset 2, 2 bytes, SHORT): Fixed value 42 (0x002A), identifying TIFF compliance.
  • Offset to IFD0 (Offset 4, 4 bytes, LONG): Pointer to the first Image File Directory (typically 0x00000010, i.e., immediately after header).
  • CR Signature (Offset 8, 2 bytes, ASCII): Fixed 'CR' (0x4352), denoting Canon RAW format.
  • Major Version (Offset 10, 1 byte, BYTE): Fixed value 2 (0x02) for .CR2.
  • Minor Version (Offset 11, 1 byte, BYTE): Fixed value 0 (0x00).
  • Raw IFD Offset (Offset 12, 4 bytes, LONG): Pointer to IFD3, containing raw image data.

IFD0 Properties (Primary Metadata Directory, typically at offset 16):

  • ImageWidth (Tag 0x0100, SHORT or LONG): Width of the preview image in pixels.
  • ImageLength (Tag 0x0101, SHORT or LONG): Height of the preview image in pixels.
  • BitsPerSample (Tag 0x0102, SHORT, count 3): Bits per color channel (e.g., [8,8,8] for RGB preview).
  • Compression (Tag 0x0103, SHORT): Compression type (1 = uncompressed, 6 = JPEG for thumbnails).
  • Make (Tag 0x010F, ASCII): Camera manufacturer (e.g., "Canon").
  • Model (Tag 0x0110, ASCII): Camera model (e.g., "Canon EOS 5D Mark IV").
  • StripOffsets (Tag 0x0111, LONG): Offsets to image data strips.
  • Orientation (Tag 0x0112, SHORT): Image orientation (1 = top-left, etc.).
  • StripByteCounts (Tag 0x0117, LONG): Byte counts for each strip.
  • XResolution (Tag 0x011A, RATIONAL): Horizontal resolution.
  • YResolution (Tag 0x011B, RATIONAL): Vertical resolution.
  • ResolutionUnit (Tag 0x0128, SHORT): Unit for resolution (2 = inches).
  • DateTime (Tag 0x0132, ASCII): Timestamp of image capture.
  • ExifIFD Offset (Tag 0x8769, LONG): Pointer to Exif subdirectory.
  • MakerNote Offset (Tag 0x927C, UNDEFINED): Pointer to Canon-specific MakerNote data.

ExifIFD Properties (Subdirectory for EXIF Metadata):

  • ExposureTime (Tag 0x829A, RATIONAL): Shutter speed in seconds.
  • FNumber (Tag 0x829D, RATIONAL): Aperture value.
  • ExposureProgram (Tag 0x8822, SHORT): Exposure mode (e.g., 1 = manual).
  • ISO Speed Ratings (Tag 0x8827, SHORT): ISO value.
  • ExifVersion (Tag 0x9000, UNDEFINED): EXIF version (e.g., "0230").
  • DateTimeOriginal (Tag 0x9003, ASCII): Original capture timestamp.
  • DateTimeDigitized (Tag 0x9004, ASCII): Digitization timestamp.
  • ShutterSpeedValue (Tag 0x9201, SRATIONAL): Shutter speed in APEX units.
  • ApertureValue (Tag 0x9202, RATIONAL): Aperture in APEX units.
  • ExposureBiasValue (Tag 0x9204, SRATIONAL): Exposure compensation.
  • FocalLength (Tag 0x920A, RATIONAL): Lens focal length in mm.
  • LensModel (Tag 0xA434, ASCII): Lens model string.

MakerNote Properties (Canon-Specific Tags within MakerNote):

  • CanonCameraSettings (Tag 0x0001, varies): Array of camera settings (e.g., macro mode, quality).
  • CanonFocalLength (Tag 0x0002, SHORT, count 4): Focal length details.
  • CanonShotInfo (Tag 0x0004, varies): Shot information (e.g., ISO, white balance).
  • CanonFileInfo (Tag 0x000B, varies): File-related data.
  • SensorWidth (Tag 0x001D, SHORT): Sensor width in pixels.
  • SensorHeight (Tag 0x001E, SHORT): Sensor height in pixels.
  • SerialNumber (Tag 0x0015, LONG): Camera serial number.
  • LensType (Tag 0x0095, SHORT): Lens identifier.

IFD1 Properties (Thumbnail Directory):

  • Similar to IFD0 but for embedded JPEG thumbnail (e.g., 1536x1024 pixels).
  • Compression (Tag 0x0103, SHORT): Typically 6 (JPEG).
  • JPEGInterchangeFormat (Tag 0x0201, LONG): Offset to thumbnail JPEG data.
  • JPEGInterchangeFormatLength (Tag 0x0202, LONG): Length of thumbnail data.

IFD3 Properties (Raw Image Data Directory):

  • ImageWidth (Tag 0x0100, SHORT or LONG): Raw image width.
  • ImageLength (Tag 0x0101, SHORT or LONG): Raw image height.
  • BitsPerSample (Tag 0x0102, SHORT): Typically 14 or 16 for raw data.
  • Compression (Tag 0x0103, SHORT): 6 (lossless JPEG for raw).
  • StripOffsets (Tag 0x0111, LONG): Offsets to raw data strips.
  • StripByteCounts (Tag 0x0117, LONG): Byte counts for raw strips.
  • Slices (Tag 0xC640, SHORT, count 3): For certain models (e.g., 5D); values [1, slice1_columns, slice2_columns] for image slicing.

Additional Intrinsic Properties:

  • MIME Type: image/x-canon-cr2
  • Compression Algorithm: Lossless JPEG (old standard), with CFA pattern (e.g., RGGB).
  • Color Filter Array (CFA) Data: Stored in IFD3; dimensions account for Bayer pattern (width halved in JPEG stream).
  • Embedded Thumbnail: JPEG preview image.
  • File Extension: .CR2
  • Typical File Size: 10-50 MB, depending on resolution.

These properties define the format's structure and metadata, enabling parsing and manipulation.

3. Ghost Blog Embedded HTML/JavaScript for Drag-and-Drop .CR2 Property Dump

The following is an HTML snippet with embedded JavaScript suitable for embedding in a Ghost blog post. It creates a drag-and-drop area where users can drop a .CR2 file. The script parses the file using a FileReader, extracts the header and key properties from IFD0/ExifIFD (based on TIFF structure), and displays them on the screen. Note: This is client-side only and handles basic parsing; full raw data decoding requires server-side processing due to browser limitations.

Drag and drop a .CR2 file here

4. Python Class for .CR2 File Handling

The following Python class uses the struct module for binary parsing. It opens a .CR2 file, decodes the header and key properties, prints them to console, and supports writing modifications (e.g., updating a tag value) back to a new file.

import struct

class CR2File:
    def __init__(self, filepath):
        self.filepath = filepath
        self.data = None
        self.byte_order = None
        self.little_endian = None
        self.properties = {}
        self._load()

    def _load(self):
        with open(self.filepath, 'rb') as f:
            self.data = f.read()
        self._parse_header()

    def _parse_header(self):
        self.byte_order = self.data[0:2].decode('ascii')
        self.little_endian = self.byte_order == 'II'
        bo = '<' if self.little_endian else '>'
        self.properties['TIFF Magic Number'] = struct.unpack(bo + 'H', self.data[2:4])[0]
        self.properties['Offset to IFD0'] = struct.unpack(bo + 'I', self.data[4:8])[0]
        self.properties['CR Signature'] = self.data[8:10].decode('ascii')
        self.properties['Major Version'] = self.data[10]
        self.properties['Minor Version'] = self.data[11]
        self.properties['Raw IFD Offset'] = struct.unpack(bo + 'I', self.data[12:16])[0]
        # Parse IFD0 (example: Make tag)
        offset = self.properties['Offset to IFD0']
        num_entries = struct.unpack(bo + 'H', self.data[offset:offset+2])[0]
        offset += 2
        for _ in range(num_entries):
            tag = struct.unpack(bo + 'H', self.data[offset:offset+2])[0]
            if tag == 0x010F:  # Make
                type_ = struct.unpack(bo + 'H', self.data[offset+2:offset+4])[0]
                count = struct.unpack(bo + 'I', self.data[offset+4:offset+8])[0]
                val_offset = struct.unpack(bo + 'I', self.data[offset+8:offset+12])[0]
                make = self.data[val_offset:val_offset+count-1].decode('ascii')
                self.properties['Make'] = make
            offset += 12
        # Add more parsing for other tags/IFDs as needed

    def print_properties(self):
        for key, value in self.properties.items():
            print(f"{key}: {value}")

    def write(self, output_path, modify_tag=None, new_value=None):
        # Basic write: copy data, optionally modify a tag (e.g., DateTime)
        data = bytearray(self.data)
        if modify_tag == 'DateTime':
            # Find and update DateTime tag (0x0132) - assumes known offset for simplicity
            # In production, search IFD for tag and update
            pass  # Implement tag search and update logic
        with open(output_path, 'wb') as f:
            f.write(data)

# Usage example:
# cr2 = CR2File('sample.cr2')
# cr2.print_properties()
# cr2.write('modified.cr2')

5. Java Class for .CR2 File Handling

The following Java class uses ByteBuffer for parsing. It reads, decodes, prints properties, and supports writing with modifications.

import java.io.*;
import java.nio.*;
import java.nio.file.*;

public class CR2File {
    private Path filepath;
    private byte[] data;
    private String byteOrder;
    private boolean littleEndian;
    private java.util.Map<String, Object> properties = new java.util.HashMap<>();

    public CR2File(String filepath) throws IOException {
        this.filepath = Paths.get(filepath);
        this.data = Files.readAllBytes(this.filepath);
        parseHeader();
    }

    private void parseHeader() {
        byteOrder = new String(data, 0, 2);
        littleEndian = byteOrder.equals("II");
        ByteOrder bo = littleEndian ? ByteOrder.LITTLE_ENDIAN : ByteOrder.BIG_ENDIAN;
        ByteBuffer bb = ByteBuffer.wrap(data).order(bo);
        properties.put("TIFF Magic Number", bb.getShort(2));
        properties.put("Offset to IFD0", bb.getInt(4));
        properties.put("CR Signature", new String(data, 8, 2));
        properties.put("Major Version", Byte.toUnsignedInt(data[10]));
        properties.put("Minor Version", Byte.toUnsignedInt(data[11]));
        properties.put("Raw IFD Offset", bb.getInt(12));
        // Parse IFD0 (example: Make)
        int offset = (int) properties.get("Offset to IFD0");
        int numEntries = bb.getShort(offset);
        offset += 2;
        for (int i = 0; i < numEntries; i++) {
            short tag = bb.getShort(offset);
            if (tag == 0x010F) { // Make
                short type = bb.getShort(offset + 2);
                int count = bb.getInt(offset + 4);
                int valOffset = bb.getInt(offset + 8);
                String make = new String(data, valOffset, count - 1);
                properties.put("Make", make);
            }
            offset += 12;
        }
        // Extend for other tags
    }

    public void printProperties() {
        properties.forEach((k, v) -> System.out.println(k + ": " + v));
    }

    public void write(String outputPath, String modifyTag, Object newValue) throws IOException {
        byte[] newData = data.clone();
        // Implement modification logic (e.g., find and update tag)
        if (modifyTag.equals("DateTime")) {
            // Search and update
        }
        Files.write(Paths.get(outputPath), newData);
    }

    // Usage:
    // public static void main(String[] args) throws IOException {
    //     CR2File cr2 = new CR2File("sample.cr2");
    //     cr2.printProperties();
    //     cr2.write("modified.cr2", null, null);
    // }
}

6. JavaScript Class for .CR2 File Handling

The following JavaScript class uses DataView for parsing. It reads via FileReader (browser) or fs (Node.js), decodes, prints to console, and supports writing (Node.js only for file output).

class CR2File {
  constructor(filepath) {
    this.filepath = filepath;
    this.data = null;
    this.byteOrder = null;
    this.littleEndian = null;
    this.properties = {};
    this.load(); // Assumes Node.js; for browser, pass ArrayBuffer
  }

  load() {
    const fs = require('fs');
    this.data = fs.readFileSync(this.filepath);
    this.parseHeader();
  }

  parseHeader() {
    const dv = new DataView(this.data.buffer);
    this.byteOrder = String.fromCharCode(dv.getUint8(0)) + String.fromCharCode(dv.getUint8(1));
    this.littleEndian = this.byteOrder === 'II';
    this.properties['TIFF Magic Number'] = dv.getUint16(2, this.littleEndian);
    this.properties['Offset to IFD0'] = dv.getUint32(4, this.littleEndian);
    this.properties['CR Signature'] = String.fromCharCode(dv.getUint8(8)) + String.fromCharCode(dv.getUint8(9));
    this.properties['Major Version'] = dv.getUint8(10);
    this.properties['Minor Version'] = dv.getUint8(11);
    this.properties['Raw IFD Offset'] = dv.getUint32(12, this.littleEndian);
    // Parse IFD0 (example: Make)
    let offset = this.properties['Offset to IFD0'];
    const numEntries = dv.getUint16(offset, this.littleEndian);
    offset += 2;
    for (let i = 0; i < numEntries; i++) {
      const tag = dv.getUint16(offset, this.littleEndian);
      if (tag === 0x010F) { // Make
        const count = dv.getUint32(offset + 4, this.littleEndian);
        const valOffset = dv.getUint32(offset + 8, this.littleEndian);
        let make = '';
        for (let j = 0; j < count - 1; j++) {
          make += String.fromCharCode(dv.getUint8(valOffset + j));
        }
        this.properties['Make'] = make;
      }
      offset += 12;
    }
    // Extend for other tags
  }

  printProperties() {
    for (const [key, value] of Object.entries(this.properties)) {
      console.log(`${key}: ${value}`);
    }
  }

  write(outputPath, modifyTag, newValue) {
    const fs = require('fs');
    let newData = Buffer.from(this.data);
    // Implement modification
    if (modifyTag === 'DateTime') {
      // Search and update
    }
    fs.writeFileSync(outputPath, newData);
  }
}

// Usage:
// const cr2 = new CR2File('sample.cr2');
// cr2.printProperties();
// cr2.write('modified.cr2');

7. C++ Class for .CR2 File Handling

The following C++ class uses fstream and manual byte manipulation for parsing. It reads, decodes, prints to console, and supports writing with modifications. (Note: C does not have classes; this uses C++ for object-oriented structure.)

#include <iostream>
#include <fstream>
#include <vector>
#include <map>
#include <cstring>
#include <cstdint>

class CR2File {
private:
    std::string filepath;
    std::vector<uint8_t> data;
    std::string byteOrder;
    bool littleEndian;
    std::map<std::string, std::string> properties;

    uint16_t getUint16(size_t offset) {
        uint16_t val;
        memcpy(&val, &data[offset], 2);
        if (!littleEndian) val = __builtin_bswap16(val);
        return val;
    }

    uint32_t getUint32(size_t offset) {
        uint32_t val;
        memcpy(&val, &data[offset], 4);
        if (!littleEndian) val = __builtin_bswap32(val);
        return val;
    }

    void parseHeader() {
        byteOrder = std::string(1, data[0]) + std::string(1, data[1]);
        littleEndian = (byteOrder == "II");
        properties["TIFF Magic Number"] = std::to_string(getUint16(2));
        properties["Offset to IFD0"] = std::to_string(getUint32(4));
        properties["CR Signature"] = std::string(1, data[8]) + std::string(1, data[9]);
        properties["Major Version"] = std::to_string(data[10]);
        properties["Minor Version"] = std::to_string(data[11]);
        properties["Raw IFD Offset"] = std::to_string(getUint32(12));
        // Parse IFD0 (example: Make)
        size_t offset = getUint32(4);
        uint16_t numEntries = getUint16(offset);
        offset += 2;
        for (uint16_t i = 0; i < numEntries; ++i) {
            uint16_t tag = getUint16(offset);
            if (tag == 0x010F) { // Make
                uint32_t count = getUint32(offset + 4);
                size_t valOffset = getUint32(offset + 8);
                std::string make;
                for (uint32_t j = 0; j < count - 1; ++j) {
                    make += static_cast<char>(data[valOffset + j]);
                }
                properties["Make"] = make;
            }
            offset += 12;
        }
        // Extend for other tags
    }

public:
    CR2File(const std::string& fp) : filepath(fp) {
        std::ifstream file(filepath, std::ios::binary);
        if (file) {
            file.seekg(0, std::ios::end);
            size_t size = file.tellg();
            file.seekg(0, std::ios::beg);
            data.resize(size);
            file.read(reinterpret_cast<char*>(data.data()), size);
            parseHeader();
        }
    }

    void printProperties() {
        for (const auto& p : properties) {
            std::cout << p.first << ": " << p.second << std::endl;
        }
    }

    void write(const std::string& outputPath, const std::string& modifyTag = "", const std::string& newValue = "") {
        std::vector<uint8_t> newData = data;
        // Implement modification (e.g., find tag and update)
        if (!modifyTag.empty()) {
            // Search and update logic
        }
        std::ofstream out(outputPath, std::ios::binary);
        out.write(reinterpret_cast<const char*>(newData.data()), newData.size());
    }
};

// Usage:
// int main() {
//     CR2File cr2("sample.cr2");
//     cr2.printProperties();
//     cr2.write("modified.cr2");
//     return 0;
// }