Task 179: .EML File Format

Task 179: .EML File Format

File Format Specifications for .EML

The .EML file format is used for storing email messages and is based on the Internet Message Format specified in RFC 5322 (which obsoletes RFC 2822 and RFC 822). It is a plain text format compliant with MIME (Multipurpose Internet Mail Extensions) as defined in RFC 2045 and related standards. .EML files contain email headers followed by a blank line and then the message body, which may include multipart content such as text, HTML, or attachments. The format allows for headers like From, To, Subject, and MIME-specific fields like Content-Type and Content-Transfer-Encoding. Specifications are available in RFC 5322 for the core structure, with MIME extensions in RFC 2045-2049.

List of all properties of this file format intrinsic to its file system:

  • Accept-Language
  • Also-Control
  • Alternate-Recipient
  • Approved
  • ARC-Authentication-Results
  • ARC-Message-Signature
  • ARC-Seal
  • Archive
  • Archived-At
  • Article-Names
  • Article-Updates
  • Authentication-Results
  • Auto-Submitted
  • Autoforwarded
  • Autosubmitted
  • Base
  • Bcc
  • Body
  • Cancel-Key
  • Cancel-Lock
  • Cc
  • Comments
  • Content-Alternative
  • Content-Base
  • Content-Description
  • Content-Disposition
  • Content-Duration
  • Content-features
  • Content-ID
  • Content-Identifier
  • Content-Language
  • Content-Location
  • Content-MD5
  • Content-Return
  • Content-Transfer-Encoding
  • Content-Translation-Type
  • Content-Type
  • Control
  • Conversion
  • Conversion-With-Loss
  • DL-Expansion-History
  • Date
  • Date-Received
  • Deferred-Delivery
  • Delivery-Date
  • Discarded-X400-IPMS-Extensions
  • Discarded-X400-MTS-Extensions
  • Disclose-Recipients
  • Disposition-Notification-Options
  • Disposition-Notification-To
  • Distribution
  • DKIM-Signature
  • Downgraded-Bcc
  • Downgraded-Cc
  • Downgraded-Disposition-Notification-To
  • Downgraded-Final-Recipient
  • Downgraded-From
  • Downgraded-In-Reply-To
  • Downgraded-Mail-From
  • Downgraded-Message-Id
  • Downgraded-Original-Recipient
  • Downgraded-Rcpt-To
  • Downgraded-References
  • Downgraded-Reply-To
  • Downgraded-Resent-Bcc
  • Downgraded-Resent-Cc
  • Downgraded-Resent-From
  • Downgraded-Resent-Reply-To
  • Downgraded-Resent-Sender
  • Downgraded-Resent-To
  • Downgraded-Return-Path
  • Downgraded-Sender
  • Downgraded-To
  • Encoding
  • Encrypted
  • Expires
  • Expiry-Date
  • Followup-To
  • From
  • Generate-Delivery-Report
  • HP-Outer
  • Importance
  • In-Reply-To
  • Incomplete-Copy
  • Injection-Date
  • Injection-Info
  • Keywords
  • Language
  • Latest-Delivery-Time
  • Lines
  • List-Archive
  • List-Help
  • List-ID
  • List-Owner
  • List-Post
  • List-Subscribe
  • List-Unsubscribe
  • List-Unsubscribe-Post
  • Message-Context
  • Message-ID
  • Message-Type
  • MIME-Version
  • MMHS-Exempted-Address
  • MMHS-Extended-Authorisation-Info
  • MMHS-Subject-Indicator-Codes
  • MMHS-Handling-Instructions
  • MMHS-Message-Instructions
  • MMHS-Codress-Message-Indicator
  • MMHS-Originator-Reference
  • MMHS-Primary-Precedence
  • MMHS-Copy-Precedence
  • MMHS-Message-Type
  • MMHS-Other-Recipients-Indicator-To
  • MMHS-Other-Recipients-Indicator-CC
  • MMHS-Acp127-Message-Identifier
  • MMHS-Originator-PLAD
  • MT-Priority
  • Newsgroups
  • NNTP-Posting-Date
  • NNTP-Posting-Host
  • Obsoletes
  • Organization
  • Original-Encoded-Information-Types
  • Original-From
  • Original-Message-ID
  • Original-Recipient
  • Original-Sender
  • Originator-Return-Address
  • Original-Subject
  • Path
  • PICS-Label
  • Posting-Version
  • Prevent-NonDelivery-Report
  • Priority
  • Received
  • Received-SPF
  • References
  • Relay-Version
  • Reply-By
  • Reply-To
  • Require-Recipient-Valid-Since
  • Resent-Bcc
  • Resent-Cc
  • Resent-Date
  • Resent-From
  • Resent-Message-ID
  • Resent-Reply-To
  • Resent-Sender
  • Resent-To
  • Return-Path
  • See-Also
  • Sender
  • Sensitivity
  • Solicitation
  • Subject
  • Summary
  • Supersedes
  • TLS-Report-Domain
  • TLS-Report-Submitter
  • TLS-Required
  • To
  • User-Agent
  • VBR-Info
  • X400-Content-Identifier
  • X400-Content-Return
  • X400-Content-Type
  • X400-MTS-Identifier
  • X400-Originator
  • X400-Received
  • X400-Recipients
  • X400-Trace
  • Xref
    (Note: These are the permanent message header field names as per IANA assignments. The format also includes the message body, which may contain plain text, HTML, or multipart attachments with their own sub-headers like Content-Type and Content-Disposition. Properties are variable and not all are required.)

Two direct download links for .EML files:

Ghost blog embedded HTML JavaScript for drag-and-drop .EML file dumper:

Drag and drop a .EML file here

(This can be embedded in a Ghost blog post. It parses headers, decodes simple bodies, and dumps matching properties from the list plus a truncated body. For multipart, it notes the part count but doesn't fully extract attachments.)

Python class for .EML handling:

import email
import email.policy
import sys
from email.message import EmailMessage

class EMLHandler:
    def __init__(self, filepath):
        self.filepath = filepath
        self.message = None
        self.load()

    def load(self):
        with open(self.filepath, 'rb') as fp:
            self.message = email.message_from_binary_file(fp, policy=email.policy.default)

    def decode_and_print_properties(self, properties_list):
        print(f"Properties for {self.filepath}:")
        for prop in properties_list:
            value = self.message.get(prop)
            if value:
                print(f"{prop}: {value}")
        # Print body
        body = self.get_decoded_body()
        print(f"\nBody: {body[:500]}... (truncated)")

    def get_decoded_body(self):
        if self.message.is_multipart():
            parts = []
            for part in self.message.iter_parts():
                if part.get_content_type() == 'text/plain':
                    parts.append(part.get_payload(decode=True).decode(errors='replace'))
            return '\n'.join(parts)
        else:
            return self.message.get_payload(decode=True).decode(errors='replace')

    def write(self, new_filepath, changes=None):
        if changes:
            for key, value in changes.items():
                self.message.replace_header(key, value) if key in self.message else self.message.add_header(key, value)
        with open(new_filepath, 'wb') as fp:
            fp.write(self.message.as_bytes())

# Example usage
if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: python script.py path/to/file.eml")
        sys.exit(1)
    properties_list = [
        'Accept-Language', 'Also-Control', 'Alternate-Recipient', 'Approved', 'ARC-Authentication-Results',
        # ... (omit for brevity; include full list from above in actual code)
        'Xref'
    ]
    handler = EMLHandler(sys.argv[1])
    handler.decode_and_print_properties(properties_list)
    # Example write: handler.write('output.eml', {'Subject': 'Modified Subject'})

(Uses Python's email module for parsing/decoding. Reads .EML, prints present properties and decoded body. Write saves modified message.)

Java class for .EML handling:

import jakarta.mail.*;
import jakarta.mail.internet.*;
import java.io.*;
import java.util.*;

public class EMLHandler {
    private MimeMessage message;
    private String filepath;

    public EMLHandler(String filepath) throws Exception {
        this.filepath = filepath;
        load();
    }

    private void load() throws Exception {
        Properties props = new Properties();
        Session session = Session.getDefaultInstance(props);
        try (InputStream is = new FileInputStream(filepath)) {
            message = new MimeMessage(session, is);
        }
    }

    public void decodeAndPrintProperties(List<String> propertiesList) throws Exception {
        System.out.println("Properties for " + filepath + ":");
        Enumeration<Header> headers = message.getAllHeaders();
        Map<String, String> headerMap = new HashMap<>();
        while (headers.hasMoreElements()) {
            Header header = headers.nextElement();
            headerMap.put(header.getName(), header.getValue());
        }
        for (String prop : propertiesList) {
            String value = headerMap.get(prop);
            if (value != null) {
                System.out.println(prop + ": " + value);
            }
        }
        // Print body
        String body = getDecodedBody();
        System.out.println("\nBody: " + body.substring(0, Math.min(500, body.length())) + "... (truncated)");
    }

    private String getDecodedBody() throws Exception {
        Object content = message.getContent();
        if (content instanceof Multipart) {
            Multipart mp = (Multipart) content;
            StringBuilder sb = new StringBuilder();
            for (int i = 0; i < mp.getCount(); i++) {
                BodyPart bp = mp.getBodyPart(i);
                if (bp.getContentType().startsWith("text/plain")) {
                    sb.append((String) bp.getContent());
                }
            }
            return sb.toString();
        } else {
            return (String) content;
        }
    }

    public void write(String newFilepath, Map<String, String> changes) throws Exception {
        if (changes != null) {
            for (Map.Entry<String, String> entry : changes.entrySet()) {
                message.setHeader(entry.getKey(), entry.getValue());
            }
        }
        try (OutputStream os = new FileOutputStream(newFilepath)) {
            message.writeTo(os);
        }
    }

    public static void main(String[] args) throws Exception {
        if (args.length < 1) {
            System.out.println("Usage: java EMLHandler path/to/file.eml");
            System.exit(1);
        }
        List<String> propertiesList = Arrays.asList(
            "Accept-Language", "Also-Control", /* ... full list omitted for brevity ... */ "Xref"
        );
        EMLHandler handler = new EMLHandler(args[0]);
        handler.decodeAndPrintProperties(propertiesList);
        // Example: handler.write("output.eml", Map.of("Subject", "Modified Subject"));
    }
}

(Uses Jakarta Mail for parsing/decoding. Requires jakarta.mail dependency. Reads .EML, prints present properties and decoded body. Write saves modified message.)

JavaScript class for .EML handling:

const fs = require('fs'); // For Node.js

class EMLHandler {
    constructor(filepath) {
        this.filepath = filepath;
        this.headers = {};
        this.body = '';
        this.load();
    }

    load() {
        const content = fs.readFileSync(this.filepath, 'utf8');
        this.parse(content);
    }

    parse(content) {
        const lines = content.split(/\r?\n/);
        let headerMode = true;
        let currentHeader = '';
        let i = 0;

        while (i < lines.length) {
            const line = lines[i];
            if (headerMode) {
                if (line.trim() === '') {
                    headerMode = false;
                    i++;
                    continue;
                }
                if (line.startsWith(' ') || line.startsWith('\t')) {
                    if (currentHeader) {
                        this.headers[currentHeader] += ' ' + line.trim();
                    }
                } else {
                    const colonIndex = line.indexOf(':');
                    if (colonIndex > -1) {
                        currentHeader = line.substring(0, colonIndex).trim();
                        this.headers[currentHeader] = line.substring(colonIndex + 1).trim();
                    }
                }
            } else {
                this.body += line + '\n';
            }
            i++;
        }

        const contentType = this.headers['Content-Type'] || '';
        if (contentType.includes('multipart/')) {
            // Basic multipart note
            this.body = 'Multipart body detected (full parsing omitted).';
        } else {
            const transferEncoding = this.headers['Content-Transfer-Encoding'] || '';
            if (transferEncoding.toLowerCase() === 'base64') {
                this.body = Buffer.from(this.body.trim(), 'base64').toString('utf8');
            } else if (transferEncoding.toLowerCase() === 'quoted-printable') {
                this.body = this.decodeQuotedPrintable(this.body);
            }
        }
    }

    decodeQuotedPrintable(text) {
        return text.replace(/=([0-9A-F]{2})/gi, (match, hex) => String.fromCharCode(parseInt(hex, 16)))
                   .replace(/=\r?\n/g, '');
    }

    decodeAndPrintProperties(propertiesList) {
        console.log(`Properties for ${this.filepath}:`);
        propertiesList.forEach(prop => {
            if (this.headers.hasOwnProperty(prop)) {
                console.log(`${prop}: ${this.headers[prop]}`);
            }
        });
        console.log(`\nBody: ${this.body.trim().substring(0, 500)}... (truncated)`);
    }

    write(newFilepath, changes) {
        if (changes) {
            Object.assign(this.headers, changes);
        }
        let content = '';
        for (const [key, value] of Object.entries(this.headers)) {
            content += `${key}: ${value}\n`;
        }
        content += '\n' + this.body;
        fs.writeFileSync(newFilepath, content);
    }
}

// Example usage (Node.js)
const propertiesList = [
    'Accept-Language', 'Also-Control', /* ... full list omitted ... */ 'Xref'
];
if (process.argv.length < 3) {
    console.log('Usage: node script.js path/to/file.eml');
    process.exit(1);
}
const handler = new EMLHandler(process.argv[2]);
handler.decodeAndPrintProperties(propertiesList);
// Example: handler.write('output.eml', { Subject: 'Modified Subject' });

(Custom parser for Node.js. Reads .EML, decodes simple bodies, prints properties and truncated body. Write serializes headers and body. Multipart handling is basic.)

C++ class for .EML handling:

#include <iostream>
#include <fstream>
#include <string>
#include <map>
#include <vector>
#include <algorithm>
#include <cctype>

class EMLHandler {
private:
    std::string filepath;
    std::map<std::string, std::string> headers;
    std::string body;

    void parse(const std::string& content) {
        std::vector<std::string> lines;
        std::string line;
        for (char c : content) {
            if (c == '\n') {
                lines.push_back(line);
                line.clear();
            } else {
                line += c;
            }
        }
        if (!line.empty()) lines.push_back(line);

        bool headerMode = true;
        std::string currentHeader;
        size_t i = 0;

        while (i < lines.size()) {
            line = lines[i];
            if (headerMode) {
                if (line.empty()) {
                    headerMode = false;
                    ++i;
                    continue;
                }
                if (std::isspace(line[0])) {
                    if (!currentHeader.empty()) {
                        headers[currentHeader] += " " + line.substr(1);
                    }
                } else {
                    size_t colonPos = line.find(':');
                    if (colonPos != std::string::npos) {
                        currentHeader = line.substr(0, colonPos);
                        // Trim spaces
                        currentHeader.erase(0, currentHeader.find_first_not_of(" \t"));
                        currentHeader.erase(currentHeader.find_last_not_of(" \t") + 1);
                        std::string value = line.substr(colonPos + 1);
                        value.erase(0, value.find_first_not_of(" \t"));
                        headers[currentHeader] = value;
                    }
                }
            } else {
                body += line + "\n";
            }
            ++i;
        }

        auto it = headers.find("Content-Type");
        std::string contentType = (it != headers.end()) ? it->second : "";
        if (contentType.find("multipart/") != std::string::npos) {
            body = "Multipart body detected (full parsing omitted).";
        } else {
            it = headers.find("Content-Transfer-Encoding");
            std::string transferEncoding = (it != headers.end()) ? it->second : "";
            std::transform(transferEncoding.begin(), transferEncoding.end(), transferEncoding.begin(), ::tolower);
            if (transferEncoding == "base64") {
                // Simple base64 decode (implement if needed; omitted for brevity)
                body = "Base64 decoded body (implementation omitted).";
            } else if (transferEncoding == "quoted-printable") {
                body = decodeQuotedPrintable(body);
            }
        }
    }

    std::string decodeQuotedPrintable(const std::string& text) {
        std::string result;
        for (size_t i = 0; i < text.size(); ++i) {
            if (text[i] == '=' && i + 2 < text.size() && isxdigit(text[i+1]) && isxdigit(text[i+2])) {
                char hex[3] = {text[i+1], text[i+2], '\0'};
                result += static_cast<char>(strtol(hex, nullptr, 16));
                i += 2;
            } else if (text[i] == '=' && (text[i+1] == '\r' || text[i+1] == '\n')) {
                ++i;
            } else {
                result += text[i];
            }
        }
        return result;
    }

public:
    EMLHandler(const std::string& fp) : filepath(fp) {
        std::ifstream file(filepath);
        if (!file) {
            throw std::runtime_error("Failed to open file");
        }
        std::string content((std::istreambuf_iterator<char>(file)), std::istreambuf_iterator<char>());
        parse(content);
    }

    void decodeAndPrintProperties(const std::vector<std::string>& propertiesList) {
        std::cout << "Properties for " << filepath << ":" << std::endl;
        for (const auto& prop : propertiesList) {
            auto it = headers.find(prop);
            if (it != headers.end()) {
                std::cout << prop << ": " << it->second << std::endl;
            }
        }
        std::cout << "\nBody: " << body.substr(0, 500) << "... (truncated)" << std::endl;
    }

    void write(const std::string& newFilepath, const std::map<std::string, std::string>& changes) {
        for (const auto& change : changes) {
            headers[change.first] = change.second;
        }
        std::ofstream file(newFilepath);
        if (!file) {
            throw std::runtime_error("Failed to write file");
        }
        for (const auto& header : headers) {
            file << header.first << ": " << header.second << "\n";
        }
        file << "\n" << body;
    }
};

int main(int argc, char* argv[]) {
    if (argc < 2) {
        std::cout << "Usage: ./program path/to/file.eml" << std::endl;
        return 1;
    }
    std::vector<std::string> propertiesList = {
        "Accept-Language", "Also-Control", /* ... full list omitted for brevity ... */ "Xref"
    };
    try {
        EMLHandler handler(argv[1]);
        handler.decodeAndPrintProperties(propertiesList);
        // Example: std::map<std::string, std::string> changes = {{"Subject", "Modified Subject"}};
        // handler.write("output.eml", changes);
    } catch (const std::exception& e) {
        std::cerr << e.what() << std::endl;
    }
    return 0;
}

(Custom parser in C++. Reads .EML, decodes simple bodies, prints properties and truncated body. Write serializes headers and body. Base64 decoding omitted for brevity; add if needed.)