.MP4

Task 415: .MP4 File Format

thinkmelt@protonmail.com

Oct 5, 2025 • 15 min read

File Format Specifications for .MP4

The .MP4 file format (MPEG-4 Part 14) is defined in ISO/IEC 14496-14, which is an instance of the more general ISO base media file format specified in ISO/IEC 14496-12. This standard describes a container format for multimedia data, structured as a series of nested boxes (also called atoms), each with a size, type, and payload. The format supports video, audio, subtitles, and metadata, and is designed for streaming, editing, and playback. Official specifications can be obtained from the ISO website, with drafts and summaries available from sources like MP4RA and developer documentation from Apple (QuickTime File Format, on which MP4 is based).

List of All Properties Intrinsic to the .MP4 File Format
Based on the ISO base media file format structure, the following are key intrinsic properties extracted from the file's box hierarchy (e.g., ftyp, mvhd, tkhd, mdhd, hdlr, stsd). These are structural and metadata properties essential to the format for playback and processing, not user-added tags like title or author. They are derived from common boxes and fields in MP4 files:

Major Brand: The primary file type identifier (from 'ftyp' box).
Minor Version: The version of the major brand (from 'ftyp' box).
Compatible Brands: List of compatible file types/brands (from 'ftyp' box).
Creation Time: Timestamp when the movie was created (from 'mvhd' box, in seconds since 1904-01-01).
Modification Time: Timestamp when the movie was last modified (from 'mvhd' box, in seconds since 1904-01-01).
Timescale: Units per second for timing (from 'mvhd' box).
Duration: Total movie duration in seconds (calculated as duration / timescale from 'mvhd' box).
Number of Tracks: Count of media tracks (derived from 'trak' boxes under 'moov').
Video Codec: Codec identifier for the video track (from 'stsd' box in video 'trak', e.g., 'avc1' for H.264).
Video Width: Width of the video in pixels (from 'tkhd' box in video 'trak').
Video Height: Height of the video in pixels (from 'tkhd' box in video 'trak').
Audio Codec: Codec identifier for the audio track (from 'stsd' box in audio 'trak', e.g., 'mp4a' for AAC).
Audio Channels: Number of audio channels (from audio sample entry in 'stsd' box).
Audio Sample Rate: Sampling rate in Hz (from 'mdhd' timescale in audio 'trak', often equals sample rate for audio).

These properties are intrinsic as they are part of the core box structure required for the file to be valid and playable.

Two Direct Download Links for .MP4 Files

https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4 (Sample video of an animated bunny, ~60MB).
https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ElephantsDream.mp4 (Sample video of an animated story, ~150MB).

Ghost Blog Embedded HTML/JavaScript for Drag-and-Drop .MP4 Property Dump
This is an embeddable HTML snippet with JavaScript for a Ghost blog post. It creates a drag-and-drop zone. When an .MP4 file is dropped, it reads the file as an ArrayBuffer, parses the boxes to extract the properties from the list above, and dumps them to the screen in a readable format. It handles basic box recursion but assumes a standard MP4 structure (no error handling for malformed files).

Drag and drop an .MP4 file here

Python Class for .MP4 Parsing
This Python class opens an .MP4 file, decodes the boxes, reads and extracts the properties from the list above, prints them to console, and can write the file back (as is, or modified if properties are set, though modification is basic here for demonstration).

import struct
import datetime

class MP4Parser:
    def __init__(self, file_path):
        with open(file_path, 'rb') as f:
            self.data = f.read()
        self.offset = 0
        self.properties = {}
        self.tracks = []
        self.parse()

    def read_uint32(self):
        val = struct.unpack('>I', self.data[self.offset:self.offset+4])[0]
        self.offset += 4
        return val

    def read_uint16(self):
        val = struct.unpack('>H', self.data[self.offset:self.offset+2])[0]
        self.offset += 2
        return val

    def read_string(self, len_):
        str_ = self.data[self.offset:self.offset+len_].decode('utf-8')
        self.offset += len_
        return str_

    def read_time(self):
        return datetime.datetime.fromtimestamp(self.read_uint32() - 2082844800).isoformat()

    def parse_box(self):
        start = self.offset
        size = self.read_uint32()
        type_ = self.read_string(4)
        end = start + size
        if type_ == 'ftyp':
            self.properties['Major Brand'] = self.read_string(4)
            self.properties['Minor Version'] = self.read_uint32()
            self.properties['Compatible Brands'] = []
            while self.offset < end:
                self.properties['Compatible Brands'].append(self.read_string(4))
        elif type_ == 'moov':
            self.parse_container(end)
        self.offset = end

    def parse_container(self, end):
        while self.offset < end:
            self.parse_box()

    def parse_mvhd(self):
        version = self.data[self.offset]
        self.offset += 4  # version + flags
        self.properties['Creation Time'] = self.read_time()
        self.properties['Modification Time'] = self.read_time()
        self.properties['Timescale'] = self.read_uint32()
        self.properties['Duration'] = f"{self.read_uint32() / self.properties['Timescale']:.2f} seconds"

    def parse_tkhd(self):
        version = self.data[self.offset]
        self.offset += 4  # version + flags
        self.read_time()  # creation
        self.read_time()  # modification
        track_id = self.read_uint32()
        self.offset += 4  # reserved
        duration = self.read_uint32()
        self.offset += 8 + 2 + 2 + 2 + 2 + 36  # reserved, layer, alt, volume, reserved, matrix
        width = self.read_uint32() / 65536
        height = self.read_uint32() / 65536
        return {'id': track_id, 'width': width, 'height': height, 'duration': duration}

    def parse_mdhd(self):
        version = self.data[self.offset]
        self.offset += 4  # version + flags
        self.read_time()  # creation
        self.read_time()  # modification
        timescale = self.read_uint32()
        self.offset += 4  # duration
        self.offset += 4  # language + pre_defined
        return {'timescale': timescale}

    def parse_hdlr(self):
        self.offset += 4  # version + flags
        self.offset += 4  # pre_defined
        handler_type = self.read_string(4)
        self.offset += 12  # reserved
        # skip name
        return handler_type

    def parse_stsd(self):
        self.offset += 4  # version + flags
        entry_count = self.read_uint32()
        if entry_count > 0:
            sample_size = self.read_uint32()
            codec = self.read_string(4)
            self.offset += 6  # reserved + data_ref_index
            if codec in ['avc1', 'mp4v']:  # video
                self.offset += 8  # pre_defined + reserved
                width = self.read_uint16()
                height = self.read_uint16()
                return {'codec': codec, 'width': width, 'height': height}
            elif codec == 'mp4a':  # audio
                self.offset += 8  # reserved + data_ref_index + pre_defined
                channels = self.read_uint16()
                self.offset += 2  # sample size
                self.offset += 4  # pre_defined + reserved
                sample_rate = self.read_uint32() >> 16
                return {'codec': codec, 'channels': channels, 'sample_rate': sample_rate}
        return {}

    def parse_trak(self):
        start = self.offset
        tkhd = self.parse_tkhd()
        handler_type = ''
        stsd = {}
        mdhd = {}
        while self.offset < start + tkhd.get('size', len(self.data)):  # Approximate
            size = self.read_uint32()
            type_ = self.read_string(4)
            if type_ == 'mdia':
                self.parse_container(self.offset + size - 8)
            elif type_ == 'mdhd':
                mdhd = self.parse_mdhd()
            elif type_ == 'hdlr':
                handler_type = self.parse_hdlr()
            elif type_ == 'stbl':
                self.parse_container(self.offset + size - 8)
            elif type_ == 'stsd':
                stsd = self.parse_stsd()
            else:
                self.offset += size - 8
        if handler_type == 'vide':
            self.properties['Video Codec'] = stsd.get('codec', 'Unknown')
            self.properties['Video Width'] = tkhd['width'] or stsd.get('width')
            self.properties['Video Height'] = tkhd['height'] or stsd.get('height')
        elif handler_type == 'soun':
            self.properties['Audio Codec'] = stsd.get('codec', 'Unknown')
            self.properties['Audio Channels'] = stsd.get('channels', 'Unknown')
            self.properties['Audio Sample Rate'] = mdhd.get('timescale', stsd.get('sample_rate', 'Unknown'))
        self.tracks.append(handler_type)

    def parse(self):
        while self.offset < len(self.data):
            self.parse_box()
        self.properties['Number of Tracks'] = len(self.tracks)

    def print_properties(self):
        for key, value in self.properties.items():
            print(f"{key}: {value}")

    def write(self, output_path):
        with open(output_path, 'wb') as f:
            f.write(self.data)  # Writes current data; modify self.data for changes

# Example usage:
# parser = MP4Parser('example.mp4')
# parser.print_properties()
# parser.write('output.mp4')

Java Class for .MP4 Parsing
This Java class does the same: opens, decodes, reads, prints properties, and can write the file.

import java.io.*;
import java.nio.*;
import java.nio.channels.FileChannel;
import java.util.*;

public class MP4Parser {
    private ByteBuffer buffer;
    private Map<String, Object> properties = new HashMap<>();
    private List<String> tracks = new ArrayList<>();

    public MP4Parser(String filePath) throws IOException {
        FileInputStream fis = new FileInputStream(filePath);
        FileChannel fc = fis.getChannel();
        buffer = ByteBuffer.allocate((int) fc.size()).order(ByteOrder.BIG_ENDIAN);
        fc.read(buffer);
        buffer.flip();
        parse();
        fis.close();
    }

    private int readUInt32() {
        return buffer.getInt();
    }

    private short readUInt16() {
        return buffer.getShort();
    }

    private String readString(int len) {
        byte[] bytes = new byte[len];
        buffer.get(bytes);
        return new String(bytes);
    }

    private String readTime() {
        long seconds = readUInt32() - 2082844800L;
        return new Date(seconds * 1000).toString();
    }

    private void parseBox() {
        int start = buffer.position();
        int size = readUInt32();
        String type = readString(4);
        int end = start + size;
        if ("ftyp".equals(type)) {
            properties.put("Major Brand", readString(4));
            properties.put("Minor Version", readUInt32());
            List<String> compat = new ArrayList<>();
            while (buffer.position() < end) {
                compat.add(readString(4));
            }
            properties.put("Compatible Brands", compat);
        } else if ("moov".equals(type)) {
            parseContainer(end);
        }
        buffer.position(end);
    }

    private void parseContainer(int end) {
        while (buffer.position() < end) {
            parseBox();
        }
    }

    private void parseMvhd() {
        buffer.position(buffer.position() + 4); // version + flags
        properties.put("Creation Time", readTime());
        properties.put("Modification Time", readTime());
        int timescale = readUInt32();
        properties.put("Timescale", timescale);
        properties.put("Duration", (double) readUInt32() / timescale + " seconds");
    }

    private Map<String, Object> parseTkhd() {
        buffer.position(buffer.position() + 4); // version + flags
        readTime(); // creation
        readTime(); // modification
        int trackId = readUInt32();
        buffer.position(buffer.position() + 4); // reserved
        int duration = readUInt32();
        buffer.position(buffer.position() + 8 + 2 + 2 + 2 + 2 + 36); // reserved, layer, alt, volume, reserved, matrix
        double width = readUInt32() / 65536.0;
        double height = readUInt32() / 65536.0;
        Map<String, Object> tkhd = new HashMap<>();
        tkhd.put("id", trackId);
        tkhd.put("width", width);
        tkhd.put("height", height);
        tkhd.put("duration", duration);
        return tkhd;
    }

    private Map<String, Object> parseMdhd() {
        buffer.position(buffer.position() + 4); // version + flags
        readTime(); // creation
        readTime(); // modification
        int timescale = readUInt32();
        buffer.position(buffer.position() + 4); // duration
        buffer.position(buffer.position() + 4); // language + pre_defined
        Map<String, Object> mdhd = new HashMap<>();
        mdhd.put("timescale", timescale);
        return mdhd;
    }

    private String parseHdlr() {
        buffer.position(buffer.position() + 4); // version + flags
        buffer.position(buffer.position() + 4); // pre_defined
        String handlerType = readString(4);
        buffer.position(buffer.position() + 12); // reserved
        return handlerType;
    }

    private Map<String, Object> parseStsd() {
        buffer.position(buffer.position() + 4); // version + flags
        int entryCount = readUInt32();
        if (entryCount > 0) {
            int sampleSize = readUInt32();
            String codec = readString(4);
            buffer.position(buffer.position() + 6); // reserved + data_ref_index
            Map<String, Object> stsd = new HashMap<>();
            if ("avc1".equals(codec) || "mp4v".equals(codec)) {
                buffer.position(buffer.position() + 8); // pre_defined + reserved
                int width = readUInt16();
                int height = readUInt16();
                stsd.put("codec", codec);
                stsd.put("width", width);
                stsd.put("height", height);
            } else if ("mp4a".equals(codec)) {
                buffer.position(buffer.position() + 8); // reserved + data_ref_index + pre_defined
                int channels = readUInt16();
                buffer.position(buffer.position() + 2); // sample size
                buffer.position(buffer.position() + 4); // pre_defined + reserved
                int sampleRate = readUInt32() >> 16;
                stsd.put("codec", codec);
                stsd.put("channels", channels);
                stsd.put("sample_rate", sampleRate);
            }
            return stsd;
        }
        return new HashMap<>();
    }

    private void parseTrak() {
        int start = buffer.position();
        Map<String, Object> tkhd = parseTkhd();
        String handlerType = "";
        Map<String, Object> stsd = new HashMap<>();
        Map<String, Object> mdhd = new HashMap<>();
        while (buffer.position() < start + (Integer) tkhd.getOrDefault("size", buffer.limit())) { // Approximate
            int size = readUInt32();
            String type = readString(4);
            if ("mdia".equals(type)) {
                parseContainer(buffer.position() + size - 8);
            } else if ("mdhd".equals(type)) {
                mdhd = parseMdhd();
            } else if ("hdlr".equals(type)) {
                handlerType = parseHdlr();
            } else if ("stbl".equals(type)) {
                parseContainer(buffer.position() + size - 8);
            } else if ("stsd".equals(type)) {
                stsd = parseStsd();
            } else {
                buffer.position(buffer.position() + size - 8);
            }
        }
        if ("vide".equals(handlerType)) {
            properties.put("Video Codec", stsd.getOrDefault("codec", "Unknown"));
            properties.put("Video Width", tkhd.get("width"));
            properties.put("Video Height", tkhd.get("height"));
        } else if ("soun".equals(handlerType)) {
            properties.put("Audio Codec", stsd.getOrDefault("codec", "Unknown"));
            properties.put("Audio Channels", stsd.getOrDefault("channels", "Unknown"));
            properties.put("Audio Sample Rate", mdhd.getOrDefault("timescale", stsd.getOrDefault("sample_rate", "Unknown")));
        }
        tracks.add(handlerType);
    }

    private void parse() {
        while (buffer.hasRemaining()) {
            parseBox();
        }
        properties.put("Number of Tracks", tracks.size());
    }

    public void printProperties() {
        for (Map.Entry<String, Object> entry : properties.entrySet()) {
            System.out.println(entry.getKey() + ": " + entry.getValue());
        }
    }

    public void write(String outputPath) throws IOException {
        try (FileOutputStream fos = new FileOutputStream(outputPath)) {
            buffer.position(0);
            fos.getChannel().write(buffer);
        }
    }

    // Example usage:
    // public static void main(String[] args) throws IOException {
    //     MP4Parser parser = new MP4Parser("example.mp4");
    //     parser.printProperties();
    //     parser.write("output.mp4");
    // }
}

JavaScript Class for .MP4 Parsing
This JS class takes a buffer (e.g., from FileReader), decodes, reads, prints properties to console, and can write to a new Blob for download.

class MP4Parser {
  constructor(buffer) {
    this.view = new DataView(buffer);
    this.offset = 0;
    this.properties = {};
    this.tracks = [];
    this.buffer = buffer;
    this.parse();
  }

  readUInt32() {
    const val = this.view.getUint32(this.offset);
    this.offset += 4;
    return val;
  }

  readUInt16() {
    const val = this.view.getUint16(this.offset);
    this.offset += 2;
    return val;
  }

  readString(len) {
    let str = '';
    for (let i = 0; i < len; i++) {
      str += String.fromCharCode(this.view.getUint8(this.offset++));
    }
    return str;
  }

  readTime() {
    return new Date((this.readUInt32() - 2082844800) * 1000).toISOString();
  }

  parseBox() {
    const start = this.offset;
    const size = this.readUInt32();
    const type = this.readString(4);
    const end = start + size;
    if (type === 'ftyp') {
      this.properties['Major Brand'] = this.readString(4);
      this.properties['Minor Version'] = this.readUInt32();
      this.properties['Compatible Brands'] = [];
      while (this.offset < end) {
        this.properties['Compatible Brands'].push(this.readString(4));
      }
    } else if (type === 'moov') {
      this.parseContainer(end);
    }
    this.offset = end;
  }

  parseContainer(end) {
    while (this.offset < end) {
      this.parseBox();
    }
  }

  parseMvhd() {
    this.offset += 4; // version + flags
    this.properties['Creation Time'] = this.readTime();
    this.properties['Modification Time'] = this.readTime();
    const timescale = this.readUInt32();
    this.properties['Timescale'] = timescale;
    this.properties['Duration'] = (this.readUInt32() / timescale).toFixed(2) + ' seconds';
  }

  parseTkhd() {
    this.offset += 4; // version + flags
    this.readTime(); // creation
    this.readTime(); // modification
    const trackId = this.readUInt32();
    this.offset += 4; // reserved
    const duration = this.readUInt32();
    this.offset += 8 + 2 + 2 + 2 + 2 + 36; // reserved, layer, alt, volume, reserved, matrix
    const width = this.readUInt32() / 65536;
    const height = this.readUInt32() / 65536;
    return { id: trackId, width, height, duration };
  }

  parseMdhd() {
    this.offset += 4; // version + flags
    this.readTime(); // creation
    this.readTime(); // modification
    const timescale = this.readUInt32();
    this.offset += 4; // duration
    this.offset += 4; // language + pre_defined
    return { timescale };
  }

  parseHdlr() {
    this.offset += 4; // version + flags
    this.offset += 4; // pre_defined
    const handlerType = this.readString(4);
    this.offset += 12; // reserved
    return handlerType;
  }

  parseStsd() {
    this.offset += 4; // version + flags
    const entryCount = this.readUInt32();
    if (entryCount > 0) {
      const sampleSize = this.readUInt32();
      const codec = this.readString(4);
      this.offset += 6; // reserved + data_ref_index
      if (codec === 'avc1' || codec === 'mp4v') {
        this.offset += 8; // pre_defined + reserved
        const width = this.readUInt16();
        const height = this.readUInt16();
        return { codec, width, height };
      } else if (codec === 'mp4a') {
        this.offset += 8; // reserved + data_ref_index + pre_defined
        const channels = this.readUInt16();
        this.offset += 2; // sample size
        this.offset += 4; // pre_defined + reserved
        const sampleRate = this.readUInt32() >> 16;
        return { codec, channels, sampleRate };
      }
    }
    return {};
  }

  parseTrak() {
    const start = this.offset;
    const tkhd = this.parseTkhd();
    let handlerType = '';
    let stsd = {};
    let mdhd = {};
    while (this.offset < start + tkhd.size || this.offset < this.view.byteLength) { // Approximate
      const size = this.readUInt32();
      const type = this.readString(4);
      if (type === 'mdia') {
        this.parseContainer(this.offset + size - 8);
      } else if (type === 'mdhd') {
        mdhd = this.parseMdhd();
      } else if (type === 'hdlr') {
        handlerType = this.parseHdlr();
      } else if (type === 'stbl') {
        this.parseContainer(this.offset + size - 8);
      } else if (type === 'stsd') {
        stsd = this.parseStsd();
      } else {
        this.offset += size - 8;
      }
    }
    if (handlerType === 'vide') {
      this.properties['Video Codec'] = stsd.codec || 'Unknown';
      this.properties['Video Width'] = tkhd.width || stsd.width;
      this.properties['Video Height'] = tkhd.height || stsd.height;
    } else if (handlerType === 'soun') {
      this.properties['Audio Codec'] = stsd.codec || 'Unknown';
      this.properties['Audio Channels'] = stsd.channels || 'Unknown';
      this.properties['Audio Sample Rate'] = mdhd.timescale || stsd.sampleRate || 'Unknown';
    }
    this.tracks.push(handlerType);
  }

  parse() {
    while (this.offset < this.view.byteLength) {
      this.parseBox();
    }
    this.properties['Number of Tracks'] = this.tracks.length;
  }

  printProperties() {
    for (const [key, value] of Object.entries(this.properties)) {
      console.log(`${key}: ${value}`);
    }
  }

  write() {
    return new Blob([this.buffer]); // Return blob for download; modify buffer for changes
  }
}

// Example usage:
// const reader = new FileReader();
// reader.onload = () => {
//   const parser = new MP4Parser(reader.result);
//   parser.printProperties();
//   const blob = parser.write();
//   // Download blob
// };
// reader.readAsArrayBuffer(file);

C "Class" for .MP4 Parsing
Since C is not object-oriented, this is a struct with functions for init, parse, print, and write. It opens, decodes, reads, prints properties, and can write the file.

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <time.h>

typedef struct {
    uint8_t *data;
    size_t size;
    size_t offset;
    char *properties[14][256]; // Key-value pairs, fixed size for simplicity
    char *tracks[10]; // Max 10 tracks
    int track_count;
} MP4Parser;

uint32_t read_uint32(MP4Parser *p) {
    uint32_t val = (p->data[p->offset] << 24) | (p->data[p->offset+1] << 16) | (p->data[p->offset+2] << 8) | p->data[p->offset+3];
    p->offset += 4;
    return val;
}

uint16_t read_uint16(MP4Parser *p) {
    uint16_t val = (p->data[p->offset] << 8) | p->data[p->offset+1];
    p->offset += 2;
    return val;
}

void read_string(MP4Parser *p, int len, char *dest) {
    memcpy(dest, p->data + p->offset, len);
    dest[len] = '\0';
    p->offset += len;
}

void read_time(MP4Parser *p, char *dest) {
    time_t seconds = read_uint32(p) - 2082844800;
    struct tm *tm = gmtime(&seconds);
    strftime(dest, 256, "%Y-%m-%dT%H:%M:%SZ", tm);
}

void parse_box(MP4Parser *p);

void parse_container(MP4Parser *p, size_t end) {
    while (p->offset < end) {
        parse_box(p);
    }
}

void parse_mvhd(MP4Parser *p) {
    p->offset += 4; // version + flags
    char ctime[256], mtime[256];
    read_time(p, ctime);
    strcpy(p->properties[3][1], ctime); // Creation Time
    read_time(p, mtime);
    strcpy(p->properties[4][1], mtime); // Modification Time
    uint32_t timescale = read_uint32(p);
    sprintf(p->properties[5][1], "%u", timescale); // Timescale
    uint32_t duration = read_uint32(p);
    char dur_str[256];
    sprintf(dur_str, "%.2f seconds", (float)duration / timescale);
    strcpy(p->properties[6][1], dur_str); // Duration
}

void parse_tkhd(MP4Parser *p, double *width, double *height) {
    p->offset += 4; // version + flags
    p->offset += 8; // creation + modification (skip times)
    read_uint32(p); // track_id
    p->offset += 4; // reserved
    read_uint32(p); // duration
    p->offset += 8 + 2 + 2 + 2 + 2 + 36; // reserved etc.
    *width = read_uint32(p) / 65536.0;
    *height = read_uint32(p) / 65536.0;
}

void parse_mdhd(MP4Parser *p, uint32_t *timescale) {
    p->offset += 4; // version + flags
    p->offset += 8; // creation + modification
    *timescale = read_uint32(p);
    p->offset += 4; // duration
    p->offset += 4; // language + pre_defined
}

char* parse_hdlr(MP4Parser *p) {
    p->offset += 4; // version + flags
    p->offset += 4; // pre_defined
    char *handler = malloc(5);
    read_string(p, 4, handler);
    p->offset += 12; // reserved
    return handler;
}

void parse_stsd(MP4Parser *p, char *codec, int *channels, uint32_t *sample_rate, double *width, double *height) {
    p->offset += 4; // version + flags
    uint32_t entry_count = read_uint32(p);
    if (entry_count > 0) {
        read_uint32(p); // sample_size
        read_string(p, 4, codec);
        p->offset += 6; // reserved + data_ref_index
        if (strcmp(codec, "avc1") == 0 || strcmp(codec, "mp4v") == 0) {
            p->offset += 8; // pre_defined + reserved
            *width = read_uint16(p);
            *height = read_uint16(p);
        } else if (strcmp(codec, "mp4a") == 0) {
            p->offset += 8; // reserved + data_ref_index + pre_defined
            *channels = read_uint16(p);
            p->offset += 2; // sample size
            p->offset += 4; // pre_defined + reserved
            *sample_rate = read_uint32(p) >> 16;
        }
    }
}

void parse_trak(MP4Parser *p) {
    size_t start = p->offset;
    double tk_width = 0, tk_height = 0;
    parse_tkhd(p, &tk_width, &tk_height);
    char *handler_type = "";
    char codec[5] = "";
    int channels = 0;
    uint32_t sample_rate = 0;
    double st_width = 0, st_height = 0;
    uint32_t md_timescale = 0;
    while (p->offset < p->size) { // Approximate loop
        uint32_t size = read_uint32(p);
        char type[5];
        read_string(p, 4, type);
        if (strcmp(type, "mdia") == 0) {
            parse_container(p, p->offset + size - 8);
        } else if (strcmp(type, "mdhd") == 0) {
            parse_mdhd(p, &md_timescale);
        } else if (strcmp(type, "hdlr") == 0) {
            handler_type = parse_hdlr(p);
        } else if (strcmp(type, "stbl") == 0) {
            parse_container(p, p->offset + size - 8);
        } else if (strcmp(type, "stsd") == 0) {
            parse_stsd(p, codec, &channels, &sample_rate, &st_width, &st_height);
        } else {
            p->offset += size - 8;
        }
        if (p->offset >= start + 10000) break; // Safety
    }
    if (strcmp(handler_type, "vide") == 0) {
        strcpy(p->properties[8][1], codec[0] ? codec : "Unknown"); // Video Codec
        char w_str[256], h_str[256];
        sprintf(w_str, "%.0f", tk_width ? tk_width : st_width);
        strcpy(p->properties[9][1], w_str); // Video Width
        sprintf(h_str, "%.0f", tk_height ? tk_height : st_height);
        strcpy(p->properties[10][1], h_str); // Video Height
    } else if (strcmp(handler_type, "soun") == 0) {
        strcpy(p->properties[11][1], codec[0] ? codec : "Unknown"); // Audio Codec
        char ch_str[256], sr_str[256];
        sprintf(ch_str, "%d", channels);
        strcpy(p->properties[12][1], ch_str); // Audio Channels
        sprintf(sr_str, "%u", md_timescale ? md_timescale : sample_rate);
        strcpy(p->properties[13][1], sr_str); // Audio Sample Rate
    }
    p->tracks[p->track_count++] = handler_type;
}

void parse_box(MP4Parser *p) {
    size_t start = p->offset;
    uint32_t size = read_uint32(p);
    char type[5];
    read_string(p, 4, type);
    size_t end = start + size;
    if (strcmp(type, "ftyp") == 0) {
        char major[5];
        read_string(p, 4, major);
        strcpy(p->properties[0][1], major); // Major Brand
        uint32_t minor = read_uint32(p);
        char min_str[256];
        sprintf(min_str, "%u", minor);
        strcpy(p->properties[1][1], min_str); // Minor Version
        char compat[1024] = "";
        while (p->offset < end) {
            char brand[5];
            read_string(p, 4, brand);
            strcat(compat, brand);
            strcat(compat, " ");
        }
        strcpy(p->properties[2][1], compat); // Compatible Brands
    } else if (strcmp(type, "moov") == 0) {
        parse_container(p, end);
    } else if (strcmp(type, "mvhd") == 0) {
        parse_mvhd(p);
    } else if (strcmp(type, "trak") == 0) {
        parse_trak(p);
    }
    p->offset = end;
}

void init_properties(MP4Parser *p) {
    const char *keys[] = {"Major Brand", "Minor Version", "Compatible Brands", "Creation Time", "Modification Time", "Timescale", "Duration", "Number of Tracks", "Video Codec", "Video Width", "Video Height", "Audio Codec", "Audio Channels", "Audio Sample Rate"};
    for (int i = 0; i < 14; i++) {
        p->properties[i][0] = malloc(256);
        strcpy(p->properties[i][0], keys[i]);
        p->properties[i][1] = malloc(256);
        strcpy(p->properties[i][1], "Unknown");
    }
}

void parse(MP4Parser *p) {
    init_properties(p);
    while (p->offset < p->size) {
        parse_box(p);
    }
    char tc_str[256];
    sprintf(tc_str, "%d", p->track_count);
    strcpy(p->properties[7][1], tc_str); // Number of Tracks
}

void print_properties(MP4Parser *p) {
    for (int i = 0; i < 14; i++) {
        printf("%s: %s\n", p->properties[i][0], p->properties[i][1]);
    }
}

void write(MP4Parser *p, const char *output_path) {
    FILE *f = fopen(output_path, "wb");
    if (f) {
        fwrite(p->data, 1, p->size, f);
        fclose(f);
    }
}

void init_mp4parser(MP4Parser *p, const char *file_path) {
    FILE *f = fopen(file_path, "rb");
    if (f) {
        fseek(f, 0, SEEK_END);
        p->size = ftell(f);
        fseek(f, 0, SEEK_SET);
        p->data = malloc(p->size);
        fread(p->data, 1, p->size, f);
        fclose(f);
    }
    p->offset = 0;
    p->track_count = 0;
    parse(p);
}

void free_mp4parser(MP4Parser *p) {
    free(p->data);
    for (int i = 0; i < 14; i++) {
        free(p->properties[i][0]);
        free(p->properties[i][1]);
    }
    for (int i = 0; i < p->track_count; i++) {
        free(p->tracks[i]);
    }
}

// Example usage:
// int main() {
//     MP4Parser parser;
//     init_mp4parser(&parser, "example.mp4");
//     print_properties(&parser);
//     write(&parser, "output.mp4");
//     free_mp4parser(&parser);
//     return 0;
// }