Task 370: .M4A File Format
Task 370: .M4A File Format
File Format Specifications for .M4A
The .M4A file format is an audio-only variant of the MPEG-4 Part 14 (MP4) container, standardized under ISO/IEC 14496-14. It typically encapsulates audio encoded with AAC (Advanced Audio Coding) or ALAC (Apple Lossless Audio Codec), but can support other codecs. The structure is based on a series of nested "boxes" (also called atoms), each starting with a 4-byte big-endian size field, followed by a 4-byte type identifier. If the size is 1, an additional 8-byte large size follows. Boxes can be containers (holding child boxes) or leaf boxes (holding data). Key top-level boxes include:
ftyp
: File type and compatibility (e.g., brands like 'M4A ' or 'mp42').moov
: Movie header, containing metadata and track info (sub-boxes likemvhd
for header,trak
for track details,udta
for user data).mdat
: Media data (raw audio frames).- Optional:
free
orskip
for padding.
Metadata is primarily stored under moov/udta/meta/hdlr
(handler type 'mdir') followed by ilst
(item list). The meta
box has 4 null bytes after its type. Each metadata item in ilst
is a box with a 4-character code (e.g., '©nam'), containing a 'data' sub-box with version (usually 0), flags (usually 0), and the value. Data types vary by tag (e.g., UTF-8 strings, integers, images). The format inherits from QuickTime, with Apple-specific extensions for iTunes compatibility.
List of all properties intrinsic to the .M4A file format (metadata fields in the 'ilst' box, with their codes, types, and purposes):
- ©nam (Text): Track title.
- ©alb (Text): Album.
- ©ART (Text): Artist.
- aART (Text): Album artist.
- ©wrt (Text): Composer.
- ©day (Text): Year.
- ©cmt (Text): Comment.
- desc (Text): Description (often for podcasts).
- purd (Text): Purchase date.
- ©grp (Text): Grouping.
- ©gen (Text): Genre.
- ©lyr (Text): Lyrics.
- purl (Text): Podcast URL.
- egid (Text): Podcast episode GUID.
- catg (Text): Podcast category.
- keyw (Text): Podcast keywords.
- ©too (Text): Encoded by.
- cprt (Text): Copyright.
- soal (Text): Album sort order.
- soaa (Text): Album artist sort order.
- soar (Text): Artist sort order.
- sonm (Text): Title sort order.
- soco (Text): Composer sort order.
- sosn (Text): Show sort order.
- tvsh (Text): Show name.
- ©wrk (Text): Work.
- ©mvn (Text): Movement.
- cpil (Boolean): Part of a compilation.
- pgap (Boolean): Part of a gapless album.
- pcst (Boolean): Podcast (iTunes reads on import).
- trkn (Tuple of ints): Track number and total tracks.
- disk (Tuple of ints): Disc number and total discs.
- tmpo (Integer): Tempo/BPM.
- ©mvc (Integer): Movement count.
- ©mvi (Integer): Movement index.
- shwm (Integer): Work/movement show mode.
- stik (Integer): Media kind.
- hdvd (Integer): HD video flag.
- rtng (Integer): Content rating.
- tves (Integer): TV episode.
- tvsn (Integer): TV season.
- plID (Integer): iTunes playlist ID.
- cnID (Integer): iTunes catalog ID.
- geID (Integer): iTunes genre ID.
- atID (Integer): iTunes artist ID.
- sfID (Integer): iTunes store front ID.
- cmID (Integer): iTunes composer ID.
- akID (Integer): iTunes account kind ID.
- covr (List of images): Cover artwork (JPEG/PNG).
- gnre (Integer): ID3v1 genre index (deprecated, use ©gen).
- ---- (Text, custom): Freeform metadata (key format like 'com.apple.iTunes:CustomKey').
Two direct download links for .M4A files:
- https://files.testfile.org/AUDIO/C/M4A/sample1.m4a
- https://files.testfile.org/AUDIO/C/M4A/sample2.m4a
Ghost blog embedded HTML/JavaScript for drag-and-drop .M4A file to dump properties:
- Python class for .M4A (reads, decodes, prints properties; write method to update and save):
import struct
import os
class M4AHandler:
def __init__(self, filepath):
self.filepath = filepath
self.properties = {}
self.data = None
self._load()
def _load(self):
with open(self.filepath, 'rb') as f:
self.data = f.read()
self._parse_metadata()
def _read_uint32(self, offset):
return struct.unpack('>I', self.data[offset:offset+4])[0]
def _read_string(self, offset, length):
return self.data[offset:offset+length].decode('utf-8', errors='ignore')
def _find_box(self, types, start=0, end=len(self.data)):
path = list(types)
target = path.pop()
offset = start
while offset < end:
size = self._read_uint32(offset)
box_type = self.data[offset+4:offset+8].decode('ascii')
if size == 1:
size = struct.unpack('>Q', self.data[offset+8:offset+16])[0]
box_start = offset + 16
else:
box_start = offset + 8
box_end = offset + size
if box_type == target:
return offset, box_end
elif path and box_type == path[0]:
path.pop(0)
start = box_start
offset = box_end
return None, None
def _parse_metadata(self):
moov_start, _ = self._find_box(['moov'])
if moov_start is None:
return
udta_start, _ = self._find_box(['udta'], moov_start)
if udta_start is None:
return
meta_start, _ = self._find_box(['meta'], udta_start)
if meta_start is None:
return
ilst_start, ilst_end = self._find_box(['ilst'], meta_start + 12) # Skip 4 null
if ilst_start is None:
return
offset = ilst_start + 8
while offset < ilst_end:
size = self._read_uint32(offset)
key = self.data[offset+4:offset+8].decode('ascii', errors='ignore')
data_offset = offset + 16 # Skip to after 'data' box header
data_size = size - 24 # Adjust for headers
version_flags = self._read_uint32(offset + 16) # version + flags
if version_flags != 0:
pass # Assume 0
if key in ['©nam', '©alb', '©ART', 'aART', '©wrt', '©day', '©cmt', 'desc', 'purd', '©grp', '©gen', '©lyr', 'purl', 'egid', 'catg', 'keyw', '©too', 'cprt', 'soal', 'soaa', 'soar', 'sonm', 'soco', 'sosn', 'tvsh', '©wrk', '©mvn']:
value = self._read_string(data_offset + 8, data_size)
elif key in ['cpil', 'pgap', 'pcst']:
value = bool(struct.unpack('>B', self.data[data_offset:data_offset+1])[0])
elif key in ['tmpo', '©mvc', '©mvi', 'shwm', 'stik', 'hdvd', 'rtng', 'tves', 'tvsn', 'plID', 'cnID', 'geID', 'atID', 'sfID', 'cmID', 'akID']:
value = struct.unpack('>H', self.data[data_offset:data_offset+2])[0]
elif key in ['trkn', 'disk']:
value = (struct.unpack('>H', self.data[data_offset+2:data_offset+4])[0],
struct.unpack('>H', self.data[data_offset+4:data_offset+6])[0])
elif key == 'covr':
value = 'Cover art present (binary data)'
elif key == 'gnre':
value = struct.unpack('>H', self.data[data_offset:data_offset+2])[0]
elif key == '----':
value = 'Custom freeform'
else:
value = 'Unknown'
self.properties[key] = value
offset += size
def print_properties(self):
import json
print(json.dumps(self.properties, indent=2))
def write_property(self, key, value):
self.properties[key] = value
# To fully implement write, we'd need to rebuild the file with updated boxes.
# For demo, print updated and save original (stub; real impl would serialize boxes).
print("Updated properties (write stub):")
self.print_properties()
with open(self.filepath + '.new', 'wb') as f:
f.write(self.data) # Placeholder; actual write would modify data
- Java class for .M4A (similar functionality):
import java.io.*;
import java.nio.*;
import java.nio.channels.FileChannel;
import java.nio.charset.StandardCharsets;
import java.util.*;
public class M4AHandler {
private String filepath;
private Map<String, Object> properties = new HashMap<>();
private ByteBuffer buffer;
public M4AHandler(String filepath) {
this.filepath = filepath;
load();
}
private void load() {
try (RandomAccessFile raf = new RandomAccessFile(filepath, "r")) {
buffer = ByteBuffer.allocate((int) raf.length()).order(ByteOrder.BIG_ENDIAN);
raf.getChannel().read(buffer);
buffer.flip();
parseMetadata();
} catch (IOException e) {
e.printStackTrace();
}
}
private int findBox(String[] types, int start, int end) {
List<String> path = new ArrayList<>(Arrays.asList(types));
String target = path.remove(path.size() - 1);
int offset = start;
while (offset < end) {
buffer.position(offset);
int size = buffer.getInt();
byte[] typeBytes = new byte[4];
buffer.get(typeBytes);
String boxType = new String(typeBytes, StandardCharsets.US_ASCII);
int boxStart = offset + 8;
if (size == 1) {
long largeSize = buffer.getLong();
size = (int) largeSize; // Assume fits
boxStart += 8;
}
int boxEnd = offset + size;
if (boxType.equals(target)) {
return offset;
} else if (!path.isEmpty() && boxType.equals(path.get(0))) {
path.remove(0);
start = boxStart;
}
offset = boxEnd;
}
return -1;
}
private void parseMetadata() {
int moovStart = findBox(new String[]{"moov"}, 0, buffer.limit());
if (moovStart == -1) return;
int udtaStart = findBox(new String[]{"udta"}, moovStart + 8, buffer.limit());
if (udtaStart == -1) return;
int metaStart = findBox(new String[]{"meta"}, udtaStart + 8, buffer.limit());
if (metaStart == -1) return;
int ilstStart = findBox(new String[]{"ilst"}, metaStart + 12, buffer.limit()); // Skip nulls
if (ilstStart == -1) return;
buffer.position(ilstStart + 8);
int ilstEnd = ilstStart + buffer.getInt(ilstStart);
int offset = ilstStart + 8;
while (offset < ilstEnd) {
buffer.position(offset);
int size = buffer.getInt();
byte[] keyBytes = new byte[4];
buffer.get(keyBytes);
String key = new String(keyBytes, StandardCharsets.US_ASCII);
int dataOffset = offset + 24; // After 'data' header + version/flags
int dataSize = size - 24;
Object value = null;
if (Arrays.asList("©nam", "©alb", "©ART", "aART", "©wrt", "©day", "©cmt", "desc", "purd", "©grp", "©gen", "©lyr", "purl", "egid", "catg", "keyw", "©too", "cprt", "soal", "soaa", "soar", "sonm", "soco", "sosn", "tvsh", "©wrk", "©mvn").contains(key)) {
byte[] strBytes = new byte[dataSize];
buffer.position(dataOffset);
buffer.get(strBytes);
value = new String(strBytes, StandardCharsets.UTF_8);
} else if (Arrays.asList("cpil", "pgap", "pcst").contains(key)) {
value = buffer.get(dataOffset) != 0;
} else if (Arrays.asList("tmpo", "©mvc", "©mvi", "shwm", "stik", "hdvd", "rtng", "tves", "tvsn", "plID", "cnID", "geID", "atID", "sfID", "cmID", "akID").contains(key)) {
value = buffer.getShort(dataOffset) & 0xFFFF;
} else if (Arrays.asList("trkn", "disk").contains(key)) {
int num = buffer.getShort(dataOffset + 2) & 0xFFFF;
int total = buffer.getShort(dataOffset + 4) & 0xFFFF;
value = new int[]{num, total};
} else if (key.equals("covr")) {
value = "Cover art present (binary data)";
} else if (key.equals("gnre")) {
value = buffer.getShort(dataOffset) & 0xFFFF;
} else if (key.equals("----")) {
value = "Custom freeform";
}
if (value != null) {
properties.put(key, value);
}
offset += size;
}
}
public void printProperties() {
System.out.println(properties);
}
public void writeProperty(String key, Object value) {
properties.put(key, value);
// Stub: To write, rebuild buffer with updated boxes; here save original
try (FileOutputStream fos = new FileOutputStream(filepath + ".new")) {
buffer.position(0);
fos.getChannel().write(buffer);
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("Updated (stub): " + properties);
}
}
- JavaScript class for .M4A (Node.js, uses fs; read/print, write stub):
const fs = require('fs');
class M4AHandler {
constructor(filepath) {
this.filepath = filepath;
this.properties = {};
this.data = fs.readFileSync(filepath);
this.view = new DataView(this.data.buffer);
this.offset = 0;
this.parseMetadata();
}
readUInt32() {
const val = this.view.getUint32(this.offset);
this.offset += 4;
return val;
}
readString(len) {
let str = '';
for (let i = 0; i < len; i++) {
str += String.fromCharCode(this.view.getUint8(this.offset++));
}
return str;
}
findBox(types, start = 0, end = this.data.length) {
const path = [...types];
const target = path.pop();
this.offset = start;
while (this.offset < end) {
const sizeStart = this.offset;
let size = this.readUInt32();
const type = this.readString(4);
if (size === 1) {
size = Number(this.view.getBigUint64(this.offset));
this.offset += 8;
}
const boxEnd = sizeStart + size;
if (type === target) {
return sizeStart;
} else if (path.length && type === path[0]) {
path.shift();
start = this.offset;
}
this.offset = boxEnd;
}
return -1;
}
parseMetadata() {
const moovStart = this.findBox(['moov']);
if (moovStart === -1) return;
const udtaStart = this.findBox(['udta'], moovStart + 8);
if (udtaStart === -1) return;
const metaStart = this.findBox(['meta'], udtaStart + 8);
if (metaStart === -1) return;
const ilstStart = this.findBox(['ilst'], metaStart + 12);
if (ilstStart === -1) return;
this.offset = ilstStart + 8;
const ilstSize = this.view.getUint32(ilstStart);
const ilstEnd = ilstStart + ilstSize;
while (this.offset < ilstEnd) {
const itemStart = this.offset;
const size = this.readUInt32();
const key = this.readString(4);
this.offset = itemStart + 16; // To data
this.readString(4); // 'data'
this.readUInt32(); // version/flags
const dataLen = size - 24;
let value;
const textKeys = ['©nam', '©alb', '©ART', 'aART', '©wrt', '©day', '©cmt', 'desc', 'purd', '©grp', '©gen', '©lyr', 'purl', 'egid', 'catg', 'keyw', '©too', 'cprt', 'soal', 'soaa', 'soar', 'sonm', 'soco', 'sosn', 'tvsh', '©wrk', '©mvn'];
if (textKeys.includes(key)) {
value = new TextDecoder('utf-8').decode(this.data.subarray(this.offset, this.offset + dataLen));
} else if (['cpil', 'pgap', 'pcst'].includes(key)) {
value = !!this.view.getUint8(this.offset);
} else if (['tmpo', '©mvc', '©mvi', 'shwm', 'stik', 'hdvd', 'rtng', 'tves', 'tvsn', 'plID', 'cnID', 'geID', 'atID', 'sfID', 'cmID', 'akID'].includes(key)) {
value = this.view.getUint16(this.offset);
} else if (['trkn', 'disk'].includes(key)) {
value = [this.view.getUint16(this.offset + 2), this.view.getUint16(this.offset + 4)];
} else if (key === 'covr') {
value = 'Cover art present (binary data)';
} else if (key === 'gnre') {
value = this.view.getUint16(this.offset);
} else if (key === '----') {
value = 'Custom freeform';
}
if (value !== undefined) {
this.properties[key] = value;
}
this.offset = itemStart + size;
}
}
printProperties() {
console.log(JSON.stringify(this.properties, null, 2));
}
writeProperty(key, value) {
this.properties[key] = value;
// Stub: full write would rebuild file
fs.writeFileSync(this.filepath + '.new', this.data);
console.log('Updated (stub):', JSON.stringify(this.properties, null, 2));
}
}
- C++ class for .M4A (using std::ifstream; read/print, write stub):
#include <iostream>
#include <fstream>
#include <vector>
#include <map>
#include <string>
#include <cstdint>
#include <iomanip>
#include <endian.h> // For big-endian if needed; assume host is little
class M4AHandler {
private:
std::string filepath;
std::map<std::string, std::string> properties; // Use string for simplicity; convert as needed
std::vector<uint8_t> data;
uint32_t readUInt32(size_t& offset) {
uint32_t val = (data[offset] << 24) | (data[offset+1] << 16) | (data[offset+2] << 8) | data[offset+3];
offset += 4;
return val;
}
std::string readString(size_t& offset, size_t len) {
std::string str(data.begin() + offset, data.begin() + offset + len);
offset += len;
return str;
}
size_t findBox(const std::vector<std::string>& types, size_t start = 0, size_t end = 0) {
if (end == 0) end = data.size();
auto path = types;
std::string target = path.back();
path.pop_back();
size_t offset = start;
while (offset < end) {
size_t sizeStart = offset;
uint32_t size = readUInt32(offset);
std::string type = readString(offset, 4);
size_t boxStart = offset;
if (size == 1) {
uint64_t largeSize = ((uint64_t)readUInt32(offset) << 32) | readUInt32(offset);
size = static_cast<uint32_t>(largeSize); // Assume fits
boxStart += 8;
}
size_t boxEnd = sizeStart + size;
if (type == target) {
return sizeStart;
} else if (!path.empty() && type == path.front()) {
path.erase(path.begin());
start = boxStart;
}
offset = boxEnd;
}
return std::string::npos;
}
void parseMetadata() {
size_t moovStart = findBox({"moov"});
if (moovStart == std::string::npos) return;
size_t udtaStart = findBox({"udta"}, moovStart + 8);
if (udtaStart == std::string::npos) return;
size_t metaStart = findBox({"meta"}, udtaStart + 8);
if (metaStart == std::string::npos) return;
size_t ilstStart = findBox({"ilst"}, metaStart + 12);
if (ilstStart == std::string::npos) return;
size_t offset = ilstStart + 8;
uint32_t ilstSize = (data[ilstStart] << 24) | (data[ilstStart+1] << 16) | (data[ilstStart+2] << 8) | data[ilstStart+3];
size_t ilstEnd = ilstStart + ilstSize;
while (offset < ilstEnd) {
size_t itemStart = offset;
uint32_t size = readUInt32(offset);
std::string key = readString(offset, 4);
offset = itemStart + 24; // Skip to data value
uint32_t dataLen = size - 24;
std::string value;
// Simplified: treat most as string; real would parse per type
if (key == "covr") {
value = "Cover art present (binary data)";
} else if (key == "----") {
value = "Custom freeform";
} else {
value = std::string(data.begin() + offset, data.begin() + offset + dataLen);
}
properties[key] = value;
offset = itemStart + size;
}
}
public:
M4AHandler(const std::string& fp) : filepath(fp) {
std::ifstream file(filepath, std::ios::binary);
data = std::vector<uint8_t>((std::istreambuf_iterator<char>(file)), std::istreambuf_iterator<char>());
parseMetadata();
}
void printProperties() {
std::cout << "{" << std::endl;
for (const auto& p : properties) {
std::cout << " \"" << p.first << "\": \"" << p.second << "\"," << std::endl;
}
std::cout << "}" << std::endl;
}
void writeProperty(const std::string& key, const std::string& value) {
properties[key] = value;
// Stub: write original
std::ofstream out(filepath + ".new", std::ios::binary);
out.write(reinterpret_cast<const char*>(data.data()), data.size());
std::cout << "Updated (stub):" << std::endl;
printProperties();
}
};