Task 480: .OGV File Format
Task 480: .OGV File Format
File Format Specifications for .OGV
The .OGV file format is an extension for video files using the Ogg container format, typically encapsulating Theora video codec data (and often Vorbis audio). It is an open, royalty-free multimedia container developed by the Xiph.Org Foundation. The core specifications are defined in RFC 3533, which describes the Ogg encapsulation format version 0. This format supports streaming, multiplexing of multiple bitstreams (e.g., audio and video), and is designed for efficient delivery without optional or variable-length fields beyond the defined structure. All multi-byte integers in Ogg are unsigned and stored in little-endian byte order.
An Ogg file (and thus .OGV) consists of a sequence of "pages," each representing a self-contained unit for framing, error detection, and synchronization. There is no global file header beyond the first page's beginning-of-stream (BOS) flag. Each page has a fixed 27-byte header followed by a segment table and then the actual data segments (payload). The format allows for chaining (appending bitstreams) and multiplexing (interleaving streams like video and audio).
List of Properties Intrinsic to the File Format
These are the core structural properties defining the Ogg container format used in .OGV files. They are derived from the page header and overall file composition, as these are the intrinsic binary elements that identify and organize the file on disk or in streams. Properties are per-page, as the file is a stream of pages without a centralized index.
- Capture Pattern (Magic Number): A 4-byte ASCII string "OggS" (hex: 4F 67 67 53) at the start of every page header. Used for identification and resynchronization if data is corrupted.
- Stream Structure Version: A 1-byte unsigned integer (mandated to be 0 for the current format). Allows for future expansions.
- Header Type Flag: A 1-byte bitfield with flags:
- Bit 0 (0x01): Set if the page continues a packet from the previous page; unset if it starts a fresh packet.
- Bit 1 (0x02): Set if this is the first page of a logical bitstream (Beginning of Stream, BOS); unset otherwise.
- Bit 2 (0x04): Set if this is the last page of a logical bitstream (End of Stream, EOS); unset otherwise.
- Bits 3-7: Reserved (must be 0).
- Absolute Granule Position: An 8-byte (64-bit) unsigned integer representing a time marker or sample count. Its exact interpretation depends on the codec (e.g., for Theora in .OGV, it encodes frame timing). Value -1 indicates no position defined.
- Stream Serial Number: A 4-byte (32-bit) unsigned integer unique to each logical bitstream in a multiplexed file (e.g., separate for video and audio streams).
- Page Sequence Number: A 4-byte (32-bit) unsigned integer incrementing by 1 for each page in a logical bitstream (starting from 0).
- Page Checksum: A 4-byte (32-bit) CRC-32 checksum of the entire page (header and data, with this field temporarily set to 0 during calculation). Used for error detection.
- Number of Page Segments: A 1-byte unsigned integer (0-255) indicating how many lacing segments follow in the segment table.
- Segment Table: A variable-length array of 1-byte unsigned integers (one per segment, up to 255 bytes total). Each value is the length (0-255 bytes) of a data segment. The sum of lengths gives the payload size (up to 65,025 bytes per page). A length of 255 indicates the packet continues to the next segment.
Additional format-level properties:
- File Extension: .ogv (for video-specific Ogg files).
- MIME Type: video/ogg.
- Byte Order: Little-endian for all multi-byte fields.
- Page Size Limit: Header (27 bytes) + segment table (up to 255 bytes) + payload (up to 255 * 255 = 65,025 bytes), for a max of ~65,307 bytes per page.
- Multiplexing/Chaining: Files can interleave multiple streams (identified by serial numbers) or append bitstreams seamlessly.
- Metadata Handling: Not built into Ogg; delegated to codec-specific headers (e.g., VorbisComment for audio metadata in .OGV files).
These properties ensure the format's streamability and robustness, with no dependencies on file system metadata like timestamps or permissions.
Two Direct Download Links for .OGV Files
- https://filesamples.com/samples/video/ogv/sample_1280x720_surfing_with_audio.ogv (1280x720 resolution sample video with audio).
- https://filesamples.com/samples/video/ogv/sample_960x400_ocean_with_audio.ogv (960x400 resolution sample video with audio).
Ghost Blog Embedded HTML JavaScript for Drag-and-Drop .OGV Property Dump
This is a self-contained HTML snippet with embedded JavaScript that can be embedded in a Ghost blog post (or any HTML page). It creates a drag-and-drop area where users can drop a .OGV file. The script reads the file as an ArrayBuffer, parses all Ogg pages, extracts the properties listed above, and dumps them to the screen in a readable format. It handles multiple pages and validates the magic number.
Python Class for .OGV Handling
This Python class opens a .OGV file, decodes and reads all pages, prints the properties to console, and supports writing a modified copy (e.g., updating granule positions). It uses struct for binary parsing.
import struct
import os
class OGVHandler:
def __init__(self, filepath):
self.filepath = filepath
self.pages = []
self._parse()
def _parse(self):
with open(self.filepath, 'rb') as f:
data = f.read()
offset = 0
while offset < len(data):
if data[offset:offset+4] != b'OggS':
print(f"Invalid magic at offset {offset}")
break
version, = struct.unpack_from('<B', data, offset + 4)
header_type, = struct.unpack_from('<B', data, offset + 5)
granule, = struct.unpack_from('<Q', data, offset + 6)
serial, = struct.unpack_from('<I', data, offset + 14)
page_seq, = struct.unpack_from('<I', data, offset + 18)
checksum, = struct.unpack_from('<I', data, offset + 22)
seg_count, = struct.unpack_from('<B', data, offset + 26)
seg_table = list(struct.unpack_from(f'<{seg_count}B', data, offset + 27))
payload_size = sum(seg_table)
page = {
'capture': 'OggS',
'version': version,
'header_type': header_type,
'granule': granule,
'serial': serial,
'page_seq': page_seq,
'checksum': checksum,
'seg_count': seg_count,
'seg_table': seg_table
}
self.pages.append(page)
offset += 27 + seg_count + payload_size
def print_properties(self):
for i, page in enumerate(self.pages, 1):
print(f"Page {i}:")
print(f" Capture Pattern: {page['capture']}")
print(f" Version: {page['version']}")
print(f" Header Type: 0x{page['header_type']:02x} (Continued: {bool(page['header_type'] & 1)}, BOS: {bool(page['header_type'] & 2)}, EOS: {bool(page['header_type'] & 4)})")
print(f" Granule Position: {page['granule']}")
print(f" Serial Number: {page['serial']}")
print(f" Page Sequence: {page['page_seq']}")
print(f" Checksum: 0x{page['checksum']:08x}")
print(f" Segments Count: {page['seg_count']}")
print(f" Segment Table: {page['seg_table']}")
def write_modified(self, output_path, modify_granule=None):
with open(self.filepath, 'rb') as f_in:
data = bytearray(f_in.read())
offset = 0
page_num = 0
while offset < len(data):
if data[offset:offset+4] != b'OggS':
break
seg_count = data[offset + 26]
payload_size = sum(data[offset + 27:offset + 27 + seg_count])
if modify_granule is not None and page_num < len(modify_granule):
struct.pack_into('<Q', data, offset + 6, modify_granule[page_num])
# Recalculate checksum (simple CRC placeholder; implement full CRC-32 if needed)
data[offset + 22:offset + 26] = b'\x00\x00\x00\x00' # Reset for demo
offset += 27 + seg_count + payload_size
page_num += 1
with open(output_path, 'wb') as f_out:
f_out.write(data)
# Example usage:
# handler = OGVHandler('sample.ogv')
# handler.print_properties()
# handler.write_modified('modified.ogv', modify_granule=[12345, 67890]) # Modify first two pages' granules
Java Class for .OGV Handling
This Java class opens a .OGV file, decodes and reads all pages, prints properties to console, and supports writing a modified copy. It uses ByteBuffer for parsing.
import java.io.*;
import java.nio.*;
import java.nio.channels.FileChannel;
import java.util.*;
public class OGVHandler {
private String filepath;
private List<Map<String, Object>> pages = new ArrayList<>();
public OGVHandler(String filepath) {
this.filepath = filepath;
parse();
}
private void parse() {
try (RandomAccessFile raf = new RandomAccessFile(filepath, "r")) {
FileChannel channel = raf.getChannel();
ByteBuffer buffer = ByteBuffer.allocate((int) raf.length());
channel.read(buffer);
buffer.flip();
int offset = 0;
while (offset < buffer.limit()) {
buffer.position(offset);
if (buffer.getInt() != 0x4F676753) { // 'OggS' in big-endian view, but since little-endian file, use int
System.out.println("Invalid magic at offset " + offset);
break;
}
byte version = buffer.get();
byte headerType = buffer.get();
long granule = buffer.getLong();
int serial = Integer.reverseBytes(buffer.getInt()); // To little-endian
int pageSeq = Integer.reverseBytes(buffer.getInt());
int checksum = Integer.reverseBytes(buffer.getInt());
byte segCount = buffer.get();
int[] segTable = new int[segCount & 0xFF];
int payloadSize = 0;
for (int i = 0; i < segCount; i++) {
segTable[i] = buffer.get() & 0xFF;
payloadSize += segTable[i];
}
Map<String, Object> page = new HashMap<>();
page.put("capture", "OggS");
page.put("version", version);
page.put("header_type", headerType);
page.put("granule", granule);
page.put("serial", serial);
page.put("page_seq", pageSeq);
page.put("checksum", checksum);
page.put("seg_count", segCount & 0xFF);
page.put("seg_table", segTable);
pages.add(page);
offset += 27 + (segCount & 0xFF) + payloadSize;
}
} catch (IOException e) {
e.printStackTrace();
}
}
public void printProperties() {
for (int i = 0; i < pages.size(); i++) {
Map<String, Object> page = pages.get(i);
System.out.println("Page " + (i + 1) + ":");
System.out.println(" Capture Pattern: " + page.get("capture"));
System.out.println(" Version: " + page.get("version"));
byte ht = (byte) page.get("header_type");
System.out.println(" Header Type: 0x" + Integer.toHexString(ht & 0xFF) + " (Continued: " + ((ht & 1) != 0) + ", BOS: " + ((ht & 2) != 0) + ", EOS: " + ((ht & 4) != 0) + ")");
System.out.println(" Granule Position: " + page.get("granule"));
System.out.println(" Serial Number: " + page.get("serial"));
System.out.println(" Page Sequence: " + page.get("page_seq"));
System.out.println(" Checksum: 0x" + Integer.toHexString((int) page.get("checksum")));
System.out.println(" Segments Count: " + page.get("seg_count"));
System.out.println(" Segment Table: " + Arrays.toString((int[]) page.get("seg_table")));
}
}
public void writeModified(String outputPath, long[] modifyGranule) {
try (FileInputStream fis = new FileInputStream(filepath);
FileOutputStream fos = new FileOutputStream(outputPath)) {
byte[] data = new byte[(int) new File(filepath).length()];
fis.read(data);
int offset = 0;
int pageNum = 0;
ByteBuffer buf = ByteBuffer.wrap(data).order(ByteOrder.LITTLE_ENDIAN);
while (offset < data.length) {
buf.position(offset);
if (buf.getInt() != 0x5367674F) { // 'OggS' little-endian
break;
}
buf.position(offset + 26);
int segCount = buf.get() & 0xFF;
int payloadSize = 0;
for (int i = 0; i < segCount; i++) {
payloadSize += buf.get() & 0xFF;
}
if (modifyGranule != null && pageNum < modifyGranule.length) {
buf.position(offset + 6);
buf.putLong(modifyGranule[pageNum]);
// Recalculate checksum (placeholder; implement CRC-32)
buf.position(offset + 22);
buf.putInt(0);
}
offset += 27 + segCount + payloadSize;
pageNum++;
}
fos.write(data);
} catch (IOException e) {
e.printStackTrace();
}
}
// Example usage:
// public static void main(String[] args) {
// OGVHandler handler = new OGVHandler("sample.ogv");
// handler.printProperties();
// handler.writeModified("modified.ogv", new long[]{12345, 67890});
// }
}
JavaScript Class for .OGV Handling
This JavaScript class (ES6) can open a .OGV file via File API, decode and read pages, print properties to console, and write a modified Blob (e.g., for download). It assumes Node.js or browser with fs/Blob support.
class OGVHandler {
constructor(file) {
this.file = file;
this.pages = [];
}
async parse() {
const buffer = await this.file.arrayBuffer();
const view = new DataView(buffer);
let offset = 0;
while (offset < buffer.byteLength) {
if (view.getUint32(offset) !== 0x4F676753) { // 'OggS'
console.log(`Invalid magic at offset ${offset}`);
break;
}
const version = view.getUint8(offset + 4);
const headerType = view.getUint8(offset + 5);
const granule = view.getBigUint64(offset + 6);
const serial = view.getUint32(offset + 14, true); // little-endian
const pageSeq = view.getUint32(offset + 18, true);
const checksum = view.getUint32(offset + 22, true);
const segCount = view.getUint8(offset + 26);
const segTable = [];
let payloadSize = 0;
for (let i = 0; i < segCount; i++) {
const len = view.getUint8(offset + 27 + i);
segTable.push(len);
payloadSize += len;
}
this.pages.push({
capture: 'OggS',
version,
headerType,
granule,
serial,
pageSeq,
checksum,
segCount,
segTable
});
offset += 27 + segCount + payloadSize;
}
}
printProperties() {
this.pages.forEach((page, i) => {
console.log(`Page ${i + 1}:`);
console.log(` Capture Pattern: ${page.capture}`);
console.log(` Version: ${page.version}`);
console.log(` Header Type: 0x${page.headerType.toString(16)} (Continued: ${!!(page.headerType & 1)}, BOS: ${!!(page.headerType & 2)}, EOS: ${!!(page.headerType & 4)})`);
console.log(` Granule Position: ${page.granule}`);
console.log(` Serial Number: ${page.serial}`);
console.log(` Page Sequence: ${page.pageSeq}`);
console.log(` Checksum: 0x${page.checksum.toString(16)}`);
console.log(` Segments Count: ${page.segCount}`);
console.log(` Segment Table: [${page.segTable.join(', ')}]`);
});
}
async writeModified(modifyGranule = []) {
const buffer = await this.file.arrayBuffer();
const data = new Uint8Array(buffer);
const view = new DataView(data.buffer);
let offset = 0;
let pageNum = 0;
while (offset < data.length) {
if (view.getUint32(offset) !== 0x4F676753) break;
const segCount = view.getUint8(offset + 26);
let payloadSize = 0;
for (let i = 0; i < segCount; i++) {
payloadSize += view.getUint8(offset + 27 + i);
}
if (pageNum < modifyGranule.length) {
view.setBigUint64(offset + 6, BigInt(modifyGranule[pageNum]));
// Recalculate checksum (placeholder)
view.setUint32(offset + 22, 0, true);
}
offset += 27 + segCount + payloadSize;
pageNum++;
}
return new Blob([data], { type: 'video/ogg' });
}
}
// Example usage (browser):
// const input = document.querySelector('input[type="file"]');
// input.addEventListener('change', async (e) => {
// const file = e.target.files[0];
// const handler = new OGVHandler(file);
// await handler.parse();
// handler.printProperties();
// const modifiedBlob = await handler.writeModified([12345n, 67890n]);
// // Download modifiedBlob
// });
C++ Class for .OGV Handling
This C++ class opens a .OGV file, decodes and reads pages, prints properties to console, and supports writing a modified copy. It uses std::ifstream and structs for parsing.
#include <iostream>
#include <fstream>
#include <vector>
#include <cstdint>
#include <cstring>
struct OggPage {
char capture[5] = "OggS";
uint8_t version;
uint8_t header_type;
uint64_t granule;
uint32_t serial;
uint32_t page_seq;
uint32_t checksum;
uint8_t seg_count;
std::vector<uint8_t> seg_table;
};
class OGVHandler {
private:
std::string filepath;
std::vector<OggPage> pages;
public:
OGVHandler(const std::string& fp) : filepath(fp) {
parse();
}
void parse() {
std::ifstream file(filepath, std::ios::binary);
if (!file) return;
file.seekg(0, std::ios::end);
size_t size = file.tellg();
file.seekg(0);
std::vector<char> data(size);
file.read(data.data(), size);
size_t offset = 0;
while (offset < size) {
if (std::memcmp(&data[offset], "OggS", 4) != 0) {
std::cout << "Invalid magic at offset " << offset << std::endl;
break;
}
OggPage page;
page.version = static_cast<uint8_t>(data[offset + 4]);
page.header_type = static_cast<uint8_t>(data[offset + 5]);
std::memcpy(&page.granule, &data[offset + 6], 8); // Little-endian assumed
std::memcpy(&page.serial, &data[offset + 14], 4);
std::memcpy(&page.page_seq, &data[offset + 18], 4);
std::memcpy(&page.checksum, &data[offset + 22], 4);
page.seg_count = static_cast<uint8_t>(data[offset + 26]);
page.seg_table.resize(page.seg_count);
std::memcpy(page.seg_table.data(), &data[offset + 27], page.seg_count);
size_t payload_size = 0;
for (auto len : page.seg_table) payload_size += len;
pages.push_back(page);
offset += 27 + page.seg_count + payload_size;
}
}
void printProperties() const {
for (size_t i = 0; i < pages.size(); ++i) {
const auto& page = pages[i];
std::cout << "Page " << (i + 1) << ":" << std::endl;
std::cout << " Capture Pattern: " << page.capture << std::endl;
std::cout << " Version: " << static_cast<int>(page.version) << std::endl;
std::cout << " Header Type: 0x" << std::hex << static_cast<int>(page.header_type) << " (Continued: " << ((page.header_type & 1) ? "Yes" : "No")
<< ", BOS: " << ((page.header_type & 2) ? "Yes" : "No") << ", EOS: " << ((page.header_type & 4) ? "Yes" : "No") << ")" << std::dec << std::endl;
std::cout << " Granule Position: " << page.granule << std::endl;
std::cout << " Serial Number: " << page.serial << std::endl;
std::cout << " Page Sequence: " << page.page_seq << std::endl;
std::cout << " Checksum: 0x" << std::hex << page.checksum << std::dec << std::endl;
std::cout << " Segments Count: " << static_cast<int>(page.seg_count) << std::endl;
std::cout << " Segment Table: [";
for (size_t j = 0; j < page.seg_table.size(); ++j) {
std::cout << static_cast<int>(page.seg_table[j]);
if (j < page.seg_table.size() - 1) std::cout << ", ";
}
std::cout << "]" << std::endl;
}
}
void writeModified(const std::string& output_path, const std::vector<uint64_t>& modify_granule) const {
std::ifstream fin(filepath, std::ios::binary);
if (!fin) return;
fin.seekg(0, std::ios::end);
size_t size = fin.tellg();
fin.seekg(0);
std::vector<char> data(size);
fin.read(data.data(), size);
size_t offset = 0;
size_t page_num = 0;
while (offset < size) {
if (std::memcmp(&data[offset], "OggS", 4) != 0) break;
uint8_t seg_count = static_cast<uint8_t>(data[offset + 26]);
size_t payload_size = 0;
for (size_t i = 0; i < seg_count; ++i) {
payload_size += static_cast<uint8_t>(data[offset + 27 + i]);
}
if (page_num < modify_granule.size()) {
std::memcpy(&data[offset + 6], &modify_granule[page_num], 8);
// Recalculate checksum (placeholder)
std::memset(&data[offset + 22], 0, 4);
}
offset += 27 + seg_count + payload_size;
++page_num;
}
std::ofstream fout(output_path, std::ios::binary);
fout.write(data.data(), size);
}
};
// Example usage:
// int main() {
// OGVHandler handler("sample.ogv");
// handler.printProperties();
// handler.writeModified("modified.ogv", {12345, 67890});
// return 0;
// }