Task 855: .YAML File Format
Task 855: .YAML File Format
1. List of Properties Intrinsic to the .YAML File Format
Based on the official YAML 1.2.2 specification, the .YAML file format (commonly using extensions .yaml or .yml) is a text-based, human-readable data serialization format without binary headers or magic numbers. Its intrinsic properties relate to its textual structure, encoding, and syntax elements that define how data is stored and interpreted in the file system as a plain text file. These properties can be detected or inferred from the file content. Note that YAML does not enforce a rigid schema; instead, it relies on syntax rules for portability. The following is a comprehensive list of key properties:
- Presence of Byte Order Mark (BOM): Indicates the encoding (e.g., UTF-8 BOM as \uFEFF); optional and used to specify byte order in Unicode encodings.
- Encoding: The character encoding of the file (e.g., UTF-8, UTF-16, UTF-32); defaults to UTF-8 if no BOM is present.
- Presence of Document Start Marker: Use of "---" to denote the beginning of a YAML document; optional for single-document files.
- Presence of Document End Marker: Use of "..." to denote the end of a YAML document; optional but useful in multi-document streams.
- Presence of Directives: Lines starting with "%" (e.g., %YAML or %TAG) to provide processor instructions; typically at the document start.
- Presence of Comments: Use of "#" to indicate comments, which are ignored during parsing.
- Presence of Block Sequence Entries: Use of "-" followed by space to define list items in block style.
- Presence of Mappings: Use of ":" followed by space to separate keys and values in mappings.
- Presence of Tags: Use of "!" to specify node types (e.g., !int for integers).
- Presence of Anchors: Use of "&" to define reusable node identifiers.
- Presence of Aliases: Use of "*" to reference anchored nodes.
- Presence of Literal Block Scalars: Use of "|" for block scalars that preserve line breaks.
- Presence of Folded Block Scalars: Use of ">" for block scalars that fold line breaks into spaces.
- Presence of Single-Quoted Scalars: Use of single quotes (') for scalar values with minimal escaping.
- Presence of Double-Quoted Scalars: Use of double quotes (") for scalar values supporting escape sequences.
- Use of Indentation for Structure: Reliance on consistent space-based indentation (no tabs) to denote nesting in block styles.
- Support for Multiple Documents: Ability to contain zero or more documents in a single file, separated by markers.
- JSON Compatibility: The format is a superset of JSON in flow styles, allowing seamless parsing by JSON processors.
These properties are intrinsic as they define the file's parseable structure and storage conventions without external metadata.
2. Direct Download Links for .YAML Files
The following are two direct download links for sample .YAML files, sourced from a repository of example files:
- https://filesamples.com/samples/code/yaml/verify_apache.yaml
- https://filesamples.com/samples/code/yaml/invoice.yaml
3. Ghost Blog Embedded HTML/JavaScript for Drag-and-Drop .YAML File Analysis
The following is an embeddable HTML/JavaScript snippet suitable for a Ghost blog or similar platform. It creates a drop zone where users can drag and drop a .YAML file. Upon dropping, the script reads the file as text (assuming UTF-8 encoding), checks for the presence of each property from the list above (using approximate string searches, as full parsing requires a library like js-yaml), and displays the results on the screen. Note that this is a client-side browser script and does not handle complex parsing or edge cases like quoted strings containing indicators.
4. Python Class for Handling .YAML Files
The following Python class opens a .YAML file, reads and decodes it (assuming UTF-8), checks for the properties (using string searches for approximation), prints them to the console, and provides a method to write the content to a new file.
class YamlHandler:
def __init__(self, filename):
with open(filename, 'r', encoding='utf-8') as f:
self.content = f.read()
self.properties = self._get_properties()
def _get_properties(self):
import re
return {
'Presence of Byte Order Mark (BOM)': self.content.startswith('\uFEFF'),
'Encoding': 'UTF-8 (assumed; BOM detection only)',
'Presence of Document Start Marker': '---' in self.content,
'Presence of Document End Marker': '...' in self.content,
'Presence of Directives': '\n%' in self.content or self.content.startswith('%'),
'Presence of Comments': '#' in self.content,
'Presence of Block Sequence Entries': '- ' in self.content,
'Presence of Mappings': ': ' in self.content,
'Presence of Tags': '!' in self.content,
'Presence of Anchors': '&' in self.content,
'Presence of Aliases': '*' in self.content,
'Presence of Literal Block Scalars': '|' in self.content,
'Presence of Folded Block Scalars': '>' in self.content,
'Presence of Single-Quoted Scalars': "'" in self.content,
'Presence of Double-Quoted Scalars': '"' in self.content,
'Use of Indentation for Structure': bool(re.search(r'\n\s{2,}', self.content)),
'Support for Multiple Documents': len(re.findall(r'---', self.content)) > 1,
'JSON Compatibility': True # Intrinsic to format
}
def print_properties(self):
for key, value in self.properties.items():
print(f"{key}: {value}")
def write(self, new_filename):
with open(new_filename, 'w', encoding='utf-8') as f:
f.write(self.content)
# Example usage:
# handler = YamlHandler('example.yaml')
# handler.print_properties()
# handler.write('output.yaml')
5. Java Class for Handling .YAML Files
The following Java class opens a .YAML file, reads and decodes it (using UTF-8), checks for the properties, prints them to the console, and provides a method to write the content to a new file.
import java.io.*;
import java.nio.charset.StandardCharsets;
import java.util.HashMap;
import java.util.Map;
import java.util.regex.Pattern;
public class YamlHandler {
private String content;
private Map<String, Object> properties;
public YamlHandler(String filename) throws IOException {
try (BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(filename), StandardCharsets.UTF_8))) {
StringBuilder sb = new StringBuilder();
String line;
while ((line = reader.readLine()) != null) {
sb.append(line).append("\n");
}
this.content = sb.toString();
}
this.properties = getProperties();
}
private Map<String, Object> getProperties() {
Map<String, Object> props = new HashMap<>();
props.put("Presence of Byte Order Mark (BOM)", content.startsWith("\uFEFF"));
props.put("Encoding", "UTF-8 (assumed; BOM detection only)");
props.put("Presence of Document Start Marker", content.contains("---"));
props.put("Presence of Document End Marker", content.contains("..."));
props.put("Presence of Directives", content.contains("\n%") || content.startsWith("%"));
props.put("Presence of Comments", content.contains("#"));
props.put("Presence of Block Sequence Entries", content.contains("- "));
props.put("Presence of Mappings", content.contains(": "));
props.put("Presence of Tags", content.contains("!"));
props.put("Presence of Anchors", content.contains("&"));
props.put("Presence of Aliases", content.contains("*"));
props.put("Presence of Literal Block Scalars", content.contains("|"));
props.put("Presence of Folded Block Scalars", content.contains(">"));
props.put("Presence of Single-Quoted Scalars", content.contains("'"));
props.put("Presence of Double-Quoted Scalars", content.contains("\""));
props.put("Use of Indentation for Structure", Pattern.compile("\n\\s{2,}").matcher(content).find());
props.put("Support for Multiple Documents", content.split("---").length > 2); // Rough count
props.put("JSON Compatibility", true);
return props;
}
public void printProperties() {
for (Map.Entry<String, Object> entry : properties.entrySet()) {
System.out.println(entry.getKey() + ": " + entry.getValue());
}
}
public void write(String newFilename) throws IOException {
try (BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(newFilename), StandardCharsets.UTF_8))) {
writer.write(content);
}
}
// Example usage:
// public static void main(String[] args) throws IOException {
// YamlHandler handler = new YamlHandler("example.yaml");
// handler.printProperties();
// handler.write("output.yaml");
// }
}
6. JavaScript Class for Handling .YAML Files
The following JavaScript class (for Node.js) opens a .YAML file, reads and decodes it (assuming UTF-8), checks for the properties, prints them to the console, and provides a method to write the content to a new file.
const fs = require('fs');
class YamlHandler {
constructor(filename) {
this.content = fs.readFileSync(filename, 'utf8');
this.properties = this.getProperties();
}
getProperties() {
return {
'Presence of Byte Order Mark (BOM)': this.content.startsWith('\uFEFF'),
'Encoding': 'UTF-8 (assumed; BOM detection only)',
'Presence of Document Start Marker': this.content.includes('---'),
'Presence of Document End Marker': this.content.includes('...'),
'Presence of Directives': this.content.includes('\n%') || this.content.startsWith('%'),
'Presence of Comments': this.content.includes('#'),
'Presence of Block Sequence Entries': this.content.includes('- '),
'Presence of Mappings': this.content.includes(': '),
'Presence of Tags': this.content.includes('!'),
'Presence of Anchors': this.content.includes('&'),
'Presence of Aliases': this.content.includes('*'),
'Presence of Literal Block Scalars': this.content.includes('|'),
'Presence of Folded Block Scalars': this.content.includes('>'),
'Presence of Single-Quoted Scalars': this.content.includes("'"),
'Presence of Double-Quoted Scalars': this.content.includes('"'),
'Use of Indentation for Structure': /\n\s{2,}/.test(this.content),
'Support for Multiple Documents': (this.content.match(/---/g) || []).length > 1,
'JSON Compatibility': true
};
}
printProperties() {
for (const [key, value] of Object.entries(this.properties)) {
console.log(`${key}: ${value}`);
}
}
write(newFilename) {
fs.writeFileSync(newFilename, this.content, 'utf8');
}
}
// Example usage:
// const handler = new YamlHandler('example.yaml');
// handler.printProperties();
// handler.write('output.yaml');
7. C Implementation for Handling .YAML Files
The following C code uses a struct to simulate a "class" for opening a .YAML file, reading and decoding it (assuming UTF-8), checking for the properties, printing them to the console, and providing a function to write the content to a new file. It uses standard library functions for file I/O and string operations.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>
typedef struct {
char *content;
} YamlHandler;
YamlHandler *yaml_handler_init(const char *filename) {
FILE *file = fopen(filename, "r");
if (!file) {
perror("Failed to open file");
return NULL;
}
fseek(file, 0, SEEK_END);
long length = ftell(file);
fseek(file, 0, SEEK_SET);
YamlHandler *handler = malloc(sizeof(YamlHandler));
handler->content = malloc(length + 1);
fread(handler->content, 1, length, file);
handler->content[length] = '\0';
fclose(file);
return handler;
}
void yaml_handler_free(YamlHandler *handler) {
free(handler->content);
free(handler);
}
bool has_substring(const char *str, const char *sub) {
return strstr(str, sub) != NULL;
}
bool has_indentation(const char *str) {
const char *p = str;
while ((p = strchr(p, '\n')) != NULL) {
p++;
if (strncmp(p, " ", 2) == 0) return true;
}
return false;
}
int count_occurrences(const char *str, const char *sub) {
int count = 0;
const char *p = str;
while ((p = strstr(p, sub)) != NULL) {
count++;
p += strlen(sub);
}
return count;
}
void print_properties(YamlHandler *handler) {
printf("Presence of Byte Order Mark (BOM): %s\n", has_substring(handler->content, "\xEF\xBB\xBF") ? "true" : "false"); // UTF-8 BOM bytes
printf("Encoding: UTF-8 (assumed; BOM detection only)\n");
printf("Presence of Document Start Marker: %s\n", has_substring(handler->content, "---") ? "true" : "false");
printf("Presence of Document End Marker: %s\n", has_substring(handler->content, "...") ? "true" : "false");
printf("Presence of Directives: %s\n", (has_substring(handler->content, "\n%") || strncmp(handler->content, "%", 1) == 0) ? "true" : "false");
printf("Presence of Comments: %s\n", has_substring(handler->content, "#") ? "true" : "false");
printf("Presence of Block Sequence Entries: %s\n", has_substring(handler->content, "- ") ? "true" : "false");
printf("Presence of Mappings: %s\n", has_substring(handler->content, ": ") ? "true" : "false");
printf("Presence of Tags: %s\n", has_substring(handler->content, "!") ? "true" : "false");
printf("Presence of Anchors: %s\n", has_substring(handler->content, "&") ? "true" : "false");
printf("Presence of Aliases: %s\n", has_substring(handler->content, "*") ? "true" : "false");
printf("Presence of Literal Block Scalars: %s\n", has_substring(handler->content, "|") ? "true" : "false");
printf("Presence of Folded Block Scalars: %s\n", has_substring(handler->content, ">") ? "true" : "false");
printf("Presence of Single-Quoted Scalars: %s\n", has_substring(handler->content, "'") ? "true" : "false");
printf("Presence of Double-Quoted Scalars: %s\n", has_substring(handler->content, "\"") ? "true" : "false");
printf("Use of Indentation for Structure: %s\n", has_indentation(handler->content) ? "true" : "false");
printf("Support for Multiple Documents: %s\n", count_occurrences(handler->content, "---") > 1 ? "true" : "false");
printf("JSON Compatibility: true\n");
}
void write_file(YamlHandler *handler, const char *new_filename) {
FILE *file = fopen(new_filename, "w");
if (!file) {
perror("Failed to write file");
return;
}
fprintf(file, "%s", handler->content);
fclose(file);
}
// Example usage:
// int main() {
// YamlHandler *handler = yaml_handler_init("example.yaml");
// if (handler) {
// print_properties(handler);
// write_file(handler, "output.yaml");
// yaml_handler_free(handler);
// }
// return 0;
// }