Task 410: .MOL2 File Format
Task 410: .MOL2 File Format
File Format Specifications for .MOL2
The .MOL2 file format, also known as the Tripos MOL2 format, is an ASCII text-based format used to represent molecular structures, including atoms, bonds, coordinates, and additional metadata. It is commonly used in computational chemistry and drug design software like SYBYL, DOCK, and others. The format is section-based, with each section starting with a Record Type Indicator (RTI) in the form @<TRIPOS>SECTION_NAME
, followed by data lines. Sections include mandatory ones like MOLECULE, ATOM, and BOND, and optional advanced sections for features like centroids, constraints, and annotations. The format supports single or multiple molecules in one file, with flexible free-format fields separated by whitespace.
1. List of All Properties Intrinsic to the .MOL2 File Format
Based on the specifications, the properties refer to the structural and data elements defined in the format. These are intrinsic to how the file is organized and parsed (e.g., sections, fields, data types). The format is text-based (ASCII), with no fixed byte order or magic number, using line-based sections delimited by RTIs. Fields are free-format, separated by spaces/tabs, with optional continuation via backslash. Comments start with #
, blank lines are ignored, and ****
denotes empty fields. Here is a comprehensive list of properties grouped by section (focusing on core and common ones; full 32 sections exist but many are optional/specialized):
General Format Properties:
- Encoding: ASCII (printable characters).
- Line Termination: End-of-line (platform-independent, Unix or DOS).
- Field Separation: Whitespace (spaces or tabs).
- Continuation: Backslash (
\
) for multi-line data. - Comments: Lines starting with
#
in column 1. - Blank Lines: Ignored.
- Empty Field Indicator:
****
for non-optional strings. - Section Delimiter: RTI starting with
@<TRIPOS>
(e.g.,@<TRIPOS>MOLECULE
). - Data Types: Strings (no whitespace), integers, reals (floating-point).
@MOLECULE Section Properties:
- mol_name (string): Molecule name.
- num_atoms (integer): Number of atoms.
- num_bonds (integer, optional if 0): Number of bonds.
- num_subst (integer, optional): Number of substructures.
- num_feat (integer, optional): Number of features.
- num_sets (integer, optional): Number of sets.
- mol_type (string): Type (e.g., SMALL, BIOPOLYMER, PROTEIN).
- charge_type (string): Charge method (e.g., NO_CHARGES, GASTEIGER).
- status_bits (string, optional): Internal bits (e.g., system|invalid_charges).
- mol_comment (string, optional): Comment about the molecule.
@ATOM Section Properties (per atom):
- atom_id (integer): Atom identifier (reference only).
- atom_name (string): Atom name.
- x (real): X-coordinate.
- y (real): Y-coordinate.
- z (real): Z-coordinate.
- atom_type (string): SYBYL atom type (e.g., C.3, O.2).
- subst_id (integer, optional): Substructure ID.
- subst_name (string, optional): Substructure name.
- charge (real, optional): Atomic charge.
- status_bit (string, optional): Bits (e.g., BACKBONE|DICT).
@BOND Section Properties (per bond):
- bond_id (integer): Bond identifier (reference only).
- origin_atom_id (integer): Starting atom ID.
- target_atom_id (integer): Ending atom ID.
- bond_type (string): Type (e.g., 1=single, 2=double, ar=aromatic, am=amide).
- status_bits (string, optional): Bits (e.g., BACKBONE|DICT).
@SUBSTRUCTURE Section Properties (optional, per substructure):
- subst_id (integer): Substructure ID.
- subst_name (string): Name (e.g., residue name).
- root_atom (integer): Root atom ID.
- subst_type (string): Type (e.g., RESIDUE, TEMP).
- dict_type (integer): Dictionary type.
- chain (string): Chain identifier.
- sub_type (string): Subtype.
- inter_bonds (integer): Inter-substructure bonds.
- status (string, optional): Bits (e.g., ROOT).
- comment (string, optional): Comment.
Other optional sections (e.g., @SET, @CENTROID, @CRYSIN) include properties like set names, coordinates for centroids, crystallographic data, etc., but are not always present.
2. Two Direct Download Links for .MOL2 Files
- https://raw.githubusercontent.com/choderalab/openmoltools/master/openmoltools/chemicals/benzene/benzene.mol2 (Sample benzene molecule).
- https://raw.githubusercontent.com/choderalab/ring-open-fep/master/openmm/examples/benzene-toluene/benzene.gaff.mol2 (Sample benzene with GAFF parameters).
3. Ghost Blog Embedded HTML/JavaScript for Drag-and-Drop .MOL2 Dumper
This is an embeddable HTML snippet with JavaScript for a Ghost blog post. It creates a drop zone where users can drag and drop a .MOL2 file. The script reads the file as text, parses the core sections (MOLECULE, ATOM, BOND, SUBSTRUCTURE), extracts the properties listed in part 1, and dumps them to the screen in a readable format.
4. Python Class for .MOL2 Handling
import sys
class Mol2Handler:
def __init__(self):
self.molecule = {}
self.atoms = []
self.bonds = []
self.substructures = []
def read(self, filename):
with open(filename, 'r') as f:
content = f.read()
lines = content.split('\n')
current_section = None
for line in lines:
line = line.strip()
if line.startswith('@<TRIPOS>'):
current_section = line[9:]
continue
if line.startswith('#') or not line:
continue
if current_section == 'MOLECULE':
if not self.molecule.get('mol_name'):
self.molecule['mol_name'] = line
elif not self.molecule.get('counts'):
counts = [int(x) for x in line.split()]
self.molecule['num_atoms'] = counts[0]
self.molecule['num_bonds'] = counts[1] if len(counts) > 1 else 0
self.molecule['num_subst'] = counts[2] if len(counts) > 2 else 0
self.molecule['num_feat'] = counts[3] if len(counts) > 3 else 0
self.molecule['num_sets'] = counts[4] if len(counts) > 4 else 0
elif not self.molecule.get('mol_type'):
self.molecule['mol_type'] = line
elif not self.molecule.get('charge_type'):
self.molecule['charge_type'] = line
elif not self.molecule.get('status_bits'):
self.molecule['status_bits'] = line
else:
self.molecule['mol_comment'] = line
elif current_section == 'ATOM':
fields = line.split()
atom = {
'atom_id': int(fields[0]),
'atom_name': fields[1],
'x': float(fields[2]),
'y': float(fields[3]),
'z': float(fields[4]),
'atom_type': fields[5],
'subst_id': int(fields[6]) if len(fields) > 6 else None,
'subst_name': fields[7] if len(fields) > 7 else None,
'charge': float(fields[8]) if len(fields) > 8 else None,
'status_bit': fields[9] if len(fields) > 9 else None
}
self.atoms.append(atom)
elif current_section == 'BOND':
fields = line.split()
bond = {
'bond_id': int(fields[0]),
'origin_atom_id': int(fields[1]),
'target_atom_id': int(fields[2]),
'bond_type': fields[3],
'status_bits': fields[4] if len(fields) > 4 else None
}
self.bonds.append(bond)
elif current_section == 'SUBSTRUCTURE':
fields = line.split()
sub = {
'subst_id': int(fields[0]),
'subst_name': fields[1],
'root_atom': int(fields[2]),
'subst_type': fields[3] if len(fields) > 3 else None,
'dict_type': int(fields[4]) if len(fields) > 4 else None,
'chain': fields[5] if len(fields) > 5 else None,
'sub_type': fields[6] if len(fields) > 6 else None,
'inter_bonds': int(fields[7]) if len(fields) > 7 else None,
'status': fields[8] if len(fields) > 8 else None,
'comment': fields[9] if len(fields) > 9 else None
}
self.substructures.append(sub)
def print_properties(self):
print("Molecule Properties:")
for key, value in self.molecule.items():
print(f"{key}: {value}")
print("\nAtoms:")
for atom in self.atoms:
print(atom)
print("\nBonds:")
for bond in self.bonds:
print(bond)
print("\nSubstructures:")
for sub in self.substructures:
print(sub)
def write(self, filename):
with open(filename, 'w') as f:
f.write('@<TRIPOS>MOLECULE\n')
f.write(self.molecule.get('mol_name', '') + '\n')
counts = [
self.molecule.get('num_atoms', len(self.atoms)),
self.molecule.get('num_bonds', len(self.bonds)),
self.molecule.get('num_subst', len(self.substructures)),
self.molecule.get('num_feat', 0),
self.molecule.get('num_sets', 0)
]
f.write(' '.join(map(str, counts)) + '\n')
f.write(self.molecule.get('mol_type', 'SMALL') + '\n')
f.write(self.molecule.get('charge_type', 'NO_CHARGES') + '\n')
if 'status_bits' in self.molecule:
f.write(self.molecule['status_bits'] + '\n')
if 'mol_comment' in self.molecule:
f.write(self.molecule['mol_comment'] + '\n')
f.write('\n@<TRIPOS>ATOM\n')
for atom in self.atoms:
fields = [
str(atom['atom_id']),
atom['atom_name'],
f"{atom['x']:.4f}",
f"{atom['y']:.4f}",
f"{atom['z']:.4f}",
atom['atom_type']
]
if atom['subst_id'] is not None:
fields.append(str(atom['subst_id']))
if atom['subst_name']:
fields.append(atom['subst_name'])
if atom['charge'] is not None:
fields.append(f"{atom['charge']:.3f}")
if atom['status_bit']:
fields.append(atom['status_bit'])
f.write(' '.join(fields) + '\n')
f.write('@<TRIPOS>BOND\n')
for bond in self.bonds:
fields = [
str(bond['bond_id']),
str(bond['origin_atom_id']),
str(bond['target_atom_id']),
bond['bond_type']
]
if bond['status_bits']:
fields.append(bond['status_bits'])
f.write(' '.join(fields) + '\n')
if self.substructures:
f.write('@<TRIPOS>SUBSTRUCTURE\n')
for sub in self.substructures:
fields = [
str(sub['subst_id']),
sub['subst_name'],
str(sub['root_atom'])
]
if sub['subst_type']:
fields.append(sub['subst_type'])
if sub['dict_type'] is not None:
fields.append(str(sub['dict_type']))
if sub['chain']:
fields.append(sub['chain'])
if sub['sub_type']:
fields.append(sub['sub_type'])
if sub['inter_bonds'] is not None:
fields.append(str(sub['inter_bonds']))
if sub['status']:
fields.append(sub['status'])
if sub['comment']:
fields.append(sub['comment'])
f.write(' '.join(fields) + '\n')
# Example usage:
# handler = Mol2Handler()
# handler.read('input.mol2')
# handler.print_properties()
# handler.write('output.mol2')
5. Java Class for .MOL2 Handling
import java.io.*;
import java.util.*;
public class Mol2Handler {
private Map<String, Object> molecule = new HashMap<>();
private List<Map<String, Object>> atoms = new ArrayList<>();
private List<Map<String, Object>> bonds = new ArrayList<>();
private List<Map<String, Object>> substructures = new ArrayList<>();
public void read(String filename) throws IOException {
try (BufferedReader reader = new BufferedReader(new FileReader(filename))) {
String line;
String currentSection = null;
while ((line = reader.readLine()) != null) {
line = line.trim();
if (line.startsWith("@<TRIPOS>")) {
currentSection = line.substring(9);
continue;
}
if (line.startsWith("#") || line.isEmpty()) continue;
if ("MOLECULE".equals(currentSection)) {
if (!molecule.containsKey("mol_name")) {
molecule.put("mol_name", line);
} else if (!molecule.containsKey("num_atoms")) {
String[] counts = line.split("\\s+");
molecule.put("num_atoms", Integer.parseInt(counts[0]));
molecule.put("num_bonds", counts.length > 1 ? Integer.parseInt(counts[1]) : 0);
molecule.put("num_subst", counts.length > 2 ? Integer.parseInt(counts[2]) : 0);
molecule.put("num_feat", counts.length > 3 ? Integer.parseInt(counts[3]) : 0);
molecule.put("num_sets", counts.length > 4 ? Integer.parseInt(counts[4]) : 0);
} else if (!molecule.containsKey("mol_type")) {
molecule.put("mol_type", line);
} else if (!molecule.containsKey("charge_type")) {
molecule.put("charge_type", line);
} else if (!molecule.containsKey("status_bits")) {
molecule.put("status_bits", line);
} else {
molecule.put("mol_comment", line);
}
} else if ("ATOM".equals(currentSection)) {
String[] fields = line.split("\\s+");
Map<String, Object> atom = new HashMap<>();
atom.put("atom_id", Integer.parseInt(fields[0]));
atom.put("atom_name", fields[1]);
atom.put("x", Double.parseDouble(fields[2]));
atom.put("y", Double.parseDouble(fields[3]));
atom.put("z", Double.parseDouble(fields[4]));
atom.put("atom_type", fields[5]);
if (fields.length > 6) atom.put("subst_id", Integer.parseInt(fields[6]));
if (fields.length > 7) atom.put("subst_name", fields[7]);
if (fields.length > 8) atom.put("charge", Double.parseDouble(fields[8]));
if (fields.length > 9) atom.put("status_bit", fields[9]);
atoms.add(atom);
} else if ("BOND".equals(currentSection)) {
String[] fields = line.split("\\s+");
Map<String, Object> bond = new HashMap<>();
bond.put("bond_id", Integer.parseInt(fields[0]));
bond.put("origin_atom_id", Integer.parseInt(fields[1]));
bond.put("target_atom_id", Integer.parseInt(fields[2]));
bond.put("bond_type", fields[3]);
if (fields.length > 4) bond.put("status_bits", fields[4]);
bonds.add(bond);
} else if ("SUBSTRUCTURE".equals(currentSection)) {
String[] fields = line.split("\\s+");
Map<String, Object> sub = new HashMap<>();
sub.put("subst_id", Integer.parseInt(fields[0]));
sub.put("subst_name", fields[1]);
sub.put("root_atom", Integer.parseInt(fields[2]));
if (fields.length > 3) sub.put("subst_type", fields[3]);
if (fields.length > 4) sub.put("dict_type", Integer.parseInt(fields[4]));
if (fields.length > 5) sub.put("chain", fields[5]);
if (fields.length > 6) sub.put("sub_type", fields[6]);
if (fields.length > 7) sub.put("inter_bonds", Integer.parseInt(fields[7]));
if (fields.length > 8) sub.put("status", fields[8]);
if (fields.length > 9) sub.put("comment", fields[9]);
substructures.add(sub);
}
}
}
}
public void printProperties() {
System.out.println("Molecule Properties:");
molecule.forEach((key, value) -> System.out.println(key + ": " + value));
System.out.println("\nAtoms:");
atoms.forEach(System.out::println);
System.out.println("\nBonds:");
bonds.forEach(System.out::println);
System.out.println("\nSubstructures:");
substructures.forEach(System.out::println);
}
public void write(String filename) throws IOException {
try (PrintWriter writer = new PrintWriter(new FileWriter(filename))) {
writer.println("@<TRIPOS>MOLECULE");
writer.println(molecule.getOrDefault("mol_name", ""));
String counts = molecule.getOrDefault("num_atoms", atoms.size()) + " " +
molecule.getOrDefault("num_bonds", bonds.size()) + " " +
molecule.getOrDefault("num_subst", substructures.size()) + " 0 0";
writer.println(counts);
writer.println(molecule.getOrDefault("mol_type", "SMALL"));
writer.println(molecule.getOrDefault("charge_type", "NO_CHARGES"));
if (molecule.containsKey("status_bits")) writer.println(molecule.get("status_bits"));
if (molecule.containsKey("mol_comment")) writer.println(molecule.get("mol_comment"));
writer.println();
writer.println("@<TRIPOS>ATOM");
for (Map<String, Object> atom : atoms) {
StringBuilder sb = new StringBuilder();
sb.append(atom.get("atom_id")).append(" ");
sb.append(atom.get("atom_name")).append(" ");
sb.append(String.format("%.4f", atom.get("x"))).append(" ");
sb.append(String.format("%.4f", atom.get("y"))).append(" ");
sb.append(String.format("%.4f", atom.get("z"))).append(" ");
sb.append(atom.get("atom_type"));
if (atom.containsKey("subst_id")) sb.append(" ").append(atom.get("subst_id"));
if (atom.containsKey("subst_name")) sb.append(" ").append(atom.get("subst_name"));
if (atom.containsKey("charge")) sb.append(" ").append(String.format("%.3f", atom.get("charge")));
if (atom.containsKey("status_bit")) sb.append(" ").append(atom.get("status_bit"));
writer.println(sb.toString());
}
writer.println("@<TRIPOS>BOND");
for (Map<String, Object> bond : bonds) {
StringBuilder sb = new StringBuilder();
sb.append(bond.get("bond_id")).append(" ");
sb.append(bond.get("origin_atom_id")).append(" ");
sb.append(bond.get("target_atom_id")).append(" ");
sb.append(bond.get("bond_type"));
if (bond.containsKey("status_bits")) sb.append(" ").append(bond.get("status_bits"));
writer.println(sb.toString());
}
if (!substructures.isEmpty()) {
writer.println("@<TRIPOS>SUBSTRUCTURE");
for (Map<String, Object> sub : substructures) {
StringBuilder sb = new StringBuilder();
sb.append(sub.get("subst_id")).append(" ");
sb.append(sub.get("subst_name")).append(" ");
sb.append(sub.get("root_atom"));
if (sub.containsKey("subst_type")) sb.append(" ").append(sub.get("subst_type"));
if (sub.containsKey("dict_type")) sb.append(" ").append(sub.get("dict_type"));
if (sub.containsKey("chain")) sb.append(" ").append(sub.get("chain"));
if (sub.containsKey("sub_type")) sb.append(" ").append(sub.get("sub_type"));
if (sub.containsKey("inter_bonds")) sb.append(" ").append(sub.get("inter_bonds"));
if (sub.containsKey("status")) sb.append(" ").append(sub.get("status"));
if (sub.containsKey("comment")) sb.append(" ").append(sub.get("comment"));
writer.println(sb.toString());
}
}
}
}
// Example usage:
// public static void main(String[] args) throws IOException {
// Mol2Handler handler = new Mol2Handler();
// handler.read("input.mol2");
// handler.printProperties();
// handler.write("output.mol2");
// }
}
6. JavaScript Class for .MOL2 Handling
class Mol2Handler {
constructor() {
this.molecule = {};
this.atoms = [];
this.bonds = [];
this.substructures = [];
}
read(content) { // Pass file content as string (e.g., from fs.readFileSync in Node.js)
const lines = content.split('\n').map(line => line.trim());
let currentSection = null;
for (let line of lines) {
if (line.startsWith('@<TRIPOS>')) {
currentSection = line.substring(9);
continue;
}
if (line.startsWith('#') || line === '') continue;
if (currentSection === 'MOLECULE') {
if (!this.molecule.mol_name) this.molecule.mol_name = line;
else if (!this.molecule.num_atoms) {
const counts = line.split(/\s+/).map(Number);
this.molecule.num_atoms = counts[0];
this.molecule.num_bonds = counts[1] || 0;
this.molecule.num_subst = counts[2] || 0;
this.molecule.num_feat = counts[3] || 0;
this.molecule.num_sets = counts[4] || 0;
} else if (!this.molecule.mol_type) this.molecule.mol_type = line;
else if (!this.molecule.charge_type) this.molecule.charge_type = line;
else if (!this.molecule.status_bits) this.molecule.status_bits = line;
else this.molecule.mol_comment = line;
} else if (currentSection === 'ATOM') {
const fields = line.split(/\s+/);
this.atoms.push({
atom_id: parseInt(fields[0]),
atom_name: fields[1],
x: parseFloat(fields[2]),
y: parseFloat(fields[3]),
z: parseFloat(fields[4]),
atom_type: fields[5],
subst_id: fields[6] ? parseInt(fields[6]) : null,
subst_name: fields[7] || null,
charge: fields[8] ? parseFloat(fields[8]) : null,
status_bit: fields[9] || null
});
} else if (currentSection === 'BOND') {
const fields = line.split(/\s+/);
this.bonds.push({
bond_id: parseInt(fields[0]),
origin_atom_id: parseInt(fields[1]),
target_atom_id: parseInt(fields[2]),
bond_type: fields[3],
status_bits: fields[4] || null
});
} else if (currentSection === 'SUBSTRUCTURE') {
const fields = line.split(/\s+/);
this.substructures.push({
subst_id: parseInt(fields[0]),
subst_name: fields[1],
root_atom: parseInt(fields[2]),
subst_type: fields[3] || null,
dict_type: fields[4] ? parseInt(fields[4]) : null,
chain: fields[5] || null,
sub_type: fields[6] || null,
inter_bonds: fields[7] ? parseInt(fields[7]) : null,
status: fields[8] || null,
comment: fields[9] || null
});
}
}
}
printProperties() {
console.log('Molecule Properties:');
console.log(this.molecule);
console.log('Atoms:');
console.log(this.atoms);
console.log('Bonds:');
console.log(this.bonds);
console.log('Substructures:');
console.log(this.substructures);
}
write() { // Returns string content for writing to file
let output = '@<TRIPOS>MOLECULE\n';
output += this.molecule.mol_name || '' + '\n';
const counts = [
this.molecule.num_atoms || this.atoms.length,
this.molecule.num_bonds || this.bonds.length,
this.molecule.num_subst || this.substructures.length,
this.molecule.num_feat || 0,
this.molecule.num_sets || 0
];
output += counts.join(' ') + '\n';
output += this.molecule.mol_type || 'SMALL' + '\n';
output += this.molecule.charge_type || 'NO_CHARGES' + '\n';
if (this.molecule.status_bits) output += this.molecule.status_bits + '\n';
if (this.molecule.mol_comment) output += this.molecule.mol_comment + '\n';
output += '\n@<TRIPOS>ATOM\n';
this.atoms.forEach(atom => {
let fields = [
atom.atom_id,
atom.atom_name,
atom.x.toFixed(4),
atom.y.toFixed(4),
atom.z.toFixed(4),
atom.atom_type
];
if (atom.subst_id !== null) fields.push(atom.subst_id);
if (atom.subst_name) fields.push(atom.subst_name);
if (atom.charge !== null) fields.push(atom.charge.toFixed(3));
if (atom.status_bit) fields.push(atom.status_bit);
output += fields.join(' ') + '\n';
});
output += '@<TRIPOS>BOND\n';
this.bonds.forEach(bond => {
let fields = [
bond.bond_id,
bond.origin_atom_id,
bond.target_atom_id,
bond.bond_type
];
if (bond.status_bits) fields.push(bond.status_bits);
output += fields.join(' ') + '\n';
});
if (this.substructures.length > 0) {
output += '@<TRIPOS>SUBSTRUCTURE\n';
this.substructures.forEach(sub => {
let fields = [
sub.subst_id,
sub.subst_name,
sub.root_atom
];
if (sub.subst_type) fields.push(sub.subst_type);
if (sub.dict_type !== null) fields.push(sub.dict_type);
if (sub.chain) fields.push(sub.chain);
if (sub.sub_type) fields.push(sub.sub_type);
if (sub.inter_bonds !== null) fields.push(sub.inter_bonds);
if (sub.status) fields.push(sub.status);
if (sub.comment) fields.push(sub.comment);
output += fields.join(' ') + '\n';
});
}
return output;
}
}
// Example usage in Node.js:
// const fs = require('fs');
// const handler = new Mol2Handler();
// const content = fs.readFileSync('input.mol2', 'utf8');
// handler.read(content);
// handler.printProperties();
// const outputContent = handler.write();
// fs.writeFileSync('output.mol2', outputContent);
7. C++ Class for .MOL2 Handling
#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>
#include <map>
#include <string>
#include <iomanip>
class Mol2Handler {
private:
std::map<std::string, std::string> molecule;
std::vector<std::map<std::string, std::string>> atoms;
std::vector<std::map<std::string, std::string>> bonds;
std::vector<std::map<std::string, std::string>> substructures;
public:
void read(const std::string& filename) {
std::ifstream file(filename);
if (!file.is_open()) {
std::cerr << "Error opening file: " << filename << std::endl;
return;
}
std::string line, current_section;
while (std::getline(file, line)) {
std::istringstream iss(line);
std::string token;
iss >> token;
if (token.substr(0, 9) == "@<TRIPOS>") {
current_section = token.substr(9);
continue;
}
if (token[0] == '#' || line.empty()) continue;
if (current_section == "MOLECULE") {
if (molecule.find("mol_name") == molecule.end()) {
molecule["mol_name"] = line;
} else if (molecule.find("num_atoms") == molecule.end()) {
std::istringstream count_iss(line);
std::string num;
count_iss >> num; molecule["num_atoms"] = num;
count_iss >> num; molecule["num_bonds"] = num.empty() ? "0" : num;
count_iss >> num; molecule["num_subst"] = num.empty() ? "0" : num;
count_iss >> num; molecule["num_feat"] = num.empty() ? "0" : num;
count_iss >> num; molecule["num_sets"] = num.empty() ? "0" : num;
} else if (molecule.find("mol_type") == molecule.end()) {
molecule["mol_type"] = line;
} else if (molecule.find("charge_type") == molecule.end()) {
molecule["charge_type"] = line;
} else if (molecule.find("status_bits") == molecule.end()) {
molecule["status_bits"] = line;
} else {
molecule["mol_comment"] = line;
}
} else if (current_section == "ATOM") {
std::map<std::string, std::string> atom;
std::istringstream atom_iss(line);
atom_iss >> atom["atom_id"] >> atom["atom_name"] >> atom["x"] >> atom["y"] >> atom["z"] >> atom["atom_type"];
std::string temp;
atom_iss >> temp; if (!temp.empty()) atom["subst_id"] = temp;
atom_iss >> temp; if (!temp.empty()) atom["subst_name"] = temp;
atom_iss >> temp; if (!temp.empty()) atom["charge"] = temp;
atom_iss >> temp; if (!temp.empty()) atom["status_bit"] = temp;
atoms.push_back(atom);
} else if (current_section == "BOND") {
std::map<std::string, std::string> bond;
std::istringstream bond_iss(line);
bond_iss >> bond["bond_id"] >> bond["origin_atom_id"] >> bond["target_atom_id"] >> bond["bond_type"];
std::string temp;
bond_iss >> temp; if (!temp.empty()) bond["status_bits"] = temp;
bonds.push_back(bond);
} else if (current_section == "SUBSTRUCTURE") {
std::map<std::string, std::string> sub;
std::istringstream sub_iss(line);
sub_iss >> sub["subst_id"] >> sub["subst_name"] >> sub["root_atom"];
std::string temp;
sub_iss >> temp; if (!temp.empty()) sub["subst_type"] = temp;
sub_iss >> temp; if (!temp.empty()) sub["dict_type"] = temp;
sub_iss >> temp; if (!temp.empty()) sub["chain"] = temp;
sub_iss >> temp; if (!temp.empty()) sub["sub_type"] = temp;
sub_iss >> temp; if (!temp.empty()) sub["inter_bonds"] = temp;
sub_iss >> temp; if (!temp.empty()) sub["status"] = temp;
sub_iss >> temp; if (!temp.empty()) sub["comment"] = temp;
substructures.push_back(sub);
}
}
file.close();
}
void print_properties() {
std::cout << "Molecule Properties:" << std::endl;
for (const auto& kv : molecule) {
std::cout << kv.first << ": " << kv.second << std::endl;
}
std::cout << "\nAtoms:" << std::endl;
for (const auto& atom : atoms) {
for (const auto& kv : atom) {
std::cout << kv.first << ": " << kv.second << " ";
}
std::cout << std::endl;
}
std::cout << "\nBonds:" << std::endl;
for (const auto& bond : bonds) {
for (const auto& kv : bond) {
std::cout << kv.first << ": " << kv.second << " ";
}
std::cout << std::endl;
}
std::cout << "\nSubstructures:" << std::endl;
for (const auto& sub : substructures) {
for (const auto& kv : sub) {
std::cout << kv.first << ": " << kv.second << " ";
}
std::cout << std::endl;
}
}
void write(const std::string& filename) {
std::ofstream file(filename);
if (!file.is_open()) {
std::cerr << "Error opening file for writing: " << filename << std::endl;
return;
}
file << "@<TRIPOS>MOLECULE" << std::endl;
file << molecule["mol_name"] << std::endl;
std::string counts = molecule["num_atoms"] + " " + molecule["num_bonds"] + " " + molecule["num_subst"] + " 0 0";
file << counts << std::endl;
file << (molecule.find("mol_type") != molecule.end() ? molecule["mol_type"] : "SMALL") << std::endl;
file << (molecule.find("charge_type") != molecule.end() ? molecule["charge_type"] : "NO_CHARGES") << std::endl;
if (molecule.find("status_bits") != molecule.end()) file << molecule["status_bits"] << std::endl;
if (molecule.find("mol_comment") != molecule.end()) file << molecule["mol_comment"] << std::endl;
file << std::endl << "@<TRIPOS>ATOM" << std::endl;
for (const auto& atom : atoms) {
file << std::setw(5) << atom.at("atom_id") << " "
<< std::left << std::setw(5) << atom.at("atom_name") << " "
<< std::fixed << std::setprecision(4) << std::stof(atom.at("x")) << " "
<< std::stof(atom.at("y")) << " "
<< std::stof(atom.at("z")) << " "
<< atom.at("atom_type");
if (atom.find("subst_id") != atom.end()) file << " " << atom.at("subst_id");
if (atom.find("subst_name") != atom.end()) file << " " << atom.at("subst_name");
if (atom.find("charge") != atom.end()) file << " " << std::setprecision(3) << std::stof(atom.at("charge"));
if (atom.find("status_bit") != atom.end()) file << " " << atom.at("status_bit");
file << std::endl;
}
file << "@<TRIPOS>BOND" << std::endl;
for (const auto& bond : bonds) {
file << std::setw(5) << bond.at("bond_id") << " "
<< std::setw(5) << bond.at("origin_atom_id") << " "
<< std::setw(5) << bond.at("target_atom_id") << " "
<< bond.at("bond_type");
if (bond.find("status_bits") != bond.end()) file << " " << bond.at("status_bits");
file << std::endl;
}
if (!substructures.empty()) {
file << "@<TRIPOS>SUBSTRUCTURE" << std::endl;
for (const auto& sub : substructures) {
file << std::setw(5) << sub.at("subst_id") << " "
<< std::left << std::setw(10) << sub.at("subst_name") << " "
<< std::setw(5) << sub.at("root_atom");
if (sub.find("subst_type") != sub.end()) file << " " << sub.at("subst_type");
if (sub.find("dict_type") != sub.end()) file << " " << sub.at("dict_type");
if (sub.find("chain") != sub.end()) file << " " << sub.at("chain");
if (sub.find("sub_type") != sub.end()) file << " " << sub.at("sub_type");
if (sub.find("inter_bonds") != sub.end()) file << " " << sub.at("inter_bonds");
if (sub.find("status") != sub.end()) file << " " << sub.at("status");
if (sub.find("comment") != sub.end()) file << " " << sub.at("comment");
file << std::endl;
}
}
file.close();
}
};
// Example usage:
// int main() {
// Mol2Handler handler;
// handler.read("input.mol2");
// handler.print_properties();
// handler.write("output.mol2");
// return 0;
// }