Input files: Reading XML
Mechanics research codes are typically written by graduate students who aim to get their work done as quickly as possible. These codes are not meant to last beyond the publication of a few related papers.
As a result, typical input files have the form
0.000000e+00 -1.000000e+00 0.000000e+00 100 1 50
6
1 0
1 -1 0 0 0 0 25
1 0
2 1 0 0 100 0 25
1 0
3 0 -1 0 50 -1 25
1 0
4 0 1 0 50 1 25
1 0
5 0 0 -1 50 0 0
1 0
6 0 0 1 50 0 50
These files have the advantage that they can be read in quickly using an input file stream and the code for doing that can be written in minutes.
The problem
But sometimes these code last beyond the tenure of the student, and some other person (or the student her/himself) has to try to understand the format of the input file. That process can take hours and sometimes days, especially if the format has not been docsumented and the next person has to dig into the code to understand what each of the numbers means.
The XML Approach
One potential approach is to use XML to mark-up the input file which can be transformed into:
<?xml version='1.0' encoding='ISO-8859-1' ?>
<Boundary>
<!-- Container limits -->
<containerMin> [0.0, -1.0, 0.0] </containerMin>
<containerMax> [100.0, 1.0, 50.0] </containerMax>
<!-- Internal boundaries -->
<boundary type="plane" id="1">
<direction> [-1.0, 0.0, 0.0] </direction>
<position> [100.0, 0.0, 25.0] </position>
</boundary>
<boundary type="plane" id="2">
<direction> [1.0, 0.0, 0.0] </direction>
<position> [100.0, 0.0, 25.0] </position>
</boundary>
<boundary type="plane" id="3">
<direction> [0.0, -1.0, 0.0] </direction>
<position> [50.0, -1.0, 25.0] </position>
</boundary>
<boundary type="plane" id="4">
<direction> [0.0, 1.0, 0.0] </direction>
<position> [50.0, 1.0, 25.0] </position>
</boundary>
<boundary type="plane" id="5">
<direction> [0.0, 0.0, -1.0] </direction>
<position> [50.0, 0.0, 0.0] </position>
</boundary>
<boundary type="plane" id="6">
<direction> [0.0, 0.0, 1.0] </direction>
<position> [50.0, 0.0, 50.0] </position>
</boundary>
</Boundary>
Though this file is more verbose that the original, the user does not have to read the source code to understand the format of the file and what the various numbers mean. The challenge for mechanics researchers is parsing and reading that sort of format. Below I’ll show you how that’s done quickly and easily with modern C++.
Reading XML
For production codes such as Vaango we use full featured libraries such as Xerces or LibXml2. However, for research codes, there are a number of simpler alternatives. I prefer the header-only style provided by ZenXml.
The reader code is quite straightforward. Let’s encapsulate it in the class
BoundaryReader
.
First, the header BoundaryReader.h
#ifndef BOUNDARY_READER_H
#define BOUNDARY_READER_H
class BoundaryReader
{
public:
BoundaryReader() = default;
~BoundaryReader() = default;
bool readXML(const std::string& inputFileName) const;
private:
BoundaryReader(BoundaryReader const&) = delete; // don't implement
void operator=(BoundaryReader const&) = delete; // don't implement
};
#endif
and then the implementation BoundaryReader.cc
#include <BoundaryReader.h>
#include <zenxml/xml.h>
bool
BoundaryReader::readXML(const std::string& inputFileName) const
{
// Read the input file
zen::XmlDoc docs;
try {
std::cout << "Input file name= " << inputFileName << "\n";
docs = zen::load(inputFileName);
} catch (const zen::XmlFileError& err) {
std::cout << "*ERROR** Could not read input file " << inputFileName << "\n";
std::cout << " Error # = " << err.lastError << "\n";
return false;
} catch (const zen::XmlParsingError& err) {
std::cout << "*ERROR** Could not read input file " << inputFileName << "\n";
std::cout << " Parse Error in line: " << err.row + 1
<< " col: " << err.col << "\n";
return false;
}
// Load the docsument into input proxy for easier element access
zen::XmlIn ps(docs);
// Read the boundary information
auto boundary_ps = ps["Boundary"];
if (!boundary_ps) {
std::cout << "**ERROR** <Boundary> tag not found. \n";
return false;
}
// Read the container dimensions
std::string vecStr;
if (!boundary_ps["containerMin"](vecStr)) {
std::cout
<< "**ERROR** Container min. position not found in boundary geometry\n";
std::cout << " Add the <containerMin> [x, y, z] </containerMin> tag.";
return false;
}
Vec boxMin = Vec::fromString(vecStr); // Convert the string [x, y, z]
if (!boundary_ps["containerMax"](vecStr)) {
....
return false;
}
Vec boxMax = Vec::fromString(vecStr); // Convert the string [x, y, z]
// Read each boundary
for (auto bound_ps = boundary_ps["boundary"]; bound_ps; bound_ps.next()) {
std::string boundaryType;
bound_ps.attribute("type", boundaryType);
BoundaryId id;
bound_ps.attribute("id", id);
if (!bound_ps["direction"](vecStr)) {
....
return false;
}
Vec direction = Vec::fromString(vecStr); // Convert the string into a vector
if (!bound_ps["position"](vecStr)) {
....
return false;
}
Vec position = Vec::fromString(vecStr); // Convert the string into a vector
}
return true;
}
Of course, if you want to use the data that you have read, you will have to save them. That’s left as an exercise for the reader. As you can see, reading a XML file can be quite straightforward and you have checks on the validity of your file along each step of the reading process.
Conclusion
I recommend shifting the XML or some other similar structured format for the input files in your research codes. However, you may think that the XML format is too verbose. In that case I’d suggest using JSON and in a future post I’ll show you how to read JSON in C++.
If you have questions/comments/corrections, please contact banerjee at parresianz dot com dot zen (without the dot zen).