DOM guide: Importing documents

From COLLADA Public Wiki
Jump to navigation Jump to search

Be sure to read the section on creating documents first. It covers some important topics relevant to this section.

A simple example

Let's begin with a simple example of reading some information from a Collada document. We'll open the document and print the ID of the first <node> we find.

DAE dae;
daeElement* root = dae.open("simpleImport.dae");
if (!root) {
    cout << "Document import failed.\n";
    return 0;
}

We create the DAE object then call DAE::open to open a file called "simpleImport.dae". If there is no file of that name in the current directory, or the file failed to open for some other reason, then the DAE::open method will return null. We check for that and print an error message if opening the document failed.

daeElement* node = root->getDescendant("node");
if (!node)
    cout << "No nodes found\n";
else
    cout << "node id: " << node->getAttribute("id") << endl;

Here we use the daeElement::getDescendant method to do a breadth-first search through the xml element tree for an element with the given name. This method will return null if it couldn't find an element with a matching name, which we check for. If it did find a matching element we use the daeElement::getAttribute method to print the value of the 'id' attribute.

The complete code.

#include <iostream>
#include <dae.h>
#include <dom/domCOLLADA.h>
using namespace std;

int main() {
	DAE dae;
	daeElement* root = dae.open("simpleImport.dae");
	if (!root) {
		cout << "Document import failed.\n";
		return 0;
	}

	daeElement* node = root->getDescendant("node");
	if (!node)
		cout << "No nodes found\n";
	else
		cout << "node id: " << node->getAttribute("id") << endl;

	return 0;
}

The simpleImport.dae document.

<?xml version="1.0" encoding="UTF-8"?>
<COLLADA xmlns="http://www.collada.org/2005/11/COLLADASchema" version="1.4.1">
	<asset>
		<contributor/>
		<created>2008-04-08T13:07:52-08:00</created>
		<modified>2008-04-08T13:07:52-08:00</modified>
	</asset>
  <library_nodes>
    <node id="hello"/>
  </library_nodes>
</COLLADA>

And the results of running the program.

node id: hello

Collada versions

The above example refers to Collada version 1.4.1. The released version of the DOM package as of July 2013 reads and writes Collada version 1.5.0 by default, but can, according to the release notes, be configured to use version 1.4.1, which is still written by other programs such as the Blender animation system. Configuring the Collada DOM to use a non-standard Collada version using the process described in the release notes is complex and may not be successful.

Reading data from elements

Any individual xml element has four types of data you might need: the element name, the element's attributes, the element's character data, and the element's child elements. The DOM provides easy access to all of this data via the daeElement interface.

Element name

Use the daeElement::getElementName method to get an element's name.

daeString getElementName() const; // Function signature

cout << elt->getElementName() << endl; // Example: print an element's name

Element attributes

To get the value of an attribute given the attribute's name, use the daeElement::getAttribute method.

std::string getAttribute(daeString name);

We've already seen an example of daeElement::getAttribute usage in the simple import example.

cout << "node id: " << node->getAttribute("id") << endl;

If you don't know what attributes an element has, you can iterate over its attribute list using the following methods of daeElement.

size_t getAttributeCount();
std::string getAttributeName(size_t i);
std::string getAttribute(size_t i);

This code snippet prints all the attribute names and values of the root element.

for (size_t i = 0; i < root->getAttributeCount(); i++) {
	cout << "attr " << i << " name: " << root->getAttributeName(i) << endl;
	cout << "attr " << i << " value: " << root->getAttribute(i) << endl;
}

Character data

You can retrieve an element's character data with the daeElement::getCharData method.

std::string getCharData();

For example, let's say you have an <asset> element and you want to tell if the <up_axis> setting is Z_UP. You could do that as follows.

daeElement* upAxis = asset->getDescendant("up_axis");
if (upAxis && upAxis->getCharData() == "Z_UP")
    // We have a match!

Child elements

If you know the name of the child element you want, you can access it with daeElement::getChild.

daeElement* getChild(daeString eltName);

This will return null if the element with the given name doesn't exist. You might use this function to test for the existence of a particular child element.

if (root->getChild("asset") == NULL)
    cout << "Missing <asset> element!\n"

If you don't have a specific element in mind you can get a list of all the child elements instead with the daeElement::getChildren method.

daeTArray< daeSmartRef<daeElement> > getChildren();

It returns an array of smart pointers to daeElement objects, which you can simply treat like ordinary daeElement pointers. You can use daeElement::getChildren to print a list of all the child elements of root like this.

daeTArray<daeElementRef> children = root->getChildren();
for (size_t i = 0; i < children.getCount(); i++)
	cout << "child " << i << " name: " << children[i]->getElementName() << endl;

daeElementRef is just a typedef for daeSmartRef<daeElement> that's made available to DOM clients to keep code simpler.

The dom* classes

As was mentioned in the creating documents section, the dom* classes provide an alternative interface to working with elements in the DOM. All of the operations discussed so far can be done with the dom* classes instead of the daeElement interface. For example, the code to print the id attribute of the first <node> in the document could've been written like this instead:

domNode* node = (domNode*)root->getDescendant("node");
if (!node)
	cout << "No nodes found\n";
else
	cout << "node id: " << node->getId() << endl;

The dom* classes provide a more strongly typed interface to the Collada elements, and sometimes this can be convenient. Use your judgment to decide between the daeElement interface and a dom* class for a given task.

Element hierarchy traversal

An xml document contains a tree of elements. Each element has a list of children, and each child has its own list of children, and so on. The DOM provides several methods in the daeElement interface for easily navigating a document's element tree.

// Search downward
daeElement* getChild(daeString eltName);
daeElement* getDescendant(daeString eltName);
// Search upward
daeElement* getParent();
daeElement* getAncestor(daeString eltName);

The first two methods, getChild and getDescendant, are used for searching downward through the element tree. We've already seen these methods used in previous examples. getDescendant does a breadth-first search down the element tree, looking for a node with the given name. getChild works exactly the same, except that it only goes one level deep.

To search upward, use the daeElement::getParent and daeElement::getAncestor functions. getParent doesn't do a "search" exactly. Since an element only has one parent, getParent simply returns that element. getAncestor goes all the way up the element tree to the root searching for an element with the given name.

All the methods for element hierarchy traversal return null if a matching element isn't found.

Using the database to get elements by type or ID

The DOM also comes with an efficient mechanism for finding elements by type or ID. This functionality is implemented by the daeDatabase class, but calling it a 'database' might be a bit misleading. Internally the DOM uses standard C++ multimaps to implement a cache to quickly find a daeElement given the element's ID or type.

Each DAE object has an associated daeDatabase that can be retrieved with the DAE::getDatabase method.

virtual daeDatabase* getDatabase();

Finding an element by ID

Retrieving a daeElement given the element's ID is a fairly common operation, and is performed frequently by the DOM internally when working with URIs and ID references. Sometimes you'll need to do it in your own code also. The method to use is daeDatabase::idLookup.

virtual std::vector<daeElement*> idLookup(const std::string& id) = 0;

You might be surprised to see that this method returns an array of elements via std::vector. After all, an ID must be unique within an entire Collada document, so how could there be multiple elements with a given ID? The answer is that the DOM can have multiple documents loaded at the same time. So for a given ID, there might be multiple matching elements in different documents, and each of these elements is returned by the idLookup method.

More commonly you'll want to find an element by ID in a specific document. For that , another version of the idLookup method is provided.

daeElement* idLookup(const std::string& id, daeDocument* doc);

This method is just like the previous idLookup method, except that takes a daeDocument objects as the second parameter. Since there can only be one element with the given ID in the specified document, this method returns a single daeElement instead of an array of daeElements.

You can get the daeDocument from any other element in the same document with the daeElement::getDocument method. For example, you might find the element with id 'myElement' in the same document as element 'root' like this.

daeElement* elt = dae.getDatabase()->idLookup("myElement", root->getDocument());

Element types in the DOM

So far we've discussed types in the DOM very little. I've explained that each type in the Collada schema gets mapped to a dom* class, and that each of these classes implement the daeElement interface. In the DOM, every dom* class has an associated type ID which can be queried at runtime using the 'ID' method. For example, to get the type ID of the domNode class (which corresponds to the <node> Collada element), you would write domNode::ID(), to get the domGeometry type ID you would write domGeometry::ID(), etc.

The daeElement interface provides a method typeID to query the type of any daeElement. This is useful when you want to confirm that a daeElement is of a particular type, for example to cast to a dom* class, like this.

daeElement* elt = root->getDescendant("surface");
if (elt->typeID() == domFx_surface_common::ID()) {
    // We have a match!
    domFx_surface_common* surface = (domFx_surface_common*)elt;
}

Checking the type of the returned element is especially important in this case because the Collada schema uses the element name "surface" with many different schema types. The getDescendant call could return an element of a type other than domFx_surface_common, in which case casting to domFx_surface_common would be invalid. By checking the type first we guard against any problems.

Type checking in this fashion is common enough that the DOM provides a cast operator daeSafeCast, which could be used to shorten the previous above.

domFx_surface_common* surface = daeSafeCast<domFx_surface_common>(root->getDescendant("surface"));
if (surface) {
    // We have a match!
}

Finding elements by type

Sometimes it's useful to perform an operation on all elements of a specific type. For example when writing a Collada conditioner you might want to find all the <geometry> elements and do some processing on them. The method daeDatabase::typeLookup is useful for these types of tasks.

std::vector<daeElement*> typeLookup(daeInt typeID, daeDocument* doc = NULL);
template<typename T> std::vector<T*> typeLookup(daeDocument* doc = NULL);

The first method returns an array of daeElements, while the second returns an array of dom* elements. For example, you could print the ID's of all nodes like this.

vector<daeElement*> nodes = dae.getDatabase()->typeLookup(domNode::ID());
for (size_t i = 0; i < nodes.size(); i++)
	cout << "node " << i << " id: " << nodes[i]->getAttribute("id") << endl;

You could also do it using the second typeLookup method instead.

vector<domNode*> nodes = dae.getDatabase()->typeLookup<domNode>();
for (size_t i = 0; i < nodes.size(); i++)
	cout << "node " << i << " id: " << nodes[i]->getId() << endl;

Note that the typeLookup methods search through all documents by default, but take an optional document argument to restrict the search to that document.

Working with URIs

URIs are used all throughout Collada to establish references to elements and external resources (such as texture files). The DOM represents URIs with the daeURI class. Wherever the schema uses a URI, the DOM creates a daeURI object. Detailed information about the daeURI class can be found in daeURI.h, but we'll cover some of the more common uses of the daeURI class here.

Retrieving the URI components

As is discussed in the URI spec, all URIs can be broken down into five component parts: schema, authority, path, query, and fragment. Sometimes you need to access these components, and the daeURI provides convenient accessor methods for that purpose.

const std::string& scheme() const;
const std::string& authority() const;
const std::string& path() const;
const std::string& query() const;
const std::string& fragment() const;

The DOM offers some utility functions to break the path component down further.

// Individual path component accessors. If you need access to multiple path
// components, calling pathComponents() will be faster.
std::string pathDir() const;      // daeURI("/folder/file.dae").pathDir() == "/folder/"
std::string pathFileBase() const; // daeURI("/folder/file.dae").pathFileBase() == "file"
std::string pathExt() const;      // daeURI("/folder/file.dae").pathExt() == ".dae"
std::string pathFile() const;     // daeURI("/folder/file.dae").pathFile() == "file.dae"

All of these functions should be fairly self explanatory.

Obtaining daeElements from URI element references

Many (but not all) of the URIs in Collada are element references. That is, they're meant to point to Collada elements. For these types of URIs, you can use the daeURI::getElement method to retrieve the daeElement referenced by a URI. Internally the DOM uses the daeDatabase to do a quick lookup of the element based on the URI's fragment, which is the element's ID.

daeElementRef getElement();

An example of an element reference URI is the 'url' attribute of the <instance_geometry> element. That attribute is a URI that points to a <geometry> element. Here's an example of finding an <instance_geometry> element in a document and then using the daeURI class to get the referenced <geometry> element.

domInstance_geometry* geomInst = dae.getDatabase()->typeLookup<domInstance_geometry>().at(0);
daeElement* geom = geomInst->getUrl().getElement();

External document references

URIs enable you to reference elements in external documents, which is an important feature of Collada. When the DOM loads a document that contains external references, the referenced documents are left unloaded at first. When you attempt to call daeURI::getElement to obtain a daeElement from another document, that document is loaded and the element is found in the other document and returned. This means that calling daeURI::getElement can trigger a document load and is therefore a potentially expensive operation. This is all handled behind the scenes for you and is one of the nice conveniences provided by the DOM.

In some cases though it might be useful to check if a URI is a local reference or an external reference. The daeURI class provides the isExternalReference method for this purpose.

daeBool isExternalReference() const;

This method returns true if the URI references a document other than the document the URI lives in (i.e. it's an external reference), and false if the URI is a normal local reference.

Converting a URI to a file path

In some cases it's necessary to convert a URI to/from a file path. It's important to note that only file scheme URIs can be converted to file paths, for other URIs (like an http URI) it makes no sense to convert it to a file path. The DOM provides functions to convert in both directions.

namespace cdom {
    std::string nativePathToUri(const std::string& nativePath,
                                systemType type = getSystemType());
    std::string uriToNativePath(const std::string& uriRef,
                                systemType type = getSystemType());
}

The first function converts a native file system path to a URI, and the second function converts a URI to a native file system path. The type parameters allow you to specify a path type other than the native system type (Posix (Linux, Mac) and Windows paths are supported). It can usually be left alone.

An example of when you might want to use these functions is when you want to load a texture image. Like all external resources in Collada, textures are referenced using URIs. Most texture loading libraries don't understand URIs though, they work with file paths. You can use the uriToNativePath function to convert the URI reference to a file path for loading.

domImage* image = dae.getDatabase()->typeLookup<domImage>().at(0);
string uri = image->getInit_from()->getValue().str();
string filePath = cdom::uriToNativePath(uri);
if (filePath.empty())
	cout << "The uri couldn't be represented as a file path. Perhaps an http scheme uri.\n";
else
	cout << filePath << endl;

Reading <extra> data

The daeElement interface provides a schema independent mechanism to work with xml data, and this works perfectly for reading <extra> data, for which there is no schema.

In the creating documents section I showed how you could use the DOM to create a document with <extra> data. The resulting document looked like this.

<node>
    <extra>
        <technique profile="steveT">
            <myElement myAttr="myValue">this is some text</myElement>
        </technique>
    </extra>
</node>

Now let's read that document back in and parse the <extra> content. Here's a complete annotated program that shows how you could do that.

#include <iostream>
#include <dae.h>
#include <dom/domCOLLADA.h>
using namespace std;

int main() {
    DAE dae;
    daeElement* root = dae.open("extra.dae");
    if (!root) {
        cout << "Document import failed.\n";
        return 0;
    }

    // Get a daeElement pointer to the <extra> element
    if (daeElement* extra = root->getDescendant("extra")) {

        // Check for a <technique> child element
        if (daeElement* technique = extra->getChild("technique")) {

            // Check the <technique>'s 'profile' attribute and make sure it matches what we expect.
            // This info could also be encoded in the 'type' attribute on the <extra> element.
            if (technique->getAttribute("profile") == "steveT") {

                // Get our custom element and print some info
                if (daeElement* elt = technique->getChild("myElement")) {
                    cout << "myAttr = " << elt->getAttribute("myAttr") << endl;
                    cout << "char data = " << elt->getCharData() << endl;
                }
            }
        }
    }

    return 0;
}

Note that at each step we're checking the return value to make sure we have an element where we expect it to be. Without the checks we could dereference a null pointer and crash if the document doesn't contain the exact elements we expect. The important thing to note is that when you use the daeElement interface <extra> data can be read and processed just like normal Collada data.

Working with sid references

Sid (scoped identifier) references are used in Collada's animation and effect systems. Sid reference syntax and usage is explained in the "Collada Target Addressing" section (chapter 3) of the Collada spec.

The most common thing you'll want to do with a sid reference is dereference it to get the element or float value it points to. The DOM makes this very easy with the daeSidRef class in daeSIDResolver.h. There are only two daeSidRef methods you need to concern yourself with. The first is the constructor.

daeSidRef(const std::string& sidRef, daeElement* referenceElt, const std::string& profile = "");

The sidRef string parameter in the constructor is the actual sid reference, and the third parameter (profile) is used to restrict the search to <technique> elements that match a specific profile. It can almost always be ignored. The referenceElt parameter requires a bit of explaining though.

Unfortunately the spec glosses over an important fact about sid references: there are actually two different "types" of sid references in Collada. One type is used in the animation system, which I call animation-style sid refs, and the other is used in the effect system, which I call effect-style sid refs.

<channel source="..." target="hip/rotateY.ANGLE"/> 
<texture texture="mySampler" texcoord="..."/> 

In animation-style sid refs (the "target" attribute in the <channel> element above), the first component of the sid ref ("hip") is the ID of an element in the document. In an effect-style sid ref, there's only one component ("mySampler" above) that specifies an element relative to a "reference" element specified in the spec. Within an <effect> element the reference is the <effect>.

This influences how you should use the referenceElt parameter in the daeSidRef constructor. When using an animation-style sid ref, you can pass in any element in the same document as the sid ref itself. Usually you would just use the element that contains the sid ref (the <channel> element in the example above). For an effect-style sid ref, the reference element to use is specified in the spec. For sid references within an <effect> you should pass in the <effect> as the reference element.

We'll provide some examples to make all this more clear, but first I'll introduce the daeSidRef::resolve method.

resolveData resolve();

You create a daeSidRef using the constructor, then call resolve to dereference it and get the results. The resolveData structure is simple:

struct DLLSPEC resolveData {
    // ...
    daeElement* elt;
    daeDoubleArray* array;
    daeDouble* scalar;
};

A sid reference can point to an element, an array of scalar values, or a single scalar value. If resolving the sid reference fails then all values will be set to null.

Examples

Let's see some examples. First let's look at an animation-style sid ref. Suppose we have the following Collada data.

<node id="myNode">
    <rotate sid="rotateZ">0 0 1 30</rotate>
    <rotate sid="rotateY">0 1 0 45</rotate>
    <rotate sid="rotateX">1 0 0 60</rotate>
</node>

Let's say we're trying to resolve the sid ref "myNode/rotateY.ANGLE", which we may have read from an animation <channel> element. You could use the daeSidRef class to resolve this sid ref like this.

// node is a daeElement* pointing at the <node> element in the document above
daeElement* node = ...;
daeSidRef sidRef("myNode/rotateY.ANGLE", node);
daeDouble* scalar = sidRef.resolve().scalar;
if (!scalar)
    cout << "bad sid ref\n";
else
    // Do something with the scalar value
    cout << *scalar << endl;

Now suppose we have the following Collada data.

<effect id="face-fx">
    <profile_COMMON>
        <newparam sid="mySampler">
            <sampler2D>
            ...
            </sampler2D>
        </newparam>
        <technique sid="common">
            <phong>
                <diffuse>
                    <texture texture="mySampler" texcoord="uv0"/>
                    ...
</effect>

The example isn't quite complete, but it demonstrates a basic effect with texturing, which uses sid references. In this example the 'texture' attribute of the <texture> element is a sid reference that refers to the <newparam> containing the <sampler2D> element. You can use the daeSidRef class to resolve the "mySampler" sid ref to get the element it points to.

daeElement* texture = ...; // Points to the <texture> element
daeSidRef sidRef(texture->getAttribute("texture"), texture->getAncestor("effect"));
daeElement* param = sidRef.resolve().element;
if (!param)
    cout << "bad sid ref\n";
else
    // Read the <sampler2D>, etc

Note the usage of the daeElement::getAncestor function to easily get the <effect> that contains the <texture>.

Custom data

In the DOM all xml elements have a pointer for a user to store some custom data. This data is accessed via the daeElement::setUserData and daeElement::getUserData functions.

// From daeElement.h
void setUserData(void* data);
void* getUserData();

These functions should be pretty self explanatory. The DOM makes no attempt to process the data you pass it via setUserData; from the DOM's perspective it's just an opaque pointer to memory. One issue to keep in mind is memory management. If you allocate a structure via new or malloc and attach it to a daeElement via setUserData, it's your job to make sure the memory gets freed. The integrationExample.cpp file in the DOM test suite demonstrates how you might use the setUserData/getUserData functions to help write a Collada importer.

A complex example

Let's tie all these concepts together by writing a program that does something a little more sophisticated. In this program we'll open a Collada file (from a hard-coded path, for simplicity), then print out a listing of some basic information about all the textures used in the file. For each texture binding we'll print the <effect> ID, the image file name, and the parent element name (<diffuse>, <ambient>, etc). The parent element could be useful because it gives us information about how the texture is used (as a diffuse texture, a specular texture, etc).

The program is given below.

#include <iostream>
#include <dae.h>
#include <dom/domCOLLADA.h>
#include <dom/domCommon_color_or_texture_type.h>
using namespace std;

// Given a <newparam>, search for child element <sampler1D>, <sampler2D>, etc
daeElement* getSampler(daeElement* param) {
   daeElementRefArray children = param->getChildren();
   for (size_t i = 0; i < children.getCount(); i++)
       if (strncmp(children[i]->getElementName(), "sampler", 7) == 0)
           return children[i];
   return NULL;
}

int main() {
   DAE dae;
   daeElement* root = dae.open("/home/sthomas/models/Seymour.dae");
   if (!root) {
       cout << "Document import failed.\n";
       return 0;
   }

   daeDatabase* db = dae.getDatabase();
   vector<daeElement*> textures = db->typeLookup(domCommon_color_or_texture_type::domTexture::ID());
   for (size_t i = 0; i < textures.size(); i++) {
       daeElement* tex = textures[i];
       daeElement* effect = tex->getAncestor("effect");
       daeElement* parent = tex->getParent();
       string imageFile;

       if (daeElement* samplerParam = daeSidRef(tex->getAttribute("texture"), effect).resolve().elt)
           if (daeElement* sampler = getSampler(samplerParam))

               // We have the <sampler*> element. Now read the <source>, which is a sid
               // ref to a <newparam> containing a <surface>.
               if (daeElement* source = sampler->getChild("source"))
                   if (daeElement* surfaceParam = daeSidRef(source->getCharData(), effect).resolve().elt)
                       if (daeElement* surface = surfaceParam->getChild("surface"))

                           // We have the <surface> element. It contains an <init_from> child
                           // whose character data contains the ID of the <image> element.
                           if (daeElement* initFrom = surface->getChild("init_from"))
                               if (daeElement* image = db->idLookup(initFrom->getCharData(), initFrom->getDocument()))

                                   // The <image> contains an <init_from> element which is a URI
                                   // that points to the image file.
                                   if (domImage::domInit_from* initFrom = daeSafeCast<domImage::domInit_from>(image->getChild("init_from")))
                                       imageFile = cdom::uriToNativePath(initFrom->getValue().str());

       cout << "effect: " << effect->getAttribute("id") << endl;
       cout << "parent: <" << parent->getElementName() << ">" << endl;
       cout << "image file: " << imageFile << endl << endl;
   }

   return 0;
}

The comments should help explain what's happening. Each of the functions and classes used has been explained in this guide. For each <texture> in the document, an entry is printed that looks like this:

effect: face-fx
parent: <diffuse>
image file: /home/sthomas/models/boy_10.tga


COLLADA DOM - Version 2.4 Historical Reference
List of main articles under the DOM portal.
User Guide chapters:  • Intro  • Architecture  • Setting up  • Working with documents  • Creating docs  • Importing docs  • Representing elements  • Working with elements  • Resolving URIs  • Resolving SIDs  • Using custom COLLADA data  • Integration templates  • Error handling

Systems:  • URI resolver  • Meta  • Load/save flow  • Runtime database  • Memory • StringRef  • Code generator
Additional information:  • What's new  • Backward compatibility  • Future work
Terminology categories:  • COLLADA  • DOM  • XML