The ZStr ownerhip transferring string class that I introduced in my first Xerces posting may have seemed like total overkill for the job. In C++98 it’s difficult to do correctly, and it requires some less-than-commonly available support machinery. And so if Xerces string handling was the only reason to do it, I wouldn’t have done it.
But once a string class like ZStr is available you (or at least I) find that it’s a natural first choice for many tasks. The beauty of it is that it allows you to defer the decision of trading efficiency for functionality, because the ownership can be transferred to an immutable sharing string type at any point. Or the string can be copied to a copying mutable string type like std::string, whatever.
With a type like ZStr, if you’re implementing library-like functionality, the decision of which “rich” string type does not have to be imposed on the client code. Instead of already trading away the efficiency you’re giving the client code the choice, including the choice of just using ZStr.
In part I of this series I discussed Xerces’ UTF16-based string representation, a common deallocation pitfall for such strings, and how to convert to and from such strings correctly by using C++ RAII techniques. For the RAII techniques I presented one solution using boost::shared_ptr, and one more efficient solution using a home-brewed ownership transfer class, cppx::Ownership. And I promised to next (i.e. in this installment) discuss how to do it even more efficiently by using wchar_t strings as the program’s native strings, and detecting the minimal amount of work needed for each conversion.
For Windows, where wchar_t corresponds in size and encoding to Xerces’ XMLCh, all that’s needed to convert from a wchar_t literal to Xerces XMLCh string is, at most, a little reinterpret_cast… But that won’t work on a system with 4 byte wchar_t or a system where wchar_t doesn’t imply Unicode. And although the non-Unicode systems are perhaps not your top porting priority, 4 byte wchar_t systems are common.
The wrapping that I start discussing in this posting, based on the cppx Ownership class, does a simple reinterpret_cast when it can (e.g. for a literal wchar_t string argument in Windows), and a dynamic allocation plus transcoding in other cases.
The Ownership wrapping of the result abstracts away the details of how the string is converted, and whether deallocation is needed. Thus, the calling code does not have to care. It gets raw null-operation efficiency where possible, and otherwise at least the same convenient notation + C++ RAII based safety. 🙂