[C++ basics] Five lesser known special cast operations

If you’re already familiar with casting to inaccessible base, disambiguation casts, casting to most derived object, casting to/from first member of a POD, and accessing the built-in address operator via a cast, then this blog entry is perhaps not for you. 🙂 It’s basics, but for some reason even seasoned C++ programmers are not always aware of all these special cast operations. This is about special case semantics, not syntax or general operations.

Assumptions

I assume that you are familiar with the five general type casting operations in C++, namely the old C style cast, which can be expressed as (T)e or as T(e); the const_cast<T>(e); the static_cast<T>(e); the dynamic_cast<T>(e); and the reinterpret_cast<T>(e).

Special cast #1: Casting to inaccessible base

The old C style cast (T)e, like (int)3.14, is known formally as cast notation (C++98 $5.4/2). In C++ it can also be expressed as T(e), like int(3.14). The latter syntax is formally an “explicit type conversion (functional notation)” – here I just refer to both notations as C style casts.

Using a C style cast for a non-class type T may do a static_cast or a reinterpret_cast, and it may add a const_cast. Thus, what it does in any particular case may be unclear, and it may change silently as a result of code maintainance. Also, a C style cast is more difficult to grep for than a C++ named cast. And since it’s not very much to type and effectively shuts up the compiler, quit yer bitching!, it’s much used by novices to make incorrect code compile. Especially for the last reason it’s generally considered Evil™.

But an C style cast can do one thing that the named casts cannot do, namely casting to an inaccessible base as if a static_cast were used, if static_cast had supported this.

It can go like …

#include <iostream>

class Base
{
private:
    int x_;
public:
    explicit Base( int x ): x_( x ) {}
    int x() const { return x_; }
};

class Derived:
    private Base
{
public:
    explicit Derived( int x ): Base( x ) {}
    virtual ~Derived() {}
};

int main()
{
    Derived     d( 42 );
    Base&       b = (Base&) d;
    //Base&     b = reinterpret_cast<Base&>( d );   // Undefined Behavior

    std::cout << b.x() << std::endl;
}

Here the C style cast does not do a reinterpret_cast, or for that matter any other named cast. Due to the virtual destructor in Derived a reinterpret_cast would even in practice yield some nonsense arbitrary result. Instead the cast behaves as static_cast would have behaved if static_cast had supported this.

In the C++98 standard casting to an inaccessible base class is defined in §5.4/7 (emphasis added by me):

… the following static_cast and reinterpret_cast operations (optionally followed by a const_cast operation) may be performed using the cast notation of explicit type conversion, even if the base class type is not accessible

Of course this can be viewed as yet another reason to avoid C style casts!

Special cast #2: Disambiguating a function reference

The following code does not compile, because the reference to foo is ambiguous:

[#include <iostream>

void foo( int x )
{
    std::cout << "foo(int)" << std::endl;
}

void foo( double x )
{
    std::cout << "foo(double)" << std::endl;
}

template< class Func >
void bar( Func f )
{
    f( 42 );
}

int main()
{
    bar( &foo );        // !Ambiguous
}

Happily the function can be disambiguated because which function a name refers to depends on the usage context, as specified by the C++98 standard’s §13.4/1 …

The function selected is the one whose type matches the target type required in the context. The target can be…

… where the list of possible targets include e.g. the left side of an assignment, but also an explicit C style or static_cast cast.

Which means that you can “cast” the function name to the desired function pointer type. Except that for this special case there’s no type conversion. Instead what it does is simply to select the desired function among the overloads, i.e. to disambiguate the name:

#include <iostream>

void foo( int x )
{
    std::cout << "foo(int)" << std::endl;
}

void foo( double x )
{
    std::cout << "foo(double)" << std::endl;
}

template< class Func >
void bar( Func f )
{
    f( 42 );
}

int main()
{
    bar( static_cast< void(*)(double) >( &foo ) );      // OK
}

So, not every static_cast is a type conversion or constructor invocation. It can also be just a plain disambiguation. And sometimes that’s necessary.

Special cast #3: Casting to the most derived object

Given a pointer or reference to some object o whose known static type is Base, the most derived object is the full object of type MostDerived where MostDerived is the type that o was originally instantiated from.

As a void* pointer a pointer to the Base sub-object of o may differ from a void* pointer to the MostDerived object. If you need a void* pointer that uniquely identifies the most derived object you therefore need either the latter, or a pointer to a unique topmost base class sub-object (and this pointer can be properly typed). And since you generally will have a unique top-most base class this is IMHO the most useless special cast in C++

But, anyway, you can obtain a void* pointer to the most derived object via a dynamic_cast, per the C++98 standard’s §5.2.7/7:

If T is “pointer to cv void,” then the result is a pointer to the most derived object pointed to by v.

For example,

#include <iostream>

struct Base1 { virtual ~Base1(){} };
struct Base2 { virtual ~Base2(){} };

void foo( Base1& a, Base2& b )
{
    if( dynamic_cast< void* >( &a ) == dynamic_cast< void* >( &b ) )
    {
        std::cout << "foo: same objects" << std::endl;
    }
    else
    {
        std::cout << "foo: different object" << std::endl;
    }
}

int main()
{
    struct Derived: Base1, Base2 {};

    Base1       b1;
    Base2       b2;
    Derived     d;

    foo( b1, b2 );          // different
    foo( b1, d );           // different
    foo( d, d );            // same
}

Until a few years ago the only not-overly-implausible use case I knew of was a kind of object registry where duplicates were not allowed, and where some big conspiration of multinationals and perhaps also involving the freemasons, Al Gore and the Norwegian politician Thorbjørn Jagland (who was instrumental in awarding the Nobel Peace Prize to Obama), had Thorbjørned you into a situation where you could not use a common base class for the objects, and had to deal with an unknown number of types.

Then someone mentioned in [comp.lang.c++] a situation where a function foo received pointers to two objects that it should delete, but where as an optimization, in a special case, the someone wanted to allocate both objects within a single object. Just to avoid a dynamic allocation. I’m not sure if this is plausible, though: to my mind the design should be changed.

Special cast #4: Casting to or from first member of a POD

C does not support inheritance. So a C++ class like IntNode below,

#include <iostream>

struct Node
{
    Node*   next;

    Node(): next( 0 ) {}
    void linkInAs( Node*& aNextPointer )
    {
        next = aNextPointer;
        aNextPointer = this;
    }
};

struct IntNode: Node
{
    int     value;
    IntNode( int v ): value( v ) {}
};

void displayIntList( Node* p )
{
    while( p != 0 )
    {
        IntNode* const  pIntNode    = static_cast< IntNode* >( p );
        std::cout << pIntNode->value << std::endl;
        p = p->next;
    }
}

int main()
{
    Node*   pFirst  = 0;
    (new IntNode( 1 ))->linkInAs( pFirst );
    (new IntNode( 2 ))->linkInAs( pFirst );
    (new IntNode( 3 ))->linkInAs( pFirst );
    displayIntList( pFirst );       // 3 2 1
}

… must be sort of emulated in C. And a common C technique for that is to explicitly put the base class sub-object as a first member of the logically derived class:

#include <iostream>

//------------------------------ Node:

struct Node
{
    Node*   next;
};

Node* initNode( Node* p )
{
    p->next = 0;
    return p;
}

void linkInAs( Node** aNextPointer, Node* p )
{
    p->next = *aNextPointer;
    *aNextPointer = p;
}

//------------------------------ IntNode:

struct IntNode
{
    Node    node;               // "base class" sub-object
    int     value;
};

IntNode* initIntNode( IntNode* p, int v )
{
    initNode( &p->node );
    p->value = v;
    return p;
}

//------------------------------ main program:

void displayIntList( Node* p )
{
    while( p != 0 )
    {
        IntNode* const  pIntNode    = reinterpret_cast< IntNode* >( p );
        std::cout << pIntNode->value << std::endl;
        p = p->next;
    }
}

int main()
{
    Node*   pFirst  = 0;
    linkInAs( &pFirst, &(initIntNode( new IntNode, 1 )->node) );
    linkInAs( &pFirst, &(initIntNode( new IntNode, 2 )->node) );
    linkInAs( &pFirst, &(initIntNode( new IntNode, 3 )->node) );
    displayIntList( pFirst );       // 3 2 1
}

Now this is just C-like C++, it’s not proper C, but I think you get the picture.

And for compatibility with C, C++ supports the reinterpret_cast via the C++98 standard’s §9.2/17:

A pointer to a POD-struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa.

Since this is in flat contradiction to the standard’s more well known general discussion of reinterpret_cast, where essentially the only thing you can do with the result of reinterpret_cast is to cast it back to the original type, the above is perhaps the least known special cast in C++. It’s useful for dealing with low level things. But when you have a choice, use proper C++ inheritance – or templating! 🙂

Special cast #5: Accessing the built-in address operator

C++ (unfortunately, in my view) allows you to overload the address operator &. And this can make it difficult to obtain a pointer to an object, when all you have is a reference to the object! The user-defined address operator might do anything, and produce any result whatsoever…

#include <iostream>

struct Ungood
{
    int operator&() { return 0x666; }
    void* selfPointer() { return this; }
};

int main()
{
    Ungood  o;

    std::cout << "Real address (this): " << o.selfPointer() << std::endl;

    // Something like 0x666:
    std::cout << "Address from '&'   : " << (void*)&o << std::endl;

    void* p = &reinterpret_cast< char& >( o );
    std::cout << "Real address (cast): " << p << std::endl;
}

The C++98 standard supports this cast via §5.2.10/10 (emphasis added by me):

An lvalue expression of type T1 can be cast to the type “reference to T2” if an expression of type “pointer to T1” can be explicitly converted to the type “pointer to T2” using a reinterpret_cast. That is, a reference cast reinterpret_cast<T&>(x) has the same effect as the conversion *reinterpret_cast<T*>(&x) with the built-in & and * operators. The result is an lvalue that refers to the same object as the source lvalue, but with a different type. No temporary is created, no copy is made, and constructors (12.1) or conversion functions (12.3) are not called.

A general implementation of this little trick, such as the one in the Boost library, must also deal with constness. And since a C style cast (an explicit type conversion) removes constness as needed, some folks have argued that it is preferable to a very verbose combo of reinterpret_cast and const_cast. However, first of all this operation should be wrapped, as it is in the Boost library, and secondly the C style cast is not safe here: depending on the Ungood class it may invoke a type conversion operator, which the above guarantees does not happen.

Other special casts

Are there any other special casts than the above 5?

Depending on what you regard as special, yes, there is e.g. the cast to void, and there are the special rules concerning casts to and from char types. But in my view they’re really not more than ordinary type conversions, just that they have special support. So I’d say that the list above is complete.

However, if you know of any more, please do comment! 🙂

Advertisements

3 comments on “[C++ basics] Five lesser known special cast operations

  1. I stumbled upon yet another case where I had to use C-style cast: i was writing a templated align function that was supposed to align integers AND pointers. That leads to difficulty, because reinterpret_cast does not work between integer types,
    and static_cast does not work between integers and pointers.

    For example (for simplicity, I use increment instead of aligning):

    template
    T f(T p)
    {
    long x = reinterpret_cast(p);
    ++x;
    return reinterpret_cast(x);
    }

    void g()
    {
    long x, *y = f(&x);
    // INVALID CAST ERROR long z = f(1);
    }

  2. I’m wondering why the C-style cast can cast an object to inaccessible base class? It will make the private keyword not working. For this kind rule of C++, I assume there should be some reason to define it in C++, for example, for Special cast #4, it should be compliant to C language.

  3. Pingback: Why the standard explicate allow C-style cast to convert derived object to inaccessible base sub-object? | BlogoSfera

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s