[cppx] Hubris, significant spaces, g++

What’s the difference between …

#define CPPX_CSI_DIR( maindir )                 \
    CPPX_CONCAT3( maindir, /, CPPX_CSI_SUBDIR )

… and …

#define CPPX_CSI_DIR( maindir )                 \
    CPPX_CONCAT3_( maindir,/, CPPX_CSI_SUBDIR )

…? Well, the former works with MSVC (Microsoft Visual C++), while the latter works with both MSVC and g++, for the purpose of this macro. The macro is part of a scheme for automatically choosing a compiler specific header, as I described in an earlier posting, and g++ treats spaces, such as the space before the slash, as Very Significant when the macro expansion ends up as the path specification of an #include directive!

Hubris… When I made that earlier posting I’d just streamlined the compiler specific include scheme to be more clear and reusable, but I’d only tested that new version with MSVC! Of course g++ would have none of that. Yer givin’ me a version that’s only been tested with MSVC? Ye got to be kidding! I’ll show you who’s G around here! I’ll stomp and spit on all them significant spaces!

To make a long story short (applause!), here’s the updated g++ and MSVC compatible version of the CSI scheme:

In [progrock/cppx/compiler_specific/include.h]:

#include    <progrock/cppx/macro_util.h>        // CPP_CONCAT3_

#if     defined( _MSC_VER )
#   define  CPPX_CSI_SUBDIR     msvc
#elif   defined( __GNUC__ )
#   define CPPX_CSI_SUBDIR      gnuc
#else
#   define CPPX_CSI_SUBDIR      _generic
#endif

// NOTE: The "missing" spaces before the slashes are important for the g++ compiler.
#define CPPX_CSI_DIR( maindir )                 \
    CPPX_CONCAT3_( maindir,/, CPPX_CSI_SUBDIR )
#define CPPX_CSI_PATH( maindir, path )          \
    CPPX_CONCAT3_( CPPX_CSI_DIR( maindir ),/, path )
#define CPPX_CSI_BRACKET_PATH( maindir, path )  \
    CPPX_CONCAT3_( <, CPPX_CSI_PATH( maindir, path ), > )

// To avoid macro replacement a component of 'path' should not be all uppercase
// and should not be any of 'assert', 'errno', 'offsetof', 'setjmp', 'va_arg',
// 'va_end' or 'va_start', which are macros  --  see C++98 §17.4.1.2/5.

// NOTE: The "missing" space before "progrock" is important for the g++ compiler.
// When invoking this macro do not have spaces around the macro argument!
#define CPPX_COMPILER_SPECIFIC_INCLUDE(path)  \
    CPPX_CSI_BRACKET_PATH(progrock/cppx/compiler_specific, path )

Of course, for this to be useful to you you’ll also need the definition of the CPPX_CONCAT3_ macro. Sorry, I forgot that the first time around. But here it is:

In [progrock/cppx/macro_util.h]:

// Concatenate with expansion of arguments:
#define CPPX_CONCAT_RAW( a, b )         a ## b
#define CPPX_CONCAT( a, b )             CPPX_CONCAT_RAW( a, b )

#define CPPX_CONCAT3_RAW( a, b, c )     a ## b ## c
#define CPPX_CONCAT3( a, b, c )         CPPX_CONCAT3_RAW( a, b, c )

// NOTE: spaces may be significant (e.g. for the g++ compiler).
// This is just in support of <progrock/cppx/compiler_specific/include.h> with g++.
#define CPPX_TERM(x)x
#define CPPX_CONCAT_( a, b )CPPX_TERM(a)CPPX_TERM(b)
#define CPPX_CONCAT3_( a, b, c )CPPX_TERM(a)CPPX_TERM(b)CPPX_TERM(c)

I don’t know what the standard mandates, whether g++ is within its rights here or not. But while I fixed this I ran into a g++ bug. Namely that in some cases, such as within a struct or within a function template, it chokes on a typedef that defines the same type (in the same way) as earlier, contrary to C++98 §7.1.3/2:

In a given scope, a typedef specifier can be used to redefine the name of any type declared in that scope to refer to the type to which it already refers.

That g++ does not honor that paragraph meant that I had to fix up the CPPX_STATIC_ASSERT macro that I described in an earlier posting, solely to work around the g++ compiler bug:

In [progrock/cppx/devsupport/static_assert.h]:

// The __LINE__ is not formally necessary (see C++98 §7.1.3/2), but caters to g++ bug.
#define CPPX_STATIC_ASSERT( e ) \
    typedef char CPPX_CONCAT( CppxStaticAssertShouldBeTrue_, __LINE__ )[(e)? 1 : -1]

Ah, well, more hubris… I was sure that the original worked OK with both MSVC and g++ since it passed all unit testing and was standard-conforming. But the unit test didn’t test having two CPPX_STATIC_ASSERT invocations in the same scope, with g++! In my defense, it’s practically impossible to test for all possible compiler bugs. But still, I think I should have caught this one!

Advertisements

3 comments on “[cppx] Hubris, significant spaces, g++

  1. Hi,

    since you mentioned that you couldn’t make this work reliably I
    was curious and finally got around to look at it. Actually, I
    just stopped at the first line and then looked up the definition
    of CPPX_CONCAT3, but I guess all the problems come from there
    ([footnote]). The result of the ## operator must be a valid
    preprocessing token or the behavior is undefined.

    [footnote] Unless you also mention problems that don’t come from
    there 🙂 Seriously, I meant to look at it quickly, so once I
    found this I stopped.

    • Yes, that’s the difference between the original’s CPPX_CONCAT3 and the second version’s CPPX_CONCAT3_ (note the underscore), defined above in this posting. The latter avoids using ##, but I don’t know whether it’s standard-conforming. It works for generating #include paths, but not in general… For some reason Visual C++ was happy with ##, but g++ would have none of it! And I guess you’re right that all the problems come from this.

      I still haven’t looked at the details of how you do this in Breeze.

      My reasoning is that since different compilers do preprocessor things in slightly different ways (non-standard) what matters is whether something works, and how well it works, not absolute standard-conformance. The above works with Visual C++ and g++, at least, but as described it’s very brittle in that invocations can’t have extraneous spaces. However, since it’s all accessed via a single higher level macro I think it can easily be replaced with e.g. the Breeze scheme.

      Cheers, & thanks,

      – Alf

      • OK. I didn’t notice there was also a version that didn’t use ##
        (I didn’t notice the underscore (eh!); and thought the answer to
        “what’s the difference” was simply “there’s no space before the
        forward slash” 🙂 —that’s when people allow themselves to be
        influenced by the title; well, I guess, at least, IANF (I am not
        Freud)).

        I have never tried using the angle-bracket form (I never use it
        for user source files). I guess that g++ just “behaves better”
        with the quoted form.

        You might make a quick check by reducing the code to (warning:
        untested; may cause hair loss)

        // BTW:
        // ----
        // I'm reporting this unchanged; but I wouldn't use it. The idea
        // is that if you add support for a new compiler you don't have
        // to re-compile anything for the others. So this kind of
        // selection is IMHO excluded a priori. (The macro corresponding
        // to CPPX_CSI_SUBDIR, in my case, is defined at the build
        // level, eventually ending in a /D compiler option or
        // equivalent)
        //
        #if     defined( _MSC_VER )
        #   define  CPPX_CSI_SUBDIR     msvc
        #elif   defined( __GNUC__ )
        #   define CPPX_CSI_SUBDIR      gnuc
        #else
        #   define CPPX_CSI_SUBDIR      _generic
        #endif
        
        #define STRINGIZE_IMPL( x )     #x
        
        #define ENCLOSE( x )            /*<x> //*/ STRINGIZE_IMPL( x )
        
        #define CPPX_COMPILER_SPECIFIC_INCLUDE( path )  \
            ENCLOSE( progrock/cppx/compiler_specific/CPPX_CSI_SUBDIR/path )
        

        and then removing the leading “/*” where ENCLOSE is defined.

        Generally speaking, every time you use this or similar code for
        a new compiler you have to check the documentation (assuming
        there is one) because the way in which the final sequence
        <tokentokentoken…blah blah> or "tokentokentoken…uhoh"
        resolves to a header name, and how that in turn resolves to a
        source file, is implementation defined.

        Note, too, that function-like macros aren’t really a problem,
        unless you manage to get the open-parenthesis in there, too (and
        at the “right” place; now, whoever really manages to do so…).

        I have no file named “setjmp” (!) but I do have one named
        assert.cpp; it’s not the subject of any selection macros,
        though, as it only involves portable stuff. If I really wanted
        to check this by other means than human review I’d have to write
        an inspect tool (which I had intention to write anyway, for
        other reasons).

        Of course, for an unexpected macro expansion to really be a
        problem:

        – you should have an oddly named file; or name it as
        all-uppercase in source code

        – the macro should expand to something that makes the whole
        “header name” resolve to another source file that –uh– you
        happen to have as well; or invoke UB (on a compiler that
        doesn’t know much about diagnostics)

        – the contents of that other source file should be such that you
        don’t notice
        that you actually compiled something else.

        Did I forget anything?

        <digression>
        Incidentally, there seems to be yet-another-problem in the
        standard: the paragraph you cite says (in a note) that e.g.
        errno (but it’s not the only case in the list; see e.g. the now
        so popular setjmp :-)) is one of the names “defined as macros in
        C” but the C standard actually says it can be a macro or an
        identifier with external linkage (admittedly, it first says it’s
        a macro then goes saying it might not. Anyhow, it’s not a case
        of “grants license for implementation as function”). IMHO the
        weak part here (in the C++ standard) is the wording “defined as
        macros in C”.
        </digression>

        I haven’t dwelt on why I think the above is OK, because an
        explanation (although focused on the quoted-include case) is in

          breeze/trunk/…/dependent_code.hpp

        and I’d like to see if it is clear on its own when the reader
        hasn’t been already exposed to the reasoning behind it (IOWs,
        I’m hoping for feedback :-)).

        P.S.:
        (the postscript of sadness)

        the syntax highlighting of line-continued macro definitions is
        terrible: it grays the macro name and parameters, and emphasizes
        the replacement list 😦 (I’d expect the suggested workaround
        from the Department of Dumb would be line-continuing immediately
        after #define:

        #define     \
            UH()    \
            "Wow... now it half-works"
        

        )

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s