Fast C++ Delegate: Boost.Function ‘drop-in’ replacement and multicast

现在的位置: 首页 > 综合 > 正文

Fast C++ Delegate: Boost.Function ‘drop-in’ replacement and multicast

2014年03月08日 ⁄ 综合 ⁄ 共 12326字 ⁄ 字号小中大 ⁄ 评论关闭

注意，使用最新的boost需要进行修改：Just replace `boost::ct_if` with `boost::mpl::if_c` (and `#include <boost/mpl/if.hpp>`) in Jae's Fast Delegate code.

Introduction

There have been several C++ delegates which declared themselves as a 'fast' or 'fastest' delegate,
whileBoost.Function and its siblings, Boost.Bind and Boost.Mem_fn,
were adopted as part of C++ Standards Committee's Library Technical Report (TR1).
So, what are those called 'fast' or 'fastest' delegates and how much 'faster' are they
than Boost.Function?

The prefix 'fast' in the term 'fast(est) delegate' means either 'fast' invocation or 'fast' copy, or both. But, I believe, what is really an issue between the two when using the 'non-fast' Boost.Function is
more likely its awful copy performance. This is due to the expensive heap memory allocation that is required to store the member function and the bound object on which member function call is made. So, 'fast' delegate often refers to a delegate that does not
require heap memory allocation for storing the member function and the bound object. In C++, as an object oriented programming paradigm, use of delegate or closure for the member function and the bound object is one of the most frequently occurring practices.
Thus, a 'fast' delegate can 'boost' the performance by far in some situations.

The following four graphs are the result of the invocation speed comparison among three fast delegates andBoost.Function in
the various function call scenarios. See the '%FD_ROOT%/benchmark' for details.

Invocation speed benchmark #01

Invocation speed benchmark #02

Invocation speed benchmark #03

Invocation speed benchmark #04

The following two graphs are the result of the copy speed comparison among three fast delegates andBoost.Function.
For a bound member function call, it was found that Boost.Function can take 150 times longer than the fastest. The
result may vary based on the benchmark platform and environment, but it is obvious that the copy performance of Boost.Function is
not acceptable in certain cases.

Copy speed benchmark - Debug mode

Copy speed benchmark - Release mode

In spite of the prominent speed boost of the fast delegates in specific cases, it is not comfortable for many programmers to switch and start using the fast delegates. This is because their features are not as
rich as those which Boost.Function and its siblings provide, and we are already accustomed to using Boosts.
These fast delegates support very limited types of callable entities to store and mostly do not support the storing of a function object, which is another frequently occurring practice in C++.

I had implemented a fast delegate some time ago, but it was not as fast as other fast delegates nor as C++ Standard compliant as I thought it was. I actually patched it to be C++ Standard compliant later. This
is the second version, but it is completely re-implemented from the scratch. The old version is obsolete. It is another 'fast' delegate, but it is also a Boost.Function 'drop-in'
replacement and more. I say 'more' because it supports the multicast feature which is missing in the most of C++ delegates currently available. It is not like an ancillary class to support multicast, but one class instance acts as single cast and multicast
on demand, without any runtime performance penalty. FD.Delegate can be thought of as an aggregation of Boost.Function and
its siblings (Boost.Bind and Boost.Mem_fn)
plus some features from Boost.Signals. See the 'Delegates Comparison Chart' at the end of the article for particulars.

Using the code

As stated previously, FD.Delegate is a Boost.Function 'drop-in'
replacement. So it is reasonable to refer to the online documentation of Boost.Function and especially Boost.Function tutorial for
features of FD.Delegate. Just make sure to add '%FD_ROOT%/include' as a system include directory.

- Example #1 from Boost.Function.

Collapse | Copy
Code

#include <iostream>
#include <fd/delegate.hpp>

struct int_div
{
    float operator()(int x, int y) const { return ((float)x)/y; };
};

int main()
{
    fd::delegate<float (int, int)> f;
    f = int_div();

    std::cout << f(5, 3) << std::endl; // 1.66667

    return 0;
}

- Example #2 from Boost.Function.

Collapse | Copy
Code

#include <iostream>
#include <fd/delegate.hpp>

void do_sum_avg(int values[], int n, int& sum, float& avg)
{
    sum = 0;
    for (int i = 0; i < n; i++)
        sum += values[i];
    avg = (float)sum / n;
}

int main()
{
    // The second parameter should be int[], but some compilers (e.g., GCC)
    // complain about this
    fd::delegate<void (int*, int, int&, float&)> sum_avg;

    sum_avg = &do_sum_avg;

    int values[5] = { 1, 1, 2, 3, 5 };
    int sum;
    float avg;
    sum_avg(values, 5, sum, avg);

    std::cout << "sum = " << sum << std::endl;
    std::cout << "avg = " << avg << std::endl;
    return 0;
}

FD.Delegate supports multicast and uses C#'s multicast syntax, operator += and operator -=.

Collapse | Copy
Code

#include <iostream>
#include <fd/delegate/delegate2.hpp>

struct print_sum
{
    void operator()(int x, int y) const { std::cout << x+y << std::endl; }
};

struct print_product
{
    void operator()(int x, int y) const { std::cout << x*y << std::endl; }
};

int main()
{
    fd::delegate2<void, int, int> dg;

    dg += print_sum();
    dg += print_product();

    dg(3, 5); // prints 8 and 15

    return 0;
}

While a function pointer is equality comparable, a function object is not quite determinant at compile-time, whether equality comparable or not. This fact makes operator -= pretty much useless for removing a function object from multicast. FD.Delegate has add() and remove() member
function pairs to remedy the issue. add() returns an instance of fd::multicast::token which
can be used to remove the added delegate(s).

Collapse | Copy
Code

#include <iostream>
#include <fd/delegate.hpp>
#include <cassert>

struct print_sum
{
    void operator()(int x, int y) const { std::cout << x+y << std::endl; }
};

struct print_product
{
    void operator()(int x, int y) const { std::cout << x*y << std::endl; }
};

struct print_difference
{
    void operator()(int x, int y) const { std::cout << x-y << std::endl; }
};

struct print_quotient
{
    void operator()(int x, int y) const { std::cout << x/-y << std::endl; }
};

int main()
{
    fd::delegate2<void, int, int> dg;

    dg += print_sum();
    dg += print_product();

    dg(3, 5);

    fd::multicast::token print_diff_tok = dg.add(print_difference());

    // print_diff_tok is still connected to dg
    assert(print_diff_tok.valid());

    dg(5, 3); // prints 8, 15, and 2

    print_diff_tok.remove(); // remove the print_difference delegate

    dg(5, 3);  // now prints 8 and 15, but not the difference

    assert(!print_diff_tok.valid()); // not connected anymore
    {
        fd::multicast::scoped_token t = dg.add(print_quotient());
        dg(5, 3); // prints 8, 15, and 1
    } // t falls out of scope, so print_quotient is not a member of dg

    dg(5, 3); // prints 8 and 15

    return 0;
}

It has been one of the main concerns for a multicast delegate how to manage multiple return values. Combiner
interface of Boost.Signals has been adopted, but has slightly different usage and syntax. The type of the combiner
interface is not a part of FD.Delegate type, although it is for Boost.Signals,
as the form of the template parameter when declaring a signal variable. Instead, FD.Delegate has a special function
call operator which takes the instance of the combiner interface as the last function call argument.

Collapse | Copy
Code

#include <algorithm>
#include <iostream>
#include <fd/delegate.hpp>

template<typename T>
struct maximum
{
    typedef T result_type;

    template<typename InputIterator>
    T operator()(InputIterator first, InputIterator last) const
    {
        if(first == last)
            throw std::runtime_error("Cannot compute maximum of zero elements!");
        return *std::max_element(first, last);
  }
};

template<typename Container>
struct aggregate_values
{
    typedef Container result_type;

    template<typename InputIterator>
    Container operator()(InputIterator first, InputIterator last) const
    {
        return Container(first, last);
    }
};

int main()
{
    fd::delegate2<int, int, int> dg_max;
    dg_max += std::plus<int>();
    dg_max += std::multiplies<int>();
    dg_max += std::minus<int>();
    dg_max += std::divides<int>();

    std::cout << dg_max(5, 3, maximum<int>()) << std::endl; // prints 15

    std::vector<int> vec_result = dg_max(5, 3, 
        aggregate_values<std::vector<int> >());
    assert(vec_result.size() == 4);

    std::cout << vec_result[0] << std::endl; // prints 8
    std::cout << vec_result[1] << std::endl; // prints 15
    std::cout << vec_result[2] << std::endl; // prints 2
    std::cout << vec_result[3] << std::endl; // prints 0

    return 0;
}

Under the hood

Part A: storing a function pointer for later invocation without requiring heap memory allocation.

According to C++ standards, a function pointer -- both free function pointer and member function pointer -- cannot be converted or stored into a void *. A function pointer may be converted into a function pointer of a different type signature, however, the result of such conversion cannot be used; it can only be converted back. The size of the member function varies over the different platforms from
4 bytes to 16 bytes. To avoid heap allocation to store the member function, some well-known template meta programming techniques have been adapted. These permit a member function pointer whose size is less than or equal to the size of the predefined generic
member function pointer to be stored without heap memory allocation. The stored generic member function pointer is restored back to its original member function type before use.

Collapse | Copy
Code

typedef void generic_fxn();

class alignment_dummy_base1 { };
class alignment_dummy_base2 { };

class alignment_dummy_s : alignment_dummy_base1 { };                         
    // single inheritance.
class alignment_dummy_m : alignment_dummy_base1, alignment_dummy_base2 { };  
    // multiple inheritance.
class alignment_dummy_v : virtual alignment_dummy_base1 { };                 
    // virtual inheritance.
class alignment_dummy_u;                                                     
    // unknown (incomplete).

typedef void (alignment_dummy_s::*mfn_ptr_s)();  
    // member function pointer of single inheritance class.
typedef void (alignment_dummy_m::*mfn_ptr_m)();  
    // member function pointer of multiple inheritance class.
typedef void (alignment_dummy_v::*mfn_ptr_v)();  
    // member function pointer of virtual inheritance class.
typedef void (alignment_dummy_u::*mfn_ptr_u)();  
    // member function pointer of unknown (incomplete) class.

typedef void (alignment_dummy_m::*generic_mfn_ptr)();

union max_align_for_funtion_pointer
{
    void const * dummy_vp;
    generic_fxn * dummy_fp;
    boost::ct_if<( sizeof( generic_mfn_ptr ) < sizeof( mfn_ptr_s ) ),
      generic_mfn_ptr, mfn_ptr_s>::type dummy_mfp1;
    boost::ct_if<( sizeof( generic_mfn_ptr ) < sizeof( mfn_ptr_m ) ),
      generic_mfn_ptr, mfn_ptr_m>::type dummy_mfp2;
    boost::ct_if<( sizeof( generic_mfn_ptr ) < sizeof( mfn_ptr_v ) ),
      generic_mfn_ptr, mfn_ptr_v>::type dummy_mfp3;
    boost::ct_if<( sizeof( generic_mfn_ptr ) < sizeof( mfn_ptr_u ) ),
      generic_mfn_ptr, mfn_ptr_u>::type dummy_mfp4;
};

BOOST_STATIC_CONSTANT( unsigned, 
    any_fxn_size = sizeof( max_align_for_funtion_pointer ) );

union any_fxn_pointer
{
    void const * obj_ptr;
    generic_fxn * fxn_ptr;
    generic_mfn_ptr mfn_ptr;
    max_align_for_funtion_pointer m_;
};

A member function pointer whose size is less than or equal to any_fxn_size is
stored into any_fxn_pointer.any_fxn_pointer is
implemented to make it able to store one out of three different pointer types -- a void data pointer, a function pointer,
or a member function pointer -- whose size is less than a function pointer to the member of a multiple inherited class. Only one pointer type is stored at one specific time. Care has been taken regarding the misalignment issue, which may cause undefined behavior
according to the C++ standard, by applying the specialized version of the well-known max alignment union trickery.

Collapse | Copy
Code

void hello(int, float) { }
typedef void (*MyFxn)(int, float);

struct foobar
{
    void foo(int, float) { }
};
typedef void (foobar::*MyMfn)(int, float);


void test1(any_fxn_pointer any)
{
    ( *reinterpret_cast<MyFxn>( any.fxn_ptr ) )( 1, 1.0f );
}

void test2(any_fxn_pointer any, foobar * pfb)
{
    ( pfb->*reinterpret_cast<MyMfn>( any.mfn_ptr ) )( 1, 1.0f );
}

void main()
{
    any_fxn_pointer any;
    any.fxn_ptr = reinterpret_cast<generic_fxn *>( &hello );

    test1( any );

    foobar fb;
    any.mfn_ptr = reinterpret_cast<generic_mfn_ptr>( &foobar::foo );
    test2( any, &fb );
}

When the size of a member function pointer is greater than any_fxn_size,
takes for an example when storing a member function pointer to virtual inherited class in MSVC, it is stored by allocating
heap memory in the same way that Boost.Function does, as a non-fast delegate. However, in real world practices, virtual inheritance
is rarely used.

Collapse | Copy
Code

template<typename U, typename T>
void bind(UR (U::*fxn)(int, float), T t)
{
    struct select_stub
    {
        typedef void (U::*TFxn)(int, float);
        typedef typename boost::ct_if<( sizeof( TFxn ) <= any_fxn_size ),
        typename impl_class::fast_mfn_delegate,
        typename impl_class::normal_mfn_delegate
        >::type type;
    };

    select_stub::type::bind( *this, fxn, t, );
}

Part B: using any_fxn_pointer

any_fxn_pointer is used for storing an arbitrary function pointer with the class template. This is done in order to erase the type while storing the function and to restore the original type safely when required
later. Sergey Ryazanov demonstrated in his article that a C++ standard compliant fast delegate for member
function can be implemented using the class template with a non-type member function template parameter. The sample below shows a rough idea of how it had been implemented.

Collapse | Copy
Code

class delegate

{

    typedef void (*invoke_stub)(void const *, int);
    void const * obj_ptr_;

    invoke_stub stub_ptr_;
    template<typename T, void (T::*Fxn)(int)>

    struct mem_fn_stub

    {

        static void invoke(void const * obj_ptr, int a0)

        {

            T * obj = static_cast<T *>( const_cast<void *>( obj_ptr ) );

            (obj->*Fxn)( a0 );

        }

    };
    template<typename T, void (T::*Fxn)(int) const>

    struct mem_fn_const_stub

    {

        static void invoke(void const * obj_ptr, int a0)

        {

            T const * obj = static_cast<T const *>( obj_ptr );

            (obj->*Fxn)( a0 );

        }

    };
    template<void (*Fxn)(int)>

    struct function_stub

    {

        static void invoke(void const *, int a0)

        {

            (*Fxn)( a0 );

        }

    };
public:

    delegate() : obj_ptr_( 0 ), stub_ptr_( 0 ) { }
    template<typename T, void (T::*Fxn)(int)>

    void from_function(T * obj)

    {

        obj_ptr_ = const_cast<T const *>( obj );

        stub_ptr_ = &mem_fn_stub<T, Fxn>::invoke;

    }
    template<typename T, void (T::*Fxn)(int)