现在的位置: 首页 > 综合 > 正文

C++11 – A Glance [part 2 of n]–move semantic and perfect forword

2017年11月05日 ⁄ 综合 ⁄ 共 16265字 ⁄ 字号 评论关闭

C++11 – A Glance [part 2 of n]

By 21
Jan 2012
 

Introduction

We have covered few features in the first part "C++11
– A Glance [part 1 of n]
". In this part let's glance at the following features

Feature

Intent

VS2010
status

  • Strongly
    typed enums
  • Type safety improvement

    no

  • Rvalue
    references
  • Performance improvement

    yes

  • Move
    semantics and Perfect 
    forwarding
  • Performance improvement

    yes

  • long
    long
  • Performance improvement

    yes

  • Override
    control
  • Usability improvement

    yes

  • preventing
    narrowing
  • Performance improvement

    no

  • Range
    based for-loops
  • Usability & Performance improvement

    no

    Strongly
    typed enums:

    As quoted by Stroustrup “C enumerations constitute a curiously half-baked concept” and very few modifications are done to rectify there shortfalls resulting in manifestation of silent behavioral changes. Let's check an example to support this,

    namespace DefenceNetwork
    { 
       namespace WeatherMonitor
       { 
         enum CycloneWarningLevels
         { 
            GREY, // Medium winds
            RED, // High speed winds
            YELLOW, // Very High speed winds
            GREEN, // Severe Cyclone
            BLUE // Very Severe Cyclone
         };
       }
       namespace ThreatMonitor
       { 
         enum AlertConditions
         { 
            BLUE, // Lowest readiness
            GREEN, // Increased intelligence watch
            YELLOW, // Increase in force readiness
            RED, // Next step to nuclear war
            GREY // Nuclear war is imminent
         };
       }
     }
    using namespace DefenceNetwork;
    void SetDEFCONLevel(int value)
    {
       // ....before setting level ...lets check the weather once
       using namespace WeatherMonitor;
       // ...... 
       // .... Ok all is fine! Lets go ahead with setting DEFCON level
       if(value >= RED)
       { 
          cout<<"Nuclear war is imminent...All Missiles GO...GO...GO"<<endl;
       }
       // Here, the confusion is AlertConditions::GREEN and CycloneWarningLevels::RED are same 
       // as we messed up by putting 'using namespace WeatherMonitor'.
     }
    void main()
    { 
       using namespace ThreatMonitor;
       // Set level to Increase intelligence watch
       SetDEFCONLevel( AlertConditions::GREEN );
       
      // Oh no... what have I done ...How the war is started
      // Hope in reality they have coded better than this...but hey can we take a chance !!!
    }

    The problem with the enums so far is that

    1. They
    can be silently converted to int.
    In the above example SetDEFCONLevel( ) method need int and when we pass an 
    enumerator it happily accepted.

    2. The
    enumerators of the enum are exported to the scope in which enum is defined, 
    thus causing name clashes and surprises.
    In the above case, see the surprises yourself.

    3. They
    have an implementation-defined underlying type and their type cannot be specified by 
    the developer leading to confusion, compatibility issues etc.
    Let's visualize this via another example:

    Take a case where we want to check if a number is present between certain pre-defined 
    intervals. As we have fixed intervals, we can go for an enum to define them.

    enum INTERVALS
    { 
       INTERVAL_1 = 10,
       INTERVAL_2 = 100,
       INTERVAL_3 = 1000,
       INTERVAL_4 = 0xFFFFFFFFU // we intend to specify a large unsigned number '4294967295'
    };
    
    unsigned long longValue = 0xFFFFFFFFU; // decimal '4294967295'
    INTERVALS enumMyInterval = (INTERVALS)0xFFFFFFFFU;
    if( enumMyInterval > longValue)
    { 
      cout<<"Invalid interval - Out of range"<<endl;
    }

    I bet the validation in above case will not be true as enumMyInterval will never be treated as unsigned int (4294967295) and will be -1. And there is no way to make INTERVAL_4 point to unsigned int as it will default to int.

    Keeping all these in mind, in C++11, we got what is called enum class - A Strongly typed enum.
    NOTE : VS2010 do not support this feature... hope they will support in VS2011.

    Let us quickly glance the new syntax:

    1. With
    C++11, the enumerators are no longer exported to surrounding scope and requires scope identifer

    enum class CycloneWarningLevels // note the keyword class after enum keyword
    { GREY, RED, YELLOW, GREEN, BLUE };
    
    // To access BLUE enumerator of CycloneWarningLevels enum we should use CycloneWarningLevels::BLUE

    2. The
    underlying type by default is int, but C++11 gave an option to specify the type

    enum class INTERVALS : unsigned long
    { 
       INTERVAL_1 = 10, INTERVAL_2 = 100, INTERVAL_3 = 1000, INTERVAL_4 = 0xFFFFFFFFU
    }
    //  Now we will get out-of interval error

    3. Enumerators
    are strongly-typed and no longer implicitly convertable to int.

    SetDEFCONLevel( AlertConditions::GREEN ); // will not compile as this method requires int 
                                              // and we are passing enumerator 

    4. Forward
    declarations are now possible

    enum class INTERVALS : unsigned long; // (forward) declaration
    void foo(INTERVALS* IntervalsEnum_in) // use of forward declaration
    { /* ...*/ }
    
    // Actual definition enum class INTERVALS : unsigned long { INTERVAL_1, ..... }; 

    Rvalue
    references:

    If you are not familiar with Lvalues and Rvalues please have a glance at "The
    Notion of Lvalues and Rvalues
    ". It will surely help you in understanding this feature.

    To handle certain scenarios the c++ compilers silently create at times temporaries that can serioulsy hit the performance of the code. With evolution of compilers, few of these temporary creations are arrested but many more slipped leading to relativley
    in-efficent programs. Let's see what I am saying:

    vector<double> GetNDimensionalVector()
    { 
       vector<double> vecTransMatrix;
       // Do some computations ....Populate vecTransMatrix
       vecTransMatrix.push_back(10.5);
       vecTransMatrix.push_back(1.3);
       //......
       return vecTransMatrix;
    } 
    
    int _tmain(int argc, _TCHAR* argv[])
    { 
       vector<double> vecNewMatrix = GetNDimensionalVector();
       // work upon this new matrix
       size_t size = vecNewMatrix.size();
    } 

    If we analyze this code, GetNDimensionalVector(
    )
     method created a vector say of 10 doubles which will require 10*sizeof(double) bytes. Now while returning a compiler (prior to VS03 for example) will again create a copy. However recent compilers fixed this hole (via Return value optimization - aka
    RVO). Now the call to GetNDimensionalVector(
    )
     will copy all its content again to vecNewMatrix (upon which further operations are done) and the result ofGetNDimensionalVector(
    )
     call is evaporated as after ; as this is temporary. What a pitty!, we are wasting a lot of memory chunks instead of just pilfering the data inside the temporary vector into vecNewMatrix.

    A smart language should allow this. And this is exactly what they have provided us through Rvalue reference.

    The '&&' is the token identifies the reference as an "rvalue reference" and distinguishes it from the current (lvalue) reference '&'.

    Lets see a function with Rvalue reference

    void PrintData(string& str) { } // function with lvalue ref 
    void PrintData(string&& str) { } // function with rvalue ref 
    
    string str="Hellow C++11 world"; PrintData(str); // will call lvalue ref as the argument is an lvalue 
    PrintData( "Hellow C++11 world" ); // will call rvalue ref as the argument is an rvalue and by some way we can efficiently transfer data

    This feature resulted in the possibility of supporting 'Move semantics' and 'Perfect forwarding'.

    Move
    semantics and Perfect forwarding:

    The implementation of move semantics significantly increases the performance as the resorurces of temporary object( that cannot be referenced elsewhere in the program as it is going to evaporate) can be pilfered instead of copying.
    To get better understanding take the case when a vector needs more capacity and if no continuos memory is available. Then it will identify a memory location which is large enough to hold its old contents plus required (new) capacity. It will then copy all the
    old contents to this new location. Now this call to copy construcutor is expensive if the contents are a 'string' or a heavy-duty class/structure. The pitty here is that all the old location contents will be evaporated. How nice it would be if this operation
    involves just stealing the old contents instead of copying.

    Hope you got what I say.

    Please note that - The copy operation leaves the source unchanged while a move operation on the other hand leaves the source either unchanged or it may be radically different. Now if a developer chooses Move operation upon an object, then he should no more
    care about the state of the source object [he should keep in mind that the source object's state is disturbed and is not more useful]. 
    If his intention is still to use source along with duplicate then he should be doing copying (as is done till now) and not move.

    Before going to the implementation part just check these points:

    1. To
    implement Move semantics we typically provide a MOVE constructor and (an optional) 
    MOVE assignment operator.

    2. The
    compiler won't provide us with default Move constructor, if we don't provide one.

    3. And
    declaring a Move constructor will stop the compiler to generate default constructor.

    Now lets go to the implementation part:

    class MyFileStream
    { 
       unsigned char* m_uchBuffer;
       unsigned int m_uiLength;
    
      public:
      // constructor
      MyFileStream(unsigned int Len) : m_uiLength(Len), m_uchBuffer(new unsigned char[Len]) {}
    
     // Copy constructor
      MyFileStream(const MyFileStream& FileStream_in) { /* ....*/ }
    
     // Assigment operator
      MyFileStream& operator =(const MyFileStream& FileStream_in)
      { 
         /* ....*/
         return *this;
      }
     
     // Move constructor
      MyFileStream(MyFileStream&& FileStream_in) : m_uiLength( FileStream_in.m_uiLength ),  
       /* assign source data to the current object*/ m_uchBuffer( FileStream_in.m_uchBuffer )
      { 
         // Set the source data to default
         // This is necessary to avoid crashes from multiple deletions
         FileStream_in.m_uiLength = 0;
         FileStream_in.m_uchBuffer = NULL;
    
         // Ha ha ha ....We have successfully stolen source data
      }
    
      // Move Assigment operator
      MyFileStream& operator =(MyFileStream&& FileStream_in)
     { 
        // A good developer always checks for self-copy
        if( this != &FileStream_in)
        {  
           // Nullify old data
           delete [] m_uchBuffer; // calling delete over NULL ptr is fine
       
          // assign source data to the current object
          m_uiLength = FileStream_in.m_uiLength;
          m_uchBuffer = FileStream_in.m_uchBuffer;
     
          // Set the source date to default
          // This is necessary to avoid crashes from multiple deletions
          FileStream_in.m_uiLength = 0; FileStream_in.m_uchBuffer = NULL;
        } 
       // We have successfully pilferated source data
       return *this;
     }
    
     //Destructor
     ~MyFileStream()
     { 
        if( NULL != m_uchBuffer) delete [] m_uchBuffer;
        m_uchBuffer = NULL;
     }
    
    };
    
    MyFileStream GetMyStream(MyFileStream FileStream) // not efficient to take argument 
                                              //as value ... but hey just for explanation sake
    { 
       return FileStream;
    }
    
    int _tmain(int argc, _TCHAR* argv[])
    { 
       MyFileStream objMyStream(100);
       MyFileStream objMyStream2 = GetMyStream( objMyStream ); 
       // Above while copying the return of GetMyStream(..), which is an Rvalue
       // the MyFileStream move constructor is invoked and data is pilferated 
    }
     

    The comments in the sample code above are pretty much self explanatory. Just note that for Move constructor and assigment operator we took the argument without const, as we intent to modify them (we want to set them to default once we moved their content
    to target)
    There are many more points that need to be grasped in this feature but those are out-of-scope on this introductory part. I will just cover one more scenario and wind up this section.

    In the above example, MyFileStream class have default member types. What if the members are of some other class type say like MyString.

    class MyString
    { 
     public:
       // constructor
       MyString(){}
    
       // Copy constructor
       MyString(const MyString& String_in){ }
    
      // Move constructor
      MyString(MyString&& String_in){ }
      
     // Assigment operator
      MyString& operator=(const MyString& String_in){ return *this; }
    
      // Move Assigment operator
      MyString& operator=(MyString&& String_in){ /* ......*/  return *this; }
    };
     

    And our MyFileStream class has this as a member

    class MyFileStream
    { 
       unsigned char* m_uchBuffer;
       unsigned int m_uiLength;
       MyString m_strFileName; // new member of type MyString 
    
       // ....... 
    };

    Now how to steal this MyString object data efficiently or if I re-phrase it How to fit this class object into our move culture. 

    Will a call to MyString constructor from MyFileStream's move constructor automatically call MyString MOVE constructor. Of course NO.

    Can you get why not? It's simple this call will pass MyString object as lvalue and hence its copy constructor is called.

    So what is the work-around? Simple. Convert this Lvalue to Rvalue!!! Bingo.
    Now how to do this?
    We can convert an Lvalue to Rvalue by using static_cast

    m_strFileName = static_cast<MyString&&>( FileStream_in.m_strFileName );

    Or another way is to use std::move (again
    a new STL method proivded in C++11).

    m_strFileName = std::move( FileStream_in.m_strFileName );
    // Move constructor
      MyFileStream(MyFileStream&& FileStream_in) : m_uiLength( FileStream_in.m_uiLength ),  
      /* assign source data to the current object*/ m_uchBuffer( FileStream_in.m_uchBuffer ),
                 m_strFileName( std::move( FileStream_in.m_strFileName) ) // std::move usage
                 // or 
                 // m_strFileName( static_cast<MyString&&>(FileStream_in.m_strFileName) 
                                                                // static_cast usage
      { 
         // Set the source data to default
         // This is necessary to avoid crashes from multiple deletions
         FileStream_in.m_uiLength = 0;
         FileStream_in.m_uchBuffer = NULL;
    
         // No need to set 'FileStream_in.m_strFileName' data to default as this is 
         // taken care by std::move( )
    
         // Ha ha ha ....We have successfully stolen soruce data
      }

    Perfect
    forwarding:
    Another nice effect of Rvalue implementation is the solution to the Forwarding problem.
    Before going any further,lets grasp this forwarding problem. 
    Suppose we have two structures handling Licence operations one OpenLicence and another ClosedLicence and say suppose if we want to do some master check before creating the object for either these structures, then we can use a wrapper template function where
    we can do this master check and then simply pass (forward) the arguments to the stuctures.

    struct OpenLicence
    { 
       OpenLicence(int& Key1, int& Key2){}
     };
     struct ClosedLicence
     { 
        ClosedLicence(int& Key1, int& Key2){}
     }; 
    
     template<typename T, typename X, typename Y>
     T* Licence_Wrapper(X& x, Y& y)
     { 
        // Do some master check and if all is well forward arguments to appropriate objects
        return new T(x, y);
     } 
     
     void main()
     { 
       int key1 = 232; int key2 = 007;
       Licence_Wrapper<OpenLicence>( key1, key2 ); // Fine. 
                                               // This will pass as both arguments are lvalues
     
       Licence_Wrapper<OpenLicence>( key1, 007 ); // Error.
                                                  //As the second argument is an Rvalue
     }
    

    Now to solve this we have to overload our wrapper function and also our sturctures,

    struct OpenLicence
    { 
       OpenLicence(int& Key1, int& Key2){}
    
       OpenLicence(int& Key1, const int& Key2){} // second argument is const ref
     };
     struct ClosedLicence
     { 
        ClosedLicence(int& Key1, int& Key2){}
    
        ClosedLicence(int& Key1, const int& Key2){} // second argument is const ref
     }; 
    
     template<typename T, typename X, typename Y> // existing function
     T* Licence_Wrapper(X& x, Y& y)
     { 
        // Do some master check and if all is well forward arguments to appropriate objects
        return new T(x, y);
     } 
    
     template<typename T, typename X, typename Y> // overload function
     T* Licence_Wrapper(X& x, const Y& y)   // second argument is const ref
     { 
        // Do some master check and if all is well forward arguments to appropriate objects
        return new T(x, y);
     } 
    

    Now what if the first argument is an Rvalue ( Licence_Wrapper( 007, key2 ) ) or what if both are Rvalues ( Licence_Wrapper( 006, 007 ); ) 
    To handle these we should have that many overloads. More number of arguments leads to more number of overloads. Our code will be pumped with overloads to handle all these 
    ..... Welcome to the forwarding problem.
    Rvalue refrences just solve this in one stroke

    template<typename T, typename X, typename Y>
     T* Licence_Wrapper(X&& x, Y&& y)
     { 
        // Do some master check and if all is well forward arguments to appropriate objects
        return new T(x, y);
     } 

    That's it. No more overloads needed any where. This is called PERFECT FORWARDING. Really perfect, isn't it. There are many more to discuss in this topic, but again as this in introductory article I won't cover them here.

    long
    long:

    long
    long
     is a 64-bit integer type. Previous to C++11, the largest interger type is long and it's size is platform (32 or 64) dependent. But this long long guarantees to be atleast 64-bit long. Actually this concept is accepted in C++99 and as many compilers
    already supported it, the C++11 committee gave a thumbs up for this new integral type.

    Override
    control:

    Say suppose we have a base class with a virtual function. In any of it's derived classes this function can be overrided and no special keyword or annotation is needed upon this function to do so. To put more clarity and to say that we are overriding a base
    class function, C++11 introduced a new keyword called override. A declaration marked 'override'
    is only valid if there is a function to override. This feature is shipped into VS2010. Let's see an example,

     class Base
    {
      public:
        virtual void Draw(){}
        void SomeFunction(){}
    };
    
    class Derived : public Base
    {
     public:
        void Draw() override {} // Fine. With override specifier
                                // we are clear specifying that we are overriding a 
                                // base virtual function
    
        void SomeFunction() override {}  // Error. Not possible as SomeFunction() 
                                         // is not a virtual function in Base
    }; 

    Preventing
    narrowing:

    void main()
    {
       int pi = {3.14}; // Here 3.14 is truncated to 3;
       
    } 

    To prevent this type of undesired conversions, the C++11 defined that {} initialization will not allow truncation or narrowing. As per this

    void main()
    {
       int pi = {3.14}; // Error. narrowing
       int i{5.112};    // Error. narrowing
    }

    Even a conversion from 3.0 to 3 is also considred as narrowing and an error is given with {} initialization. This feature too is omitted from VS2010.

    Range
    based for-loops:

    Before 'auto' to iterate an vector requires typing lot of code

    for(vector<int>::iterator itr = vec.begin(); itr != vec.end(); itr++) { }

    But after 'auto' life became easy

    for(auto itr = vec.begin(); itr != vec.end(); itr++) { }

    C++11 still simplified these type of parsings by providing what is called Range-for support. Now all we have to do is

    for( auto val : vec ) { cout <<val<<endl; }

    This is like calling for each value of val in vec from
    begin to end.

    This feature works for any C-style arrays and for all those which support iteration via begin and end functions. This feature is also omitted in VS2010.

    Rest of the features will be covered in the next part.

    Thank you for reading this article. It would be helpful if you rate/send feedback, so that I can improve while working on the remaining parts or updating this part with new information.

    Acknowledgments

    Thanks again to Clement Emerson for his views and review.

    Other sources

    http://www2.research.att.com/~bs/C++0xFAQ.html
    http://www.open-std.org/jtc1/sc22/wg21/docs/papers/

    History

    January 13 2011 : Added Part-2 as a continuation to "C++11
    – A Glance [part 1 of n]

    January 21 2012 : Corrected few broken links [no additional information]

    抱歉!评论已关闭.