【MSDN文摘】What You Need to Know to Move from C++ to C#

现在的位置: 首页 > 综合 > 正文

【MSDN文摘】What You Need to Know to Move from C++ to C#

2012年07月12日 ⁄ 综合 ⁄ 共 18770字 ⁄ 字号小中大 ⁄ 评论关闭

【打不死的猫注：
本文中涉及的测试程序可从如下地址下载：
http://download.microsoft.com/download/2/8/c/28c4ace3-f5ed-4e14-bc64-3d563b807dfb/CtoCsharp.exe
本文转载自MSDN，原文由鼎鼎大名的Jesse Liberty撰写，原文地址是：
http://msdn.microsoft.com/msdnmag/issues/01/07/ctocsharp/default.aspx】

What You Need to Know to Move from C++ to C#

Jesse Liberty

This article assumes you're familiar with C++

SUMMARY
C# builds on the syntax and semantics of C++, allowing C programmers to take advantage of .NET and the common language runtime. While the transition from C++ to C# should be a smooth one, there are a few things to watch out for including changes to new, structs, constructors, and destructors. This article explores the language features that are new to C# such as garbage collection, properties, foreach loops, and interfaces. Following a discussion of interfaces, there's a discussion of properties, arrays, and the base class libraries. The article concludes with an exploration of asynchronous I/O, attributes and reflection, type discovery, and dynamic invocation.

very 10 years or so, developers must devote time and energy to learning a new set of programming skills. In the early 1980s it was Unix and C; in the early 1990s it was Windows® and C++; and today it is the Microsoft® .NET Framework and C#. While this process takes work, the benefits far outweigh the costs. The good news is that with C# and .NET the analysis and design phases of most projects are virtually unchanged from what they were with C++ and Windows. That said, there are significant differences in how you will approach programming in the new environment. In this article I'll provide information about how to make the leap from programming in C++ to programming in C#.
Many articles (for example, Sharp New Language: C# Offers the Power of C++ and Simplicity of Visual Basic) have explained the overall improvements that C# implements, and I won't repeat that information here. Instead, I'll focus on what I see as the most significant change when moving from C++ to C#: going from an unmanaged to a managed environment. I'll also warn you about a few significant traps awaiting the unwary C++ programmer and I'll show some of the new features of the language that will affect how you implement your programs.

Moving to a Managed Environment
C++ was designed to be a low-level platform-neutral object-oriented programming language. C# was designed to be a somewhat higher-level component-oriented language. The move to a managed environment represents a sea change in the way you think about programming. C# is about letting go of precise control, and letting the framework help you focus on the big picture.
For example, in C++ you have tremendous control over the creation and even the layout of your objects. You can create an object on the stack, on the heap, or even in a particular place in memory using the placement operator new.
With the managed environment of .NET, you give up that level of control. When you choose the type of your object, the choice of where the object will be created is implicit. Simple types (ints, doubles, and longs) are always created on the stack (unless they are contained within other objects), and classes are always created on the heap. You cannot control where on the heap an object is created, you can't get its address, and you can't pin it down in a particular memory location. (There are ways around these restrictions, but they take you out of the mainstream.)
You no longer truly control the lifetime of your object. C# has no destructor. The garbage collector will take your item's storage back sometime after there are no longer any references to it, but finalization is nondeterministic.
The very structure of C# reflects the underlying framework. There is no multiple inheritance and there are no templates because multiple inheritance is terribly difficult to implement efficiently in a managed, garbage-collected environment, and because generics have not been implemented in the framework.
The C# simple types are nothing more than a mapping to the underlying common language runtime (CLR) types. For example, a C# int maps to a System.Int32. The types in C# are determined not by the language, but by the common type system. In fact, if you want to preserve the ability to derive C# objects from Visual Basic® objects, you must restrict yourself further, to the common language subset—those features shared by all .NET languages.
On the other hand, the managed environment and CLR bring a number of tangible benefits. In addition to garbage collection and a uniform type system across all .NET languages, you get a greatly enhanced component-based language, which fully supports versioning and provides extensible metadata, available at runtime through reflection. There is no need for special support for late binding; type discovery and late binding are built into the language. In C#, enums and properties are first-class members of the language, fully supported by the underlying engine, as are events and delegates (type-safe function pointers).
The key benefit of the managed environment, however, is the .NET Framework. While the framework is available to any .NET language, C# is a language that's well-designed for programming with the framework's rich set of classes, interfaces, and objects.

Traps
C# looks a lot like C++, and while this makes the transition easy, there are some traps along the way. If you write what looks like perfectly legitimate code in C++, it won't compile, or worse, it won't behave as expected. Most of the syntactic changes from C++ to C# are trivial (no semicolon after a class declaration, Main is now capitalized). I'm building a Web page which lists these for easy reference, but most of these are easily caught by the compiler and I won't devote space to them here. I do want to point out a few significant changes that will cause problems, however.

Reference and Value Types
C# distinguishes between value types and reference types. Simple types (int, long, double, and so on) and structs are value types, while all classes are reference types, as are Objects. Value types hold their value on the stack, like variables in C++, unless they are embedded within a reference type. Reference type variables sit on the stack, but they hold the address of an object on the heap, much like pointers in C++. Value types are passed to methods by value (a copy is made), while reference types are effectively passed by reference.

Structs
Structs are significantly different in C#. In C++ a struct is exactly like a class, except that the default inheritance and default access are public rather than private. In C# structs are very different from classes. Structs in C# are designed to encapsulate lightweight objects. They are value types (not reference types), so they're passed by value. In addition, they have limitations that do not apply to classes. For example, they are sealed, which means they cannot be derived from or have any base class other than System.ValueType, which is derived from Object. Structs cannot declare a default (parameterless) constructor.
On the other hand, structs are more efficient than classes so they're perfect for the creation of lightweight objects. If you don't mind that the struct is sealed and you don't mind value semantics, using a struct may be preferable to using a class, especially for very small objects.

Everything Derives from Object
In C# everything ultimately derives from Object. This includes classes you create, as well as value types such as int or structs. The Object class offers useful methods, such as ToString. An example of when you use ToString is with the System.Console.WriteLine method, which is the C# equivalent of cout. The method is overloaded to take a string and an array of objects.
To use WriteLine you provide substitution parameters, not unlike the old-fashioned printf. Assume for a moment that myEmployee is an instance of a user-defined Employee class and myCounter is an instance of a user-defined Counter class. If you write the following code

Console.WriteLine("The employee: {0}, the counter value: {1}", myEmployee, myCounter);

WriteLine will call the virtual method Object.ToString on each of the objects, substituting the strings they return for the parameters. If the Employee class does not override ToString, the default implementation (derived from System.Object) will be called, which will return the name of the class as a string. Counter might override ToString to return an integer value. If so, the output might be:

The employee: Employee, the counter value: 12

What happens if you pass integer values to WriteLine? You can't call ToString on an integer, but the compiler will implicitly box the int in an instance of Object whose value will be set to the value of the integer. When WriteLine calls ToString, the object will return the string representation of the integer's value (see Figure 1).

Reference Parameters and Out Parameters
In C#, as in C++, a method can only have one return value. You overcome this in C++ by passing pointers or references as parameters. The called method changes the parameters, and the new values are available to the calling method.
When you pass a reference into a method, you do have access to the original object in exactly the way that passing a reference or pointer provides you access in C++. With value types, however, this does not work. If you want to pass the value type by reference, you mark the value type parameter with the ref keyword.

public void GetStats(ref int age, ref int ID, ref int yearsServed)

Note that you need to use the ref keyword in both the method declaration and the actual call to the method.

Fred.GetStats(ref age, ref ID, ref yearsServed);

You can now declare age, ID, and yearsServed in the calling method and pass them into GetStats and get back the changed values.
C# requires definite assignment, which means that the local variables, age, ID, and yearsServed must be initialized before you call GetStats. This is unnecessarily cumbersome; you're just using them to get values out of GetStats. To address this problem, C# also provides the out keyword, which indicates that you may pass in uninitialized variables and they will be passed by reference. This is a way of stating your intentions explicitly:

public void GetStats(out int age, out int ID, out int yearsServed)

Again, the calling method must match.

Fred.GetStats(out age,out ID, out yearsServed);

Calling New
In C++, the new keyword instantiates an object on the heap. Not so in C#. With reference types, the new keyword does instantiate objects on the heap, but with value types such as structs, the object is created on the stack and a constructor is called.
You can, in fact, create a struct on the stack without using new, but be careful! New initializes the object. If you don't use new, you must initialize all the values in the struct by hand before you use it (before you pass it to a method) or it won't compile. Once again, definite assignment requires that every object be initialized (see Figure 2).

Properties
Most C++ programmers try to keep member variables private. This data hiding promotes encapsulation and allows you to change your implementation of the class without breaking the interface your clients rely on. You typically want to allow the client to get and possibly set the value of these members, however, so C++ programmers create accessor methods whose job is to modify the value of the private member variables.
In C#, properties are first-class members of classes. To the client, a property looks like a member variable, but to the implementor of the class it looks like a method. This arrangement is perfect; it allows you total encapsulation and data hiding while giving your clients easy access to the members.
You can provide your Employee class with an Age property to allow clients to get and set the employee's age member.

public int Age

{

get

{

return age;

}

set

{

age = value;

}

The keyword value is implicitly available to the property. If you write

Fred.Age = 17;

the compiler will pass in the value 17 as value.
You can create a read-only property for YearsServed by implementing the Get and not the Set accessor.

public int YearsServed
{
    get
    {
        return yearsServed;
    }
}

If you change your driver program to use these accessors, you can see how they work (see Figure 3).
You can get Fred's age through the property and then you can use that property to set the age. You can access the YearsServed property to obtain the value, but not to set it; if you uncomment the last line, the program will not compile.
If you decide later to retrieve the Employee's age from a database, you need change only the accessor implementation; the client will not be affected.

Arrays
C# provides an array class which is a smarter version of the traditional C/C++ array. For example, it is not possible to write past the bounds of a C# array. In addition, Array has an even smarter cousin, ArrayList, which can grow dynamically to manage the changing size requirements of your program.
Arrays in C# come in three flavors: single-dimensional, multidimensional rectangular arrays (like the C++ multidimensional arrays), and jagged arrays (arrays of arrays).
You can create a single-dimensional array like this:

int[] myIntArray = new int[5];

Otherwise, you can initialize it like this:

int[] myIntArray = { 2, 4, 6, 8, 10 };

You can create a 4×3 rectangular array like this:

int[,] myRectangularArray = new int[rows, columns];

Alternatively, you can simply initialize it, like this:

int[,] myRectangularArray =  
{
    {0,1,2}, {3,4,5}, {6,7,8}, {9,10,11}
};

Since jagged arrays are arrays of arrays, you supply only one dimension

int[][] myJaggedArray = new int[4][];

and then create each of the internal arrays, like so:

myJaggedArray[0] = new int[5]; 
myJaggedArray[1] = new int[2]; 
myJaggedArray[2] = new int[3]; 
myJaggedArray[3] = new int[5];

Because arrays derive from the System.Array object, they come with a number of useful methods, including Sort and Reverse.

Indexers
It is possible to create your own array-like objects. For example, you might create a listbox which has a set of strings that it will display. It would be convenient to be able to access the contents of the box with an index, just as if it were an array.

string theFirstString = myListBox[0];
string theLastString = myListBox[Length-1];

This is accomplished with Indexers. An Indexer is much like a property, but supports the syntax of the index operator. Figure 4 shows a property whose name is followed by the index operator.
Figure 5 shows how to implement a very simple ListBox class and provide indexing for it.

Interfaces
A software interface is a contract for how two types will interact. When a type publishes an interface, it tells any potential client, "I guarantee I'll support the following methods, properties, events, and indexers."
C# is an object-oriented language, so these contracts are encapsulated in entities called interfaces. The interface keyword declares a reference type which encapsulates a contract.
Conceptually, an interface is similar to an abstract class. The difference is that an abstract class serves as the base class for a family of derived classes, while interfaces are meant to be mixed in with other inheritance trees.

The IEnumerable Interface
Returning to the previous example, it would be nice to be able to print the strings from the ListBoxTest class using a foreach loop, as you can with a normal array. You can accomplish this by implementing the IEnumerable interface in your class, which is used implicitly by the foreach construct. IEnumerable is implemented in any class that wants to support enumeration and foreach loops.
IEnumerable has only one method, GetEnumerator, whose job is to return a specialized implementation of IEnumerator. Thus the semantics of an Enumerable class allow it to provide an Enumerator.
The Enumerator must implement the IEnumerator methods. This can be implemented either directly by the container class or by a separate class. The latter approach is generally preferred because it encapsulates this responsibility in the Enumerator class rather than cluttering up the container.
I'll add an Enumerator to the ListBoxTest that you have already seen in Figure 5. Because the Enumerator class is specific to my container class (that is, because ListBoxEnumerator must know a lot about ListBoxTest) I will make it a private implementation, contained within ListBoxTest.
In this version, ListBoxTest is defined to implement the IEnumerable interface. The IEnumerable interface must return an Enumerator.

public IEnumerator GetEnumerator()
{
    return (IEnumerator) new ListBoxEnumerator(this);
}

Notice that the method passes the current ListBoxTest object (this) to the enumerator. That will allow the enumerator to enumerate this particular ListBoxTest object.
The class to implement the Enumerator is implemented here as ListBoxEnumerator, which is a private class defined within ListBoxTest. Its work is fairly straightforward.
The ListBoxTest to be enumerated is passed in as an argument to the constructor, where it is assigned to the member variable myLBT. The constructor also sets the member variable index to -1, indicating that enumerating the object has not yet begun.

public ListBoxEnumerator(ListBoxTest theLB)
{
    myLBT = theLB;
    index = -1;
}

The MoveNext method increments the index and then checks to ensure that you have not run past the end of the object you're enumerating. If you have, you return false; otherwise, true is returned.

public bool MoveNext()
{
    index++;
    if (index >= myLBT.myStrings.Length)
        return false;
    else
        return true;
}

Reset does nothing but reset the index to -1.
The property Current is implemented to return the last string added. This is an arbitrary decision; in other classes Current will have whatever meaning the designer decides is appropriate. However it's defined, every enumerator must be able to return the current member, as accessing the current member is what enumerators are for.

public object Current
{
    get
    {
        return(myLBT[index]);
    }
}

That's all there is to it. The call to foreach fetches the enumerator and uses it to enumerate over the array. Since foreach will display every string whether or not you've added a meaningful value, I've changed the initialization of myStrings to eight items to keep the display manageable.

myStrings = new String[8];

Using the Base Class Libraries
To get a better sense of how C# differs from C++ and how your approach to solving problems might change, let's examine a slightly less trivial example. I'll build a class to read a large text file and display its contents on the screen. I'd like to make this a multithreaded program so that while the data is being read from the disk, I can do other work.
In C++ you would create a thread to read the file, and another thread to do the other work. These threads would work independently, but they might need synchronization. You can do all of that in C# as well, but most of the time you won't need to write your own threading because .NET provides very powerful mechanisms for asynchronous I/O.
The asynchronous I/O support is built into the CLR and is nearly as easy to use as the normal I/O stream classes. You start by informing the compiler that you'll be using objects from a number of System namespaces:

using System;
using System.IO;
using System.Text;

When you include System, you do not automatically include all its subsidiary namespaces, each must be explicitly included with the using keyword. Since you'll be using the I/O stream classes, you'll need System.IO, and you want System.Text to support ASCII encoding of your byte stream, as you'll see shortly.
The steps involved in writing this program are surprisingly simple because .NET will do most of the work for you. I'll use the BeginRead method of the Stream class. This method provides asynchronous I/O, reading in a buffer full of data, and then calling your callback method when the buffer is ready for you to process.
You need to pass in a byte array as the buffer and a delegate for the callback method. You'll declare both of these as private member variables of your driver class.

public class AsynchIOTester
{
    private Stream inputStream;       
    private byte[] buffer;          
    private AsyncCallback myCallBack;

The member variable inputStream is of type Stream, and it is on this object that you will call the BeginRead method, passing in the buffer as well as the delegate (myCallBack). A delegate is very much like a type-safe pointer to member function. In C#, delegates are first-class elements of the language.
.NET will call your delegated method when the byte has been filled from the file on disk so that you can process the data. While you're waiting you can do other work (in this case, incrementing an integer from 1 to 50,000, but in a production program you might be interacting with the user or doing other useful tasks).
The delegate in this case is declared to be of type AsyncCallback, which is what the BeginRead method of Stream expects. An AsyncCallback delegate is declared in the System namespace as follows:

public delegate void AsyncCallback (IAsyncResult ar);

Thus, this delegate may be associated with any method that returns void and takes an IAsyncResult interface as a parameter. The CLR will pass in the IAsyncResult interface object at runtime when the method is called; you only have to declare the method

void OnCompletedRead(IAsyncResult asyncResult)

and then to hook up the delegate in the constructor:

AsynchIOTester()

{

•••

myCallBack = new AsyncCallback(this.OnCompletedRead);

}

This assigns to the member variable myCallback (which was previously defined to be of type AsyncCallback) the instance of the delegate created by calling the AsyncCallback constructor and passing in the method you want to associate with the delegate.
Here's how the entire program works, step by step. In Main you create an instance of the class and tell it to run: