现在的位置: 首页 > 综合 > 正文

《也做个比较》的解释(草稿)

2013年02月04日 ⁄ 综合 ⁄ 共 9634字 ⁄ 字号 评论关闭

  思归《也做个比较》中对Java和C#的行为作了比较,我花了点时间根据文章中的几个条目整理、翻译了一下两种语言的Specification,解释了文章中Java和C#行为不同的原因。

  ■■■■■■颜色的文字是思归文章中的原文,为了方便而引用过来的;

  ■■■■■■表示引用的C#或Java的Specification原文和翻译;

  ■■■■■■颜色的文字是我的解释。

1。奇偶性 
Java:

    public static boolean isOdd(int i) {
        return i % 2 == 1;
    }

C#:
    public static bool isOdd(int i) {
        return i % 2 == 1;
    }

isOdd i=-2 i=-1 i=0 i=1 i=2
Java false false false true false
C# False False False True False

C# Language Specification 14.7.3
The result of x % y is the value produced by x – (x / y) * y.
x % y的结果是由x - (x / y) * y所产生的值。

  因此 -1 % 2 = -1 - (-1 / 2) * 2 = -1 - 0 = -1。
  这个函数对负数不能起到判断奇偶的作用,因此应该如下修改:
    public static bool isOdd(int i) {
        return i % 2 != 0;
    }

2。浮点数的减法
Java: System.out.println(2.00 - 1.10);

C#: System.Console.WriteLine(2.00 - 1.10);

输出结果不一样

Java 0.8999999999999999
C# 0.9

对C#这个结果有点怀疑,大概是格式化的原因,因为如果用ILDASM看的话,是这样的

  IL_0000:  ldc.r8     0.89999999999999991
  IL_0009:  call       void [mscorlib]System.Console::WriteLine(float64)

  绝大多数浮点数不能在电脑中准确存储。在需要精确计算的场合应该使用decimal。
decimal d1 = 2.00M;
decimal d2 = 1.10M;
System.Console.WriteLine(d1 - d2);
  所生成的代码反编译为:
decimal num1 = new decimal(200, 0, 0, false, 2);
decimal num2 = new decimal(110, 0, 0, false, 2);
Console.WriteLine((decimal) (num1 - num2));
  本来我还想看看前面那段C#为什么会输出准确的0.9,可是一追踪,发现内部的方法调用不是一般的麻烦,就暂时放弃了。

3。大整数除法

Java:
        final long MICROS_PER_DAY = 24 * 60 * 60 * 1000 * 1000;
        final long MILLIS_PER_DAY = 24 * 60 * 60 * 1000;

        System.out.println(MICROS_PER_DAY / MILLIS_PER_DAY);

C#:
        const long MICROS_PER_DAY = 24 * 60 * 60 * 1000 * 1000;
        const long MILLIS_PER_DAY = 24 * 60 * 60 * 1000;

        System.Console.WriteLine(MICROS_PER_DAY / MILLIS_PER_DAY);

在C#里编译出错,使用unchecked后输出相同,但输出结果也许跟你想象的输出结果不一样

Java 5
C# 5

  不难识破,24 * 60 * 60 * 1000 * 1000所参与运算的都是Int32类型的,结果也自然是Int32类型的,然后再做一个隐式类型转换。所以实际上MICROS_PER_DAY溢出了,得到了一个非预期的结果。但并非无规律可循。24 * 60 * 60 * 1000 * 1000 = 0x141DD76000,(int)0x141DD76000 = 0x1DD76000,0x1DD76000 / (24 * 60 * 60 * 1000) ≈ 5.79,(long)5.79 = 5。解决办法是在参加运算的数中添加至少一个“L”,比如const long MICROS_PER_DAY = 24L * 60 * 60 * 1000 * 1000;
  事后诸葛亮总是好做,你在平时书写/检查代码的时候也能看出这里的错误吗?

4。16进制的加法
Java: System.out.println(Long.toHexString(0x100000000L + 0xcafebabe));
C#:  System.Console.WriteLine("{0:x}", 0x100000000L + 0xcafebabe);

输出结果不一样
Java cafebabe
C# 1cafebabe

Java Language Specification 3.10.1
The largest positive hexadecimal and octal literals of type int are 0x7fffffff and 017777777777, respectively, which equal 2147483647 (2^31-1). The most negative hexadecimal and octal literals of type int are 0x80000000 and 020000000000, respectively, each of which represents the decimal value -2147483648 (-2^31).
最大的正16进制和8进制int形常数分别是0x7fffffff和017777777777,也就是2147483647 (2^31-1)。最小的负16进制和8进制int形常数分别是0x80000000和020000000000,也就是十进制值-2147483648 (-2^31)。

C# Language Specification 9.4.4.2
The type of an integer literal is determined as follows:
· If the literal has no suffix, it has the first of these types in which its value can be represented: int, uint, long, ulong.
· If the literal is suffixed by U or u, it has the first of these types in which its value can be represented: uint, ulong.
· If the literal is suffixed by L or l, it has the first of these types in which its value can be represented: long, ulong.
· If the literal is suffixed by UL, Ul, uL, ul, LU, Lu, lU, or lu, it is of type ulong.
If the value represented by an integer literal is outside the range of the ulong type, a compile-time error occurs.
[Note: As a matter of style, it is suggested that “L” be used instead of “l” when writing literals of type long, since it is easy to confuse the letter “l” with the digit “1”. end note]
一个整形常数的类型按照如下的方法来确定:
·如果这个常数没有后缀,它的类型是在int, uint, long, ulong中第一个它的值能够被表示的类型。
·如果这个常数的后缀是U或u,它的类型是uint, ulong中第一个它的值能够被表示的类型。
·如果这个常数的后缀是L或l,它的类型是long, ulong中第一个它的值能够被表示的类型。
·如果这个常数的后缀是UL、Ul、uL、ul、LU、Lu、lU或lu,它的类型是ulong。
如果一个整形常数所表示的值超出了ulong类型的范围,会发生一个编译时错误。
[注释:由于形式上的原因,当写long类型的时候建议用“L”而不是“l”,因为字母“l”和数字“1”很容易混淆。注释结束]

  可见0xcafebabe在Java中被认为是int类型的负数,而在C#中被认为是uint类型的正数。所以在C#中的结果就不难理解了。可Java的cafebabe是怎么得来的呢?“看似”结果被“截去”了最高位的那个“1”,可结果是long类型的,不应该被“截去”啊。再引用一段:

Java Language Specification 5.1.2 Widening Primitive Conversion
A widening conversion of a signed integer value to an integral type T simply sign-extends the two's-complement representation of the integer value to fill the wider format.
一个带符号整形数值到一个整数类型T的扩展转换简单的将这个整形数值的2的补码表示进行带符号扩展到更宽的格式。

  也就是原数0xcafebabe首先被扩展成long型的0xffffffffcafebabe,然后再参与加法运算,得到了0x100000000cafebabe。其实这里又出现了一次溢出,截去最高位,留下了0xcafebabe。

5。多重转换

Java: System.out.println((int) (char) (byte) -1);

C#:

 unchecked
 {
         System.Console.WriteLine((int) (char) (byte) -1);
 }

输出结果不一样

Java 65535
C# 255

  仅仅是形式上一样,Java的byte是带符号的,对应C#的sbyte,因此两句话直接比较没什么意义。试一下C#中
unchecked
{
 System.Console.WriteLine((int) (char) (sbyte) -1);
}
的结果,与Java一样了,65535。
  但是为什么呢?

C# Language Specification 13.2.1
In a checked context, the conversion succeeds if the value of the source operand is within the range of the destination type, but throws a System.OverflowException if the value of the source operand is outside the range of the destination type. In an unchecked context, the conversion always succeeds, and proceeds as follows.
· If the source type is larger than the destination type, then the source value is truncated by discarding its “extra” most significant bits. The result is then treated as a value of the destination type.
· If the source type is smaller than the destination type, then the source value is either sign-extended or zero-extended so that it is the same size as the destination type. Sign-extension is used if the source type is signed; zero-extension is used if the source type is unsigned. The result
is then treated as a value of the destination type.
· If the source type is the same size as the destination type, then the source value is treated as a value of the destination type.
在一个checked环境中,如果源操作数的值在目标类型的范围内转换就会成功,但如果源操作数的值在目标类型的范围之外就会抛出一个System.OverflowException。在一个unchecked环境中,转换总会成功,并如下进行:
·如果源类型比目标类型大,则源值的最高的若干位被截去。结果被当作目标类型的值。
·如果源类型比目标类型小,则源类型被带符号扩展或零扩展到跟目标类型一样的大小。当源类型带符号时进行带符号扩展,当源类型无符号时进行零扩展。结果被当作目标类型的值。
·当源类型与目标类型一样大小时,源值被当作目标类型的值。

  -1是int型,其补码为0xFFFFFFFF;第一次转换直接截去最高的24位,留下0xFF,即sbyte类型的-1;然后做带符号扩展为char,即0xFFFF,无符号的65535;最后一个转换由于0xFFFF是无符号数,所以做零扩展,简单的在高位补0,值不变,最终得到65535。
  试着自己理解一下C#中的(int) (char) (byte) -1,看看明白了没有。

6。 整数交换

Java:
        int x = 1984;
        int y = 2001;
        x ^= y ^= x ^= y;
        System.out.println("x = " + x + "; y = " + y);

C#:
        int x = 1984;
        int y = 2001;
        x ^= y ^= x ^= y;
        System.Console.WriteLine("x = " + x + "; y = " + y);

输出结果一样

Java x = 0; y = 1984
C# x = 0; y = 1984

  已经讨论过了,见http://blog.joycode.com/saucer/archive/2005/08/21/62300.aspx

7。条件运算符

Java:
        char x = 'X';
        int i = 0;
        System.out.print(true  ? x : 0);
        System.out.print(false ? i : x);

C#:
        char x = 'X';
        int i = 0;
        System.Console.Write(true  ? x : 0);
        System.Console.Write(false ? i : x);

输出结果不一样

Java X88
C# 8888

Java Language Specification 15.25
The type of a conditional expression is determined as follows:
If the second and third operands have the same type (which may be the null type), then that is the type of the conditional expression.
If one of the second and third operands is of type boolean and the type of the other is of type Boolean, then the type of the conditional expression is boolean.
If one of the second and third operands is of the null type and the type of the other is a reference type, then the type of the conditional expression is that reference type.
Otherwise, if the second and third operands have types that are convertible (§5.1.8) to numeric types, then there are several cases:
If one of the operands is of type byte or Byte and the other is of type short or Short, then the type of the conditional expression is short.
If one of the operands is of type T where T is byte, short, or char, and the other operand is a constant expression of type int whose value is representable in type T, then the type of the conditional expression is T.
If one of the operands is of type Byte and the other operand is a constant expression of type int whose value is representable in type byte, then the type of the conditional expression is byte.
If one of the operands is of type Short and the other operand is a constant expression of type int whose value is representable in type short, then the type of the conditional expression is short.
If one of the operands is of type Character and the other operand is a constant expression of type int whose value is representable in type char, then the type of the conditional expression is char.
Otherwise, binary numeric promotion (§5.6.2) is applied to the operand types, and the type of the conditional expression is the promoted type of the second and third operands. Note that binary numeric promotion performs unboxing conversion (§5.1.8) and value set conversion (§5.1.13).
Otherwise, the second and third operands are of types S1 and S2 respectively. Let T1 be the type that results from applying boxing conversion to S1, and let T2 be the type that results from applying boxing conversion to S2. The type of the conditional expression is the result of applying capture conversion (§5.1.10) to lub(T1, T2) (§15.12.2.7).
条件表达式的类型是如下被确定的:
·如果第二个和第三个操作数类型相同(可能是null类型),那么这就是条件表达式的类型。
·如果第二个和第三个操作数中的一个是boolean类型,另一个是Boolean类型,那么条件表达式的类型是boolean。
·如果第二个和第三个操作数中的一个是null类型,另一个是引用类型,那么条件表达式的类型是那个引用类型。
·否则,如果第二个和第三个操作数的类型可以转换(§5.1.8)为数值类型,那么有几种情况:
  ·如果一个操作数是byte或Byte类型,另一个是short或Short类型,那么条件表达式的类型是short。
  ·如果一个操作数是T类型,T是byte、short或char,另一个操作数是值能够被T类型表示的整形常数表达式,那么条件表达式的类型为T。
  ·如果一个操作数是Byte类型,另一个操作数是值能够被byte类型表示的常量表达式,那么条件表达式的类型为byte。
  ·如果一个操作数是Short类型,另一个操作数是值能够被short类型表示的常量表达式,那么条件表达式的类型为short。
  ·如果一个操作数是Character类型,另一个操作数是值能够被char类型表示的常量表达式,那么条件表达式的类型为char。
  ·否则,二值数值提升(binary numeric promotion (§5.6.2),注)被应用到操作数类型上,条件表达式的类型是第二个和第三个操作数提升后的类型。注意二值数值提升执行拆箱转换和赋值转换(value set conversion (§5.1.13),注)。
·否则,第二个和第三个操作数分别是S1和S2类型。设T1为在S1上进行装箱操作的结果的类型,T2为在S2上进行装箱操作的结果的类型。条件表达式的类型是在lub(T1, T2)(§15.12.2.7)上应用捕捉转换(capture conversion (§5.1.10),注)的结果。

[注:如果你知道binary numeric promotion、value set conversion和capture conversion的准确/常用翻译请告诉我,谢谢!]

C# Language Specification 14.12
The second and third operands of the ?: operator control the type of the conditional expression. Let X and Y be the types of the second and third operands. Then,
· If X and Y are the same type, then this is the type of the conditional expression.
· Otherwise, if an implicit conversion (§13.1) exists from X to Y, but not from Y to X, then Y is the type of the conditional expression.
· Otherwise, if an implicit conversion (§13.1) exists from Y to X, but not from X to Y, then X is the type of the conditional expression.
· Otherwise, no expression type can be determined, and a compile-time error occurs.
?:操作符的第二个和第三个操作数控制着条件表达式的类型。设X和Y是第二个和第三个操作数的类型,那么,
·如果X和Y是同一个类型,那么这就是条件表达式的类型。
·否则,如果一个隐式类型转换(§13.1)存在于从X到Y,但不存在于从Y到X,那么Y是条件表达式的类型。
·否则,如果一个隐式类型转换(§13.1)存在于从Y到X,但不存在于从X到Y,那么X是条件表达式的类型。
·否则,表达式的类型无法被确定,会出现一个编译时错误。

抱歉!评论已关闭.