现在的位置: 首页 > 综合 > 正文

The GNU C Reference Manual

2012年03月19日 ⁄ 综合 ⁄ 共 21489字 ⁄ 字号小中大 ⁄ 评论关闭

本文转载自网络，版权归GNU组织所有。

The GNU C Reference Manual

Next: Preface, Up: (dir)

The GNU C Reference Manual

This is the GNU C reference manual.

Next: Lexical Elements, Previous: Top,
Up: Top

Preface

This is a reference manual for the C programming language as implemented by the GNU Compiler Collection (GCC). Specifically, this manual aims to document:

The 1989 ANSI C standard, commonly known as “C89”
The 1999 ISO C standard, commonly known as “C99”, to the extent that C99 is implemented by GCC
The current state of GNU extensions to standard C

This manual describes C89 as its baseline. C99 features and GNU extensions are explicitly labeled as such.

By default, GCC will compile code as C89 plus GNU-specific extensions. Much of C99 is supported; once full support is available, the default compilation dialect will be C99 plus GNU-specific extensions. (Some of the GNU extensions to C89 ended up,
sometimes slightly modified, as standard language features in C99.)

The C language includes a set of preprocessor directives, which are used for things such as macro text replacement, conditional compilation, and file inclusion. Although normally described in a C language manual, the GNU C preprocessor has been
thoroughly documented in The C Preprocessor, a separate manual which covers preprocessing for C, C++, and Objective-C programs, so it is not included here.

Credits

Contributors who have helped with writing, editing, proofreading, ideas, typesetting, or administrative details include: Nelson H. F. Beebe, Karl Berry, Robert Chassell, Andreas Foerster, Denver Gingerich, Lisa Goldstein, Robert Hansen, Jean-Christophe
Helary, Teddy Hogeborn, Joe Humphries, J. Wren Hunt, Adam Johansen, Steve Morningthunder, Richard Stallman, J. Otto Tennant, Ole Tetlie, Keith Thompson, T.F. Torrey, and James Youngman. Trevis Rothwell wrote most of the text and serves as project maintainer.

Some example programs are based on algorithms in Donald Knuth's The Art of Computer Programming.

Please send bug reports and suggestions to gnu-c-manual@gnu.org.

Next: Data Types, Previous: Preface, Up: Top

1 Lexical Elements

This chapter describes the lexical elements that make up C source code after preprocessing. These elements are called tokens. There are five types of tokens: keywords, identifiers,
constants, operators, and separators. White space, sometimes required to separate tokens, is also described in this chapter.

Next: Keywords, Up: Lexical Elements

1.1 Identifiers

Identifiers are sequences of characters used for naming variables, functions, new data types, and preprocessor macros. You can include letters, decimal digits, and the underscore character ‘_’
in identifiers.

The first character of an identifier cannot be a digit.

Lowercase letters and uppercase letters are distinct, such that foo and FOO are two different identifiers.

When using GNU extensions, you can also include the dollar sign character ‘$’ in identifiers.

Next: Constants, Previous: Identifiers,
Up: Lexical Elements

1.2 Keywords

Keywords are special identifiers reserved for use as part of the programming language itself. You cannot use them for any other purpose.

Here is a list of keywords recognized by ANSI C89:

     auto break case char const continue default do double else enum extern
     float for goto if int long register return short signed sizeof static
     struct switch typedef union unsigned void volatile while

ISO C99 adds the following keywords:

     inline _Bool _Complex _Imaginary

and GNU extensions add these keywords:

     __FUNCTION__ __PRETTY_FUNCTION__ __alignof __alignof__ __asm
     __asm__ __attribute __attribute__ __builtin_offsetof __builtin_va_arg
     __complex __complex__ __const __extension__ __func__ __imag __imag__
     __inline __inline__ __label__ __null __real __real__
     __restrict __restrict__ __signed __signed__ __thread __typeof
     __volatile __volatile__

In both ISO C99 and C89 with GNU extensions, the following is also recogized as a keyboard:

     restrict

Next: Operators, Previous: Keywords, Up: Lexical
Elements

1.3 Constants

A constant is a literal numeric or character value, such as 5 or `m'. All constants are of a particular data type; you can use type casting to explicitly specify the type of a constant, or let the compiler
use the default type based on the value of the constant.

Next: Character Constants, Up: Constants

1.3.1 Integer Constants

An integer constant is a sequence of digits, with an optional prefix to denote a number base.

If the sequence of digits is preceded by 0x or 0X (zero x or zero X), then the constant is considered to be hexadecimal (base 16). Hexadecimal values may use the digits
from 0 to 9, as well as the letters a to f and A to F. Here are some examples:

     0x2f
     0x88
     0xAB43
     0xAbCd
     0x1

If the first digit is 0 (zero), and the next character is not ‘x’ or ‘X’, then the constant is considered to be octal (base 8). Octal values may only use the digits from
0 to 7; 8 and 9 are not allowed. Here are some examples:

In all other cases, the sequence of digits is assumed to be decimal (base 10). Decimal values may use the digits from 0 to 9. Here are some examples:

There are various integer data types, for short integers, long integers, signed integers, and unsigned integers. You can force an integer constant to be of a long and/or unsigned integer type by appending a sequence of one or more letters to the
end of the constant:

u
U: Unsigned integer type.
l
L: Long integer type.

For example, 45U is an unsigned int constant. You can also combine letters: 45UL is an unsigned long int constant.
(The letters may be used in any order.)

Both ISO C99 and GNU C extensions add the integer types long long int and unsigned long long int. You can use two ‘L’s to get a long long int constant; add a ‘U’ to that and you have an unsigned long long int constant. For example: 45ULL.

Next: Real Number Constants, Previous: Integer
Constants, Up: Constants

1.3.2 Character Constants

A character constant is usually a single character enclosed within single quotation marks, such as 'Q'. A character
constant is of type int by default.

Some characters, such as the single quotation mark character itself, cannot be represented using only one character. To represent such characters, there are several “escape sequences” that you can use:

\\: Backslash character.
\?: Question mark character.
\': Single quotation mark.
\": Double quotation mark.
\a: Audible alert.
\b: Backspace character.
\e: <ESC> character. (This is a GNU extension.)
\f: Form feed.
\n: Newline character.
\r: Carriage return.
\t: Horizontal tab.
\v: Vertical tab.
\o, \oo, \ooo: Octal number.
\xh, \xhh, \xhhh, ...: Hexadecimal number.

To use any of these escape sequences, enclose the sequence in single quotes, and treat it as if it were any other character. For example, the letter m is'm' and the newline character is '\n'.

The octal number escape sequence is the backslash character followed by one, two, or three octal digits (0 to 7). For example, 101 is the octal equivalent of 65, which is the ASCII character 'A'. Thus, the character constant '\101' is
the same as the character constant 'A'.

The hexadecimal escape sequence is the backslash character, followed by x and an unlimited number of hexadecimal digits (0 to 9, and a to f or A to F).

While the length of possible hexadecimal digit strings is unlimited, the number of character constants in any given character set is not. (The much-used extended ASCII character set, for example, has only 256 characters in it.) If you try to use
a hexadecimal value that is outside the range of characters, you will get a compile-time error.

Next: String Constants, Previous: Character
Constants, Up: Constants

1.3.3 Real Number Constants

A real number
constant is a value that represents a fractional (floating point) number. It consists of a sequence of digits which represents the integer (or “whole”) part of the number, a decimal point, and a sequence of digits which represents the fractional part.

Either the integer part or the fractional part may be omitted, but not both. Here are some examples:

     double a, b, c, d, e, f;
     
     a = 4.7;
     
     b = 4.;
     
     c = 4;
     
     d = .7;
     
     e = 0.7;

(In the third assignment statement, the integer constant 4 is automatically converted from an integer value to a double value.)

Real number constants can also be followed by e or E, and an integer exponent. The exponent can be either positive or negative.

     double x, y;
     
     x = 5e2;   /* x is 5 * 100, or 500.0. */
     y = 5e-2;  /* y is 5 * (1/100, or 0.05. */

You can append a letter to the end of a real number constant to cause it to be of a particular type. If you append the letter F (or f) to a real number constant, then its type is float. If you append the letter L (or l),
then its type is long double. If you do not append any letters, then its type is double.

Previous: Real Number Constants, Up: Constants

1.3.4 String Constants

A string constant is a sequence of zero or more characters, digits, and escape sequences enclosed within double quotation marks. A string constant
is of type “array of characters”. All string constants contain a null termination character (\0) as their last character. Strings are stored as arrays of characters, with no inherent size attribute. The null termination character lets string-processing
functions know where the string ends.

Adjacent string constants are concatenated (combined) into one string, with the null termination character added to the end of the final concatenated string.

A string cannot contain double quotation marks, as double quotation marks are used to enclose the string. To include the double quotation mark character in a string, use the \" escape sequence. You can use
any of the escape sequences that can be used as character constants in strings. Here are some example of string constants:

     /* This is a single string constant. */
     "tutti frutti ice cream"
     
     /* These string constants will be concatenated, same as above. */
     "tutti " "frutti" " ice " "cream"
     
     /* This one uses two escape sequences. */
     "\"hello, world!\""

If a string is too long to fit on one line, you can use a backslash \ to break it up onto separate lines.

     "Today's special is a pastrami sandwich on rye bread with \
     a potato knish and a cherry soda."

Adjacent strings are automatically concatenated, so you can also have string constants span multiple lines by writing them as separate, adjacent, strings. For example:

     "Tomorrow's special is a corned beef sandwich on "
     "pumpernickel bread with a kasha knish and seltzer water."

is the same as

     "Tomorrow's special is a corned beef sandwich on \
     pumpernickel bread with a kasha knish and seltzer water."

To insert a newline character into the string, so that when the string is printed it will be printed on two different lines, you can use the newline escape sequence ‘\n’.

     printf ("potato\nknish");

prints

     potato
     knish

Next: Separators, Previous: Constants,
Up: Lexical Elements

1.4 Operators

An operator is a special token that performs an operation, such as addition or subtraction, on either one, two, or three operands. Full coverage of operators can be found in a later
chapter. See Expressions and Operators.

Next: White Space, Previous: Operators,
Up: Lexical Elements

1.5 Separators

A separator separates tokens. White space (see next section) is a separator, but it is not a token. The other separators are all single-character tokens themselves:

     ( ) [ ] { } ; , . :

Previous: Separators, Up: Lexical Elements

1.6 White Space

White space is the collective term used for several characters: the space character, the tab character, the newline character, the vertical tab character, and the form-feed character. White space is ignored
(outside of string and character constants), and is therefore optional, except when it is used to separate tokens. This means that

     #include <stdio.h>
     
     int
     main()
     {
       printf( "hello, world\n" );
       return 0;
     }

and

     #include <stdio.h> int main(){printf("hello, world\n");
     return 0;}

are functionally the same program.

Although you must use white space to separate many tokens, no white space is required between operators and operands, nor is it required between other separators and that which they separate.

     /* All of these are valid. */
     
     x++;
     x ++ ;
     x=y+z;
     x = y + z ;
     x=array[2];
     x = array [ 2 ] ;
     fraction=numerator / *denominator_ptr;
     fraction = numerator / * denominator_ptr ;

Furthermore, wherever one space is allowed, any amount of white space is allowed.

     /* These two statements are functionally identical. */
     x++;
     
     x
            ++       ;

In string constants, spaces and tabs are not ignored; rather, they are part of the string. Therefore,

     "potato knish"

is not the same as

     "potato                        knish"

Next: Expressions and Operators, Previous: Lexical
Elements, Up: Top

2 Data Types

Next: Enumerations, Up: Data Types

2.1 Primitive Data Types

Next: Real Number Types, Up: Primitive
Types

2.1.1 Integer Types

The integer data types range in size from at least 8 bits to at least 32 bits. The C99 standard
extends this range to include integer sizes of at least 64 bits. You should use integer types for storing whole number values (and the char data type for storing characters). The sizes and ranges listed for these types
are minimums; depending on your computer platform, these sizes and ranges may be larger.

While these ranges provide a natural ordering, the standard does not require that any two types have a different range. For example, it is common for intand long to have the same range.
The standard even allows signed char and long to have the same range, though such platforms are very unusual.

signed char
The 8-bit signed char data type can hold integer values in the range of −128 to 127.
unsigned char
The 8-bit unsigned char data type can hold integer values in the range of 0 to 255.
char
Depending on your system, the char data type is defined as having the same range as either the signed char or the unsigned char data type (they
are three distinct types, however). By convention, you should use the char data type specifically for storing ASCII characters (such as `m'), including escape sequences (such as `\n').
short int
The 16-bit short int data type can hold integer values in the range of −32,768 to 32,767. You may also refer to this data type as short, signed short int, or signed short.
unsigned short int
The 16-bit unsigned short int data type can hold integer values in the range of 0 to 65,535. You may also refer to this data type as unsigned short.
int
The 32-bit int data type can hold integer values in the range of −2,147,483,648 to 2,147,483,647. You may also refer to this data type as signed int orsigned.
unsigned int
The 32-bit unsigned int data type can hold integer values in the range of 0 to 4,294,967,295. You may also refer to this data type simply as unsigned.
long int
The 32-bit long int data type can hold integer values in the range of at least −2,147,483,648 to 2,147,483,647. (Depending on your system, this data type might be 64-bit, in which case its range is identical to that of
the long long int data type.) You may also refer to this data type as long, signed long int, or signed long.
unsigned long int
The 32-bit unsigned long int data type can hold integer values in the range of at least 0 to 4,294,967,295. (Depending on your system, this data type might be 64-bit, in which case its range is identical to that of the unsigned long long int data type.) You may also refer to this data type as unsigned long.
long long int
The 64-bit long long int data type can hold integer values in the range of −9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. You may also refer to this data type as long long, signed long long int or signed long long. This type is not part of C89, but is both part of C99 and a GNU C extension.
unsigned long long int
The 64-bit unsigned long long int data type can hold integer values in the range of at least 0 to 18,446,744,073,709,551,615. You may also refer to this data type as unsigned long long. This type
is not part of C89, but is both part of C99 and a GNU C extension.

Here are some examples of declaring and defining integer variables:

     int foo;
     unsigned int bar = 42;
     char quux = 'a';

The first line declares an integer named foo but does not define its value; it is left unintialized, and its value should not be assumed to be anything in particular.

Next: Complex Number Types, Previous: Integer
Types, Up: Primitive Types

2.1.2 Real Number Types

There
are three data types that represent fractional numbers. While the sizes and ranges of these types are consistent across most computer systems in use today, historically the sizes of these types varied from system to system. As such, the minimum and maximum
values are stored in macro definitions in the library header file float.h. In this section, we include the names of the macro definitions in place of their possible values; check your system's float.hfor
specific numbers.

float
The float data type is the smallest of the three floating point types, if they differ in size at all. Its minimum value is stored in the FLT_MIN, and should be no greater than 1e-37.
Its maximum value is stored in FLT_MAX, and should be no less than 1e37.
double
The double data type is at least as large as the float type, and it may be larger. Its minimum value is stored in DBL_MIN, and its maximum value is stored
in DBL_MAX.
long double
The long double data type is at least as large as the float type, and it may be larger. Its minimum value is stored in LDBL_MIN, and its maximum value is
stored in LDBL_MAX.

All floating point data types are signed; trying to use unsigned float, for example, will cause a compile-time error.

Here are some examples of declaring and defining real number variables:

     float foo;
     double bar = 114.3943;

The first line declares a float named foo but does not define its value; it is left unintialized, and its value should not be assumed to be anything in particular.

The real number types provided in C are of finite precision, and accordingly, not all real numbers can be represented exactly. Most computer systems that GCC compiles for use a binary representation for real numbers, which is unable to precisely
represent numbers such as, for example, 4.2. For this reason, we recommend that you consider not comparing real numbers for exact equality with the == operator, but rather check that real numbers are within an acceptable
tolerance.

There are other more subtle implications of these imprecise representations; for more details, see David Goldberg's paper What Every Computer Scientist Should Know About Floating-Point Arithmetic and section
4.2.2 of Donald Knuth's The Art of Computer Programming.

Previous: Real Number Types, Up: Primitive
Types

2.1.3 Complex Number Types

GCC introduced some complex number types as an extension to C89. Similar
features were introduced in C99¹, but there were a number of differences. We describe the standard complex number types first.

Next: GNU Extensions for Complex Number Types, Up: Complex
Number Types

2.1.3.1 Standard Complex Number Types

Complex types were introduced in C99. There are three complex types:

float _Complex
double _Complex
long double _Complex

The names here begin with an underscore and an uppercase letter in order to avoid conflicts with existing programs' identifiers. However, the C99 standard header file <complex.h> introduces some macros which
make using complex types easier.

complex
Expands to _Complex. This allows a variable to be declared as double complex which seems more natural.
I
A constant of type const float _Complex having the value of the imaginary unit normally referred to as i.

The <complex.h> header file also declares a number of functions for performing computations on complex numbers, for example the creal and cimag functions
which respectively return the real and imaginary parts of a double complex number. Other functions are also provided, as shown in this example:

     #include <complex.h>
     #include <stdio.h>
     
     void example (void)
     {
       complex double z = 1.0 + 3.0*I;
       printf ("Phase is %f, modulus is %f\n", carg (z), cabs (z));
     }

Previous: Standard Complex Number Types, Up: Complex
Number Types

2.1.3.2 GNU Extensions for Complex Number Types

GCC also introduced complex types as a GNU extension to C89, but the spelling is different. The floating-point complex types in GCC's C89 extension are:

__complex__ float
__complex__ double
__complex__ long double

GCC's extension allow for complex types other than floating-point, so that you can declare complex character types and complex integer types; in fact__complex__ can be used with any of the primitive data types. We won't
give you a complete list of all possibilities, but here are some examples:

__complex__ float
The __complex__ float data type has two components: a real part and an imaginary part, both of which are of the float data type.
__complex__ int
The __complex__ int data type also has two components: a real part and an imaginary part, both of which are of the int data type.

To extract the real part of a complex-valued expression, use the keyword __real__, followed by the expression. Likewise, use __imag__ to extract the imaginary part.

     __complex__ float a = 4 + 3i;
     
     float b = __real__ a;          /* b is now 4. */
     float c = __imag__ a;          /* c is now 3. */

This example creates a complex floating point variable a, and defines its real part as 4 and its imaginary part as 3. Then, the real part is assigned to the floating point variable b, and the imaginary
part is assigned to the floating point variable c.

Next: Unions, Previous: Primitive Types,
Up: Data Types