本文转载自网络,版权归GNU组织所有。
The GNU C Reference Manual
Table of Contents
- The GNU C Reference Manual
- Preface
- 1 Lexical Elements
- 2 Data Types
- 3 Expressions and Operators
- 3.1 Expressions
- 3.2 Assignment Operators
- 3.3 Incrementing and Decrementing
- 3.4 Arithmetic Operators
- 3.5 Complex Conjugation
- 3.6 Comparison Operators
- 3.7 Logical Operators
- 3.8 Bit Shifting
- 3.9 Bitwise Logical Operators
- 3.10 Pointer Operators
- 3.11 The sizeof Operator
- 3.12 Type Casts
- 3.13 Array Subscripts
- 3.14 Function Calls as Expressions
- 3.15 The Comma Operator
- 3.16 Member Access Expressions
- 3.17 Conditional Expressions
- 3.18 Statements and Declarations in Expressions
- 3.19 Operator Precedence
- 3.20 Order of Evaluation
- 4 Statements
- 4.1 Labels
- 4.2 Expression Statements
- 4.3 The
if
Statement - 4.4 The
switch
Statement - 4.5 The
while
Statement - 4.6 The
do
Statement - 4.7 The
for
Statement - 4.8 Blocks
- 4.9 The Null Statement
- 4.10 The
goto
Statement - 4.11 The
break
Statement - 4.12 The
continue
Statement - 4.13 The
return
Statement - 4.14 The
typedef
Statement
- 5 Functions
- 6 Program Structure and Scope
- 7 A Sample Program
- Appendix A Overflow
- GNU Free Documentation License
- Index
The GNU C Reference Manual
This is the GNU C reference manual.
Preface
This is a reference manual for the C programming language as implemented by the GNU Compiler Collection (GCC). Specifically, this manual aims to document:
- The 1989 ANSI C standard, commonly known as “C89”
- The 1999 ISO C standard, commonly known as “C99”, to the extent that C99 is implemented by GCC
- The current state of GNU extensions to standard C
This manual describes C89 as its baseline. C99 features and GNU extensions are explicitly labeled as such.
By default, GCC will compile code as C89 plus GNU-specific extensions. Much of C99 is supported; once full support is available, the default compilation dialect will be C99 plus GNU-specific extensions. (Some of the GNU extensions to C89 ended up,
sometimes slightly modified, as standard language features in C99.)
The C language includes a set of preprocessor directives, which are used for things such as macro text replacement, conditional compilation, and file inclusion. Although normally described in a C language manual, the GNU C preprocessor has been
thoroughly documented in The C Preprocessor, a separate manual which covers preprocessing for C, C++, and Objective-C programs, so it is not included here.
Credits
Contributors who have helped with writing, editing, proofreading, ideas, typesetting, or administrative details include: Nelson H. F. Beebe, Karl Berry, Robert Chassell, Andreas Foerster, Denver Gingerich, Lisa Goldstein, Robert Hansen, Jean-Christophe
Helary, Teddy Hogeborn, Joe Humphries, J. Wren Hunt, Adam Johansen, Steve Morningthunder, Richard Stallman, J. Otto Tennant, Ole Tetlie, Keith Thompson, T.F. Torrey, and James Youngman. Trevis Rothwell wrote most of the text and serves as project maintainer.
Some example programs are based on algorithms in Donald Knuth's The Art of Computer Programming.
Please send bug reports and suggestions to gnu-c-manual@gnu.org.
1 Lexical Elements
This chapter describes the lexical elements that make up C source code after preprocessing. These elements are called tokens. There are five types of tokens: keywords, identifiers,
constants, operators, and separators. White space, sometimes required to separate tokens, is also described in this chapter.
1.1 Identifiers
Identifiers are sequences of characters used for naming variables, functions, new data types, and preprocessor macros. You can include letters, decimal digits, and the underscore character ‘_’
in identifiers.
The first character of an identifier cannot be a digit.
Lowercase letters and uppercase letters are distinct, such that foo
and FOO
are two different identifiers.
When using GNU extensions, you can also include the dollar sign character ‘$’ in identifiers.
1.2 Keywords
Keywords are special identifiers reserved for use as part of the programming language itself. You cannot use them for any other purpose.
Here is a list of keywords recognized by ANSI C89:
auto break case char const continue default do double else enum extern float for goto if int long register return short signed sizeof static struct switch typedef union unsigned void volatile while
ISO C99 adds the following keywords:
inline _Bool _Complex _Imaginary
and GNU extensions add these keywords:
__FUNCTION__ __PRETTY_FUNCTION__ __alignof __alignof__ __asm __asm__ __attribute __attribute__ __builtin_offsetof __builtin_va_arg __complex __complex__ __const __extension__ __func__ __imag __imag__ __inline __inline__ __label__ __null __real __real__ __restrict __restrict__ __signed __signed__ __thread __typeof __volatile __volatile__
In both ISO C99 and C89 with GNU extensions, the following is also recogized as a keyboard:
restrict
1.3 Constants
A constant is a literal numeric or character value, such as 5 or `m'. All constants are of a particular data type; you can use type casting to explicitly specify the type of a constant, or let the compiler
use the default type based on the value of the constant.
1.3.1 Integer Constants
An integer constant is a sequence of digits, with an optional prefix to denote a number base.
If the sequence of digits is preceded by 0x
or 0X
(zero x or zero X), then the constant is considered to be hexadecimal (base 16). Hexadecimal values may use the digits
from 0 to 9, as well as the letters a
to f
and A
to F
. Here are some examples:
0x2f 0x88 0xAB43 0xAbCd 0x1
If the first digit is 0 (zero), and the next character is not ‘x’ or ‘X’, then the constant is considered to be octal (base 8). Octal values may only use the digits from
0 to 7; 8 and 9 are not allowed. Here are some examples:
057 012 03 0241
In all other cases, the sequence of digits is assumed to be decimal (base 10). Decimal values may use the digits from 0 to 9. Here are some examples:
459 23901 8 12
There are various integer data types, for short integers, long integers, signed integers, and unsigned integers. You can force an integer constant to be of a long and/or unsigned integer type by appending a sequence of one or more letters to the
end of the constant:
u
U
- Unsigned integer type.
l
L
- Long integer type.
For example, 45U
is an unsigned int
constant. You can also combine letters: 45UL
is an unsigned long int
constant.
(The letters may be used in any order.)
Both ISO C99 and GNU C extensions add the integer types long long int
and unsigned long long int
. You can use two ‘L’s to get a long
constant; add a ‘U’ to that and you have an
long intunsigned long long int
constant. For example: 45ULL
.
1.3.2 Character Constants
A character constant is usually a single character enclosed within single quotation marks, such as 'Q'
. A character
constant is of type int
by default.
Some characters, such as the single quotation mark character itself, cannot be represented using only one character. To represent such characters, there are several “escape sequences” that you can use:
\\
- Backslash character.
\?
- Question mark character.
\'
- Single quotation mark.
\"
- Double quotation mark.
\a
- Audible alert.
\b
- Backspace character.
\e
- <ESC> character. (This is a GNU extension.)
\f
- Form feed.
\n
- Newline character.
\r
- Carriage return.
\t
- Horizontal tab.
\v
- Vertical tab.
\o, \oo, \ooo
- Octal number.
\xh, \xhh, \xhhh, ...
- Hexadecimal number.
To use any of these escape sequences, enclose the sequence in single quotes, and treat it as if it were any other character. For example, the letter m is'm'
and the newline character is '\n'
.
The octal number escape sequence is the backslash character followed by one, two, or three octal digits (0 to 7). For example, 101 is the octal equivalent of 65, which is the ASCII character 'A'
. Thus, the character constant '\101'
is
the same as the character constant 'A'
.
The hexadecimal escape sequence is the backslash character, followed by x
and an unlimited number of hexadecimal digits (0 to 9, and a
to f
or A
to F
).
While the length of possible hexadecimal digit strings is unlimited, the number of character constants in any given character set is not. (The much-used extended ASCII character set, for example, has only 256 characters in it.) If you try to use
a hexadecimal value that is outside the range of characters, you will get a compile-time error.
1.3.3 Real Number Constants
A real number
constant is a value that represents a fractional (floating point) number. It consists of a sequence of digits which represents the integer (or “whole”) part of the number, a decimal point, and a sequence of digits which represents the fractional part.
Either the integer part or the fractional part may be omitted, but not both. Here are some examples:
double a, b, c, d, e, f; a = 4.7; b = 4.; c = 4; d = .7; e = 0.7;
(In the third assignment statement, the integer constant 4 is automatically converted from an integer value to a double value.)
Real number constants can also be followed by e
or E
, and an integer exponent. The exponent can be either positive or negative.
double x, y; x = 5e2; /*x
is 5 * 100, or 500.0. */ y = 5e-2; /*y
is 5 * (1/100, or 0.05. */
You can append a letter to the end of a real number constant to cause it to be of a particular type. If you append the letter F (or f) to a real number constant, then its type is float
. If you append the letter L (or l),
then its type is long double
. If you do not append any letters, then its type is double
.
1.3.4 String Constants
A string constant is a sequence of zero or more characters, digits, and escape sequences enclosed within double quotation marks. A string constant
is of type “array of characters”. All string constants contain a null termination character (\0
) as their last character. Strings are stored as arrays of characters, with no inherent size attribute. The null termination character lets string-processing
functions know where the string ends.
Adjacent string constants are concatenated (combined) into one string, with the null termination character added to the end of the final concatenated string.
A string cannot contain double quotation marks, as double quotation marks are used to enclose the string. To include the double quotation mark character in a string, use the \"
escape sequence. You can use
any of the escape sequences that can be used as character constants in strings. Here are some example of string constants:
/* This is a single string constant. */ "tutti frutti ice cream" /* These string constants will be concatenated, same as above. */ "tutti " "frutti" " ice " "cream" /* This one uses two escape sequences. */ "\"hello, world!\""
If a string is too long to fit on one line, you can use a backslash \
to break it up onto separate lines.
"Today's special is a pastrami sandwich on rye bread with \ a potato knish and a cherry soda."
Adjacent strings are automatically concatenated, so you can also have string constants span multiple lines by writing them as separate, adjacent, strings. For example:
"Tomorrow's special is a corned beef sandwich on " "pumpernickel bread with a kasha knish and seltzer water."
is the same as
"Tomorrow's special is a corned beef sandwich on \ pumpernickel bread with a kasha knish and seltzer water."
To insert a newline character into the string, so that when the string is printed it will be printed on two different lines, you can use the newline escape sequence ‘\n’.
printf ("potato\nknish");
prints
potato knish
1.4 Operators
An operator is a special token that performs an operation, such as addition or subtraction, on either one, two, or three operands. Full coverage of operators can be found in a later
chapter. See Expressions and Operators.
1.5 Separators
A separator separates tokens. White space (see next section) is a separator, but it is not a token. The other separators are all single-character tokens themselves:
( ) [ ] { } ; , . :
1.6 White Space
White space is the collective term used for several characters: the space character, the tab character, the newline character, the vertical tab character, and the form-feed character. White space is ignored
(outside of string and character constants), and is therefore optional, except when it is used to separate tokens. This means that
#include <stdio.h> int main() { printf( "hello, world\n" ); return 0; }
and
#include <stdio.h> int main(){printf("hello, world\n"); return 0;}
are functionally the same program.
Although you must use white space to separate many tokens, no white space is required between operators and operands, nor is it required between other separators and that which they separate.
/* All of these are valid. */
x++;
x ++ ;
x=y+z;
x = y + z ;
x=array[2];
x = array [ 2 ] ;
fraction=numerator / *denominator_ptr;
fraction = numerator / * denominator_ptr ;
Furthermore, wherever one space is allowed, any amount of white space is allowed.
/* These two statements are functionally identical. */
x++;
x
++ ;
In string constants, spaces and tabs are not ignored; rather, they are part of the string. Therefore,
"potato knish"
is not the same as
"potato knish"
2 Data Types
2.1 Primitive Data Types
2.1.1 Integer Types
The integer data types range in size from at least 8 bits to at least 32 bits. The C99 standard
extends this range to include integer sizes of at least 64 bits. You should use integer types for storing whole number values (and the char
data type for storing characters). The sizes and ranges listed for these types
are minimums; depending on your computer platform, these sizes and ranges may be larger.
While these ranges provide a natural ordering, the standard does not require that any two types have a different range. For example, it is common for int
and long
to have the same range.
The standard even allows signed char
and long
to have the same range, though such platforms are very unusual.
signed char
The 8-bitsigned char
data type can hold integer values in the range of −128 to 127.unsigned char
The 8-bitunsigned char
data type can hold integer values in the range of 0 to 255.char
Depending on your system, thechar
data type is defined as having the same range as either thesigned char
or theunsigned char
data type (they
are three distinct types, however). By convention, you should use thechar
data type specifically for storing ASCII characters (such as`m'
), including escape sequences (such as`\n'
).short int
The 16-bitshort int
data type can hold integer values in the range of −32,768 to 32,767. You may also refer to this data type asshort
,signed short int
, orsigned
.
shortunsigned short int
The 16-bitunsigned short int
data type can hold integer values in the range of 0 to 65,535. You may also refer to this data type asunsigned short
.int
The 32-bitint
data type can hold integer values in the range of −2,147,483,648 to 2,147,483,647. You may also refer to this data type assigned int
orsigned
.unsigned int
The 32-bitunsigned int
data type can hold integer values in the range of 0 to 4,294,967,295. You may also refer to this data type simply asunsigned
.long int
The 32-bitlong int
data type can hold integer values in the range of at least −2,147,483,648 to 2,147,483,647. (Depending on your system, this data type might be 64-bit, in which case its range is identical to that of
thelong long int
data type.) You may also refer to this data type aslong
,signed long int
, orsigned long
.unsigned long int
The 32-bitunsigned long int
data type can hold integer values in the range of at least 0 to 4,294,967,295. (Depending on your system, this data type might be 64-bit, in which case its range is identical to that of theunsigned
data type.) You may also refer to this data type as
long long intunsigned long
.long long int
The 64-bitlong long int
data type can hold integer values in the range of −9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. You may also refer to this data type aslong long
,signed
or
long long intsigned long long
. This type is not part of C89, but is both part of C99 and a GNU C extension.unsigned long long int
The 64-bitunsigned long long int
data type can hold integer values in the range of at least 0 to 18,446,744,073,709,551,615. You may also refer to this data type asunsigned long long
. This type
is not part of C89, but is both part of C99 and a GNU C extension.
Here are some examples of declaring and defining integer variables:
int foo; unsigned int bar = 42; char quux = 'a';
The first line declares an integer named foo
but does not define its value; it is left unintialized, and its value should not be assumed to be anything in particular.
2.1.2 Real Number Types
There
are three data types that represent fractional numbers. While the sizes and ranges of these types are consistent across most computer systems in use today, historically the sizes of these types varied from system to system. As such, the minimum and maximum
values are stored in macro definitions in the library header file float.h
. In this section, we include the names of the macro definitions in place of their possible values; check your system's float.h
for
specific numbers.
float
Thefloat
data type is the smallest of the three floating point types, if they differ in size at all. Its minimum value is stored in theFLT_MIN
, and should be no greater than1e-37
.
Its maximum value is stored inFLT_MAX
, and should be no less than1e37
.double
Thedouble
data type is at least as large as thefloat
type, and it may be larger. Its minimum value is stored inDBL_MIN
, and its maximum value is stored
inDBL_MAX
.long double
Thelong double
data type is at least as large as thefloat
type, and it may be larger. Its minimum value is stored inLDBL_MIN
, and its maximum value is
stored inLDBL_MAX
.
All floating point data types are signed; trying to use unsigned float
, for example, will cause a compile-time error.
Here are some examples of declaring and defining real number variables:
float foo; double bar = 114.3943;
The first line declares a float named foo
but does not define its value; it is left unintialized, and its value should not be assumed to be anything in particular.
The real number types provided in C are of finite precision, and accordingly, not all real numbers can be represented exactly. Most computer systems that GCC compiles for use a binary representation for real numbers, which is unable to precisely
represent numbers such as, for example, 4.2. For this reason, we recommend that you consider not comparing real numbers for exact equality with the ==
operator, but rather check that real numbers are within an acceptable
tolerance.
There are other more subtle implications of these imprecise representations; for more details, see David Goldberg's paper What Every Computer Scientist Should Know About Floating-Point Arithmetic and section
4.2.2 of Donald Knuth's The Art of Computer Programming.
2.1.3 Complex Number Types
GCC introduced some complex number types as an extension to C89. Similar
features were introduced in C991, but there were a number of differences. We describe the standard complex number types first.
2.1.3.1 Standard Complex Number Types
Complex types were introduced in C99. There are three complex types:
float _Complex
double _Complex
long double _Complex
The names here begin with an underscore and an uppercase letter in order to avoid conflicts with existing programs' identifiers. However, the C99 standard header file <complex.h>
introduces some macros which
make using complex types easier.
complex
Expands to_Complex
. This allows a variable to be declared asdouble complex
which seems more natural.I
A constant of typeconst float _Complex
having the value of the imaginary unit normally referred to as i.
The <complex.h>
header file also declares a number of functions for performing computations on complex numbers, for example the creal
and cimag
functions
which respectively return the real and imaginary parts of a double complex
number. Other functions are also provided, as shown in this example:
#include <complex.h> #include <stdio.h> void example (void) { complex double z = 1.0 + 3.0*I; printf ("Phase is %f, modulus is %f\n", carg (z), cabs (z)); }
2.1.3.2 GNU Extensions for Complex Number Types
GCC also introduced complex types as a GNU extension to C89, but the spelling is different. The floating-point complex types in GCC's C89 extension are:
__complex__ float
__complex__ double
__complex__ long double
GCC's extension allow for complex types other than floating-point, so that you can declare complex character types and complex integer types; in fact__complex__
can be used with any of the primitive data types. We won't
give you a complete list of all possibilities, but here are some examples:
__complex__ float
The__complex__ float
data type has two components: a real part and an imaginary part, both of which are of thefloat
data type.__complex__ int
The__complex__ int
data type also has two components: a real part and an imaginary part, both of which are of theint
data type.
To extract the real part of a complex-valued expression, use the keyword __real__
, followed by the expression. Likewise, use __imag__
to extract the imaginary part.
__complex__ float a = 4 + 3i; float b = __real__ a; /*b
is now 4. */ float c = __imag__ a; /*c
is now 3. */
This example creates a complex floating point variable a
, and defines its real part as 4 and its imaginary part as 3. Then, the real part is assigned to the floating point variable b
, and the imaginary
part is assigned to the floating point variable c
.