In the ariticle "A Guide to Understanding Even the Most Complex C Declarations", Greg Comeau presents a set of rules that can be applied to interpret any C declaration however complex it may seem. While the rules are intuitive and might appeal to most advanced C programmers, the beginner may find them difficult to grasp. He does however start the article by presenting a simple rule-set to read and write Kernighan and Ritchie (of the famous book, The C Progamming Language, 1978) style declarations. In this Primer I will present these rules and elaboarte on them with examples.
Here is the standard sytax for C Declarations:
The sytax of a C declaration is of the form: storage-class type qualifier declarator = initializer; where storage-class is only one of the following: typedef
extern static auto register A type could be one or more of the following: void
char short, int, long float, double signed, unsigned struct ... union ... typedef type A qualifier could be one or more of the following: const
A declarator contains an identifier and one or more, or none at all, of
the following in a variety of combinations:volatile *
possibly grouped within parentheses to create different bindings() [] |
The term storage-class refers to the method by which an object is assigned space in memory. Chapter 4 of the C Primer gives detailed descriptions of what each of the storage-classes mean. Suffice it to say that understanding what is being declared has no bearing on the storage-class, as it specifically tells you where it is being declared and assigned space in memory. Also, qualifiers (const, volatile) refer respectively to the non-modifiability of an entity, and the fact that the entity in question is modified elsewhere. Therefore, henceforth we will ignore these two pieces of information.
The above definition simply says what a declaration ought to look like (the syntax that is). The key phrase in the above definition is "to create different bindings". What this means is, to give different interpretations to the declaration based on parenthesizing the declaration. All one has to understand any complex C declaration then, is to know that these declarations are based on the C operator precedence chart, the same one you use to evaluate expressions in C:Precedence | Operators | Associativity |
highest | () []
. -> ++(postfix)
--(postfix) |
left to right |
++(prefix) --(prefix)
!~ sizeof(type)
+(unary) -(unary)
&(address) *(dereference) |
right to left | |
*
/ % |
left to right | |
+ - |
left to right | |
<<
>> |
left to right | |
<
<= > >= |
left to right | |
== != |
left to right | |
& | left to right | |
^ | left to right | |
| | left to right | |
&& | left to right | |
|| | left to right | |
? : | right to left | |
= +=
-=
*= /=
%= <<=
>>= |=
&= ^=
|
right to left | |
lowest | , | left to right |
|
Example 1: uint16_t i;
The parenthesization of the above declaration is:uint16_t (i); {1}
Applying the rules (see above) to the parenthesized expression can be done as follows:The innermost parentesis is (i) {2} i is the variable name, therefore we say "i is ..." {3} No more parentheses left so we say "a unsigned 16-bit integer". {4,5,6} That is, "i is a unsigned 16-bit integer" |
Example 2: uint16_t *i;
The parenthesization of the above declaration is:uint16_t (*(i)); {1}
Applying the rules (see above) to the parenthesized expression can be done as follows:The innermost parentesis is (i) {2}
i is the variable name, therefore we say "i is ..." {3}
Move to the next set of parenthesis: (*(i)) {4}
Go back to step 3. {5} We say "a pointer to" since we see a * {3.b} No more parentheses left so we say "a unsigned 16-bit integer". {4,5,6} That is, "i is a pointer to a unsigned 16-bit integer" |
Example 3: uint16_t *i[3];
The parenthesization of the above declaration is:uint16_t (*((i)[3])); {1} // Note that () and [] have the same
Applying the rules (see above) to the parenthesized expression can be done as follows:// precedence but we deal with them from left to right. The innermost parentesis is (i) {2}
i is the variable name, therefore we say "i is ..." {3}
Move to the next set of parenthesis: ((i)[3]) {4}
Go back to step 3. {5}
We say "an array of 3 ..." since we see a [3] {3.a}
Move to the next set of parenthesis: (*((i)[3])) {4} Go back to step 3. {5}
No more parentheses left so we say "unsigned 16-bit integers". {4,5,6}
That is, "i is an array of 3 unsigned 16-bit integers" |
Example 4: uint16_t (*i)[3];
The parenthesization of the above declaration is:uint16_t ((*(i))[3]); {1} // Note that parentheses are valid tokens
Applying the rules (see above) to the parenthesized expression can be done as follows:// in a declaration and therefore must be // be left in place when finding the final // parenthesization The innermost parentesis is (i) {2}
i is the variable name, therefore we say "i is ..." {3}
Move to the next set of parenthesis: (*(i)) {4}
Go back to step 3. {5}
We say "a pointer to ..." since we see a * {3.b}
Move to the next set of parenthesis: ((*(i))[3]) {4} Go back to step 3. {5}
We say "an array of 3 ..." since we see a [3] {3.a} No more parentheses left so we say "unsigned 16-bit integers". {4,5,6}
That is, "i is a pointer to an array of 3 unsigned 16-bit integers"
|
Example 5: uint16_t *i();
The parenthesization of the above declaration is:uint16_t (*((i)())); {1} // Note that * has a lower precedence than
Applying the rules (see above) to the parenthesized expression can be done as follows:// parentheses, () The innermost parentesis is (i) {2}
// One could argue that () is also the innermost parenthesis but // it does not contain anything so we know it must indicate // a function i is the variable name, therefore we say "i is ..." {3}
Move to the next set of parenthesis: ((i)()) {4}
Go back to step 3. {5}
We say "a function returning" since we see a () {3.c}
Move to the next set of parenthesis: (*((i)())) {4} Go back to step 3. {5}
No more parentheses left so we say "unsigned 16-bit integers". {4,5,6}We say "a pointer to ..." since we see a * {3.b} That is, "i is a function returning a pointer to unsigned 16-bit integers" |
Example 6: uint16_t (*i)();
The parenthesization of the above declaration is:uint16_t ((*(i))()); {1} // Note that parentheses are valid tokens
Applying the rules (see above) to the parenthesized expression can be done as follows:// in a declaration and therefore must be // be left in place when finding the final // parenthesization The innermost parentesis is (i) {2}
i is the variable name, therefore we say "i is ..." {3}
Move to the next set of parenthesis: (*(i)) {4}
Go back to step 3. {5}
We say "a pointer to" since we see a * {3.b}
Move to the next set of parenthesis: ((*(i))()) {4} Go back to step 3. {5}
We say "a function returning" since we see a () {3.c} No more parentheses left so we say "a unsigned 16-bit integer". {4,5,6}
That is, "i is a pointer to a function returning a unsigned 16-bit integer" |