Issue
I use gcc to compile a simple test code for ARM Cortex-M4, and it will optimize the usgae of the global variables which confused me. What are the rules that gcc optimizing the usage of global variables?
GCC compiler: gcc-arm-none-eabi-8-2019-q3-update/bin/arm-none-eabi-gcc
Optimization level: -Os
My test code:
The following code is in "foo.c", and the function foo1() and foo2() ard called in task A, the function global_cnt_add() is called in task B.
int g_global_cnt = 0;
void dummy_func(void);
void global_cnt_add(void)
{
g_global_cnt++;
}
int foo1(void)
{
while (g_global_cnt == 0) {
// do nothing
}
return 0;
}
int foo2(void)
{
while (g_global_cnt == 0) {
dummy_func();
}
return 0;
}
The function dummy_func() is implemented in bar.c as following:
void dummy_func(void)
{
// do nothing
}
The assembly code of function foo1() is shown below:
int foo1(void)
{
while (g_global_cnt == 0) {
201218: 4b02 ldr r3, [pc, #8] ; (201224 <foo1+0xc>)
20121a: 681b ldr r3, [r3, #0]
20121c: b903 cbnz r3, 201220 <foo1+0x8>
20121e: e7fe b.n 20121e <foo1+0x6>
// do nothing
}
return 0;
}
201220: 2000 movs r0, #0
201222: 4770 bx lr
201224: 00204290 .word 0x00204290
The assembly code of function foo2() is shown below:
int foo2(void)
{
201228: b510 push {r4, lr}
while (g_global_cnt == 0) {
20122a: 4c04 ldr r4, [pc, #16] ; (20123c <foo2+0x14>)
20122c: 6823 ldr r3, [r4, #0]
20122e: b10b cbz r3, 201234 <foo2+0xc>
dummy_func();
}
return 0;
}
201230: 2000 movs r0, #0
201232: bd10 pop {r4, pc}
dummy_func();
201234: f1ff fcb8 bl 400ba8 <dummy_func>
201238: e7f8 b.n 20122c <foo2+0x4>
20123a: bf00 nop
20123c: 00204290 .word 0x00204290
In the assembly code of function foo1(), the global variable "g_global_cnt" is loaded only once, and the while loop will never be broken. The compiler optimize the usage of "g_global_cnt", and I know I can add volatile to avoid this optimization.
In the assembly code of function foo2(), the global variable "g_global_cnt" is loaded and checked in each while loop, the while loop can be broken.
What are the gcc optimization rules make the difference?
Solution
In order to understand this behaviour, you have to think about side effects and sequence point ref.
For the compiler a side effect is a result of an operator, expression, statement, or function that persists even after the operator, expression, statement, or function has finished being evaluated.
While *A sequence point defines any point in a computer program's execution at which it is guaranteed that all side effects of previous evaluations will have been performed, and no side effects from subsequent evaluations have yet been performed. *
The main rule of a sequence point is that no variable will be accessed more than once between points for any purpose other than to calculate a change in its value
Citing the C standard:
In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object).
In your code
int foo1(void)
{
while (g_global_cnt == 0) {
// do nothing
}
return 0;
}
After reading the g_global_cnt
there are no more side effects that might influence the value of the variable. The compiler can't know that it is modified outside the scope of the function, hence it thinks that you can read it only once, and that's because there are no more sequence points in the functions scope.
The way to tell the compiler that each read has side effects is to mark the variable with the identifier volatile
.
With int g_global_cnt = 0;
:
adrp x0, g_global_cnt
add x0, x0, :lo12:g_global_cnt
ldr w0, [x0]
cmp w0, 0
beq .L3
mov w0, 0
ret
With volatile int g_global_cnt = 0;
:
adrp x0, g_global_cnt
add x0, x0, :lo12:g_global_cnt
ldr w0, [x0]
cmp w0, 0
cset w0, eq
and w0, w0, 255
cmp w0, 0
bne .L3
mov w0, 0
ret
Answered By - Fra93 Answer Checked By - Gilberto Lyons (WPSolving Admin)