Sunday, January 9, 2022

[SOLVED] vectorization fails with GCC

Issue

I am trying to understand vectorization but to my surprise this very simple code is not being vectorized

#define n 1024
int main () {
  int i, a[n], b[n], c[n];

  for(i=0; i<n; i++) { a[i] = i; b[i] = i*i; }
  for(i=0; i<n; i++) c[i] = a[i]+b[i];
}

While the Intel compiler vectorizes for some reason the initialization loop, line 5.

> icc -vec-report a.c
a.c(5): (col. 3) remark: LOOP WAS VECTORIZED

With GCC, I get nothing it seems

> gcc -ftree-vectorize -ftree-vectorizer-verbose=2 a.c

Am I doing something wrong? Shouldn't this be a very simple vectorizable loop? All the same operations, continuous memory etc. My CPU supports SSE1/2/3/4.

--- update ---

Following the answer below, this example works for me.

#include <stdio.h>
#define n 1024

int main () {
  int i, a[n], b[n], c[n];

  for(i=0; i<n; i++) { a[i] = i; b[i] = i*i; }
  for(i=0; i<n; i++) c[i] = a[i]+b[i];

  printf("%d\n", c[1023]);  
}

With icc

> icc -vec-report a.c
a.c(7): (col. 3) remark: LOOP WAS VECTORIZED
a.c(8): (col. 3) remark: LOOP WAS VECTORIZED

And gcc

> gcc -ftree-vectorize -fopt-info-vec -O a.c
a.c:8:3: note: loop vectorized
a.c:7:3: note: loop vectorized

Solution

I've slightly modified your source code to be sure that GCC couldn't remove the loops:

#include <stdio.h>
#define n 1024

int main () {
  int i, a[n], b[n], c[n];

  for(i=0; i<n; i++) { a[i] = i; b[i] = i*i; }
  for(i=0; i<n; i++) c[i] = a[i]+b[i];

  printf("%d\n", c[1023]);  
}

GCC (v4.8.2) can vectorize the two loops but it needs the -O flag:

gcc -ftree-vectorize -ftree-vectorizer-verbose=1 -O2 a.c

and I get:

Analyzing loop at a.c:8

Vectorizing loop at a.c:8

a.c:8 note: LOOP VECTORIZED. Analyzing loop at a.c:7

Vectorizing loop at a.c:7

a.c:7 note: LOOP VECTORIZED. a.c: note: vectorized 2 loops in function.

Using the -fdump-tree-vect switch GCC will dump more information in the a.c.##t.vect file (it's quite useful to get an idea of what is happening "inside").

Also consider that:



Answered By - manlio