I am developing embedded software on ARM Cortex-A53 (armv7/v8) for image processing, and I need some help for a specific algorithm.

This later involves **complex** (real/imaginary parts) **matrix inversion with large size** (up to 24x24).
The most significant constraint is the **timing execution**.

I am currently looking for a library, optimized for ARM (using NEON SIMD?), which offers this kind of operation (matrix inversion of large size, with complex numbers).

I did try **Eigen** library: it works fine, but it is **too slow**.
Our processor runs at 1 GHz, and I got an average duration of 0,82 ms for a complex double matrix inversion of size 24x24.

I also had a look at:

- Ne10 library: it does not propose matrix inversion with complex number (no template used) and large size (maximum size is 4x4).
- ARM Compute Library: cannot find matrix inversion function. I am not sure it is implemented.
- ARM Performance Libraroes: I am still looking for the matrix inversion, does anyone knows which function/class I should have a look to?

**Please, any C++ libraries (should be templated for use of complex), optimized for ARM, could you suggest?**

Thanking you in advance!

Laurent.

**UPDATE**

I am currently testing the C LAPACK (=LAPACKE) library to complete a matrix inversion.

I do have problem to link my code.
Look at C++ code (example of code got from internet, very simple):

```
#include "test_lapacke.hh"
extern "C"
{
#include <lapacke.h>
}
lapack_int matInv(double *A, unsigned n)
{
int ipiv[n+1];
lapack_int ret;
ret = LAPACKE_dgetrf(LAPACK_COL_MAJOR,
n,
n,
A,
n,
ipiv);
if (ret !=0)
return ret;
ret = LAPACKE_dgetri(LAPACK_COL_MAJOR,
n,
A,
n,
ipiv);
return ret;
}
/**********************************************************************************************************************/
void Test_lapacke()
{
double A[] = {
0.378589, 0.971711, 0.016087, 0.037668, 0.312398,
0.756377, 0.345708, 0.922947, 0.846671, 0.856103,
0.732510, 0.108942, 0.476969, 0.398254, 0.507045,
0.162608, 0.227770, 0.533074, 0.807075, 0.180335,
0.517006, 0.315992, 0.914848, 0.460825, 0.731980
};
for (int i=0; i<25; i++) {
if ((i%5) == 0) putchar('\n');
printf("%+12.8f ",A[i]);
}
putchar('\n');
matInv(A,5);
for (int i=0; i<25; i++) {
if ((i%5) == 0) putchar('\n');
printf("%+12.8f ",A[i]);
}
putchar('\n');
}
```

I did add all blas/lapack libraries to C++ linker on my project: `-llapack -llapacke -lrefblas -lcblas -ltmglib`

But I still got the error:

/path/lib/liblapacke.a(lapacke_dgetrf_work.o): In function ```
LAPACKE_dgetrf_work':
lapacke_dgetrf_work.c:(.text+0xd8): undefined reference to
```

dgetrf_'
lapacke_dgetrf_work.c:(.text+0x160): undefined reference to ```
dgetrf_'
/path/lib/liblapacke.a(lapacke_dgetri_work.o): In function
```

LAPACKE_dgetri_work':
lapacke_dgetri_work.c:(.text+0xe4): undefined reference to ```
dgetri_'
lapacke_dgetri_work.c:(.text+0x174): undefined reference to
```

dgetri_'
collect2: error: ld returned 1 exit status
make: *** [Test_matrice] Erreur 1

**Any ideas to solve this?**

Many thanks!
Regards,
Laurent.