Inconsistent MEX file Output using Armadillo Interpolation
I am trying to convert matlab code to a c++ mex file in order to run a few computations more efficiently. I am using the armadillo library with blas and lapack for a few matrix operations, which involves interpolating data to apply a delay.
However, I am receiving an inconsistent output from my mex file. If I run the same mex file with the same input, sometimes I receive the correct output, and occasionally it will output a HUGE number (i.e. instead of on the order of 100, it is on the order of 10^246).
I am very new to c++ coding, and have exhausted my general knowledge base. I believe the problem is in my interpolation step, because I am able to consistently output the correct delay matrix, which is the preceeding step.
Does anyone have any idea what I am doing to produce this?
In Matlab I call:
mex test.cpp lblas llapack
[outData] = test( squeeze(inData(:,:,ang,:)) , params, angles(ang),1);
My mex file is generally:
#include <math.h>
#include <mex.h>
#include <armadillo>
#include "armaMex.hpp"
using namespace std; //avoid having to scope with std:: before commands
using namespace arma; //avoid having to scope with std:: before commands
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]){
// ============== INITIALIZE =============
// Initialize Data
const mwSize *dims;
int cDim,dDim,aDim,numDims; // Dimension variables
int m, n, a; // Loop variables
mxArray *fs_p, *f0_p, *prf_p, *pval_p, *c_p; // Parameter pointers
const double *fs,*f0,*prf,*pval, *c, *ang; // Parameter variables
const int *nthreads;
// Initialize pointers for param variables
pval_p = mxGetField(prhs[1],0,"pval"); //note that your parameters need these exact names
fs_p = mxGetField(prhs[1],0,"fs");
f0_p = mxGetField(prhs[1],0,"f0");
prf_p = mxGetField(prhs[1],0,"prf");
c_p = mxGetField(prhs[1],0,"c");
// Initialize parameters
pval = mxGetPr(pval_p);
fs = mxGetPr(fs_p);
f0 = mxGetPr(pval_p);
prf = mxGetPr(prf_p);
c = mxGetPr(c_p);
ang = (double*)mxGetData(prhs[2]);
nthreads = (int*)mxGetData(prhs[3]);
dims = mxGetDimensions(prhs[0]);
numDims = (int)mxGetNumberOfDimensions(prhs[0]);
dDim=(int)dims[0];cDim=(int)dims[1];aDim=(int)dims[2];
//Read in channel Data
cube data_in = armaGetCubePr(prhs[0]);
(....... simple calculations that look okay ... )
cube data_out(dDim, bDim, aDim);
cube delayedData(dDim, aDim, bDim);
vec delayArray(dDim); //need to define these tmp variables bc subcube fcn otherwise gives me errors idk
vec tmpIN(dDim);
vec tmpOut(dDim);
vec tmpOUTdata(dDim);
for(m=0;m<bDim;m++){
for(n=0;n<cDim;n++){
for (a=0;a<aDim;a++){
delayArray = tdelays.subcube(0,n,m,dDim1,n,m);
tmpIN = data_in.subcube(0,n,a,dDim1,n,a);
tmpOUTdata = data_out.subcube(0,m,a,dDim1,m,a);
interp1(timeArray, tmpIN , delayArray, tmpOut, "linear",0);
data_out.subcube(0,m,a,dDim1,m,a) = tmpOUTdata +tmpOut;
}
}
}
// Define output data
plhs[0] = armaCreateMxMatrix(data_out.n_rows, data_out.n_cols, data_out.n_slices);
armaSetCubePr(plhs[0], data_out);
return
}
See also questions close to this topic

C++ Linked List, bad_alloc
My teacher gave us the main.cpp (mostly just data to test the program) and a header file for a Linked List. I'm pretty sure I have most of it okay, but having issues turning into a vector with the method/parameters he gave us:
vector<string> LinkedList::toVector(bool sorted) const { vector<string> result; Node *current = mFront>mNext; while (current != mRear) { result.push_back(string(current>mData)); current = current>mNext; } if (sorted) { sort(result.begin(), result.end()); } return result; }
I keep getting an error saying that there is bad memory allocation on the push_back line. Any advice?
Edit: I didn't put enough of the code in
The Node class was created in the private access modifier of linkedList.h  code for that is:
private: class Node { friend class LinkedList; public: inline Node *getNext() const { return mNext; }; inline string getData() { return mData; } private: inline Node(string data) { mPrevious = mNext = 0; mData = data; } string mData; Node *mPrevious; Node *mNext; };
I attempted debugging and that was actually how I found out exactly what line the exception was being thrown from. I'm not sure how to test whether it is push_back that is the issue or whether the string is throwing, as someone suggested.
current>mData should be giving back a string as long as I'm looking at it right, that's what the main.cpp is using to populate everything.

LAPACK matrix inversion: unable to program
I am developing embedded software on ARM CortexA53 (armv7/v8) for image processing, and I need some help for a specific algorithm.
This later involves complex (real/imaginary parts) matrix inversion with large size (up to 24x24). The most significant constraint is the timing execution.
I am currently looking for a library, optimized for ARM (using NEON SIMD?), which offers this kind of operation (matrix inversion of large size, with complex numbers).
I did try Eigen library: it works fine, but it is too slow. Our processor runs at 1 GHz, and I got an average duration of 0,82 ms for a complex double matrix inversion of size 24x24.
I also had a look at:
 Ne10 library: it does not propose matrix inversion with complex number (no template used) and large size (maximum size is 4x4).
 ARM Compute Library: cannot find matrix inversion function. I am not sure it is implemented.
 ARM Performance Libraroes: I am still looking for the matrix inversion, does anyone knows which function/class I should have a look to?
Please, any C++ libraries (should be templated for use of complex), optimized for ARM, could you suggest?
Thanking you in advance!
Laurent.
UPDATE
I am currently testing the C LAPACK (=LAPACKE) library to complete a matrix inversion.
I do have problem to link my code. Look at C++ code (example of code got from internet, very simple):
#include "test_lapacke.hh" extern "C" { #include <lapacke.h> } lapack_int matInv(double *A, unsigned n) { int ipiv[n+1]; lapack_int ret; ret = LAPACKE_dgetrf(LAPACK_COL_MAJOR, n, n, A, n, ipiv); if (ret !=0) return ret; ret = LAPACKE_dgetri(LAPACK_COL_MAJOR, n, A, n, ipiv); return ret; } /**********************************************************************************************************************/ void Test_lapacke() { double A[] = { 0.378589, 0.971711, 0.016087, 0.037668, 0.312398, 0.756377, 0.345708, 0.922947, 0.846671, 0.856103, 0.732510, 0.108942, 0.476969, 0.398254, 0.507045, 0.162608, 0.227770, 0.533074, 0.807075, 0.180335, 0.517006, 0.315992, 0.914848, 0.460825, 0.731980 }; for (int i=0; i<25; i++) { if ((i%5) == 0) putchar('\n'); printf("%+12.8f ",A[i]); } putchar('\n'); matInv(A,5); for (int i=0; i<25; i++) { if ((i%5) == 0) putchar('\n'); printf("%+12.8f ",A[i]); } putchar('\n'); }
I did add all blas/lapack libraries to C++ linker on my project:
llapack llapacke lrefblas lcblas ltmglib
But I still got the error:
/path/lib/liblapacke.a(lapacke_dgetrf_work.o): In function
LAPACKE_dgetrf_work': lapacke_dgetrf_work.c:(.text+0xd8): undefined reference to
dgetrf_' lapacke_dgetrf_work.c:(.text+0x160): undefined reference todgetrf_' /path/lib/liblapacke.a(lapacke_dgetri_work.o): In function
LAPACKE_dgetri_work': lapacke_dgetri_work.c:(.text+0xe4): undefined reference todgetri_' lapacke_dgetri_work.c:(.text+0x174): undefined reference to
dgetri_' collect2: error: ld returned 1 exit status make: *** [Test_matrice] Erreur 1Any ideas to solve this?
Many thanks! Regards, Laurent.

Understanding delete in C++ with
I have following code snippet from a C++ book I am reading.
int* operator=(const int& rhs, int *x) { int *tmpx=x //line 1 x = new int(2) //line 2 delete tmpx; //line 3 return x; //line 4 }
My doubt is that If I am deleting tmpx on line 3 which holds the address to memory location that x points to, and deleting will invalidate the memory address, So wouldn't it be wrong to return x which is pointing to memory address that was freed at line 3 ?

Set the depth of image in Matlab
How can I set the depth of an image to 1 Byte? I import an image with the help of the Matlab Imaging toolbox the following way:
UT = imread('ut.jpg');

Is there any demonstration of standard ICA packages to solve 'cocktail party' problem?
Though ICA (independent component analysis) is developed under the umbrella of BSS (blind source separation) and 'cocktail party' problem is often introduced, why is there no good example of separating mixed audio voices separated using ICA in R or matlab? I am looking for a tutorial dataset that uses mixed audio recording or images to demonstrate how ICA works. If it is implemented using standard packages (fastICA, ica, JADE, etc.) in R or matlab that would be great.
Thanks a lot in advance. 
Convert excel to array in Laravel
I'm using matlab excel plugin for convert excel to array. My code is:
public static function ImportExcel($table="", $path=""){ $data = array(); Excel::load($path, function ($reader) use($data) { $data = $reader>toArray(); //here data has all excel data in array. }); return $data; //but here it shows empty array. }
Check my comments in code. Inside the Excel::load data has array of all data. But its scope is only inside the Excel::load. I need it outside.

Constant sample based background subtraction and interpolation
I need your expertise for a problem. I have the following matrix with different columns.
Time subsamp raw filt_BG filter 230.5382060 1 1 1 0 230.5382176 2 1 1 0 230.5382292 1 4 0 1 230.5382407 2 4 0 1 230.5382523 1 3 0 1 230.5382639 2 3 0 1 230.5382755 1 2 1 0 230.5382870 2 2 1 0 230.5382986 1 2 0 1 230.5383102 2 2 0 1 230.5383218 1 1 0 1 230.5383333 2 1 0 1 230.5383449 1 3 1 0 230.5383565 2 3 1 0
It contains two different type of raw data (column 3). Each type of raw data has same no. of sub sample (column 2). The 'filt_BG' and 'filter' seprate the type of data. I try to explain the problem and required solution with the following diagram. In the above diagram (leftside), raw data is plotted as a function of time. The colors are representing different windows(subsample) 'sig' and 'BG'based on 'filter' and 'filt_BG', repectively. I want to do interpolation from the window 'BG1' to the window 'BG2'(subsample to subsample), so that interpolated data can be subtracted (subsample to subsample) from 'sig1' and 'sig2'. Similarly from 'BG2' to 'BG3' interpolation and subtraction from 'sig3' and 'sig4'. This gives me 'solution1' (upper right). Now, if 'BG3' do not exsist then I want to use only 'BG2' for subtraction from 'sig3' and 'sig4'(solution2, lower right). The solution matrix would be something like below.!!! Thank you for your ideas/solution.!!! Please keep in mind that I shall have millions of data points not as simple data as described here. Also, I do not have any toolbox but only the matlab available. Any solution based on toolbox is also fine.
Time subsamp solution1 solution2 230.5382060 1 NaN NaN 230.5382176 2 NaN NaN 230.5382292 1 2.5 2.5 230.5382407 2 2.5 2.5 230.5382523 1 1.5 1.5 230.5382639 2 1.5 1.5 230.5382755 1 NaN NaN 230.5382870 2 NaN NaN 230.5382986 1 0.5 0 230.5383102 2 0.5 0 230.5383218 1 1.5 1 230.5383333 2 1.5 1 230.5383449 1 NaN NaN 230.5383565 2 NaN NaN
Best Regards

Errors in bilinear interpolation using scipy LinearNDInterpolator
I am trying to do simple bilinear interpolation of a 2d dataset, but the results are unexpected. When I plot my dataset as a contour it looks fine, but when I generate a LinearNDInterpolator and sample on a denser mesh artifacts in the interpolation become obvious. The contour plot itself is, of course, interpolating between the actual datapoints given, and all I want is the same result that is displayed on the contour plot, but I can't seem to get it with LinearNDInterpolator or interp2d. In fact interp2d results in a nearly nonsense function, even for
kind='linear'
. I think I must be missing something fundamental here, but I can't understand why I am having so much trouble doing simple linear interpolation between points. Here is an example of what I am doing, hopefully I am making some obvious mistake.import numpy as np from scipy.interpolate import LinearNDInterpolator import matplotlib.pyplot as plt x = np.linspace(0, 10, num=10) y = np.linspace(0, 10, num=10) z = np.array([[ 76.075 , 78.125 , 80.15 , 81.46875, 82.53125, 83.69375, 84.25 , 84.59375, 85.275 , 85.9375 ], [ 72.4625 , 74.95625, 76.45 , 77.98125, 79.36875, 80.1875 , 81.09375, 81.94375, 82.56875, 83.04375], [ 65.20625, 70.425 , 72.45625, 73.9625 , 75.74375, 76.7 , 77.90625, 78.4 , 79.4875 , 80.00625], [ 0. , 65.4875 , 67.66875, 69.93125, 71.70625, 72.7 , 73.98125, 74.95625, 75.6 , 76.6875 ], [ 0. , 0. , 55.6875 , 64.8125 , 66.16875, 67.91875, 69.35 , 70.3 , 71.64375, 72.00625], [ 0. , 0. , 0. , 42.6625 , 59.5625 , 61.15 , 62.10625, 64.19375, 65.1125 , 66.2 ], [ 0. , 0. , 0. , 0. , 28.49375, 53.14375, 54.3875 , 55.89375, 57.1 , 58.275 ], [ 0. , 0. , 0. , 0. , 0. , 11.84375, 44.5125 , 46.91875, 47.975 , 48.66875], [ 0. , 0. , 0. , 0. , 0. , 0. , 3.16875, 32.50625, 36.96875, 37.7625 ], [ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.175 , 17.8125 , 26.6875 ]]) plt.contour(x,y,z,20) X, Y = np.meshgrid(x,y) pts = list(zip(X.flatten(),Y.flatten())) interp = LinearNDInterpolator(pts,z.flatten()) x2 = np.linspace(0, 10, num=100) y2 = np.linspace(0, 10, num=100) X2, Y2 = np.meshgrid(x2, y2) Z2= interp(X2, Y2) plt.contour(X2,Y2,Z2, 20, linestyles='dashed') plt.show()
Solid lines are data, dotted are interpolation

Retrieve index of nearest surfacepoints returned from CGAL's surface_neighbor_coordinates_3
I (relatively new to CGAL and not a C++ expert) am trying to extract the index of the nearestneighbor 3D points returned from CGAL's
surface_neighbor_coordinates_3
(which searches a 2D mesh comprised of 3D points to find natural neighbors of a provided querypoint) in this CGAL example. In other examples (3D interpolation with 3D meshes), I have been able to do this by adding info to vertex handles in the triangulation data structure. In the linked example, I simply wish to retrieve the indices of returnedcoords
with respect to the input list ofpoints
.The other calloptions for
surface_neighbor_coordinates_3
seem to suggest this may be possible by passingin an existing triangulation (with perhaps its infoaugmented triangulationdatastructure). However, I'm not sure how to specify the infoaugmented Delaunay_triangulation_3 for the case of a 2D mesh consisting of 3D points. I'm experimenting with it (using advancingfront triangulations to 2Dmesh my 3D points) but would like to know if there's some easier way to use the native capabilities ofsurface_neighbor_coordinates_3
if one only seeks to also have an info field associated with the returned points.Any help would be greatly appreciated ... this has stumped me for a week.

Mex Matlab: mxArray * unknow size when trying to increment the value
I am trying to implement a matlab function . I am really close to what I have logically, however, i always run into an error which I can't seem to figure it out.
What I am trying to do:
I am trying to implement a matlab function that return the number of elements that are greater than certain value(10 in this case).
#include "mex.h" void mexFunction(int nargout, mxArray *varargout[], int nargin, const mxArray* varargin[]) { varargout[0] = 0; size_t N = mxGetNumberOfElements(varargin[0]); double *p; p = (double*)mxGetData(varargin[0]); for (size_t i = 0; i < N; i++) { if (*p >= 10) { //*varargout[0] ++; <Error } p++; } }
The error was
mxArray * unknow size
. The line that gives error was*varargout[0] ++.
I am trying to set
varargout[0]
to 0 so it becomes the first output. Then whenever I find the value that fits certain condition, I will increment it by 1. 
mex matlab function c++ gives wrong answer relative to R c++ function
I've got an issue that I cannot get the answer after mex'ing c++ function. It gives zeros always no matter what.
code:
https://github.com/jstriaukas/codequestions/tree/master/code
main.m gives the answer in matlab. linSolver.cpp contains mexFunction that I think I made a mistake in but cannot find it. Any ideas?

How can I specify a minimum compute capability to the mexcuda compiler to compile a mexfunction?
I have a CUDA project in a
.cu
file that I would like to compile to a.mex
file usingmexcuda
. Because my code makes use of the 64bit floating point atomic operationatomicAdd(double *, double)
, which is only supposed for GPU devices of compute capability 6.0 or higher, I need to specify this as a flag when I am compiling.In my standard IDE, this works fine, but when compiling with
mexcuda
, this is not working as I would like. In this post on MathWorks, it was suggested to use the following command (edited from the comment by Joss Knight):mexcuda('v', 'mexGPUExample.cu', 'NVCCFLAGS=gencode=arch=compute_60,code=sm_60')
but when I use this command on my file, the verbose option spits out the following line last:
Building with 'NVIDIA CUDA Compiler'. nvcc c compileroptions=/Zp8,/GR,/W3,/EHs,/nologo,/MD  gencode=arch=compute_30,code=sm_30 gencode=arch=compute_50,code=sm_50  gencode=arch=compute_60,code=sm_60  gencode=arch=compute_70,code=\"sm_70,compute_70\"
(and so on), which signals to me that the specified flag was not passed to the
nvcc
properly. And indeed, compilation fails with the following error:C:/path/mexGPUExample.cu(35): error: no instance of overloaded function "atomicAdd" matches the argument list. Argument types are: (double *, double)
The only other post I could find on this topic was this post on SO, but it is almost three years old and seemed to me more like a workaround  one which I do not understand even after some research, otherwise I would have tried it  rather than a true solution to the problem.
Is there a setting I missed, or can this simply not be done without a workaround?

Extract every kth column of arma::mat matrix in rcpp
I was struggling with subsetting columns of a matrix of class arma::mat.
Let's say
arma::mat X
is given, and I tried to create a vector of indicesIDX
, in order to doX.cols(IDX)
. Especially, the index vector has every kth integer from 1 to p (dimension ofX
). For example, one may be interested in every even columnsIDX=[2,4,6,8, ...]
.Based on this documentation, contiguous indices such as [0, 1, 2, ..., m1] can be extracted easily using
X.cols(0, m  1)
if m <= p. However, I couldn't find a good way to subset a matrix with the index vectorIDX
described above.I wonder how I complete this code to give a desired output.
My "subset_armamat.cpp" file looks like
// [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> using namespace Rcpp; using namespace arma; // [[Rcpp::export]] mat subset_armamat(mat X, int k){ uvec IDX = "every kth integer from 0 to X.ncols"; return X.cols(IDX); }
and R code to execute the defined function is
library("Rcpp") sourceCpp("subset_armamat.cpp") subset_armamat(matrix(1:10, 2, 5, byrow = T), 2)
This is expected to produce a 2by3 matrix as the following R code would give
> matrix(1:10, 2, 5, byrow = T)[,seq(1, 5, by = 2)] [,1] [,2] [,3] [1,] 1 3 5 [2,] 6 8 10
It would be very appreciated if you give any input.
p.s. I've tried to
 generate a sequence vector
seq(1,m) * 2
manually, but this does not work withX.cols()
.  or find an index using
find(seq(1,p) % 2 == 0)
, but%
operator does not work well betweenseq(1,p)
and2
.
 generate a sequence vector

The import libraries problem "Recognised but unhandled machine type (0x8664) in Import Library Format archive"
I tried to implement my work using netbeans, armadillo and their libraries. When the code was build, the compiler given the massage below:
mkdir p dist/Debug/GNULinux g++ o dist/Debug/GNULinux/boussinesqv2 build/Debug/GNULinux/main.o L. blas_win64_MT.dll lapack_win64_MT.dll blas_win64_MT.a lapack_win64_MT.a /usr/bin/ld: blas_win64_MT.a(blas_win64_MT.dll): Recognised but unhandled machine type (0x8664) in Import Library Format archive /usr/bin/ld: blas_win64_MT.a(blas_win64_MT.dll): Recognised but unhandled machine type (0x8664) in Import Library Format archive /usr/bin/ld: lapack_win64_MT.a(lapack_win64_MT.dll): Recognised but unhandled machine type (0x8664) in Import Library Format archive /usr/bin/ld: lapack_win64_MT.a(lapack_win64_MT.dll): Recognised but unhandled machine type (0x8664) in Import Library Format archive /usr/bin/ld: lapack_win64_MT.a(lapack_win64_MT.dll): Recognised but unhandled machine type (0x8664) in Import Library Format archive /usr/bin/ld: lapack_win64_MT.a(lapack_win64_MT.dll): Recognised but unhandled machine type (0x8664) in Import Library Format archive /usr/bin/ld: lapack_win64_MT.a(lapack_win64_MT.dll): Recognised but unhandled machine type (0x8664) in Import Library Format archive /usr/bin/ld: lapack_win64_MT.a(lapack_win64_MT.dll): Recognised but unhandled machine type (0x8664) in Import Library Format archive /usr/bin/ld: lapack_win64_MT.a(lapack_win64_MT.dll): Recognised but unhandled machine type (0x8664) in Import Library Format archive /usr/bin/ld: lapack_win64_MT.a(lapack_win64_MT.dll): Recognised but unhandled machine type (0x8664) in Import Library Format archive /usr/bin/ld: lapack_win64_MT.a(lapack_win64_MT.dll): Recognised but unhandled machine type (0x8664) in Import Library Format archive /usr/bin/ld: lapack_win64_MT.a(lapack_win64_MT.dll): Recognised but unhandled machine type (0x8664) in Import Library Format archive /usr/bin/ld: lapack_win64_MT.a(lapack_win64_MT.dll): Recognised but unhandled machine type (0x8664) in Import Library Format archive /usr/bin/ld: dist/Debug/GNULinux/boussinesqv2: Unable to sort relocs  they are of an unknown size make[2]: Leaving directory '/home/nutthavuth/NetBeansProjects/BoussinesqV2' make[1]: Leaving directory '/home/nutthavuth/NetBeansProjects/BoussinesqV2' BUILD SUCCESSFUL (total time: 2s)
After that, i decide to ran the code. The segfault was occurred as
RUN FINISHED; Segmentation fault; core dumped; real time: 4s; user: 80ms; system: 4s
The following picture may be help c++ compiler, linker
Do you have an idea for this problem. T^T

How to convert std::vector<std::vector<float>> to arma::mat<float>?
I am trying to convert vector of vector of float to arma Mat of float format in efficient way possible.
I am a beginner pardon me if any mistakes in code.
This is my implementation.
template<typename T> arma::Mat<T> vector_to_arma_mat(std::vector<std::vector<T>> &vec) { arma::Mat<T> mat(vec.size(),vec[0].size()); int row=0; for (std::size_t i=0; i<vec.size(); ++i) { auto &colvec = vec[i]; std::copy(colvec.begin(),colvec.end(),mat.begin_row(row)); ++row; } return std::move(mat); }
Is there any efficient way to construct directly from vector of vector to armadillo Matrix format?
Is there any better way to copy data other than above implementation?
Is there any way to avoid the for loop call just by taking vector of vector of float iterators?