# Efficiently accessing columns of a matrix in C

I have a `Nx x Ny` matrix `U` stored as a one-dimensional array of length `Nx*Ny`. In terms of application, each entry represents the solution value to some differential equation at the grid point `(x_i, y_j)`, although I don't think that's important.

I am not very proficient in C, but I know that it is row-major, so to avoid too many cache misses, it is better to loop over the columns first:

``````#define U(i,j) U[j+Ny*i]
for (int i=0; i<Nx; ++i)
for (int j=0; j<Ny; ++j)
U(i,j) = i*j; // example operation
``````

My algorithm requires me to do two different types of operations:

1. For row `i` of `U`, do some computation that outputs row `i` of another array `F`
2. For column `j` of `U`, do some computation that outputs column `j` of another array `G`

where `F` and `G` have the same length and "shape" as `U`. The goal is a computational step like this:

``````#define U(i,j) U[j+Ny*i]
#define F(i,j) F[j+Ny*i]
#define G(i,j) G[j+Ny*i]

for (int i; i<Nx; ++i)
/* use U(i,:) to compute F(i,:); the : is just pseudocode short-hand to indicate an entire column or row */

for (int j; j<Ny; ++j)
/* use U(:,j) to compute G(:,j) */

for (int i=0; i<Nx; ++i)
for (int j=0; j<Ny; ++j)
U(i,j) += F(i,j) + G(i,j); // example computation
``````

I am struggling a bit to see how to do this computation efficiently. The steps that operate on rows of `U` seem fine, but then the operations on the columns of `U` will be quite slow, and entering values into `G` in a column-wise fashion will also be slow.

One method I thought of would involve storing both `U` and its transpose, that way operations on columns of `U` can be done on rows of `UT`. But I have to do the computational steps many thousands of times, and it seems like explicitly computing a transpose would be even slower. Likewise, I could assemble the transpose of `G` so that I'm only ever entering values in a row-major fashion, but then in the step `U(i,j) += F(i,j) + G(j,i)`, I am now having to get column-wise values of `G`.

How should I deal with this situation in an efficient way?