Missing points in polar plot after interpolation
I have 17 measurement points, and I would like to extrapolate between them within is circle. Here is what I've tried:
import numpy as np
from scipy.interpolate import griddata
import matplotlib.pyplot as plt
import math
# Measurement data
x1 = np.array([[-144.00, -101.80, -101.80, -75.00, -53.00, -53.00, 0.00, 0.00, 0.00, 0.00, 0.00, 53.00, 53.00, 75.00, 101.80, 101.80, 144.00]])
y1 = np.array([[0.00, 101.80, -101.80, 0.00, 53.00, -53.00, 144.00, 75.00, 0.00, -75.00, -144.00, 53.00, -53.00, 0.00, 101.80, -101.80, 0.00]])
z1 = np.array([148.3861807, 148.9051447, 148.0415147, 147.9976293, 147.98485, 147.9579673, 148.89261, 148.0217707, 147.9312247, 147.7952, 147.3225247, 148.2489567, 148.2120013, 148.3169953, 149.578092, 147.9356893, 148.556672])
# Convert to polar coords
r1 = np.sqrt(x1**2 + y1**2)
t1 = np.arctan2(y1, x1)
# Add cyclic points
for i in range(0,x1.shape[1]):
if np.arctan2(y1[0,i], x1[0,i]) == math.pi:
print(np.sqrt(x1[0,i]**2 + y1[0,i]**2), np.arctan2(y1[0,i], x1[0,i]))
r1 = np.append(r1,([np.sqrt(x1[0,i]**2 + y1[0,i]**2)]))
t1 = np.append(t1,([-np.arctan2(y1[0,i], x1[0,i])]))
z1 = np.append(z1,([z1[i]]))
# New points
r2, t2 = np.meshgrid(np.linspace(np.min(r1), np.max(r1), num=50),np.linspace(np.min(t1), np.max(t1), num=50))
# Griddata function used to interpolate between scattered data
z2 = griddata(np.concatenate((np.array([t1]), np.array([r1])), axis=0).T, np.array([z1]).T, (t2, r2), method='linear')
# Surface plots
fig, ax = plt.subplots(figsize=(10,10), subplot_kw=dict(projection='polar'))
ax.contourf(t2, r2, np.squeeze(z2), 50, cmap='jet')
for i in range(len(t1)):
plt.text(t1[i], r1[i], "{:.2f}".format(z1[i]), ha="center", va="center", color="k")
plt.show()
The resultant polar plot, however, has missing points in the middle. How could I get rid of them?
See also questions close to this topic
-
Sparse Matrix Creation : KeyError: 579 for text datasets
I am trying to use the make_sparse_matrix function to create a sparse matrix for my text dataset, and I face KeyError: 579. Does anyone has any leads on the root of the error.
def make_sparse_matrix(df, indexed_words, labels): """ Returns sparse matrix as dataframe. df: A dataframe with words in the columns with a document id as an index (X_train or X_test) indexed_words: index of words ordered by word id labels: category as a series (y_train or y_test) """ nr_rows = df.shape[0] nr_cols = df.shape[1] word_set = set(indexed_words) dict_list = [] for i in range(nr_rows): for j in range(nr_cols): word = df.iat[i, j] if word in word_set: doc_id = df.index[i] word_id = indexed_words.get_loc(word) category = labels.at[doc_id] item = {'LABEL': category, 'DOC_ID': doc_id, 'OCCURENCE': 1, 'WORD_ID': word_id} dict_list.append(item) return pd.DataFrame(dict_list) make_sparse_matrix( X_train, word_index, y_test )
X_train is a DF that contains one single word in each cell, word_index contains all the index of words and y_test stores all labels.
The Key Error I am facing is:
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) ~\New folder\envs\geo_env\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 3079 try: -> 3080 return self._engine.get_loc(casted_key) 3081 except KeyError as err:
pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 579
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last) in
in make_sparse_matrix(df, indexed_words, labels) 20 doc_id = df.index[i] 21 word_id = indexed_words.get_loc(word) ---> 22 category = labels.at[doc_id] 23 24 item = {'LABEL': category, 'DOC_ID': doc_id,
~\New folder\envs\geo_env\lib\site-packages\pandas\core\indexing.py in getitem(self, key) 2154 return self.obj.loc[key] 2155 -> 2156 return super().getitem(key) 2157 2158 def setitem(self, key, value):
~\New folder\envs\geo_env\lib\site-packages\pandas\core\indexing.py in getitem(self, key) 2101 2102 key = self._convert_key(key) -> 2103 return self.obj._get_value(*key, takeable=self._takeable) 2104 2105 def setitem(self, key, value):
~\New folder\envs\geo_env\lib\site-packages\pandas\core\series.py in _get_value(self, label, takeable) 959 960 # Similar to Index.get_value, but we do not fall back to positional --> 961 loc = self.index.get_loc(label) 962 return self.index._get_values_for_loc(self, loc, label) 963
~\New folder\envs\geo_env\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 3080 return self._engine.get_loc(casted_key) 3081 except KeyError as err: -> 3082 raise KeyError(key) from err 3083 3084 if tolerance is not None:
KeyError: 579
-
Finding part of string in list of strings
GCM = ([519,520,521,522,533],[534,525],[526,527,530,531], [4404]) slice = int(str(df["CGM"][row_count])[:3])
I am looking through a row in a csv file and taking out the number I want. i want the number that starts with the number I have in
GCM
. since they represent info I want in other columns. this has working fine with the slice function because all the number i wanted started with 3 digits. now that i need to look for any number that starts with4404
and later on going to probably need to look for57052
the slice function no longer work.is there a way I can, instead of slicing and comparing to list, can take 5 digit number and see if part of it is in list. preferably look for it starting 3 or more same digits. the real point of that part of code is finding out which list in
GCM
list the number is. it need to be able to have the number44042
and know that the part of it a care about is inGCM[3]
, but on the other side do not want it to say that32519
is inDCM[0]
since I only care about number that start with519
not ends with it.ps. I am norwegian and have been learning programming by myself. been some long nights. so something here can be lost in translation.
-
How to forecast a time series out-of-sample using an ARIMA model in Python?
I have seen similar questions at Stackoverflow. But, either the questions were different enough or if similar, they actually have not been answered. I gather it is something that modelers run into often, and have a challenge solving.
In my case I am using two variables, one Y and one X with 50 time series sequential observations. They are both random numbers representing % changes (they could be anything you want, their true value does not matter. This is just to set up an example of my coding problem). Here are my basic codes to build this ARIMAX(1,0,0) model.
import pandas as pd import statsmodels.api as sm import statsmodels.formula.api as smf df = pd.read_excel('/Users/gaetanlion/Google Drive/Python/Arima/df.xlsx', sheet_name = 'final') from statsmodels.tsa.arima_model import ARIMA endo = df['y'] exo = df['x']
Next, I build the ARIMA model, using the first 41 observations
modelho = sm.tsa.arima.ARIMA(endo.loc[0:40], exo.loc[0:40], order =(1,0,0)).fit() print(modelho.summary())
So far everything works just fine.
Next, I attempt to forecast or predict the next 9 observations out-of-sample. Here I want to use the X values over these 9 observations to predict Y. And, I just can't do it. I am showing below just the one code, that I think gets me the closest to where I need to go.
modelho.predict(exo.loc[41:49], start = 41, end = 49, dynamic = False) TypeError: predict() got multiple values for argument 'start'
-
How to get the first Dataframe Python Pandas
I have a Dataframe named "s_copy". This database contains information about ships (example: name, speed, latitude, longitude and mmsi).
I want to group this database by mmsi (vessel identification number), at the moment I am using this code:
vectors = dict(tuple(s_copy.groupby('mmsi')))
Once the database is grouped, I want to be able to use the information for each mmsi. I have tried using indices as if it were a vector, but it doesn't work.
first_vector = vectors[0] KeyError: 0
-
Ctypes function not found
I try to use
ctypes
to run some cuda code in python. After compilation and loading the.so
file I run into an error telling me that thecuda
function does not exist. I tried using an example in plainc
before and that worked. Is there something wrong I do with compilation?The Cuda code
#include <stdio.h> #include <stdlib.h> #define BLOCK_SIZE 16 struct Matrix { int width; int height; float *elements; }; __global__ void MatMulKernel(Matrix A, Matrix B, Matrix C){ // runs for each col - row pair float tmpVal = 0; int col = blockIdx.x * blockDim.x + threadIdx.x; int row = blockIdx.y * blockDim.y + threadIdx.y; for (int i = 0; i < A.width; ++i) tmpVal += A.elements[row * A.width + i] * B.elements[i * B.width + col]; C.elements[ row * C.width + col ] = tmpVal; } void mMul( Matrix *A, Matrix *B, Matrix *C ){ Matrix d_A, d_B, d_C; // Matrix d_A d_A.width = A->width; d_A.height = A->height; size_t sizeA = A->width * A->height * sizeof(float); // dynamically allocate cudaMemory for elemenst array cudaMalloc(&d_A.elements, sizeA); cudaMemcpy(d_A.elements, A->elements, sizeA, cudaMemcpyHostToDevice); // Matrix d_B d_B.width = B->width; d_B.height = B->height; size_t sizeB = B->width * B->height * sizeof(float); // dynamically allocate cudaMemory for elemenst array cudaMalloc(&d_B.elements, sizeB); cudaMemcpy(d_B.elements, B->elements, sizeB, cudaMemcpyHostToDevice); // Matrix d_C d_C.width = C->width; d_C.height = C->height; size_t sizeC = C->width * C->height * sizeof(float); // dynamically allocate cudaMemory for elemenst array cudaMalloc(&d_C.elements, sizeC); // 16 * 16 = 256 threads per block dim3 dimBlock(BLOCK_SIZE, BLOCK_SIZE); // Blocks per grid dim3 dimGrid(B->width / dimBlock.x, A->height / dimBlock.y); // calling the Kernel MatMulKernel<<<dimGrid, dimBlock>>>(d_A, d_B, d_C); // copy results from result matrix C to the host again cudaMemcpy(C->elements, d_C.elements, sizeC, cudaMemcpyDeviceToHost); // free the cuda memory cudaFree(d_A.elements); cudaFree(d_B.elements); cudaFree(d_C.elements); }
Then I compile into
Sequential_Cuda_Pyton.so
nvcc --shared --compiler-options '-fPIC' -o Sequential_Cuda_Python.so Sequential_Cuda_Python.cu
The python - ctypes code:
import numpy as np from numpy.ctypeslib import ndpointer from ctypes import * class Matrix(Structure): _fields_ = [("width", c_int), ("height", c_int), ("elements", POINTER(c_float))] libc = CDLL("./Sequential_Cuda_Python.so") libc.mMul.argtypes = [ POINTER(Matrix), POINTER(Matrix), POINTER(Matrix) ]
The Error, seems like the function has not been found
Traceback (most recent call last): File "cuda_arr.py", line 17, in <module> libc.mMul.argtypes = [ POINTER(Matrix), POINTER(Matrix), POINTER(Matrix) ] File "/usr/lib/python3.8/ctypes/__init__.py", line 386, in __getattr__ func = self.__getitem__(name) File "/usr/lib/python3.8/ctypes/__init__.py", line 391, in __getitem__ func = self._FuncPtr((name_or_ordinal, self)) AttributeError: ... /Sequential_Cuda_Python.so: undefined symbol: mMul
-
How to condense multiples np.array_split
i would like some help. I have a dataframe of 28 columns and 3M index. I need to split each currency, after that i will split type of price array and them i need to split in x elements. My question is, where i have a bunch of np.array_split(), can i make it shorter? i would be appreciate for some help
@staticmethod def pandas_to_array_big(data, comp): array = np.array( [ np.transpose(np.array(data[['EURCHF_Open','EURCHF_High','EURCHF_Low','EURCHF_Close']])), np.transpose(np.array(data[['EURGBP_Open','EURGBP_High','EURGBP_Low','EURGBP_Close']])), np.transpose(np.array(data[['EURJPY_Open','EURJPY_High','EURJPY_Low','EURJPY_Close']])), np.transpose(np.array(data[['EURNZD_Open','EURNZD_High','EURNZD_Low','EURNZD_Close']])), np.transpose(np.array(data[['EURUSD_Open','EURUSD_High','EURUSD_Low','EURUSD_Close']])), np.transpose(np.array(data[['EURAUD_Open','EURAUD_High','EURAUD_Low','EURAUD_Close']])), np.transpose(np.array(data[['EURCAD_Open','EURCAD_High','EURCAD_Low','EURCAD_Close']])), np.transpose(np.array(data[['GBPAUD_Open','GBPAUD_High','GBPAUD_Low','EURGBP_Close']])), np.transpose(np.array(data[['GBPCHF_Open','GBPCHF_High','GBPCHF_Low','EURGBP_Close']])), np.transpose(np.array(data[['GBPJPY_Open','GBPJPY_High','GBPJPY_Low','EURGBP_Close']])), np.transpose(np.array(data[['GBPCAD_Open','GBPCAD_High','GBPCAD_Low','EURGBP_Close']])), np.transpose(np.array(data[['GBPUSD_Open','GBPUSD_High','GBPUSD_Low','EURGBP_Close']])), np.transpose(np.array(data[['GBPNZD_Open','GBPNZD_High','GBPNZD_Low','EURGBP_Close']])), np.transpose(np.array(data[['USDCHF_Open','USDCHF_High','USDCHF_Low','EURUSD_Close']])), np.transpose(np.array(data[['USDJPY_Open','USDJPY_High','USDJPY_Low','EURUSD_Close']])), np.transpose(np.array(data[['AUDUSD_Open','AUDUSD_High','AUDUSD_Low','EURAUD_Close']])), np.transpose(np.array(data[['NZDUSD_Open','NZDUSD_High','NZDUSD_Low','EURNZD_Close']])), np.transpose(np.array(data[['USDCAD_Open','USDCAD_High','USDCAD_Low','EURUSD_Close']])), np.transpose(np.array(data[['AUDJPY_Open','AUDJPY_High','AUDJPY_Low','EURAUD_Close']])), np.transpose(np.array(data[['CADJPY_Open','CADJPY_High','CADJPY_Low','EURCAD_Close']])), np.transpose(np.array(data[['CHFJPY_Open','CHFJPY_High','CHFJPY_Low','EURCHF_Close']])), np.transpose(np.array(data[['NZDJPY_Open','NZDJPY_High','NZDJPY_Low','EURNZD_Close']])), np.transpose(np.array(data[['AUDCHF_Open','AUDCHF_High','AUDCHF_Low','EURAUD_Close']])), np.transpose(np.array(data[['CADCHF_Open','CADCHF_High','CADCHF_Low','EURCAD_Close']])), np.transpose(np.array(data[['NZDCHF_Open','NZDCHF_High','NZDCHF_Low','EURNZD_Close']])), np.transpose(np.array(data[['AUDNZD_Open','AUDNZD_High','AUDNZD_Low','EURAUD_Close']])), np.transpose(np.array(data[['NZDCAD_Open','NZDCAD_High','NZDCAD_Low','EURNZD_Close']])), np.transpose(np.array(data[['AUDCAD_Open','EURAUD_High','AUDCAD_Low','AUDCAD_Close']])), ], dtype=np.float64) massive = np.array([ [ np.array_split(array[0][0], comp), np.array_split(array[0][1], comp), np.array_split(array[0][2], comp), np.array_split(array[0][3], comp) ], [ np.array_split(array[1][0], comp), np.array_split(array[1][1], comp), np.array_split(array[1][2], comp), np.array_split(array[1][3], comp) ], [ np.array_split(array[2][0], comp), np.array_split(array[2][1], comp), np.array_split(array[2][2], comp), np.array_split(array[2][3], comp) ], [ np.array_split(array[3][0], comp), np.array_split(array[3][1], comp), np.array_split(array[3][2], comp), np.array_split(array[3][3], comp) ], [ np.array_split(array[4][0], comp), np.array_split(array[4][1], comp), np.array_split(array[4][2], comp), np.array_split(array[4][3], comp) ], [ np.array_split(array[5][0], comp), np.array_split(array[5][1], comp), np.array_split(array[5][2], comp), np.array_split(array[5][3], comp) ], [ np.array_split(array[6][0], comp), np.array_split(array[6][1], comp), np.array_split(array[6][2], comp), np.array_split(array[6][3], comp) ], [ np.array_split(array[7][0], comp), np.array_split(array[7][1], comp), np.array_split(array[7][2], comp), np.array_split(array[7][3], comp) ], [ np.array_split(array[8][0], comp), np.array_split(array[8][1], comp), np.array_split(array[8][2], comp), np.array_split(array[8][3], comp) ], [ np.array_split(array[9][0], comp), np.array_split(array[9][1], comp), np.array_split(array[9][2], comp), np.array_split(array[9][3], comp) ], [ np.array_split(array[10][0], comp), np.array_split(array[10][1], comp), np.array_split(array[10][2], comp), np.array_split(array[10][3], comp) ], [ np.array_split(array[11][0], comp), np.array_split(array[11][1], comp), np.array_split(array[11][2], comp), np.array_split(array[11][3], comp) ], [ np.array_split(array[12][0], comp), np.array_split(array[12][1], comp), np.array_split(array[12][2], comp), np.array_split(array[12][3], comp) ], [ np.array_split(array[13][0], comp), np.array_split(array[13][1], comp), np.array_split(array[13][2], comp), np.array_split(array[13][3], comp) ], [ np.array_split(array[14][0], comp), np.array_split(array[14][1], comp), np.array_split(array[14][2], comp), np.array_split(array[14][3], comp) ], [ np.array_split(array[15][0], comp), np.array_split(array[15][1], comp), np.array_split(array[15][2], comp), np.array_split(array[15][3], comp) ], [ np.array_split(array[16][0], comp), np.array_split(array[16][1], comp), np.array_split(array[16][2], comp), np.array_split(array[16][3], comp) ], [ np.array_split(array[17][0], comp), np.array_split(array[17][1], comp), np.array_split(array[17][2], comp), np.array_split(array[17][3], comp) ], [ np.array_split(array[18][0], comp), np.array_split(array[18][1], comp), np.array_split(array[18][2], comp), np.array_split(array[18][3], comp) ], [ np.array_split(array[19][0], comp), np.array_split(array[19][1], comp), np.array_split(array[19][2], comp), np.array_split(array[19][3], comp) ], [ np.array_split(array[20][0], comp), np.array_split(array[20][1], comp), np.array_split(array[20][2], comp), np.array_split(array[20][3], comp) ], [ np.array_split(array[21][0], comp), np.array_split(array[21][1], comp), np.array_split(array[21][2], comp), np.array_split(array[21][3], comp) ], [ np.array_split(array[22][0], comp), np.array_split(array[22][1], comp), np.array_split(array[22][2], comp), np.array_split(array[22][3], comp) ], [ np.array_split(array[23][0], comp), np.array_split(array[23][1], comp), np.array_split(array[23][2], comp), np.array_split(array[23][3], comp) ], [ np.array_split(array[24][0], comp), np.array_split(array[24][1], comp), np.array_split(array[24][2], comp), np.array_split(array[24][3], comp) ], [ np.array_split(array[25][0], comp), np.array_split(array[25][1], comp), np.array_split(array[25][2], comp), np.array_split(array[25][3], comp) ], [ np.array_split(array[26][0], comp), np.array_split(array[26][1], comp), np.array_split(array[26][2], comp), np.array_split(array[26][3], comp) ], [ np.array_split(array[27][0], comp), np.array_split(array[27][1], comp), np.array_split(array[27][2], comp), np.array_split(array[27][3], comp) ] ]) return massive
-
Pandas interpolate - "Time"
So, I went through the website on pandas interpolation. I noticed we have linear, time, pad, nearest, split, and so on. Multiple methods that can handle missing data.
I would like to learn how the interpolation - time works. Imagine we are getting sensor data every 5 minutes and there is a pattern from morning till night and it's almost consistent daily. but you might have slight variations from time to time. How do I use the interpolation time and how will it work if for two days we don't have data and one day we have data and then again the subsequent days the day filled for half of the day and so on and then again there is continuous data for many many days.
How does the time interpolation handle it? imagine a scenario where I am capturing data from an API onto GCP or AWS or so on. If I use the time interpolation, will it use the earlier day same time data and pastern the missing spots? Or the same-day earlier timeline data until the gap? what is the formula or the methodology used here to fill missing values?
-
R: Inter- and extrapolate values in dataframe by matching column of another dataframe
I have two dataframes:
df1 <- data.frame(levels = c(1, 3, 5, 7, 9), values = c(2.2, 5.3, 7.9, 5.4, 8.7)) df2 <- data.frame(levels = c(1, 4, 8, 12)) # other columns not necessary
I want the df1$values to be interpolated to the df2$levels, based on what the numbers in df1$levels are. So there is some interpolation, but also extrapolation to level 12 in the second dataframe.
-
Matplotlib Scatter Interpolation line
I have following scatter plot with two dataframes (users and customers).
x = [users] y = [customers] plt.scatter(x,y) plt.show()
The scatter plot is working but, how do I find the right way to add an interpolation line between the label points?
-
Determining what your f(x,y) is for double integrals
How do you determine the integrand for double integrals? Some times the expression is given and sometimes you have to determine it your self, how do you find what your f(x,y) will be ?
-
How to measure distance on linear polar transformed data using OpenCV
I am working on some image data that I am doing a polar transform on where I want to measure the width of bright rings in a circular type object.
Example image:
So far I have something like this using faux data:
import cv2 import numpy as np img = cv2.imread('testimg.tif') img_gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #threshold image to calculate center of object ret,thresh = cv2.threshold(img_gry,254,255,cv2.THRESH_BINARY_INV) M = cv2.moments(thresh) cX = int(M["m10"] / M["m00"]) cY = int(M["m01"] / M["m00"]) #convert white space around object to 0 intensity img_gry[img_gry == 255] = 0 #calculate radius of image to be used for polar transform radius = np.sqrt(((img_gry.shape[0]/2.0)**2.0)+((img_gry.shape[1]/2.0)**2.0)) #transform using center coordinates and radius polar_image = cv2.linearPolar(img_gry,(cX, cY), radius, cv2.WARP_FILL_OUTLIERS) polar_image = polar_image.astype(np.uint8) #add gaussian smoothing polar_blurred = cv2.GaussianBlur(polar_image,(3,3),0)
This transform looks something like this:
And I will be looking at slices of the data that show intensity, like such:
My question from here is what formula to use to calculate the width of the bright peaks in the image. I don't really know what type of axes are used for displaying this transformation, which underlies my problem. For example, my non-transformed peaks have a width of ~3px, but the transformed data has a peak width of 8 units (radians? no clue). I'm wondering how exactly I can estimate the actual width of my non-transformed data based off the "distance" in this polar transformed data.