'Series' object is not callable  linspace function
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import statsmodels.api as sm
twitter_data = pd.read_csv('result.csv')
#histogram of x axis followers and y axis retweets
plt.figure()
hist1,edges1 = np.histogram(twitter_data.friends)
plt.bar(edges1[:1],hist1,width=edges1[1:]edges1[:1])
plt.scatter(twitter_data.followers,twitter_data.retwc)
#results of x axis retwx and y axis followers
y = twitter_data.retwc
X = twitter_data.followers
X = sm.add_constant(X)
lr_model = sm.OLS(y,X).fit()
print(lr_model.summary())
#scatter plot full for retc and followers
X_prime = np.linspace(X.followers.min(), X.followers().max(),100)
X_prime = sm.add_constant(X_prime)
I am getting error
series object not callable
and it is pointing to the X_prime = np.linspace(X.followers.min(), X.followers().max(),100) line
do you know?
how many words do you know
See also questions close to this topic

Trouble operating on my pandas dataframe in a loop?
I am trying to loop through a list that contains two pandas data frames:
dataset = [df_train, df_test] for df in dataset: df = pd.get_dummies(df, columns=['A', 'B','C'])
I was expecting this to give me updated versions of df_train and df_test with the dummy variables included, however it leaves them unchanged. When I check df it is the expected updated df_test with the dummy variables. I am guessing this is something to do with Pythons memory allocation and only referencing the variable or something like that?
I also tried the following but to the same result:for df in dataset: df = pd.get_dummies(df.copy(), columns=['A', 'B','C'])
I also tried without success:
for i in range(len(dataset)): df = dataset[i] dataset[i] = pd.get_dummies(df, columns=['A', 'B','C'])
I currently have a workaround which is:
df_train = pd.get_dummies(df_train, columns=['A', 'B','C']) df_test = pd.get_dummies(df_test, columns=['A', 'B','C'])
This is fine because I only have two dataframes but I would like to know what I am missing about what python is doing that prevents me overwriting df but not df_train and df_test.
I had no problem doing similar operations elsewhere in my code so long as I had inplace=True set, e.g.for df in dataset: df.drop('A', axis=1, inplace = True)
The code above ran fine which I can only assume has to do with the fact that its working inplace. This seems like a python memory thing? Can anyone explain please?

Interactive pie chart in bokeh; reactively swap in variables to plot
I am fairly new to Python and Bokeh so I am still trying to grasp how to interactively swap in variables to plot. At the moment I am trying to create a pie chart with bokeh widgets.
Here is what the pie chart looks like :
from operator import index from bokeh.models.widgets.markups import Div import numpy as np from numpy.lib import source import pandas as pd from bokeh.io import curdoc,show from bokeh.layouts import column, row, gridplot from bokeh.models import ColumnDataSource, Select, Slider, BoxSelectTool, LassoSelectTool, Tabs, Panel, LinearColorMapper, ColorBar, BasicTicker, PrintfTickFormatter, MultiSelect, DataTable, TableColumn from bokeh.plotting import figure, curdoc from bokeh.palettes import viridis, gray, cividis, Category20, Category20c from bokeh.transform import factor_cmap,cumsum from sklearn.model_selection import train_test_split from sklearn.linear_model import LinearRegression from sklearn.metrics import classification_report, confusion_matrix, mean_squared_error, r2_score, recall_score, f1_score from sklearn.preprocessing import StandardScaler, PolynomialFeatures from sklearn.cluster import KMeans from sklearn.svm import SVC from sklearn.decomposition import PCA from math import pi from bokeh.transform import cumsum np.random.seed(42) print("step 1") #define the categorical variable category_a = ['A','B','C'] category_b = ['X','Y','Z'] print("step 2") df_random = pd.DataFrame({ 'id': np.arange(0, 100), 'date': pd.date_range(start='1/1/2021', periods=100, freq='D'), 'month':np.random.randint(1, 12, 100), 'sensor_1': np.random.uniform(0, 1,100), 'sensor_2': np.random.uniform(10, 150, 100), 'sensor_3': np.random.randint(0, 90, 100), 'sensor_4': np.random.randint(0, 450, 100), 'sensor_5': np.random.randint(0, 352, 100), 'categorya': np.random.choice(category_a, 100, p=[0.2, 0.4, 0.4]), 'categoryb': np.random.choice(category_b, 100, p=[0.6, 0.2, 0.2]), }) column_choices = { "Sensor 1": "sensor_1", "Sensor 2": "sensor_2", "Sensor 3": "sensor_3", "Sensor 4": "sensor_4", "Sensor 5": "sensor_5" } column_choices_list = list(column_choices.values()) groupeddf = df_random.groupby('categorya')[column_choices_list].mean() print(groupeddf) groupeddf['angle'] = groupeddf['sensor_1']/groupeddf['sensor_1'].sum() * 2*pi groupeddf['color'] = Category20c[len(groupeddf)] source_pie = ColumnDataSource(data=groupeddf) p_pie = figure(plot_height=350, title="Pie Chart", toolbar_location=None, tools="") p_pie.wedge(x=0, y=1, radius=0.4, start_angle=cumsum('angle', include_zero=True), end_angle=cumsum('angle'), line_color="white", fill_color='color', legend='categorya', source=source_pie) show(p_pie)
However, I am trying to add in the widget to change which column in
df_random
which is where I am tripping up:... selecty_pie_var = Select(title="Select variable:", options=list(column_choices_list)) def callback(attr,old,new): df_platform['angle'] = df_platform[selecty_pie_var.value]/df_platform[selecty_pie_var.value].sum() * 2*pi df_platform['color'] = Category20c[len(df_platform)] source_pie.data = df_platform return source_pie selecty_pie_var.js_on_change('value', callback) layoutwithwidgets = row(selecty_pie_var,p_pie) show(layoutwithwidgets)
And I imagine that the above to any competent Python user, just looks like a mess. Would someone be able to help me link the widget and the pie chart together so that it updates whenever I change the column? For example,
sensor_1
is being plotted now, but I would like to be able to change it tosensor_2
,sensor_3
, etc.Any help is greatly appreciated :)

Different result pandas Groupby in Python for DF based on Excel and CSV
To make my code faster, I want to switch from Excel input to CSV input data. First, I create two df's that are exactly the same.
demand_data = pd.ExcelFile("Input Data\Historical Demand.xlsx") FY20 = pd.read_excel(demand_data, 'Data FY20') FY20b = pd.read_csv("Input Data\Historical Demand FY20.csv")
The resulting df's are: Based on Excel Based on CSV
Next, I want to group my df by some columns using pandas groupby and sum over some colums. I use the following code:
FY20 = FY20.groupby(['SKU', 'Material', 'Plant'])[["OrderQuantity","DeliveredQuantity"]].sum().reset_index() FY20b = FY20b.groupby(['SKU', 'Material', 'Plant'])[["OrderQuantity","DeliveredQuantity"]].sum().reset_index()
This is the result: Result based on Excel DF Result based on CSV DF
This does not make any sense to me, since the two dataframes are exactly the same, but the result is not. How do I get the same groupby result from the CSV based dataframe?

Code is returning a list of numbers when it should only return 1
Sorry for bad terminology here, I'm new to python.
I have a code that produces a long list of integers (180 of them), but only a single '1'. I want to count the number of 1s in this list and then print that. I know it should only be printing 1, but instead it produces a 180 number long list, where there are 179 '0's, and 1 '1', I'm very confused and can't find anything like this online
for i in range(0, 179): imgr = ndimage.rotate(img, i) start_row, start_col = int(0), int(0) end_row, end_col = int(Y), int(width) cropped_top = imgr[start_row:end_row , start_col:end_col] start_row, start_col = int(Y), int(0) end_row, end_col = int(height), int(width) cropped_bot = imgr[start_row:end_row , start_col:end_col] D1 = cv2.matchShapes(cropped_top, cropped_bot, cv2.CONTOURS_MATCH_I2, 0) D2 = D1*100 D3 = math.ceil(D2) D4 = np.array(D3) D5 = (D4 == 1).sum() print(D5)
Edited to include code as text instead of image, made a mistake. I expect print(D5) to just print the number 1 as that is the actual number of '1's, but it doesn't

Dividing a country, geographically, into equidistant points (Python)
I am trying to split the map of panama into equidistant lat,long coordinates and am using numpy, to create the linspace that will separate them out, based off the lat longs at the bottomLeft, bottomRight, topLeft, and topRight of the map. To do so I'm using a nested for loop.
The problem is that, despite getting the correct number of unique latitudes in the final dictionary, I'm only getting one unique longitude. Can you help me correct the mistake in the loop so that the final dictionary tfed into the final dataframe gives 646 in response to
grid_centroid.long.nunique()
Thanks a bunch!
import pandas as pd import numpy as np bottomLeft = (7.239013, 82.94546842973114) bottomRight = (7.239013, 77.177479) topLeft = (9.62079503922844, 82.94546842973114) topRight = (9.62079503922844, 77.177479) df = pd.read_csv('sites_unique.csv') cols = np.linspace(bottomLeft[1], bottomRight[1], num=276) rows = np.linspace(bottomLeft[0], topLeft[0], num=646) df['col'] = np.searchsorted(cols, df['Average of lat'], 'right') df['row'] = np.searchsorted(rows, df['Average of long'], 'right') grid_dict = {'lat':[],'long':[]} for l in range(275): lat_index_init = l for i in range(645): lon_init_index = i grid_dict['lat'].append(rows[lat_index_init]) grid_dict['long'].append(cols[lon_index_init]) i + 1 l + 1 i = 0 grid_centroid = pd.DataFrame(grid_dict)

How to iterate over a numpy matrix based on a condition of another nump array?
I have a matrix X, and the labels to each vector of that matrix as a np array Y. If the value of Y is +1 I want to multiply the vector of matrix X by another vector W. If the value of Y is 1 I want to multiply the vector of matrix X by another vector Z.
I tried the following loop:
for i in X: if Y[i] == 1: np.sum(np.multiply(i, np.log(w)) + np.multiply((1i), np.log(1w))) elif Y[i] == 1: np.sum(np.multiply(i, np.log(z)) + np.multiply((1i), np.log(1z)))
IndexError: arrays used as indices must be of integer (or boolean) type
i is the index of X, but I'm not sure how to align the index of X to the value in that index of Y.
How can I do this?