Get labels for horizontal bar plot
I'm trying to plot a few horizontal bar chart, in a for loop. For each of my bars, I want to add a label to the top to show the count (exactly what the x_labels are)
import matplotlib.pyplot as plt
for i in my_list:
sns.set_palette("husl")
ax = sns.barplot(x=data.sort_values(ascending = False), y=data.sort_values(ascending = False).index)
ax.set_xlabel('number of clicks')
# for each of the bars, try to get a label
rects = ax.patches
for rect in rects:
y_value = rect.get_x() + rect.get_width() / 2
x_value = rect.get_height()
label = "{:.1f}".format(x_value)
plt.annotate(
label,
(x_value, y_value))
plt.show()
I got all my bar plots, but the labels are out of place and are not calculated right:
i am referencing this stackoverflow post: Adding value labels on a matplotlib bar chart
See also questions close to this topic

Divide each value in list array
I am trying to divide by 80 of each array value in the list. What I have tried is,
dfs = pd.read_excel('ff1.xlsx', sheet_name=None) dfs1 = {i:x.groupby(pd.to_datetime(x['date']).dt.strftime('%Y%m%d'))['duration'].sum() for i, x in dfs.items()} d = pd.concat(dfs1).groupby(level=1).apply(list).to_dict() print(d)
OP :
{'20170506': [197, 250], '20170507': [188, 80], '20170508': [138, 138], '20170509': [216, 222], '20170609': [6]}
But Expected OP :
1 : Divide by 80 {'20170506': [2, 3], '20170507': [2, 1], '20170508': [2, 2], '20170509': [2, 2], '20170609': [0]} 2 : total of each array and subtract each value (3+2 = 53 and 52) {'20170506': [3, 2], '20170507': [1, 2], '20170508': [2, 2], '20170509': [2, 2], '20170609': [0]}
How to do this using python?

Export list of data frames to CSVs in python
I've got a list of data frames I'm trying to preform a function on and export the results. The function spits out a result, and I then want to turn the results into a data frame and export to a .CSV. Here's what I currently have:
for df, filename in zip(df_list, filename_list): function(df) results_df = pd.DataFrame(function_results) results_df.to_csv(filename)
The error occurs when I try to export the .csv. If I just run the loop with the function and print results to the console like so:
for df in df_list: function(df)
It works fine. When I try to loop the .csv export though I get
Attribute Error: 'list' object has no attribute 'close'
Any ideas?

Tips on saving distribution shape parameters to sqlite3
I have been playing around with distributions in Python using scipy. I have some code which produces a dataframe containing fit parameters for my various data fits. Example is shown below
fit = pd.DataFrame({'domain': ['T1','T1', 'T1'], 'type':['A1', 'A2', 'A3'], 'dist':['triang', 'triang', 'trapz'], 'shp':[(0.1, None), (0.1, None), (0.2, 0.8)], 'loc':[3, 100, 85], 'scale':[60, 50, 95]})
I have a range of these inputs using different distributions, so the shape parameters change and need to be kept in the right order. Therefore, when It is retrieved from the database, it will fit in with the distribution order.So for example, using the Triangle distribution: (c, d, loc, scale).
I have tried to convert the tuple to a string (keeping the brackets) so it can be viewed in the database. For example, for the 3rd row.
a = str(fit['shp'][2])
which produces
'(0.2, 0.8)'
. This can be saved to the Sqlite3 database.However, I am not to sure how to convert it back to a tuple?
I did try,
b = tuple(a)
But this does not seem to produce the desired result, as shown below.
('(', '0', '.', '2', ',', ' ', '0', '.', '8', ')')
What would you guys recommend about saving the shape tuple to a database?
BJR

How to fix issue of variables being referenced before assignment
I'm having some issues with this code as the final line seems to be causing an issue:
from astropy.io import fits import matplotlib.pyplot as plt import numpy as np import glob import pdb #to debug filelist=glob.glob('5/img/*.fit') #reads in all fits files hdu = fits.open('MCtest_C1.fits') hdu[0].header coeff=hdu[0].data #added in [1].data print(coeff.shape) hdu.close() def make_cube(filelist,x0,x1,y0,y1): exptime= np.zeros(len(filelist)) n=np.zeros(len(filelist)) jd= np.zeros(len(filelist)) #pdb.set_trace() # used to debug for i,name in enumerate(filelist): hdu=fits.open(name) img=hdu[1].data nonlin = coeff[:,:,0]*img**3 + coeff[:,:,1]*img**2 + coeff[:,:,2]*img**1 + coeff[:,:,3]*img**0 exptime[i]=hdu[0].header['EXPTIME'] n[i]=int( (hdu[0].header['RUNSET'].split(':') )[0] ) jd[i]=hdu[0].header['JD'] hdu.close() img=img[x0:x1,y0:y1] #makes cutout idx=(n>3) #added 13/10/17 if (i==0): cube=img.copy() else: cube=np.dstack((cube,img)) print (n[i],i,cube.shape) print(n,cube.shape) idx=(n>3) hdu=fits.PrimaryHDU(cube[:,:,idx]) hdu.writeto( ('cube_corrected_%i_%i_%i_%i.fits' % (x0,x1,y0,y1) ) ,overwrite=True ) hdu=fits.PrimaryHDU(exptime[idx]) hdu.writeto( ('exptime_corrected_%i_%i_%i_%i.fits' % (x0,x1,y0,y1) ) ,overwrite=True ) hdu=fits.PrimaryHDU(jd[idx]) hdu.writeto( ('jd_corrected_%i_%i_%i_%i.fits' % (x0,x1,y0,y1) ) ,overwrite=True ) hdu=fits.PrimaryHDU(nonlin[idx]) hdu.writeto( ('nonlin_corrected_%i_%i_%i_%i.fits' % (x0,x1,y0,y1) ), overwrite=True) make_cube(filelist, 0, 512, 0, 512)
The error is as follows:
UnboundLocalError: local variable 'cube' referenced before assignment
I have tried different indentation of the final line, and it runs with different indentation but it doesn't run the code properly.
I've also looked up other issues associated with this error but I haven't seen anything that could help me.
Any help would be great! Thanks :)

Dates do not appear in a plot (using matplotlib)
I would like to plot the following data
>>> AllSummary upload_date Gross Loan Amount NewPerfColumn 0 20180219 1.532472e+11 2.624765e+08 1 20180301 1.475863e+11 1.361267e+08 2 20180312 1.376221e+11 1.133450e+08
As shown below, the column "upload_date" is definitely recognized as a date
AllSummary.dtypes upload_date datetime64[ns] Gross Loan Amount float64 NewPerfColumn float64 dtype: object
but when i plot the data it shows xxis as a number (not as date): the code is below
x=AllSummary['upload_date'] y1 =AllSummary['Gross Loan Amount'] y2 =AllSummary['NewPerfColumn'] fig, ax1 = plt.subplots() ax2 = ax1.twinx() ax1.plot(x, y1, 'g') ax2.plot(x, y2, 'b') ax1.set_xlabel('Date') ax1.set_ylabel('Gross Loan Amount', color='g') ax2.set_ylabel('Performance', color='b') plt.show()
why is it the case?

extrapolation of the curve
I am trying to extrapolate the fitting curve. There are many questions posted with a similar problem, but I don't understand any of them. I tried the simplest of the solutions, however, it is not working. Can anyone of you review the code please?
import numpy as np import matplotlib.pyplot as plt from scipy.optimize import curve_fit old_settings = np.seterr(all='ignore') x=np.array([0.21, 0.43, 0.50, 0.65, 0.86, 1.0, 1.5, 2.0, 2.5, 3.0, 4.0]) y=[43, 33, 30, 24, 18, 16, 14, 13, 14, 13, 13] yerr= [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] xerr=[0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01, 0.01,0.01,0.01] def func(x, a, b, c, d): return a * np.exp(b * x) + c * np.exp(d * x) def func1(x,m,n): return m *np.exp (n*x) def func2(x, s, t): return s *np.exp(t*x) # Here you give the initial parameters for a,b,c which Python then iterates over # to find the best fit popt, pcov = curve_fit(func,x,y,p0=(1, 1e6, 0.5, 1)) print(popt) # This contains your three best fit parameters p1 = popt[0] # This is your a p2 = popt[1] # This is your b p3 = popt[2] # This is your c p4 = popt[3] # This is your d residuals = y  func(x,p1, p2, p3,p4) fres = sum( (residuals**2)/func(x,p1, p2, p3,p4) ) # The chisqaure of your fit print(fres) """ Now if you need to plot, perform the code below """ curvey = func(x, p1, p2,p3,p4) # This is your y axis fitline curvey1 = func1(x, p1,p2) # This is your y axis fitline curvey2 = [func2(i,p3,p4) for i in x] # This is the curve I want to extrapolate to x=0 plt.plot(x, curvey, 'red', label= r"Fit: $A \cdot e^{a \cdot x}+ B \cdot e^ {b \cdot x}$" "\n" r"$\chi ^2: 0.52 $" ) plt.plot(x, curvey1, 'blue', label= r"Fit: $A \cdot e^{a \cdot x}$") plt.plot([0]+x, curvey2, 'green', label= r"Fit: $B \cdot e^{b \cdot x}$") plt.errorbar(x,y, yerr=yerr, xerr=xerr, fmt='.',label='experimental data') plt.legend(loc='best') plt.ylim(0,45) plt.xlabel('x') plt.ylabel('y') plt.show()
The curve I want to extrapolate is given in the line 42.

How to adjust x axis labels in a seaborn plot?
I am trying to plot histogram similar to this:Actual plot
However, I am unable to customize the x axis labels similar to the above figure. My seaborn plot looks something like this, my plot
I want the same xaxis labels ranging from 0 to 25000 with equal interval of 5000. It would be great if anyone can guide me in the right direction?
Code for my figure:
sns.set_style('darkgrid') kws = dict(linewidth=.3, edgecolor='k') g = sns.FacetGrid(college, hue='Private',size=6, aspect=2, palette = 'coolwarm') g = g.map(plt.hist, 'Outstate', bins=24,alpha = 0.7,**kws).add_legend()

Add Second Colorbar to a Seaborn Heatmap / Clustermap
I was trying to help someone add a colorbar for the vertical blue bar in the image below. We tried many variations of
plt.colorbar(row_colors)
(like above and belowsns.clustermap()
) and looked around online for 2 hours, but no luck. We just want to add a colorbar for the blues, please help!import pickle import numpy as np import seaborn as sns import pandas as pd import matplotlib.pyplot as plt feat_mat, freq, label = pickle.load(open('file.pkl', 'rb')) feat_mat_df = pd.DataFrame(feat_mat[4]) freq_df = pd.DataFrame(freq) freq_df_transposed = freq_df.transpose() my_palette = dict(zip(set(freq_df_transposed[int('4')]), sns.color_palette("PuBu", len(set(freq_df_transposed[int('4')])))))) row_colors = freq_df_transposed[int('4')].map(my_palette) sns.clustermap(feat_mat_df, metric="euclidean", standard_scale=1, method="complete", cmap="coolwarm", row_colors = row_colors) plt.show()

Changing color scale/gradient vertically in bar like plot using seaborn
I wanted to have vertical gradient for each bar of the seaborn barplot/countplot ,
#to reproduce above plot import numpy as np import matplotlib.pyplot as plt import seaborn as sns sns.set(style="whitegrid", color_codes=True) np.random.seed(sum(map(ord, "categorical"))) titanic = sns.load_dataset("titanic") sns.countplot(x="deck", data=titanic, palette="Greens_d") plt.show()
This image has horizontal gradient but I want the gradient to be vertical, like the linear down or linear up gradient in Excel https://support.office.com/enus/article/addagradientcolortoashape11cf6392723c4be8840ab2dab4b2ba3e
See the example here from https://matplotlib.org/gallery/lines_bars_and_markers/gradient_bar.html for vertical gradient Neglect the background, colour is immaterial.
p.s. Newbie to seaborn