Python/MatPlotLib: Set the Labels to Months
I have been trying to figure this out for hours. I am working with a dataset on Trump's approval ratings from FiveThirtyEight, and the data is specifically from Gallup Polls. The data has 12 polls a month. I just cannot seem to find a way to set the labels to months instead of the entry it is. For example, 0 through 11 should be Jan, 12 through 24 should be Feb and etc. Screenshot of the iPython Notebook currently working in This is the code I currently have:
sns.set() xcoords = [i for i in range(0, len(trumpGallup))] plt.plot(xcoords,trumpGallup.adjusted_approve,'g-',label="Adjusted Approval (Trump)") plt.plot(xcoords,trumpGallup.adjusted_disapprove,'r-',label="Adjusted Disapproval (Trump)") plt.legend(loc=2) #This enables and determines the location of the graph's legend plt.xlabel("Trump's Ratings from January 2017 to November 2017") plt.ylabel("Percetange") plt.show()
See also questions close to this topic
what's the difference between list and list[1:] for python3
In python3 script, a branch function is defined to get the branch of a tree:
def branch(tree): return tree[1:]
Then, a var t is set to:
t = [3, , [2, , ]]
The print output of these vars are:
>>> t [3, , [2, , ]] >>> branch(t) [, [2, , ]] >>> branch(t)  >>> branch(t) [2, , ] >>> branch(t)[1:] [[2, , ]]
You see that 'branch(t)' and 'branch(t)[1:]' are different. Why ?
How to use pd.cut to bin data in a natural way?
Say I have a pandas Series of 100 float data points and I need to put them into 10 equally wide bins, and I need to access, say, the indices of the data in the fourth bin. Then what I tried is:
import pandas as pd; import numpy as np np.random.seed(1) s = pd.Series(np.random.randn(100)) cut = pd.cut(s, bins=10, labels=range(10)) fourth_bin = s[cut == 4] fourth_bin Out: 9 -0.249370 12 -0.322417 13 -0.384054 16 -0.172428 26 -0.122890 28 -0.267888 31 -0.396754 40 -0.191836 51 -0.352250 53 -0.349343 54 -0.208894 63 -0.298093 65 -0.075572 71 -0.504466 76 -0.306204 80 -0.222328 81 -0.200758 92 -0.375285 96 -0.343854 dtype: float64
which isn't quite natural and looks even a bit clumsy. For example, can I avoid manually setting the
labelsand just start from
pd.cut(s, bins=10)? This way I want to do something like
s[s in pd.cut(s, bins=10).categories]
categoriesis a list of
Intervals, but this doesn't work out.
Is there a more natural way to do this so I don't have to manually set
Why did python-3.x remove ROT-13 as an encoding?
With python-2.7, you can pretty easily implement a rot-13 Ceasar Cipher using
>>> 'abcdefghijklmnopqrstuvwxyz'.encode('rot-13') 'nopqrstuvwxyzabcdefghijklm'
You'll even find it in the Zen of Python code in the CPython repository.
However, the same code on python3.6 gives -
>>> 'abcdefghijklmnopqrstuvwxyz'.encode('rot-13') Traceback (most recent call last): File "<stdin>", line 1, in <module> LookupError: 'rot-13' is not a text encoding; use codecs.encode() to handle arbitrary codecs
If I want to use the
rot-13encoding in python3.x, I'll need to import
>>> import codecs >>> codecs.encode('abcdefghijklmnopqrstuvwxyz', 'rot-13') 'nopqrstuvwxyzabcdefghijklm'
Of course, this is really a minor issue, I don't mind importing
codecsto implement a caesar cipher (it's a builtin anyway). I'm just curious to know if there was any underlying rationale behind this design decision. Maybe the reason is as simple as "rot-13 isn't really an encoding", I don't know.
If someone can shed some light on this, I'd love to hear it!
Pandas Dataframe - How to check if the string value in column A is available in the list of string items in column B
Here is my dataframe which has two columns: Column A contains string and column B contains list of strings.
import pandas as pd df = pd.DataFrame(columns=['A','B']) df.loc = ['apple',['orange','banana','blueberry']] df.loc = ['orange',['orange','banana','avocado']] df.loc = ['blueberry',['apple','banana','blueberry']] df.loc = ['cherry',['apple','orange','banana']] print(df) A B 0 apple [orange, banana, blueberry] 1 orange [orange, banana, avocado] 2 blueberry [apple, banana, blueberry] 3 cherry [apple, orange, banana]
I want to check for each row to see if the value in column A is listed in the list in column B of the same row. So, the expected output should be:
0 False 1 True 2 True 3 False
isinwhich works to check against a static list:
df.A.isin(['orange','banana','blueberry']) 0 False 1 True 2 False 3 False
However, when I try to use it to check the list items in the dataframe, it does not work:
df.A.isin(df.B) TypeError: unhashable type: 'list'
I would like to avoid for loop and lambda if there is a solution available using Pandas.
Any help is greatly appreciated.
Setting with Copy Warning
The above statement gives the following warning:
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame
Any idea what's wrong?
How to Matching conditions in R Dataframe
I am having dataframe which looks like:
Count_ID Stats Date 123 A 10-01-2017 123 A 12-01-2017 123 B 15-01-2017 456 B 18-01-2017 456 C 17-01-2017 789 A 20-01-2017 486 A 25-01-2017 486 A 28-01-2017
I want to add a Status & Count column in Dataframe which give me below mention status.
- Match oldest
Count_IDas per date having
Statsas "A" compare if any
Count_IDwith same value (i.e 123) is having date > than that Previous same
Statsas "A", than show it "False" in status column.
- If there are multiple
Count_IDwith same value (i.e 123) than check
Stats"A" than match any same
Statsother than "A" or "A" are having date > than of those having
Stats"A", than show status as "False"
- If there are multiple same
Count_ID(i.e 123) having
Statsas "A" with date difference <30 days (w.r.t the previous
Count_IDas per Date) show status as "False-B".
- In count column, show difference of days between same
Count_IDcreated from previous
- Where no condition show it as "-".
Count_ID Stats Date Status Count 123 A 10-01-2017 False-B 0 123 A 12-01-2017 False-B 2 123 B 15-01-2017 False 3 456 B 18-01-2017 - 0 456 C 17-01-2017 False 1 789 A 20-01-2017 - 0 486 A 25-01-2017 False-B 0 486 A 28-01-2017 False-B 3
- Match oldest
Add text with PdfPages - matplotlib
Following this example of the official documentation, I can create a pdf file with the plots that I want in different pages. But I would like to add some text to the page(not inside the plot) and I've tried in this way without success:
with PdfPages('multipage_pdf.pdf') as pdf: fig = plt.figure(figsize=(11.69,8.27)) x = df1.index y1 = df1[col1] y2 = df1[col2] plt.plot(x, y1, label=col1) plt.plot(x, y2, label=col2) plt.legend(loc='best') plt.grid(True) plt.title('Title') txt = 'this is an example' plt.text(1,1,txt) pdf.savefig() plt.close()
How can I show also the text
this is an example? Is it possible to create also a first page with only text? Thanks in advance
Python: plotting an exponential on an axis
I'm currently working on a piece of code to model the evolution of the dark energy equation of state parameter
wwith the scale factor
a. In order to do this I am solving a system of three coupled ODEs, however the derivative used is with respect to e-foldings
N = ln(a)(in the code
x = wand
ln(a) = tfor simplicity). I have the following code:
import numpy as np from scipy.integrate import odeint import matplotlib.pyplot as plt import math plt.rc('text', usetex=True) plt.rc('font', family='serif') def f(s,t): p = 1.0 G = 1.0 + (1.0/p) xm = 0 x = s y = s z = s dxdt = (x - 1.0)*(3.0*(1.0 + x) - z*math.sqrt(3.0*(1.0 + x)*y)) dydt = -3.0*(x - xm)*y*(1.0 - y) dzdt = -math.sqrt(3.0*(1.0 + x)*y)*(G - 1.0)*(z**2) return [dxdt, dydt, dzdt] t = np.linspace(0.0001,1,10000) s0 = [-0.667,0.01,0.45] s = odeint(f,s0,t) plt.plot(t,s[:,0],'b-') plt.grid(True) plt.xlabel('e-foldings, N = ln(a)') plt.ylabel('Equation of state parameter w') plt.show()
which gives me this plot.
This works fine, however I want the x-axis in units of
N = ln(a)but I can't figure out how to make it work. I've tried changing the plot line to
plt.plot(math.exp(t),s[:,0],'b-')but I get the following error:
Traceback (most recent call last): File "/Users/bradleyaldous/propr2.py", line 26, in <module> plt.plot(math.exp(t),s[:,0],'b-') TypeError: only size-1 arrays can be converted to Python scalars [Finished in 6.0s]
Any help is greatly appreciated.
How to configure Jupyter notebook to use the correct paths associated with its kernel?
My goal is to launch jupyter notebook for python using python 2.7. The reason is because all of the python modules I need are installed under 2.7.
So, I launch jupyhter notebook just by doing
jupyter notebookfrom the terminal, and I get this in the upper right hand corner:
that tells me that it is launching using python 2, I assume?
So, I assume it has access to all of the modules that I need for doing dev work in python2.7. When I launch
pythonfrom the terminal, and I do
import matplotlibit works perfectly fine (and that is using python2.7).
But then, when I try
import matplotlibthe way I always do it in python 2.7, it tells me that there is
No module named matplotlib.
So, I print out
!echo $PATHonly to discover that I see all paths related to 3.5, and nothing relating to python 2.7.
How do I change the path that jupyter notebook is launching with? And, does the picture I posted mean anything about the path it is using? I assumed that it was launching using the same as when I do
pythonfrom the terminal, as I mentioned before.
Other info - MacOSX
How to adjust x axis labels in a seaborn plot?
I am trying to plot histogram similar to this:Actual plot
However, I am unable to customize the x axis labels similar to the above figure. My seaborn plot looks something like this, my plot
I want the same x-axis labels ranging from 0 to 25000 with equal interval of 5000. It would be great if anyone can guide me in the right direction?
Code for my figure:
sns.set_style('darkgrid') kws = dict(linewidth=.3, edgecolor='k') g = sns.FacetGrid(college, hue='Private',size=6, aspect=2, palette = 'coolwarm') g = g.map(plt.hist, 'Outstate', bins=24,alpha = 0.7,**kws).add_legend()
Add Second Colorbar to a Seaborn Heatmap / Clustermap
I was trying to help someone add a colorbar for the vertical blue bar in the image below. We tried many variations of
plt.colorbar(row_colors)(like above and below
sns.clustermap()) and looked around online for 2 hours, but no luck. We just want to add a colorbar for the blues, please help!
import pickle import numpy as np import seaborn as sns import pandas as pd import matplotlib.pyplot as plt feat_mat, freq, label = pickle.load(open('file.pkl', 'rb')) feat_mat_df = pd.DataFrame(feat_mat) freq_df = pd.DataFrame(freq) freq_df_transposed = freq_df.transpose() my_palette = dict(zip(set(freq_df_transposed[int('4')]), sns.color_palette("PuBu", len(set(freq_df_transposed[int('4')])))))) row_colors = freq_df_transposed[int('4')].map(my_palette) sns.clustermap(feat_mat_df, metric="euclidean", standard_scale=1, method="complete", cmap="coolwarm", row_colors = row_colors) plt.show()
Changing color scale/gradient vertically in bar like plot using seaborn
I wanted to have vertical gradient for each bar of the seaborn barplot/countplot ,
#to reproduce above plot import numpy as np import matplotlib.pyplot as plt import seaborn as sns sns.set(style="whitegrid", color_codes=True) np.random.seed(sum(map(ord, "categorical"))) titanic = sns.load_dataset("titanic") sns.countplot(x="deck", data=titanic, palette="Greens_d") plt.show()
This image has horizontal gradient but I want the gradient to be vertical, like the linear down or linear up gradient in Excel https://support.office.com/en-us/article/add-a-gradient-color-to-a-shape-11cf6392-723c-4be8-840a-b2dab4b2ba3e
See the example here from https://matplotlib.org/gallery/lines_bars_and_markers/gradient_bar.html for vertical gradient Neglect the background, colour is immaterial.
p.s. Newbie to seaborn