In pandas, set_index is not creating a hierarchical index
I have a data frame that I am trying to hierarchically index by two columns, State and RegionName. However, whenever I try to set the index, I get, for lack of a better word, parallel indexing and not hierarchical. I tried the same code for a different data, set and I did not run into this issue.
df = pd.read_csv('City_Zhvi_AllHomes.csv')
df.set_index(["State","RegionName"], inplace = True)
The results look like this:
I have looked around Stackoverflow, but have been unable to find an answer for this, or even find a similar question. Any and all help will be appreciated. Thanks!
See also questions close to this topic

How to programatically get parameter names and values in scipy
Is there any way to get the parameters of a distribution? I know almost every distribution has "loc" and "scale" but theres differences between them, for example alpha has "a", beta has "a" ,"b".
What i want to do is programatically print(after fiting a distribution) key value pairs of parameter,value.
But i dont want to write a print routine for every possible distribution.

Getting global variables using globals() calling from a module
I want a list of "global" variables given by
globals()
(How to get the list of all initialized objects and function definitions alive in python?).Now I want to call it from a module, how do I do it?
To state it precisely, suppose in an IDE, I enter
x = globals()
How do I get the same
x
within a module?Edit:
When I am doing data analysis, all my large size objects are in the global level. I have a function to find the highest memory usage object.
def foo: objs = globals() #line (1) object_size = get_object_size(objs) ... return (name_of_largest_size_object)
It works if
foo
is defined in global level. When I try to wrap it in a module, line (1) does not work: it gives only variables within the module. 
Django: How to retrieve a particular item for specific user from a model?
How can I retrieve a specified field for a particular user in Django? I got
user: A, B, C and user A got 10 url
records in the database. The problem is how to retrieve the 10 url records of user A? Thank you so much.class Bookmark(models.Model): url = models.CharField(max_length=500) shortlisted_by = models.ForeignKey(User, related_name='bookmark') shortlisted_date = models.DateTimeField(auto_now_add=True)

Python DataFrame shape difference
for dataframe with shape (143, ) and (143, 1)
Are they the same? they are all 1column dataframe with 143 rows.
Thanks,

how to split values in a datacolumn and adding it to a new column with a condition in pandas
I have a df,
name Value Sri is a cricketer Sri,is Ram player Ram Ravi is a singer is cricket and foot is ball and,is,foot
and a list,
my_list=["is", "foot"]
I am trying to split df["value"] by (,) and adding the value to a new column if the value exists in my_list. My expected output is
name Value my_list Sri is a cricketer Sri is Ram player Ram Ravi is a singer is cricket and foot is ball and is,foot
please help to achieve this, thanks in advance

Losing float precision when converting dataframe with tuples to excel
Here's the dataframe:
0 1 2 A (0.2,.1) (.1,.3) (.4,.5) B (0.2,.14) (.1,.3) (.42,.5) C (.5,.12) (.1,.5) (.74,.5)
When I try to export it to excel using:
df.to_excel('hello.xlsx',float_format='%.2f')
It shows floating points like .10000000001  how do I just show up to two decimals? (the precision seems to be lost when you zip two values , that were previously rounded)

Remove values from all columns and rows of pandas dataframe
I have a pandas dataframe that look like this
It is a large dataset with 1500 rows and 200 columns
I was wondering how can I remove the number before each value in row and column. example The values look like this: 1: 0.345 2: 0.467
I want only the value to be like this: 0.345 0.467
How can I do that?

Python: How do you make a variable which can be used as an index call to slice another variable?
I am coming from a MATLAB background and moving over to Python. I am trying to figure out a way to set up a variable which is some vector which contains a range of indices which can then be used to slice some other array.
In MATLAB I would do this:
A = [2,3,4,5,6; 9,4,3,2,1; 5,4,3,2,5]; %some arbitrary matrix begin = 2; %the first index I want to pull end = 4; %the last index I want to pull idx = 2:4; %the vector of indices I want A(:,idx) %results in me pulling out the 2nd, 3rd and 4th column of A
Now in Python, what is the equivalent?
import numpy as np A = np.array([[2,3,4,5,6],[9,4,3,2,1],[5,4,3,2,5]]) #some arbitrary matrix begin = 1 #first index end = 3 #last index idx = ??? #This is the part I don't know! <<< A[:,idx] #I want the same result as the Matlab example above
Obviously for this trivial example I could just have
idx = [1,2,3]
, but I have much more complicated scenario in real life where I cannot write out the indices manually.I have tried using the
range
andnp.arange
functions but they give the error that the object is not callable.When I look at some MATLABtoNumpy conversions such as here, it suggests that the
idx = 2:4
command in MATLAB command is equivalent toidx = range(1,3)
in Python, but this is apparently not quite true?Any help is appreciated.

Populate a sparse matrix by a matrix of indices
I have a symmetric sparse matrix, initialized to zero:
library(Matrix) set.seed(1) mat < Matrix(0,5,5)
Then I have a matrix (
idx.mat
) which specifies for each row inmat
, three column indices which should be filled with values given by another matrix (val.mat
):idx.mat < do.call(rbind,lapply(1:5,function(i) sample(1:5,3,replace=F))) val.mat < matrix(runif(15,1,10),5,3)
So I'm wondering if there's a faster way to populate
mat
according toidx.mat
andval.mat
than:mat < do.call(rbind,lapply(1:nrow(mat),function(i) { mat[i,idx.mat[i,]] < val.mat[i,] return(mat[i,]) }))

Python  List becoming empty when its not supposed to
I am trying to code a program for playing the card game War however I get this error when running the program:
File "____________", line 65, in playWar userCard = userCurrent.pop() IndexError: pop from empty list
I'm using the pop function to compare the two cards of each player and determine which is greater but I'm not sure why the userCurrent list would be empty because the while loop shouldn't repeat if the userCurrent or computerCurrent lists are empty. Any help appreciated.
from math import* import random DECK_SIZE = 52 HALF_DECK = 26 def main(): userGamesWon = 0 compGamesWon = 0 playGame = input(("Welcome to the game of war! Type X to begin a game. ")) while (playGame == "X" or playGame == "x"): deck = [] for i in range(int(DECK_SIZE/13)): deck.extend([2,3,4,5,6,7,8,9,10,11,12,13,14]) shuffledDeck = shuffle(deck) userDeck, compDeck = split(shuffledDeck) game = playWar(userDeck,compDeck) if (game == True): userGamesWon = userGamesWon + 1 else: compGamesWon = compGamesWon + 1 playGame = input(("Type X to begin another game, or press any other key to stop playing. ")) print("Thanks for playing!") print("You won: ", userGamesWon, "time(s). Computer won: ", compGamesWon, "time(s)." ) def playWar(userDeck,compDeck): gameWon = 0 userCurrent = userDeck print("Your hand: ", userCurrent) userWinnings = [] compCurrent = compDeck print("Computer hand: ", compCurrent) compWinnings = [] while (gameWon == 0): #Need to figure out how to tell if game is over while (userCurrent != [] or compCurrent != []): compCard = compCurrent.pop() userCard = userCurrent.pop() if (userCard > compCard): userWinnings = userWinnings + [userCard + compCard] elif (compCard > userCard): compWinnings = compWinnings + [userCard + compCard] else: userCurrent = userCurrent userCurrent = userCurrent + userWinnings userCurrent = shuffle(userCurrent) userWinnings = [] compCurrent = compCurrent + compWinnings compCurrent = shuffle(compCurrent) compWinnings = [] if (userCurrent == [] or compCurrent == []): gameWon = 1 if (len(userCurrent) == DECK_SIZE): print("YOU WIN.\nGood Game!") return True else: print("COMPUTER WINS.\nGood Game!") return False def shuffle(deck): #write a randomizing program that takes a list as a parameter shuffledDeck = [] for i in deck: rand = random.randint(0, len(deck)1) shuffledDeck.insert(rand,i) return shuffledDeck def split(shuffledDeck): return shuffledDeck[0:HALF_DECK], shuffledDeck[HALF_DECK:] main()

how can I upload a 4 multi index level row hierarchy from excel into a pandas dataframe?
I spend hours trying to find a solution but nothing so far. I have the following table (see screenshot). can anybody please shed some light and explain me how to upload it into a dataframe? Many thanks, Andrea

Multiindex and timestamps not recognised in csv export (pandas 0.13)
I have written some data manipulation scripts that read into simulation results and export csv files with multiindex and timeseries.
Such scripts run nicely on my PC (
pandas 0.21
), however when I run them on a simulation server wherepandas 0.13
is installed, the exported index looks like a concatenated string. For instance, I am expecting results such as:first index  second index  column1 level1  20170501 00:00:00  x
but I am getting instead:index  column1 ('level1', Timestamp('20170501 00:00:00', tz=None))  x
Any idea why this happens and if it can be fixed without upgrading pandas?

How to build a MultiIndex Pandas DataFrame from a nested dictionary with lists
I have the following dictionary.
d= {'key1': {'subkey1': ['a','b','c','d','e']}, 'key2': {'subkey2': ['1','2','3','5','8','9','10']}}
With the help of this post, I managed to successfully convert this dictionary to a DataFrame.
df = pd.DataFrame.from_dict({(i,j): d[i][j] for i in d.keys() for j in d[i].keys()}, orient='index')
However, my DataFrame takes the following form:
0 1 2 3 4 5 6 (key1, subkey1) a b c d e None None (key2, subkey2) 1 2 3 5 8 9 10
I can work with tuples, as index values, however I think it's better to work with a multilevel DataFrame. Post such as this one have helped me to create it in two steps, however I am struggling to do it in one step (i.e. from the initial creation), as the list within the dictionary as well as the tuples afterwards are adding a level of complication.