Getting Information about Optimal Solution from Multidimensional Knapsack Algorithm
I am building a multidimensional knapsack algorithm to optimize fantasy NASCAR lineups. I have the code thanks to another author and am now trying to piece back together the drivers the optimal solution consists of. I have written code to do this in the standard case, but am struggling to figure it out with the added dimension. Here's my code:
#open csv file
df = pd.read_csv('roster_kentucky_july18.csv')
print(df.head())
def knapsack2(n, weight, count, values, weights):
dp = [[[0] * (weight + 1) for _ in range(n + 1)] for _ in range(count + 1)]
for z in range(1, count + 1):
for y in range(1, n + 1):
for x in range(weight + 1):
if weights[y  1] <= x:
dp[z][y][x] = max(dp[z][y  1][x],
dp[z  1][y  1][x  weights[y  1]] + values[y  1])
else:
dp[z][y][x] = dp[z][y  1][x]
return dp[1][1][1]
w = 50000
k = 6
values = df['total_pts']
weights = df['cost']
n = len(values)
limit_fmt = 'Max value for weight limit {}, item limit {}: {}'
print(limit_fmt.format(w, k, knapsack2(n, w, k, values, weights)))
And my output:
Driver total_pts cost
0 A.J. Allmendinger 29.030000 6400
1 Alex Bowman 39.189159 7600
2 Aric Almirola 53.746988 8800
3 Austin Dillon 32.476250 7000
4 B.J. McLeod 14.000000 4700
Max value for weight limit 50000, item limit 6: 325.00072048
I'm looking to at least get the "cost" associated with each "total_pts" in the optimal solution, though it would be nice if I could have it draw out the "Driver" column of the dataframe instead (which I guess could be accessed by indices). Thanks.
See also questions close to this topic

Regex lookbehind and lookahead doesn't find any match
I have a lot of data that I need to parse and output in different format. The data looks something like this:
tag="001">utb20181009818< tag="003">CZ PrNK< ...
And now, I want to extract 'utb20181009818' after after 'tag="001">' and before the last '<'
This is my code in python:
regex_pattern = re.compile(r'''(?=(tag="001(.*?)">)).*?(?<=[<])''') ID = regex_pattern.match(one_line) print(ID)
My variable one_line already contains the necessary data and I just need to extract the value, but it doesn't seem to match no matter what I do. I looked at it for hours, but doesn't seem to find out what I'm doing wrong.

Python: show type inheritance
I'm trying to look under the hood in idle to wrap my head around python custom classes and how they are stored in memory. Suppose I have the following code:
class Point: pass x=Point() print(x)
Given the following output:
<__main__.Point object at 0x000002A3A071DF60>
I know that since my class consists of no code, when I create an object of type
Point
, an object of typeobject
is implicitly created from which thePoint
objectx
inherits such methods as__str__
etc. However, I cant seem to see the connection ie. when I typedir(x)
, I dont see any attribute that stores a reference to an object of typeobject
. Am I misunderstanding how it works or is there some attribute that I am unaware of? 
TypeError: ufunc 'add' did not contain a loop with signature matching types dtype('<U78') dtype('<U78') dtype('<U78'
I am reading in images from directories and as I loop through file names I get error mentioned in the title. The variable 'imagePath' is the path to image in my local machine. When 'np.fromfile(imagePath)' is removed the code runs, it even will print the image's path, but blows up when I try to read them in with numpy.
def getTrainingDataFromFile(): for subdir, dirs, images in os.walk(directory): for sub, dirs, images in os.walk(subdir): for currentImage in images: imagePath = str(os.getcwd() + "/" + sub.replace("./", "") + "/" + currentImage) if '.jpg' in imagePath: face = np.fromfile(imagePath) images.append(face)
TypeError: ufunc 'add' did not contain a loop with signature matching types dtype('

Determing if a graph is acyclic in python
I am trying to use DFS in python to check if a graph is acyclic, but I am incurring problems on some hidden test cases.
The code should return 1 if there is a cycle and 0 is there is not
def cycle(adj, i): graph = adj stack = [] x = i visited = [False for _ in range(len(adj))] visited[x] = True while len(graph[x]) != 0 or len(stack) != 0: if len(graph[x]) != 0: stack.append(x) x = graph[x].pop(0) if visited[x] == True: return True else: visited[x] = True else: x = stack.pop() return False def acyclic(adj): for i in range(len(adj)): if cycle(adj, i): return 1 return 0

What is the runtime of this code in Theta Notation?
I believe that the runtime of func2 is O(n * log(n)).
But some people have told me that it's not.int func2(int* arr, int n){ int i, j; for (i = 1; i <= n; i *= 2) reverseArray(arr, i); } } void reverseArray(int* arr, int n){ int left, right, temp; for (left = 0, right = n1; left <= right; left++, right){ temp = arr[left]; arr[left] = arr[right]; arr[left] = temp; } }

Tokens in a bag
We have n tokens. Every token is either red, blue, or green. These n tokens are in a bag
Repeat the following until the bag is empty:
1) If there are more than two tokens in the bag. take two random tokens out of the bag. Otherwise, empty the bag.
2) According to the two tokens we got in step 1), we do the following things:
∗ Case 1: If one of the tokens is red, do nothing.
∗ Case 2: If both tokens are green, we put one green token and 2 blue tokens back into the bag.
∗ Case 3: If we got one blue token, and the other token is not red, then we put 3 red tokens back into the bag.
Assume that we always have enough tokens to put back into the bag, prove via induction that this process always terminates.
So for my base case, I put n = 1 and since we have less than 2 tokens, we just empty the bag and the process terminates.
I don't know where to go from there.
This is what I've written down in my notebook just thinking about the problem:
R = red, B = blue, G = green
If we take out RR, we do nothing and the bag now contains n=n2 tokens
If we take out RB, we do nothing and the bag now contains n=n2 tokens
If we take out RG, we do nothing and the bag now contains n=n2 tokens
If we take out BB, we put 3 red tokens back in and now the bag contains +1 token (since we took out 2 and added 3 back)
If we take out BG, do the same as above
If we take out GG, 1 green and 2 blue goes back in and now the bag contains +1 token
What I think I can see from this is that eventually, the bag will be full or almost full with red tokens since there is only one situation where we put tokens back in that are not red and two situations where we put back 3 red tokens. And whenever we pull out a red token, we do nothing and just shrink the token size in the bag until the bag is empty.
The amount of green tokens will shrink relative to the amount of blue and red tokens. We want to pull a red or blue token, not so much with green.
I'm not sure how to prove this via induction. Any help would be much appreciated

How to match csv values to log file?
Full Contents files: trend.log , sha1.csv
Task:
My task is to is match the first column of csv to log file, if it matched: get the 11th column of log file then write it in csv file else: write undetected. Example:
sha1.csv
SHA1 VSDT 002dfc56cf7770dbc7909f0035fd6d61d50de421 Microsoft RTF 60080 012b00a2eaae744ea2256e4dfb7920b3e44146ed WIN32 EXE 72
trend.log
trend.log 1539944370 0 1 1 1539915569 1539915570 1539915569 8224 93 695296 002dfc56cf7770dbc7909f0035fd6d61d50de421 Troj.Win32.TRX.XXPE50FFF027 c:\users\administrator\desktop\downloader\download\ TRENDX 172.20.4.179 Administrator 012b00a2eaae744ea2256e4dfb7920b3e44146ed AABBSBKSBIiAAFCABAAAAAAAAAAgAgCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA= 1539944381 0 1 1 1539915579 1539915582 1539915579 8224 97 4655312 01b4e558c2bb8f99e13f52ad0c1a569a24e8d9b2 Troj.Win32.TRX.XXPE50FFF027 c:\users\administrator\desktop\downloader\download\ TRENDX 172.20.4.179 Administrator ALLOYMANYCUTS ALLOYMANYCUTS 7.5.0.0 01b4e558c2bb8f99e13f52ad0c1a569a24e8d9b2 https://www.FrapsCapture.com;https://www.FrapsCapture.com;1539071336;1570607336 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=
The sha1
002dfc56cf7770dbc7909f0035fd6d61d50de421
in csv matched in logfile so get value then write in csv third column. Like this:sha1.csv
SHA1,VSDT,DESC 002dfc56cf7770dbc7909f0035fd6d61d50de421,Microsoft RTF 60080,Troj.Win32.TRX.XXPE50FFF027 012b00a2eaae744ea2256e4dfb7920b3e44146ed,WIN32 EXE 72,Undetected
Problem:
My problem is I used pandas and I selected the 11th column of the logfile so after the space or tab, it will be considered as end of column like this line:
1539948767 0 1 1 0 0 0 0 100 0 e4fc9cb0c9f5d6e943142fa3b3f39cd230796f85.crdownload Rapid Proliferation c:\users\administrator\desktop\downloader\download\ TRENDX 172.20.4.179 Administrator e4fc9cb0c9f5d6e943142fa3b3f39cd230796f85
Rapid Proliferation
should be readed as 1 column but rather it writesRapid
only in my csvCode:
Here is my code
import numpy as np import pandas as pd import csv,time import io,os import shutil from config import * print "Log File Reading" logtext = "C:\\filenames\\log\\trendx.log" trim = ["(1).crdownload"] #Log data into dataframe using genfromtxt print "Matching SHA1 to Logfiles" logdata = np.genfromtxt(logtext,invalid_raise = False,dtype=object, comments=None,usecols=np.arange(16),encoding='utf_16le') logframe = pd.DataFrame(logdata) #print (logframe.head()) #Dataframe trimmed to use only SHA1, PRG and IP df2=(logframe[[10,11]]).rename(columns={10:'SHA1', 11: 'DESC'}) #print (df2.head()) #sha1_vsdt data into dataframe using read_csv df1=pd.read_csv("sha1_vsdt.csv",delimiter=",",error_bad_lines=False,engine = 'python',quoting=3) #Using merge to compare the two CSV df1.__delitem__('DESC') df = pd.merge(df1, df2, on='SHA1', how='outer').fillna('Undetected') df1['DESC']= df['DESC'] df1.to_csv("sha1_vsdt.csv",index=False) df = pd.read_csv("sha1_vsdt.csv") df.loc[df['VSDT'].isin([".crdownload"]), 'DESC'] = "Undetected" df.to_csv('sha1_vsdt.csv',index=False) df3 = pd.read_csv("sha1_vsdt.csv") df3.loc[df3['DESC'].isin(trim), 'DESC'] = "Undetected" df3.to_csv('sha1_vsdt.csv',index=False) print "CSV Succesfully Created"

Down sampling in python
I'm trying to downsample my data which is minute and my index is date time. But when i call pandas.resample it returns only one column while my data contains six columns
import pandas as pd from matplotlib import pyplot dataset = pd.read_csv('household_power_consumption.txt', sep=';', header=0, low_memory=False, infer_datetime_format=True, parse_dates={'datetime': [0,1]}, index_col=['datetime']) #Date and time has been combined dataset.head(); dataset=dataset.resample('H', how='mean', label='left'); a=dataset.head(); print(a) dataset.to_csv('Downsampled_House_data.csv');
dataset.resample
returns only one column. 
Transpose a pandas dataframe with headers as column and not index
When I transpose a dataframe, the headers are considered as "index" by default. But I want it to be a column and not an index. How do I achieve this ?
import pandas as pd dict = {'cola': [97, 98, 99], 'colb': [34, 35, 36], 'colc': [24, 25, 26]} df = pd.DataFrame(dict) print(df.T) 0 1 2 cola 97 98 99 colb 34 35 36 colc 24 25 26
Desired Output:
0 1 2 3 0 cola 97 98 99 1 colb 34 35 36 2 colc 24 25 26

optimization problem, minimize from a matrix with constraints
I'm trying to find a minimum for a function which takes a 5x4 matrix shown below.
def objective(w): ... return value # Random Initial Input Matrix where rows sum up to 1.0 w = array( [[0.33333333, 0. , 0.2 , 0.46666667], [0.07142857, 0.42857143, 0.5 , 0. ], [0. , 0.42857143, 0. , 0.57142857], [0.31034483, 0.27586207, 0.27586207, 0.13793103], [0.27272727, 0.22727273, 0.22727273, 0.27272727]])
The only constraint is to keep each row sum up to 1.0.
I've tried
1)
scipy.optimize.fmin(objective, w)
 which gives me back a converging result, but is incorrect because I'm not sure how to apply the constraint.2)
scipy.optimize.minimize(objective, w)
 which isn't changing the initial matrix.Any suggestions of what I can look at?
Thanks in advance.

identify whether an ELF binary is built with optimizations
I know we can use cmake or make to control
CFLAGS
orCXXFLAGS
for building release version manually.release
version to me meansO2
orO3
is given at least, it doesn't matter whetherg
orstrip
is performed, whereasdebug
is givenO0
.However sometimes I want use scripts to decide whether an elf binary was built with optimizations, so I can decide what to do next. I tried
objdump
orfile
orreadelf
, but found no answers. Are there any alternatives that I can do this ? 
how to fit a complex model to complex data
I don't know which fit routine I should use, which one is the best. I got a 2dimensional complex field (measurement data, about 300x400 pixel) and I like to fit a model with 6 degrees of freedom (6 parameters, complex and real). The functions, generating the complex field, exists as Matlab code. I like to minimize the (quadratic) difference between both, or maybe to maximize the overlap integral. I'd like to check both. The question is: do I have to use fminsearch or lsqnonlin, is a local algorithm sufficient, or do I have to use a global search? There are so many options, I'm overwhelmed a little bit. Regards Andre

Solution to how to count the number of loop
i want to know the solution of problem
the question is There are 5 paths from A to G inside the loop. If the loop is repeated 18 times, the number of all possible cases is 3814697265625. Explain how to solve this calculation.

Dividing Weights Equally in Javascript (bin sorting/knapsack sort of problem)
I have 100 bins, and a random number of boxes (which each have weight and volume). I need to distribute the boxes as equally as possible into the bins, and no bin can have more than 100 volume. While the number of boxes is random each turn, I know the amount of boxes (
boxCount
), and the weight and volume of each box in an array.Because weight needs to be distributed equally, I'm taking total weight of all boxes, and dividing it by 100 to get the average amount that needs to be in each bin. Below I'm simply trying to iterate over each box and assign it to a bin if the current bin is less than the average and volume is less than 100. But the bin stays at zero, it's not increasing. Is something wrong in the if statement?
Also, any pointers on how this can be optimized so that every bin has the closest possible amount to the average would be much appreciated, as the current method might be pretty far off if the next box has a lot of weight or volume.
let bin = 0; for (let i = 0; i < boxCount; i++) { w = 0; vol = 0; if (((w + allBoxes[i].weight) <= (average)) && ((vol + allBoxes[i].volume)) < 100) { w = w + allBoxes[i].weight; vol = vol + allBoxes[i].volume; allBoxes[i].bin = bin; } else { bin = ++bin; } }

Knapsack dynamic in R
I am trying to implement the sudo code from wikepedia in order to solve the knapsack problem but my code fails. My function takes as input a data frame X with 2 columns,the first was the weights and second the values
weights value 10 110 20 150 15 180 30 170 18 130 knapsnack_dyn<function(X,W){ w<c(0,X[,1]) v<c(0,X[,2]) n<nrow(X) m< matrix(0,nrow=n+1,ncol=W+1) keep<m res<c() for (i in 1:n+1){ for (j in 0:W+1){ if (w[i]>j ){ m[i,j]<m[i1,j] keep[i,j]<0 }else{ m[i,j]<max(m[i1, j], m[i1, jw[i]] + v[i]) keep[i,j]<1 } } } K=W+1 for (i in n+1:1){ if(keep[i,K]==1){ res[i]<i K=Kw[i] } } return(c(res,m[n+1,W+1])) }