Getting Information about Optimal Solution from Multidimensional Knapsack Algorithm
I am building a multidimensional knapsack algorithm to optimize fantasy NASCAR lineups. I have the code thanks to another author and am now trying to piece back together the drivers the optimal solution consists of. I have written code to do this in the standard case, but am struggling to figure it out with the added dimension. Here's my code:
#open csv file
df = pd.read_csv('roster_kentucky_july18.csv')
print(df.head())
def knapsack2(n, weight, count, values, weights):
dp = [[[0] * (weight + 1) for _ in range(n + 1)] for _ in range(count + 1)]
for z in range(1, count + 1):
for y in range(1, n + 1):
for x in range(weight + 1):
if weights[y  1] <= x:
dp[z][y][x] = max(dp[z][y  1][x],
dp[z  1][y  1][x  weights[y  1]] + values[y  1])
else:
dp[z][y][x] = dp[z][y  1][x]
return dp[1][1][1]
w = 50000
k = 6
values = df['total_pts']
weights = df['cost']
n = len(values)
limit_fmt = 'Max value for weight limit {}, item limit {}: {}'
print(limit_fmt.format(w, k, knapsack2(n, w, k, values, weights)))
And my output:
Driver total_pts cost
0 A.J. Allmendinger 29.030000 6400
1 Alex Bowman 39.189159 7600
2 Aric Almirola 53.746988 8800
3 Austin Dillon 32.476250 7000
4 B.J. McLeod 14.000000 4700
Max value for weight limit 50000, item limit 6: 325.00072048
I'm looking to at least get the "cost" associated with each "total_pts" in the optimal solution, though it would be nice if I could have it draw out the "Driver" column of the dataframe instead (which I guess could be accessed by indices). Thanks.
See also questions close to this topic

Selective Escapng in Python?
In print() I use "" to write my statement print
enter code here
("Huzaifa is\"Good"") After I use the escape the brackets for print() also convert to string. How do I end the escape without having to use SingleQuotes'' ? 
Error as I try to shift an image left by 10 pixels: (215:Assertion failed) ifunc != 0 in function 'remap'
I am trying to shift an image left by 10 pixels using the following:
import cv2 image = cv2.imread("brazil.png") transformed = cv2.warpAffine(image, np.float32([[1, 0, 10], [0, 1, 0]]), (image.shape[1], image.shape[0]), borderMode=cv2.BORDER_REPLICATE) cv2.imwrite("shifted.png", transformed)
But I get an error saying:
Traceback (most recent call last):\n File \"/home/snippets/ad/CVResult.py\", line 29, in func_wrapper\n ans = func(self)\n File \"run.py\", line 230, in test_shiftimageleft\n student = ps1.shift_image_left(np.copy(test_image), shift)\n File \"/home/snippets/ad/workspace/ps1.py\", line 196, in shift_image_left\n transformed = cv2.warpAffine(temp_image, np.float32([[1, 0, 1 * shift], [0, 1, 0]]), (temp_image.shape[1], temp_image.shape[0]), borderMode=cv2.BORDER_REPLICATE)\nerror: OpenCV(4.0.0) /io/opencv/modules/imgproc/src/imgwarp.cpp:1805: error: (215:Assertion failed) ifunc != 0 in function 'remap'\n\n"
What could be the reason for this?

Python:How to merge two columns according to a common parameter(Pandas)?
I have two different .csv files as follows:
movies.csv
movieId movie rating 1 StarWarsiv 2 Avengers 3 About time 4 MIfallout 5 It
ratings.csv
userId movieId rating 1 1 4 2 1 4.5 3 1 3.5 4 1 3 5 1 5 1 2 3.4 3 2 4.5
Now, i want to calculate the avearge rating of a movie according to different users and add it to the rating columns of movies.csv.
ex: Average rating of star wars is (4+4.5+3.5+3+5)/5=4. and it's movie id is 1. so it should match with movieId in and add to rating.
expected output:
movieId movie rating 1 StarWarsiv 4 2 Avengers some number 3 About time some number 4 MIfallout some number 5 It some number
Is there any easy way to do this using Pandas or Numpy? How to match rows according to the id?

Eclipse C/C++  a problem caused the program to stop working correctly
I need to use from DeltaMOCK. For use from this source code I am using from
eclipse C/C++
and installedgcc
on my system(windows 8  64 bit). I cloned the source code intoeclipse
but when I run the project get me bellow error:Notice: I just cloned the project into eclipse then run it.
Here is my
Console
:When I use from debug say me:
Can't find a source file at /home/gfortan/gcchome/workshop/gcc/objdir64/x86_64w64mingw32/libstdc++v3/include/bits/basic_string.h Multiple errors reported. 1) Failed to execute MI command: varcreate  * &((()._M_local_buf)) Error message from debugger back end: varcreate: unable to create variable object 2) Failed to execute MI command: dataevaluateexpression (()._M_local_buf) Error message from debugger back end: A syntax error in expression, near `)._M_local_buf)'. 3) Failed to execute MI command: varcreate  * &((()._M_local_buf)) Error message from debugger back end: varcreate: unable to create variable object 4) Unable to create variable object

Confused with this HashMap algorithm interview question
i'm studying interview questions and came across this question thats really confusing me. I know how to do the basic O(n^2) solution but the HashTable O(n) is not making any sense.
static void printpairs(int arr[],int sum) { HashSet<Integer> s = new HashSet<Integer>(); for (int i=0; i<arr.length; ++i) { int temp = sumarr[i]; // checking for condition if (temp>=0 && s.contains(temp)) { System.out.println("Pair with given sum " + sum + " is (" + arr[i] + ", "+temp+")"); } s.add(arr[i]); } }
The part that is confusing me is the part where its checking condition. it does s.contains(temp) when nothing is put into the hashtable. So how can it contain sum  i?
https://www.geeksforgeeks.org/givenanarrayaandanumberxcheckforpairinawithsumasx/

Smallest to largest Time complexity w.r.t Order of Growth
My question is which line has function the smallest to Largest Time complexity in terms of order of growth.
1) n^2,nlog(n),2^n
2) 2^n,n^2,nlog(n)
3) nlog(n),n^2,2^n.

Count unique values in a rolling window of 30 days  pandas
I have a dataframe like this (with > 2M lines):
Id date local A 20180101 web B 20180101 web B 20180101 web C 20180101 web D 20180101 web E 20180102 web
What is the best way to count unique ids in a rolling window of 30 days? For example, at 20180531 I want to know how many unique users I had in the last 30 days (20180531) inclueded.
My desired output would be like this:
UniqueIds date 27890 20180131 27690 20180201 26990 20180202
I tried pandas.DataFrame.rolling and groupby without success.

Cumulative grouping in pandas
I have a dataframe like this:
df = name amount date 0 A 10 1 1 B 15 1 2 A 5 2 3 C 7 3 4 A 8 4 5 B 10 4 6 C 11 4
and I would to do a cumulative sum along names and dates, I mean, my desired result with this example will be:
df_result = name amount date 0 A 10 1 1 B 15 1 2 A 15 2 3 B 15 2 4 A 15 3 5 B 15 3 6 C 7 3 7 A 23 4 8 B 25 4 9 C 18 4
I want to show the accumulated value over the time periods represented by the date column, for example, for the case of A, its value in period 1 is 10, in 2 it is 5, in 3 it is 0 (because it does not appear) and in 4 it is 8, so that in the df_result that accumulation is shown. C does not appear until period 3 because it has no value until that period
I've tried different combinations of groupby, cumsum, even stack, but I can't achieve anything close to that.

How to get best from first generation in genetic algorithm?
I created this function that executes the genetic algorithm in order to solve the backpack problem. At the last generation I can easily get the best chromosome by the output
x
, but how can I get the best chromosome from the first generation? Looking here I noticed that there is the vectorBest
that should keep the best element from each generation. How can I access to that vector?function [x,fval,exitflag,output,population,score] = evolutionaryGaSolution(ds, Wmax) nvars = 32; PopulationSize_Data = 50; MaxGenerations_Data = 80; FitnessLimit_Data = 207; f = @(x)fitnessBackpack(x, ds, Wmax); % Load default settings options = optimoptions('ga'); % Modify options setting options = optimoptions(options,'PopulationType', 'bitstring'); options = optimoptions(options,'PopulationSize', PopulationSize_Data); options = optimoptions(options,'MaxGenerations', MaxGenerations_Data); options = optimoptions(options,'FitnessLimit', FitnessLimit_Data); options = optimoptions(options,'MutationFcn', { @mutationuniform 0.1 }); options = optimoptions(options,'Display', 'off'); options = optimoptions(options,'PlotFcn', { @gaplotlogbestf }); [x,fval,exitflag,output,population,score] = ... ga(f,nvars,[],[],[],[],[],[],[],[],options); fprintf("Starting point: %s\n", ????); % What should I write here? fprintf("Final point: %s\n", num2str(x)); fprintf("Generations:%d\n", output.generations); end

Optimize slow query with join and subquery
I have a slow query that I need to optimize but can't figure out how/what to do. Can someone please advice?
I'm building a web app that lets the user answer questions (QuestionGroup). An answer to a question have one or more AnswerTextMarkers that in turn is connected to a piece of text in a Chapter. A Chapter is contained in a ChapterGroup. When the user starts answering questions he starts a QuestionSession, and each answer is saved to AnswerQuestionSession.
The slow query I have is used to find the QuestionGroup.IDs for a new QuestionSession with a list of ChapterGroup.IDs as input. The query takes 23 sec to run which is too long considering the relative small amount of data.
select count(*) from AnswerTextMarker; > 9 125 select count(*) from AnswerQuestionSession; > 3 488 725 select count(*) from QuestionSession; > 313 425 select count(*) from ChapterChapterGroup; > 31 673
I have indexes on all columns used in the query.
The execution plan:
'1','PRIMARY','<derived2>',NULL,'ALL',NULL,NULL,NULL,NULL,'832','100.00','Using temporary; Using filesort' '2','DERIVED','CCG2',NULL,'range','uq_Chapter_ChapterGroup,fk_ChapterGroup_Chapter_ID,ix_chapterChapterGroup_chapterID','fk_ChapterGroup_Chapter_ID','8',NULL,'145','100.00','Using where; Using index; Using temporary' '2','DERIVED','ATM3',NULL,'ref','fk_Chapter_AnswerTextMarker_ID,fk_QuestionGroup_AnswerTextMarker_ID','fk_Chapter_AnswerTextMarker_ID','8','Hypo.CCG2.ChapterID','5','100.00',NULL '3','DEPENDENT SUBQUERY','QS',NULL,'ref','PRIMARY,fk_QSession_User_ID','fk_QSession_User_ID','8','const','291','100.00','Using index; Using temporary; Using filesort' '3','DEPENDENT SUBQUERY','AQS',NULL,'ref','PRIMARY,fk_AQSession_AnswerTextMarker_ID,ix_answerQuestionSession_questionSessionID','fk_AQSession_AnswerTextMarker_ID','16','Hypo.ATM3.ID,Hypo.QS.ID','1','100.00',NULL
The query:
(SELECT DISTINCT ATM3.QuestionGroupID, IFNULL(( SELECT AQS.Correct FROM AnswerQuestionSession AQS JOIN QuestionSession QS ON QS.ID = AQS.QuestionSessionID AND QS.UserID = 3 WHERE AQS.AnswerTextMarkerID=ATM3.ID AND QS.ChapterGroupID = CCG2.ChapterGroupID ORDER BY AQS.QuestionSessionID DESC LIMIT 1), 1) AS Correct FROM AnswerTextMarker ATM3 JOIN ChapterChapterGroup CCG2 ON CCG2.ChapterID = ATM3.ChapterID WHERE CCG2.ChapterGroupID IN (94,255,288,332,356,358,376,381,394,397,5146,118,148,246,338,372,378,388,407,414,1020,90,256,300,339,377,390,391,411,412,1021,9,333,343,369,373,392,403,2537,6669,52,251,334,340,342,345,550,2165,2508,37,46,104,135,253,385,389,2163,84,249,344,348,382,399,410,2164,147,252,324,330,349,354,2166,115,149,271,346,350,363,3342,254,337,351,352,365,374,386,117,150,250,353,364,395,2509,331,355,360,370,404,408,341,359,368,379,398,413,329,362,366,367,380,315,396,405,400,1957) HAVING Correct = 0 OR Correct = 1 OR Correct = 1 ) AS QGS order by RAND();

Optimize javascript number function
My target is to get floor of number with javascript. For example:
// 23 > 20; // 234 > 230; // 2345 > 2300; ...
I am using next function:
var usersCount = 23; var rounded; if (usersCount < 100) { rounded = Math.floor(usersCount / 10) * 10; } else if (usersCount < 1000) { rounded = Math.floor(usersCount / 100) * 100; } else if (usersCount < 10000) { rounded = Math.floor(usersCount / 1000) * 1000; } else if (usersCount < 100000) { rounded = Math.floor(usersCount / 10000) * 10000; } else if (usersCount < 1000000) { rounded = Math.floor(usersCount / 100000) * 100000; } else { rounded = usersCount; }
I need to improve that function to be dynamic in order to avoid putting else ifs or any kind of switches. How can I do that? I there any better approach to achieve wanted result?

I am trying to determine the correct Knapsack algorithm
Problem summary: we have 3 containers: small containers carry 3 items, medium containers carry 5 items, and large containers carry 9 items. Given a required item number N. Find the correct combination of containers so that we can get exactly N items, but with a minimum amount of containers.
So in equation term: 3x + 5y + 9z = N where x + y + z must be as small as possible, solve for x,y,z.
My question: I'm wondering if this can be solved using a knapsack algorithm? From the look of it, it looks like an unbounded knapsack problem?
Correct me if I am wrong, but If I was to convert it to a Knapsack problem, it is equivalent to:
W = N weights = [3, 5, 9] values = [1, 1, 1]
we want to solve for minimum values where total weights equal to W.
Is this correct?

How to find best combination of items in python that match expected result?
I have a list of floats
b = [1.2, 3.1, 4.5, 0.3 , 6.2, 1.1, 3.1, 0.4, 9.1]
. How to find best combination of b items forvalue= 8.6
? I am looking for exact match[3.1, 6.2, 1.1, 0.4]
or the closest match (e.g. ifvalue = 8.4
closed match should be[0.3, 1.1, 9.1]
.EDIT: I found solution for print items in knapsack:
# Python3 code for Dynamic Programming # based solution for 01 Knapsack problem # Prints the items which are put in a # knapsack of capacity W def printknapSack(W, wt, val, n): K = [[0 for w in range(W + 1)] for i in range(n + 1)] items = [] # Build table K[][] in bottom # up manner for i in range(n + 1): for w in range(W + 1): if i == 0 or w == 0: K[i][w] = 0 elif wt[i  1] <= w: # prev = val[i  1] # idx = w  wt[i  1] # print(i, w, prev, idx) # K[i][w] = max(prev + K[i  1][idx], K[i  1][w]) K[i][w] = max(val[i  1] + K[i  1][w  wt[i  1]], K[i  1][w]) else: K[i][w] = K[i  1][w] # stores the result of Knapsack res = K[n][W] # print(res) result = res w = W for i in range(n, 0, 1): if res <= 0: break # either the result comes from the # top (K[i1][w]) or from (val[i1] # + K[i1] [wwt[i1]]) as in Knapsack # table. If it comes from the latter # one/ it means the item is included. if res == K[i  1][w]: continue else: # This item is included. # print(wt[i  1]) items.append(wt[i  1]) # Since this weight is included # its value is deducted res = res  val[i  1] w = w  wt[i  1] return result, items # Test val = [12, 31, 45, 3, 62, 11, 31, 4, 91] wt = val W = 84 n = len(val) print(printknapSack(W, wt, val, n))
It produces output
(83, [4, 3, 45, 31])
. But there are still open issues: It does not work with float (this is doable, because multiplication with 100 will fix it, not sure how it will effect performance)
 It does not work with negative values: this is big problem, because this code can not be use e.g. in cased of matching payment to open invoices (there could be credit notes that has to be taken to account).

Binpacking / Knapsack Optimisation problem design
I have a scenario I need some help formulating the question in, so that I can properly implement an optimisation method. I hope someone can guide me a little, it seems so simple on the surface, but I am having trouble figuring out how to encode the variables, constraints etc properly.
The scenario is this:
 Multiple items need to be placed into bins / knapsacks
 Each item has two factors that must be taken into account when packing
 I have several types of bin / knapsack that can be used for packing
 The supply of bins / knapsacks is infinite
 Each bin / knapsack has constraints on each of the values from the items so that the values of the items add up in a cumulative way but cannot exceed either constraint on the bin / knapsack
 Each bin / knapsack has a different cost (price) to use it
 There is an upper limit to the number of items that can fit into a bin / knapsack regardless of which items are in it
Example:
A vector of items with two values each:
Items = [[7,6],[14,2],[27,23],[5,15]]
A vector of bins / knapsacks with first value being upper limit it can accept for item first values. Second value is the same but applies to the second value of each item in the bin / knapsack. Third value is the maximum number of items the bin / knapsack can hold. Last value is the price / cost of the bin / knapsack.
BinOptions = [[64000,1450,350,22000],[8000,450,64,8000]]
The goal is to pack all the items in the most efficient manner so as to provide the least cost (using the price of the bins / knapsacks).
I was looking at two ways the problem might be solved:
 ORTools with the MILP approach
 ORTools with the Knapsack solver
I am not necessarily stuck on ORTools, it is just what I have been playing with and seems to work nicely across different languages from the reports I have seen. It would be nice to be able to model this and then choose a language later.
The one thing that is probably not apparent is that the number of available bin varieties changes. Sometimes I will have two or three to choose from, other times many many more, possibly up to a hundred. The number of incoming items to pack will also change depending on the day.
If anybody can provide some guidance on solving this I would be most appreciative.
Cheers
The Frog