Find all points below a line on a map
In order to draw a path between two points on a map with many points (almost two thousand), I use the following function:
def path_between_cities(self, cities_with_coordinates, from_to):
from matplotlib.lines import Line2D
# coordinates from chosen path
x = [int(from_to[0][2]), int(from_to[1][2])]
y = [int(from_to[0][1]), int(from_to[1][1])]
#create line
line = Line2D(x,y,linestyle='',color='k')
# create axis
x_ = np.array((0,2000))
y_ = np.array((0,6000))
plt.plot(x_,y_, 'o')
for item in cities_with_coordinates:
name = item[0]
y_coord = int(item[1])
x_coord = int(item[2])
plt.plot([x_coord], [y_coord], marker='o', markersize=1, color='blue')
plt.axes().add_line(line)
plt.axis('scaled')
plt.show()
My goal is to extract all points (coordinates) which are found below the drawn line.
I know that you can do this using the cross product of vectors
Given a large number of vectors, what would be the most efficient way of achieving this in the context above?
1 answer

Each cross product operation is still
O(1)
. You can run the below function for all the points and see which of them are below, bringing it to a linear time check.def ccw(a,b,c): """ Returns 1 if c is above directed line ab else returns 1""" return (b.x  a.x) * (c.y  a.y)  (c.x  a.x) * (b.y  a.y) #a and b are the vertices and c is the test point.
Unless you have some other information about the points, you would have to check each point to see if it below a particular line.
See also questions close to this topic

Please help me with this matplotlib pyplot axis
Here is my code:
x = np.arange(0,10000) y = 10000.2*x profit = .0008*x**3  2.2*x**2 +1400*x plt.plot(x,profit) plt.grid(b = None, which = 'both', axis = 'both') plt.axis(xmin = 0, xmax = 10000, option = 'tight)
I want the xaxis from 0 to 10000. But I want to see the values at 200, 400, 600, etc. How can I do this. Also is it correct to use np.arange?

How to get the ping of different websites and store it in file for later use?
I am writing a script to find the fastest google website near my location. What I am trying to do is take my IP and the corresponding Google website for the particular location. So that if I visit that location again it doesn't need to search for the fastest link again. It should look through the file and if IP exists for any previous link it should automatically connect to the link instead of looking for it again.
I have already done this in a shell script, but I want to try it in python. As shell scripts can only run on Linux systems. But I am not able to think about how to read variables from that file. Here is my code for a shell script.
source .nearest.sh ip=`hostname allipaddresses  cut d' ' f 1  cut d'.' f '1 2 3' tr \. _` art_var="a${ip}" if [ z ${!art_var} ];then echo "Looking for the nearest google..." array=(google.com google.ca google.au google.us) fastest_response=2147483647 # largest possible integer for site in ${array[*]} do this_response=`ping c 4 "$site"  awk 'END { split($4,rt,"/") ; print rt[1] }'` if (( $(bc l <<< "$this_response < $fastest_response") )) ; then fastest_response=$this_response fastest_site="$site" fi done echo ${art_var}="${fastest_site}" >> .nearest.sh connect=${fastest_site} else connect=${!art_var} fi

Unable to add column and index and vaccum analyze postgresql database table from python
I have a python code from which a connection to a postgresql database is made. The goal of the code is to create a new table by running a sql command which adds columns from one table to the new table based on a spatial operation (spatial identity). The python code runs the sql a couple of times and each time adds new columns to the new table from other tables. Up to this point it works fine, it creates the new table and adds the columns from other tables. In order to speed up the process, before the next round of running the sql on the newly created table, I try to add a primary key and index by running:
ALTER TABLE new_table ADD COLUMN id SERIAL PRIMARY KEY
and
CREATE INDEX new_table_index ON new_table USING GIST (geom)
and then using subprocess module
["psql", "h", "srv01", "d", "mdb", "c", "VACUUM ANALYZE", new_table]
I try to clean it up and update the stats to make it read for the next repetition.
However, python is not able to add the column or index and when running the vaccum analyze command a prompt appears
Password for user new_table
I haven't been able to get python to show the error message for the add id column and index creation. When I run these commands directly in command line using psql it works fine but I haven't had much luck doing the same thing from python. As the process is iterative one I need to run the indexing and vaccuming from python. What can I do to overcome this problem? The driver I use to connect to postgres is pgdb.

How to plot a specific range of values from a text file in matplotlib
I have a text file contain values as:
0.00 10.742 1.00 17.75391 4.00 19.62891 20.00 20.7641 23.00 34.2300 50.00 50.000 65.88 22.5000 78.00 30.000 86.00 37.7900 90.00 45.00000
I wish to plot only a range of values from a text file after a time interval another set of values to be plotted. For example: First I wish to plot only[0 to 50]:
0.00 10.742 1.00 17.75391 4.00 19.62891 20.00 20.7641 23.00 34.2300 50.00 50.000
After a time interval( say 10s) I wish to plot next set of values ie:
65.88 22.5000 78.00 30.000 86.00 37.7900 90.00 45.00000
Looking forward to show this as a slideshow.
What I have tried is:
import matplotlib.pyplot as plt import sys import numpy as np from matplotlib import style fileName=input("Enter Input File Name: ") f1=open(fileName,'r') style.use('ggplot') x1,y1=np.loadtxt(fileName,unpack=True, usecols=(0,1)); plt.plot(x1,y1,'r') plt.plot plt.title('example1') plt.xlabel('Time') plt.ylabel('Values') plt.grid(True,color='k') plt.show()
I wish to show this as a slideshow. I will be thankful if someone helps me out there.

Matplotlib, how can I darken the colour of hatch lines and the edge?
My code is
import matplotlib matplotlib.rcParams['text.usetex'] = True matplotlib.rcParams['text.latex.unicode'] = True matplotlib.rcParams['axes.linewidth'] = 1.6 matplotlib.rcParams['figure.autolayout'] = True central = pd.DataFrame({'MAACeGCN': [787, 785, 783, 784, 785], 'DPGbetaeGCN': [762, 751, 756, 756, 751], 'PGbeta': [726, 721, 719, 721, 722]}) cen_mean = np.array(central.mean().tolist()) cen_std = np.array(central.std().tolist()) N = len(cen_mean) ind = np.arange(N) # the x locations for the groups width = 0.35 # the width of the bars whole = pd.DataFrame({'A': [335, 336, 337, 337, 336], 'B': [335, 335, 337, 336, 337], 'C': [313, 313, 314, 313, 314]}) wh_mean = np.array(whole.mean().tolist()) wh_std = np.array(whole.std().tolist()) fig = plt.figure() ax = fig.add_subplot(111) rects1 = ax.bar(ind, cen_mean, width, color='white',hatch='/\\', yerr=2*cen_std, alpha=0.4, edgecolor='black') rects2 = ax.bar(ind+width, wh_mean, width, color='white', hatch='+', yerr=2*wh_std, alpha=0.4, edgecolor='black') # add some ax.set_ylabel('performance', fontsize=18) # ax.set_title('Scores by group and gender') ax.set_xticks(ind + width / 2) ax.set_xticklabels( ('A', 'B', 'C') ) ax.legend( (rects1[0], rects2[0]), ('W', 'C') ) fontsize = 16 _ax = plt.gca() for tick in _ax.xaxis.get_major_ticks(): tick.label1.set_fontsize(fontsize) tick.label1.set_fontweight('bold') for tick in _ax.yaxis.get_major_ticks(): tick.label1.set_fontsize(fontsize) tick.label1.set_fontweight('bold') fig.savefig('temp.pdf', format='pdf', dpi=1000)
But the output is
How can I make the hatch lines black since it is grey?

Calculating object labelling consensus area
Scenario: four users are annotating images with one of four labels each. These are stored in a fairly complex format  either as polygons or as centreradius circles. I'm interested in quantifying, for each class, the area of agreement between individual raters – in other words, I'm looking to get an m x n matrix, where M_i,j will be some metric, such as the IoU (intersection over union), between i's and j's ratings (with a 1 diagonal, obviously). There are two problems I'm facing.
One, I don't know what works best in Python for this. Shapely doesn't implement circles too well, for instance.
Two, is there a more efficient way for this than comparing it annotatorbyannotator?

How to calculate whether a point lies inside a triangular prism
I'm trying to figure out whether or not the current method I am using is correct. I am trying to figure out whether a point lies inside of a triangular prism like the following: Geometry set up
None of the edges of this shape are necessarily parallel. I am currently using the the points to create vectors p1, p2 and p3, and then using vector cross products to calculate the surface normals of each rectangular plane. Then I calculate a vector from s to the midpoints of each of the upper triangles. I take a dot product of this vector with the surface normal for each surface. One of these dot products looks like the following in case that was confusing: Vector Geometry
If all three dot products are positive, or all three or negative, then the point lies within the plane (I do not necessarily know if the surface normal points in or out due to the way these objects are being tracked). I would like to know if this is correct, or if there is a better way of calculating it. Thanks!

how to get approach to bowlength by this code?
each line maxX goes from minY and the other way around this makes a part of hyperbole which can fit for approximation by its single vectors marked by each intersection. how can these vectors be summed to give the bowlength in general?
the code gives a wrong result  why is this relation of x and y false for curvature
#include <stdio.h> #include <stdlib.h> double sqrt1(double a); double powerOfTen(int num); double func_relDREF(); double _sin(double x); double _cos(double x); double _tan(double x); double func_pow2(double v); double func_cot(double rad); int _fact(int a); int main() { double a = 1.0; double b = 1.0; double phi = 1000; double DREF = 2*func_cot(phi/2.0); double A = (a*b); double res = 2.0*(((sqrt1(func_pow2(2.0*A)+sqrt1(func_pow2(A))))*A)/sqrt1(func_pow2(3.0*A)+(sqrt1(func_pow2(A))))); double bow = res * DREF; fprintf(stdout, "%f :: %f", res , bow); return 0; } int _fact (int a) { if (a>1) // Base case = 1 { return (a * _fact(a1)); // fact(a1) is the recursive call in this step to call its own function again } else // If the value of the a = 1, then simply return the same return 1; } double func_cot(double rad) { double res = 0.0; res = 1/_tan(rad); return res; } double _sin(double x) { double y = x; double s = 1; int i = 0; for (i=3; i<=10; i+=2) { y+=s*(powerOfTen((x,i)/_fact(i))); s *= 1; } return y; } double _cos(double x) { double y = 1; double s = 1; int i = 0; for (i=2; i<=10; i+=2) { y+=s*(powerOfTen((x,i)/_fact(i))); s *= 1; } return y; } double _tan(double x) { return (_sin(x)/_cos(x)); } double func_relDREF() { return (3.141592653589793 / 2.0) / sqrt1(2.0); } double func_pow2(double v) { return v*v; } double powerOfTen(int num){ double rst = 1.0; int i = 0; if(num >= 0){ for(i = 0; i < num ; i++){ rst *= 10.0; } }else{ for(i = 0; i < (0  num ); i++){ rst *= 0.1; } } return rst; } double sqrt1(double a) { /* find more detail of this method on wiki methods_of_computing_square_roots *** Babylonian method cannot get exact zero but approximately value of the square_root */ double z = a; double rst = 0.0; int max = 8; // to define maximum digit int i; double j = 1.0; for(i = max ; i > 0 ; i){ // value must be bigger then 0 if(z  (( 2 * rst ) + ( j * powerOfTen(i)))*( j * powerOfTen(i)) >= 0) { while( z  (( 2 * rst ) + ( j * powerOfTen(i)))*( j * powerOfTen(i)) >= 0) { j++; if(j >= 10) break; } j; //correct the extra value by minus one to j z = (( 2 * rst ) + ( j * powerOfTen(i)))*( j * powerOfTen(i)); //find value of z rst += j * powerOfTen(i); // find sum of a j = 1.0; } } for(i = 0 ; i >= 0  max ; i){ if(z  (( 2 * rst ) + ( j * powerOfTen(i)))*( j * powerOfTen(i)) >= 0) { while( z  (( 2 * rst ) + ( j * powerOfTen(i)))*( j * powerOfTen(i)) >= 0) { j++; } j; z = (( 2 * rst ) + ( j * powerOfTen(i)))*( j * powerOfTen(i)); //find value of z rst += j * powerOfTen(i); // find sum of a j = 1.0; } } // find the number on each digit return rst; }
deltaX 1 and deltaY 3 should be more than sqrt(10)
another trial is this code but it's also not correponding to nummerical reality please help!
#include <stdio.h> #include <math.h> #define LIM_N 3 #define LIM_M 3 #define LIM_E 0.0000001 main() { int n = 0; int m = 0; int d = 0; double dim = 0.0; double x1 = 0.0; double x2 = 0.0; double pr = 0.0; double res = 0.0; if(LIM_M > LIM_N) { for(d = 1; pow(10,d) <= LIM_M; d++) { } dim = pow(10,d  1); for(n = 2, m = 1; n <= (int) LIM_M + 1; n++, m++) { x1 = sqrt((sqrt(2.0) / ((double) LIM_M / (double) LIM_N) * sqrt(2.0) / ((double) LIM_M) / (double) LIM_N) + ((double)LIM_E) / 2.0*((double)LIM_E) / 2.0) + (sqrt(2.0) / (double) 2.0); res = dim * x1; } fprintf(stdout, "sqrt(1. %.64lf + %.64lf) = %.64f\n",x1, x2, res); } if(LIM_M < LIM_N) { for(d = 1; pow(10,d) <= LIM_N; d++) { } dim = pow(10,d  1); for(n = 2, m = 1; n <= (int) LIM_N + 1; n++, m++) { x1 = sqrt((sqrt(2.0) / ((double) LIM_N / (double) LIM_M) * sqrt(2.0) / ((double) LIM_N) * (double) LIM_M) + ((double)LIM_E) / 2.0*((double)LIM_E) / 2.0) + (sqrt(2.0) / (double) 2.0); x2 = sqrt(sqrt(2.0/(double)LIM_N)*sqrt(2.0/(double)LIM_N))+(LIM_E/(double)LIM_N)*(LIM_E/(double)LIM_N) + (sqrt(2.0)/(double)LIM_M); res = dim * x1; } fprintf(stdout, "sqrt(2. %.64lf + %.64lf) = %.64f\n",x1, x2, res); } if(LIM_M == LIM_N) { for(d = 1; pow(10,d) <= LIM_M; d++) { } dim = pow(10,d  1); for(n = 2, m = 1; n <= (int) LIM_M + 1; n++, m++) { x1 = 0.5*sqrt((sqrt(2.0) / ((double) 1.0) * sqrt(2.0) / ((double) 1.0)) + ((double)LIM_E) / 2.0*((double)LIM_E) / 2.0) + (sqrt(2.0) / (double) 2.0); x2 = sqrt(sqrt(2.0/(double)LIM_N)*sqrt(2.0/(double)LIM_N))+(LIM_E/(double)LIM_N)*(LIM_E/(double)LIM_N) + (sqrt(2.0)/(double)LIM_M); res = dim*x1; } fprintf(stdout, "sqrt(3. %.64lf + %.64lf) = %.64f\n",x1, x2, res); } }

The inverse of the cross product of a m x n matrix
I would like to recreate the following calculation with a random matrix:
I started out with the following, which gives a result:
kmin1 < cbind(1:10,1:10,6:15,1:10,1:10,6:15,1:10,1:10,6:15) C < cbind(1, kmin1) # Column of 1s diag(C) < 1 Ccrosprod <crossprod(C) # C'C Ctranspose < t(C) # C' CCtransposeinv < solve(Ccrosprod) # (C'C)^1 W < Ctranspose %*% CCtransposeinv # W=(C'C)^1*C'
My assumption is however that C should be able to be an
m x n
matrix, as there is no good reason to assume that factors equal observations.EDIT: Based on the comment by Hong Ooi, I changed
kmin1 < matrix(rexp(200, rate=.1), ncol=20)
intokmin1 < matrix(rexp(200, rate=.1), nrow=20)
I checked the wikipedia and learned that an
m x n
might have a left or a right inverse. To put this into practice I attempted the following:kmin1 < matrix(rexp(200, rate=.1), nrow=20) C < cbind(1, kmin1) # Column of 1s Ccrosprod <crossprod(C) # C'C Ctranspose < t(C) # C' CCtransposeinv < solve(Ccrosprod) # (C'C)^1 W < Ctranspose %*% CCtransposeinv # W=(C'C)^1*C'
EDIT: Based on the comments below this questions everything works.
I would post this on stackexchange if I was sure this did not have anything to do with syntax, but as I am not experienced with matrices I am not sure.

r data.table::dcast cross product fails on large data set
i have a
dcast()
application whose crossproduct exceeds.Machine$integer.max
. is there a recommended alternative to dealing with this situation? i could break upw
into smaller pieces, but was hoping for a clean solution.this might be a duplicate of R error when applying dcast to a large data.table object but that question also doesn't have an answer.
thanks!
library(data.table) # three million x one thousand w < data.table( x = 1:3000000 , y = 1:1000 ) z < data.table::dcast( w , x ~ y , value.var = 'x' ) # Error in CJ(1:3000000, 1:1000) : # Cross product of elements provided to CJ() would result in 3e+09 rows which exceeds .Machine$integer.max == 2147483647

NumPy is faster than PyTorch for larger cross or outer products
I'm computing huge outer products between vectors of size
(50500,)
and found out that NumPy is (much?) faster than PyTorch while doing so.Here are the tests:
# NumPy In [64]: a = np.arange(50500) In [65]: b = a.copy() In [67]: %timeit np.outer(a, b) 5.81 s ± 56.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)  # PyTorch In [73]: t1 = torch.arange(50500) In [76]: t2 = t1.clone() In [79]: %timeit torch.ger(t1, t2) 7.73 s ± 143 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
I'd ideally like to have the computation done in PyTorch. So, how can I speed things up for computing outer product in PyTorch for such huge vectors?
Note: I tried to move the tensors to GPU but I was treated with
MemoryError
because it needs around 19 GiB of space. So, I eventually have to do it on the CPU.