Solvepnp CPP function gives different results
I am trying to undersatand how solvepnp works. I tried giving 8 corner points of an object (its 2D  3D correspondence) and intrinsics of camera. I get the result as
rvec
1.59 1.6 0.89
Tvec
18 3000 1400
When i tried reprojecting using output of solvepnp rvec and tvec, the points get properly overlayed on the input image. When I increment value of one of my image points by one(say (400,300) earlier and now I just changed to (401,300)). My rvec changes sign and tvec value drastically varies . Now it is
rvec
1.6 1.6 0.8
Tvec
9 900 5000
Reprojection also fails. I am curious on how this change occurs with the minor change. How can it be solved?
See also questions close to this topic

Import ImageGrab.grab() directly into OpenCV
My code takes screenshot of box, saving it to disk, then loading it back to OpenCV. Is there any way to skip saving and loading from disk?
image = ImageGrab.grab() image.save('image.png') image_cv_gray = cv2.imread('image.png', 0) #it is gray scale image_cv_color = cv2.imread('image.png') #color
My goal is something like this:
image = ImageGrab.grab() image_cv_gray = cv2.imread(image, 0) #it is gray scale image_cv_color = cv2.imread(image) #color
If there is no way to reach this with pillow and opencv, can you recommend other module? Thanks!

import cv2 ImportError: DLL load failed: The specified module could not be found
I spent 2 days to solve this problem but I can't solve it, please any clue that you know to share with me.
I install
python 3.6.4 x64
onWindow 10 x64
correctly.After that I install
opencv
using this command:pip install opencvpython
But when i import cv2 this error coming:
import cv2 Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: DLL load failed: The specified module could not be found.
I use from these solutions (1, 2, 3, 4) but no result was obtained!
Thanks for sharing.

How can I reduceflicker & display thicker lines with Canny Edge detection on video with opencv?
I'm doing Canny edge detection on a video stream from my webcam and i've been trying to find parameters to reduce the flicker. I'm wonderng if thicker lines representing the edges might not do the trick.
The code I have working at the moment is
# inspired by hhttps://shahsparx.me/edgedetectionopencvpythonvideoimage/ttps://shahsparx.me/edgedetectionopencvpythonvideoimage/ import cv2 import numpy as np cap = cv2.VideoCapture(0) while(1): ret, frame = cap.read() gray_vid = cv2.cvtColor(frame, cv2.IMREAD_GRAYSCALE) cv2.imshow('Original',frame) edged_frame = cv2.Canny(frame,100,200) cv2.imshow('Edges',edged_frame) # Quit with 'Esc' key k= cv2.waitKey(5) & 0xFF if k==27: break cap.release() cv2.destroyAllWindows()
but it's very flickery, i.e. many of the detected edges appear and disappear, particular where the edges are faint like at wrinkles, hair, background elements etc. Are there opencv parameters to be used either at the videocapture or the transformations that can help out ?
Best regards,
Colm

Calculating Hausdorff distance for image comparison
I'm trying to implement Hausdorff distance metric to calculate similarity between my ground truth and actual image segmentations from a neural network, but am struggling to find any implementation that works for images in python.
thanks in advance

Map information on an HTML with its printscreen image
I wonder how to map all the information from a web page in HTML to its print screen. To be more specific, you are on this stackoverflow page, you do a print screen and you want to know where on the image is located the header, the answers, at the top my question, the top left stackoverflow logo etc.. Thanks a lot !

Why is the poisson image editing implementation giving wrong results?
I was trying to implement this equation in matlab as follows:
Image acquisition:
clear ori=imread('C:\Users\hpw\Desktop\11.png'); im= imread('C:\Users\hpw\Desktop\11.png'); ori=rgb2gray(ori); im=rgb2gray(im);
Masking is done as follows:
im=imadd(im,1); imshow(im) hFH = imfreehand(); % Create a binary image ("mask") from the ROI object. binaryImage = hFH.createMask(); xy = hFH.getPosition; mask=binaryImage; im3 = im; im3(mask ~= 0) = mask(mask ~= 0);
Masked pixels:
u = find(mask); w = find(~mask); M = size(mask,1);
Finding matrix A for equation Ax=b: For the matrix A, I am taking a pixel in the masked region. If it's neighbor is also in the masked region, the corresponding A entry is 1.
v = ones(size(u)); a=zeros(size(u)); a=a'; for i=1:size(u) aa=zeros(size(u)); aa(i)=4; z=u(i); if ((mask(z1)==1)) if(i>1) aa(i1)=1; end end if (mask(z+1)==1) if(i+1<size(u)) aa(i+1)=1; end end if ((mask(zM)==1)) if(i>M) aa(iM)=1; end end if ((mask(z+M)==1)) if(i+M<size(u)) aa(iM)=1; end end aa=aa'; aa=sparse(aa); a=[a;aa]; end A=a(2:size(a,1),:);
Matrix b of Ax=b as follows:
ima=im(:); b=zeros(size(u)); for i=1:size(u) z=u(i); n1=(mask(z+1)==0); n2=(mask(z1)==0); n3=(mask(z+M)==0); n4=(mask(zM)==0); n5=(mask(z+M+1)==0); n6=(mask(z+M1)==0); n7=(mask(zM+1)==0); n8=(mask(zM1)==0); if(n1==1) b(i)=b(i)+ima((z+1)); end if(n2==1) b(i)=b(i)+ima((z1)); end if(n3==1) b(i)=b(i)+ima((z+M)); end if(n4==1) b(i)=b(i)+ima((z1)); end if(n5==1) b(i)=b(i)+ima((z+1+M)); end if(n6==1) b(i)=b(i)+ima((z+M1)); end if(n7==1) b(i)=b(i)+ima((zM+1)); end if(n8==1) b(i)=b(i)+ima((zM1)); end end r=double(A)\double(b); ori(mask(:))=double(A)\double(b); imshow(ori)
I get something like this as output.
See the black region.
It is supposed to be like this:
See the blurred region.

Convert JPG image into RGB array JS
I have a JPEG image in a HTML5 canvas. With only JS, is it possible to convert this into an array of RGB values in the following format:
[[31, 67, 245], [255, 23, 132], [171, 72, 62], [3, 225, 86], [39, 112, 122], [24, 239, 52], [R, G, B]...]
I have tried reading through the JPEG spec to see if there is something I can find out from there about how to decode it, but have gotten no further than getting myself a list of (seemingly useless) numbers that looks like:
[32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 61440, 17, 8, 1, 257, 193, 17, 0, 2, 17, 1, 3, 17, 1, 61696, 45057, 0, 2, 2, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 5, 3, 6, 1, 2, 7, 8, 1, 1, 0, 3, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 16, 0, 2, 1, 3, 2, 4, 4, 3, 7, 2, 4, 5]
I cannot make out what this represents, even after trying to read everything I can about the inner workings of JPEG. I got this by decoding the JPEG and taking the resulting Base64 and converting it into base 10. I've also tried converting it into binary, Base8, Base16 and Base256, or converting the numbers into RGB (from format
R+G*255+B*65025
), but have not achieved any output that I can understand and use for further decoding into RGB. Therefore, I'm almost certain there's something completely wrong about my approach to this.If this is not possible in frontend JS (libraries/frameworks are fine), then I'd still like to know if there is some way of doing it without buying fancy software or requiring lots of terminal work. I am making an online image upscaler neural network, and need this for uploading images (since the AI requires an input of an RGB array in the above format), and while it would be a nice touch to add a file uploader, if there is an easy way to do it as an end user and simply copypaste in the array (for me to
eval
), then that is also a possibility. I've looked all over the internet, however, and found no tool to do this (at least without downloading software).Thanks for your help.

AR experience similar to Instagram beauty filter
I'd like to create a beauty AR filter similar to one from the Instagram app  https://youtu.be/l3jm7Dd4bUo. Please advise what SDKs or sets of algorithms are proven to give such quality of results.
I understand that the implementation of a similar filter includes two parts: face landmarks tracking and rendering. Rendering is not a question for me.
For landmarks tracking, I've already tried:
 ARKit 2 (tracking is not accurate as it based on depth, not image)
 Dlib (tracking is not stable & tracking is not accurate when a face is moving)
 Visage SDK (tracking is not stable & tracking is not accurate when a face is moving)
Thanks!

Get peaks with the same height
I have the following spectrum that I got from using a vertical histogram of an image.
1) I would like to get the peaks with the same height as shown in the figure with red arrows.
def verticalProjection(img): "Return a list containing the sum of the pixels in each column" (h, w) = img.shape[:2] sumCols = [] for j in range(w): col = img[0:h, j:j+1] # y1:y2, x1:x2 sumCols.append(np.sum(col)) return sumCols def slice_digits(image_name): img = cv2.imread(image_name, 0) bw = cv2.bitwise_not(img) env = verticalProjection(bw) env = np.asarray(env, dtype=np.float32) #filtered = moving_average(img) peaks,_ = signal.find_peaks(env, width = 5) # print(peaks) plt.scatter(peaks, env[peaks] , s=50, c='r') plt.plot(env) plt.show()

How to understand Euler angles from cv2.decomposeProjectionMatrix?
I have been searching the internet for hours on some documentation on how to understand the Euler angles returned by
cv2.decomposeProjectionMatrix
.My problem seems simple, I have this 2D image of an aircraft. I want to be able to derive from that image how the aircraft is oriented with respect to the camera. Ultimately, I am looking for the Look and Depression (i.e. Azimuth and Elevation). I have corresponding 3D coordinates to the 2D features selected in my image  listed below in my code.
#  Imports  import os import cv2 import numpy as np #  Main  if __name__ == "__main__": # Load image and resize THIS_DIR = os.path.abspath(os.path.dirname(__file__)) im = cv2.imread(os.path.abspath(os.path.join(THIS_DIR, "raptor.jpg"))) im = cv2.resize(im, (im.shape[1]//2, im.shape[0]//2)) size = im.shape # 2D image points image_points = np.array([ (230, 410), # Nose (55, 215), # right forward wingtip (227, 170), # right aft outboard horizontal (257, 71), # right forward vertical tail (532, 96), # left forward vertical tail (605, 210), # left aft outboard horizontal (700, 283) # left forward wingtip ], dtype="double") # 3D model points (estimated) model_points = np.array([ ( 0., 484.1, 18.4), # Nose (758.1, 872.4, 15.9), # right forward wingtip (470.3, 1409.4, 7.9), # right aft outboard horizontal (287.3, 1040.2, 323.3), # right forward vertical tail ( 287.3, 1040.2, 323.3), # left forward vertical tail ( 470.3, 1409.4, 7.9), # left aft outboard horizontal ( 758.1, 872.4, 15.9) # left forward wingtip ], dtype="double") # Estimated camera internals focal_length = size[1] center = (size[1]/2, size[0]/2) camera_matrix = np.array( [[focal_length, 0, center[0]], [0, focal_length, center[1]], [0, 0, 1]], dtype = "double" ) # Lens distortion assumed to be zero dist_coeffs = np.zeros((4,1)) # Solving for persepective and point _, rvec, tvec = cv2.solvePnP(model_points, image_points, camera_matrix, dist_coeffs, flags=cv2.SOLVEPNP_ITERATIVE) # Rotational matrix rmat = cv2.Rodrigues(rvec)[0] # Projection Matrix pmat = np.hstack((rmat, tvec)) roll, pitch, yaw = cv2.decomposeProjectionMatrix(pmat)[1] print('Roll: {:.2f}\nPitch: {:.2f}\nYaw: {:.2f}'.format(float(roll), float(pitch), float(yaw))) # Visualization # Points of interest from the nose poi = np.array([(model_points[0][0], model_points[0][1]+1e6, model_points[0][2])]) poi_end, jacobian = cv2.projectPoints(poi, rvec, tvec, camera_matrix, dist_coeffs) p1 = ( int(image_points[0][0]), int(image_points[0][1]) ) # nose p2 = ( int(poi_end[0][0][0]), int(poi_end[0][0][1]) ) # poi # Show the 2D features for p in image_points: cv2.circle(im, (int(p[0]), int(p[1])), 3, (0,0,255), 1) # Line from nose to projected point cv2.line(im, p1, p2, (255,0,0), 2) cv2.imshow("Output", im) cv2.waitKey(0)
Below is my output image, as you can see the point projected aft does not seem to follow the centerline. I'm not entirely certain that my code is working as I intended so please feel free to offer helpful inputs.
Thanks in advance!!

Calculate camera pose from image using SfM
I already have point cloud, which is obtained by SfM, so I have camera pose of images which was used for constructing them, and 2D3D point correspondence of key points. Now I have a new image, and want to get its camera pose, without the reconstruction of SfM. How can I do this?
1.Extract feature points of each image, and count the number of matched key points. 2.Take the image with the most matched keypoints. 3.Obtain the 3D position of matched keypoints in that image. 4.SolvePnP problem.
This is what I'm thinking now, but is this an efficient way to implement?

OpenCV Camera calibration and SolvePNP translate results
I am attempting to initially calibrate a sensor using a chessboard. I make around 50 run, and after I calibrate the camera I proceed in using solvepnp to teach the coordinate system, and since I am using a well defined chessboard to actually learn the real world coordinates.
As input for the SolvePnP I use all corner values with their corresponding real world distances.
My biggest issue is that the translate matrix I calculate from the SolvePvP is a bit strange. It is my understanding that the translate matrix is the actual offset between the camera and the coordinate system, which I define as the the upper part of the chessboard. But I get completely random values, with the Tz gives me a value of around 6000, even though the distance between camera and chessboard is around 1600 mm, and I am not even using depth values as parameters for the solvePnP method anyway.
Any input on what could be off?
Code Sample:
50 x DrawChessboardCorners Corner result: {X = 1170.45984 Y = 793.002} {X = 1127.80371 Y = 792.54425} 3d Points: {X = 175 Y = 70, Z = 0} {X = 140 Y = 70, Z = 0}
total 18 (6x3) results per run for a total of 50 runs.
Afterwards I calibrate the camera:
CalibrateCamera(_objectsList, _pointsList, new Size(_sensorWrapper.Width, _sensorWrapper.Height), cameraMatrix, distCoeffs, CalibType.Default, new MCvTermCriteria(30, 0.1), out _, out _);
Afterwards, using the cameraMatrix and distCoeffs I use SolverPnP by using the top left, top right, bottom left bottom right corners, with their real world coordinates.
The results i get from the calibration are :
{ "CameraMatrix": [ [ 5969.947, 0.0, 959.687256 ], [ 0.0, 6809.737, 540.3694 ], [ 0.0, 0.0, 1.0 ] ], "DistortionCoefficientsMatrix": [ [ 0.141516522, 285.377747, 0.008248664, 0.0280253552, 1.5376302 ] ], "RotationMatrix": [ [ 0.9992069, 0.0270648878, 0.0292078461 ], [ 0.0003847139, 0.726907134, 0.68673563 ], [ 0.0398178138, 0.6862022, 0.7263202 ] ], "TranslationMatrix": [ [ 22.5370159 ], [ 362.535675 ], [ 5448.58057 ] ], "SensorName": "BRIO 4K Stream Edition", "SensorIndex": 0, "Error": 0.18790249992423425 }