Bounding Box Coordinates without Guess & Check

So I am attempting to take a screenshot of my monitor and only grab part of the screen when doing so. I know I can use mss or opencv, pillow or any other screenshotting library that supports bounding boxes... However, instead of randomly guessing on what the coordinate are... and what I mean by this is taking a screenshot with bounding box coordinates set, and seeing if it is anywhere close to what I am actually trying to get the picture of.

For example: my trial coordinates would be 10,10,500,500 when in reality my actual coordinates that I need are 15,40,200,300 (these coordinates are made up)

My idea to solve this problem is to either have a tool that allows me to click and drag a bounding box around the image (part of the screen) that I need and have the program return the results, suchas 15,40,200,300. Also, if I box could be drawn as it is shown that would be really helpful! If there is another way of doing achieving this goal I would be open to this as well.

Thank you.

1 answer

  • answered 2019-03-14 01:29 nathancy

    The idea is to click-and-drag a bounding box around a region of interest to obtain the coordinates. To do this, we must capture the event actions of a mouse click and record the starting and ending coordinates of the ROI. OpenCV allows us to do this by processing mouse click events. Anytime a mouse click event is triggered, OpenCV will relay the information to our extract_coordinates callback function. In order to handle the event, OpenCV requires various arguments:

    • event: Event that took place (left/right pressed or released mouse click)
    • x: The x-coordinate of event
    • y: The y-coordinate of event
    • flags: Relevant flags passed by OpenCV
    • Parameters: Extra parameters passed by OpenCV

    A pressed left click records the top left coordinates while a released left click records the bottom right coordinates. We then draw a bounding box around the ROI and print the coordinates of the top left and bottom right rectangular region to the console. A right click will reset the image.

    Bounding Box coordinates

    Extract bounding box coordinates widget:

    import cv2
    class ExtractImageWidget(object):
        def __init__(self):
            self.original_image = cv2.imread('placeholder.PNG')
            # Resize image, remove if you want raw image size
            self.original_image = cv2.resize(self.original_image, (640, 556))
            self.clone = self.original_image.copy()
            cv2.setMouseCallback('image', self.extract_coordinates)
            # Bounding box reference points and boolean if we are extracting coordinates
            self.image_coordinates = []
            self.extract = False
        def extract_coordinates(self, event, x, y, flags, parameters):
            # Record starting (x,y) coordinates on left mouse button click
            if event == cv2.EVENT_LBUTTONDOWN:
                self.image_coordinates = [(x,y)]
                self.extract = True
            # Record ending (x,y) coordintes on left mouse bottom release
            elif event == cv2.EVENT_LBUTTONUP:
                self.extract = False
                print('top left: {}, bottom right: {}'.format(self.image_coordinates[0], self.image_coordinates[1]))
                # Draw rectangle around ROI
                cv2.rectangle(self.clone, self.image_coordinates[0], self.image_coordinates[1], (0,255,0), 2)
                cv2.imshow("image", self.clone) 
            # Clear drawing boxes on right mouse button click
            elif event == cv2.EVENT_RBUTTONDOWN:
                self.clone = self.original_image.copy()
        def show_image(self):
            return self.clone
    if __name__ == '__main__':
        extract_image_widget = ExtractImageWidget()
        while True:
            cv2.imshow('image', extract_image_widget.show_image())
            key = cv2.waitKey(1)
            # Close program with keyboard 'q'
            if key == ord('q'):