Opencvpython Tutorials Documentation Release Beta Eastwillow pdf download
Opencvpython Tutorials Documentation Release Beta Eastwillow pdf download
https://ebookbell.com/product/opencvpython-tutorials-
documentation-release-beta-eastwillow-50196046
https://ebookbell.com/product/opencv-with-python-blueprints-design-
and-develop-advanced-computer-vision-projects-using-opencv-with-
python-1st-edition-michael-beyeler-20640218
https://ebookbell.com/product/opencv-with-python-by-example-prateek-
joshi-36155310
https://ebookbell.com/product/opencv-with-python-by-example-prateek-
joshi-10833448
https://ebookbell.com/product/learn-opencv-with-python-by-
examples-2nd-edition-james-chen-53046722
Mastering Opencv With Python Use Numpy Scikit Tensorflow And
Matplotlib To Learn Advanced Algorithms For Machine Learning Through A
Set Of Practical Projects Ayush Vaishya
https://ebookbell.com/product/mastering-opencv-with-python-use-numpy-
scikit-tensorflow-and-matplotlib-to-learn-advanced-algorithms-for-
machine-learning-through-a-set-of-practical-projects-ayush-
vaishya-53747356
https://ebookbell.com/product/learn-opencv-with-python-by-examples-
implement-computer-vision-algorithms-provided-by-opencv-with-python-
for-image-processing-object-detection-and-machine-learning-2nd-
edition-james-chen-50883868
https://ebookbell.com/product/computer-vision-projects-with-opencv-
and-python-3-rever-matthew-11861408
https://ebookbell.com/product/handson-ml-projects-with-opencv-master-
computer-vision-and-machine-learning-using-opencv-and-python-
mugesh-s-52327532
https://ebookbell.com/product/opencv-4-with-python-blueprints-build-
creative-computer-vision-projects-with-the-latest-version-of-
opencv-4-and-python-3-menua-gevorgyan-57677018
OpenCV-Python Tutorials
Documentation
Release beta
eastWillow
1 OpenCV-Python Tutorials 3
1.1 Introduction to OpenCV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Gui Features in OpenCV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3 Core Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.4 Image Processing in OpenCV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
1.5 Feature Detection and Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
1.6 Video Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
1.7 Camera Calibration and 3D Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
1.8 Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
1.9 Computational Photography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
1.10 Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
1.11 OpenCV-Python Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
i
ii
OpenCV-Python Tutorials Documentation, Release beta
Contents 1
OpenCV-Python Tutorials Documentation, Release beta
2 Contents
CHAPTER 1
OpenCV-Python Tutorials
• Introduction to OpenCV
Here you will learn how to display and save images and videos, control
mouse events and create trackbar.
• Core Operations
In this section you will learn basic operations on image like pixel editing,
geometric transformations, code optimization, some mathematical tools etc.
3
OpenCV-Python Tutorials Documentation, Release beta
In this section you will learn different image processing functions inside
OpenCV.
In this section you will learn about feature detectors and descriptors
• Video Analysis
In this section you will learn different techniques to work with videos like
object tracking etc.
In this section we will learn about camera calibration, stereo imaging etc.
• Machine Learning
In this section you will learn different image processing functions inside
OpenCV.
• Computational Photography
• Object Detection
In this section you will object detection techniques like face detection etc.
• OpenCV-Python Bindings
5
OpenCV-Python Tutorials Documentation, Release beta
OpenCV
OpenCV was started at Intel in 1999 by Gary Bradsky and the first release came out in 2000. Vadim Pisarevsky
joined Gary Bradsky to manage Intel’s Russian software OpenCV team. In 2005, OpenCV was used on Stanley, the
vehicle who won 2005 DARPA Grand Challenge. Later its active development continued under the support of Willow
Garage, with Gary Bradsky and Vadim Pisarevsky leading the project. Right now, OpenCV supports a lot of algorithms
related to Computer Vision and Machine Learning and it is expanding day-by-day.
Currently OpenCV supports a wide variety of programming languages like C++, Python, Java etc and is available on
different platforms including Windows, Linux, OS X, Android, iOS etc. Also, interfaces based on CUDA and OpenCL
are also under active development for high-speed GPU operations.
OpenCV-Python is the Python API of OpenCV. It combines the best qualities of OpenCV C++ API and Python
language.
OpenCV-Python
Python is a general purpose programming language started by Guido van Rossum, which became very popular in
short time mainly because of its simplicity and code readability. It enables the programmer to express his ideas in
fewer lines of code without reducing any readability.
Compared to other languages like C/C++, Python is slower. But another important feature of Python is that it can
be easily extended with C/C++. This feature helps us to write computationally intensive codes in C/C++ and create
a Python wrapper for it so that we can use these wrappers as Python modules. This gives us two advantages: first,
our code is as fast as original C/C++ code (since it is the actual C++ code working in background) and second, it
is very easy to code in Python. This is how OpenCV-Python works, it is a Python wrapper around original C++
implementation.
And the support of Numpy makes the task more easier. Numpy is a highly optimized library for numerical operations.
It gives a MATLAB-style syntax. All the OpenCV array structures are converted to-and-from Numpy arrays. So
whatever operations you can do in Numpy, you can combine it with OpenCV, which increases number of weapons in
your arsenal. Besides that, several other libraries like SciPy, Matplotlib which supports Numpy can be used with this.
So OpenCV-Python is an appropriate tool for fast prototyping of computer vision problems.
OpenCV-Python Tutorials
OpenCV introduces a new set of tutorials which will guide you through various functions available in OpenCV-Python.
This guide is mainly focused on OpenCV 3.x version (although most of the tutorials will work with OpenCV 2.x
also).
A prior knowledge on Python and Numpy is required before starting because they won’t be covered in this guide.
Especially, a good knowledge on Numpy is must to write optimized codes in OpenCV-Python.
This tutorial has been started by Abid Rahman K. as part of Google Summer of Code 2013 program, under the guidance
of Alexander Mordvintsev.
Since OpenCV is an open source initiative, all are welcome to make contributions to this library. And it is same for
this tutorial also.
So, if you find any mistake in this tutorial (whether it be a small spelling mistake or a big error in code or concepts,
whatever), feel free to correct it.
And that will be a good task for freshers who begin to contribute to open source projects. Just fork the OpenCV
in github, make necessary corrections and send a pull request to OpenCV. OpenCV developers will check your pull
request, give you important feedback and once it passes the approval of the reviewer, it will be merged to OpenCV.
Then you become a open source contributor. Similar is the case with other tutorials, documentation etc.
As new modules are added to OpenCV-Python, this tutorial will have to be expanded. So those who knows about
particular algorithm can write up a tutorial which includes a basic theory of the algorithm and a code showing basic
usage of the algorithm and submit it to OpenCV.
Remember, we together can make this project a great success !!!
Contributors
Additional Resources
Goals
In this tutorial
• We will learn to setup OpenCV-Python in your Windows system.
Below steps are tested in a Windows 7-64 bit machine with Visual Studio 2010 and Visual Studio 2012. The screenshots
shows VS2012.
1. Below Python packages are to be downloaded and installed to their default locations.
1.1. Python-2.7.x.
1.2. Numpy.
1.3. Matplotlib (Matplotlib is optional, but recommended since we use it a lot in our tutorials).
2. Install all packages into their default locations. Python will be installed to C:/Python27/.
3. After installation, open Python IDLE. Enter import numpy and make sure Numpy is working fine.
4. Download latest OpenCV release from sourceforge site and double-click to extract it.
7. Goto opencv/build/python/2.7 folder.
If the results are printed out without any errors, congratulations !!! You have installed OpenCV-Python successfully.
Note: In this case, we are using 32-bit binaries of Python packages. But if you want to use OpenCV for x64, 64-bit
binaries of Python packages are to be installed. Problem is that, there is no official 64-bit binaries of Numpy. You
have to build it on your own. For that, you have to use the same compiler used to build Python. When you start Python
IDLE, it shows the compiler details. You can get more information here. So your system must have the same Visual
Studio version and build Numpy from source.
Note: Another method to have 64-bit Python packages is to use ready-made Python distributions from third-parties
like Anaconda, Enthought etc. It will be bigger in size, but will have everything you need. Everything in a single shell.
You can also download 32-bit versions also.
7.4. It will open a new window to select the compiler. Choose appropriate compiler (here, Visual
Studio 11) and click Finish.
9. Now click on BUILD field to expand it. First few fields configure the build method. See the below image:
10. Remaining fields specify what modules are to be built. Since GPU modules are not yet supported by OpenCV-
Python, you can completely avoid it to save time (But if you work with them, keep it there). See the image
below:
11. Now click on ENABLE field to expand it. Make sure ENABLE_SOLUTION_FOLDERS is unchecked (So-
lution folders are not supported by Visual Studio Express edition). See the image below:
12. Also make sure that in the PYTHON field, everything is filled. (Ignore PYTHON_DEBUG_LIBRARY). See
image below:
18. Open Python IDLE and enter import cv2. If no error, it is installed correctly.
Note: We have installed with no other support like TBB, Eigen, Qt, Documentation etc. It would be difficult to
explain it here. A more detailed video will be added soon or you can just hack around.
Additional Resources
Exercises
1. If you have a windows machine, compile the OpenCV from source. Do all kinds of hacks. If you meet any
problem, visit OpenCV forum and explain your problem.
Goals
In this tutorial
• We will learn to setup OpenCV-Python in your Fedora system. Below steps are tested for Fedora 18
(64-bit) and Fedora 19 (32-bit).
Introduction
OpenCV-Python can be installed in Fedora in two ways, 1) Install from pre-built binaries available in fedora reposito-
ries, 2) Compile from the source. In this section, we will see both.
Another important thing is the additional libraries required. OpenCV-Python requires only Numpy (in addition to
other dependencies, which we will see later). But in this tutorials, we also use Matplotlib for some easy and nice
plotting purposes (which I feel much better compared to OpenCV). Matplotlib is optional, but highly recommended.
Similarly we will also see IPython, an Interactive Python Terminal, which is also highly recommended.
Open Python IDLE (or IPython) and type following codes in Python terminal.
If the results are printed out without any errors, congratulations !!! You have installed OpenCV-Python successfully.
It is quite easy. But there is a problem with this. Yum repositories may not contain the latest version of OpenCV
always. For example, at the time of writing this tutorial, yum repository contains 2.4.5 while latest OpenCV version is
2.4.6. With respect to Python API, latest version will always contain much better support. Also, there may be chance
of problems with camera support, video playback etc depending upon the drivers, ffmpeg, gstreamer packages present
etc.
So my personnel preference is next method, i.e. compiling from source. Also at some point of time, if you want to
contribute to OpenCV, you will need this.
Compiling from source may seem a little complicated at first, but once you succeeded in it, there is nothing compli-
cated.
First we will install some dependencies. Some are compulsory, some are optional. Optional dependencies, you can
leave if you don’t want.
Compulsory Dependencies
We need CMake to configure the installation, GCC for compilation, Python-devel and Numpy for creating Python
extensions etc.
Next we need GTK support for GUI features, Camera support (libdc1394, libv4l), Media Support (ffmpeg, gstreamer)
etc.
Optional Dependencies
Above dependencies are sufficient to install OpenCV in your fedora machine. But depending upon your requirements,
you may need some extra dependencies. A list of such optional dependencies are given below. You can either leave it
or install it, your call :)
OpenCV comes with supporting files for image formats like PNG, JPEG, JPEG2000, TIFF, WebP etc. But it may be
a little old. If you want to get latest libraries, you can install development files for these formats.
Several OpenCV functions are parallelized with Intel’s Threading Building Blocks (TBB). But if you want to en-
able it, you need to install TBB first. ( Also while configuring installation with CMake, don’t forget to pass -D
WITH_TBB=ON. More details below.)
OpenCV uses another library Eigen for optimized mathematical operations. So if you have Eigen installed in your sys-
tem, you can exploit it. ( Also while configuring installation with CMake, don’t forget to pass -D WITH_EIGEN=ON.
More details below.)
If you want to build documentation ( Yes, you can create offline version of OpenCV’s complete official documentation
in your system in HTML with full search facility so that you need not access internet always if any question, and it
is quite FAST!!! ), you need to install Sphinx (a documentation generation tool) and pdflatex (if you want to create
a PDF version of it). ( Also while configuring installation with CMake, don’t forget to pass -D BUILD_DOCS=ON.
More details below.)
Downloading OpenCV
Next we have to download OpenCV. You can download the latest release of OpenCV from sourceforge site. Then
extract the folder.
Or you can download latest source from OpenCV’s github repo. (If you want to contribute to OpenCV, choose this. It
always keeps your OpenCV up-to-date). For that, you need to install Git first.
It will create a folder OpenCV in home directory (or the directory you specify). The cloning may take some time
depending upon your internet connection.
Now open a terminal window and navigate to the downloaded OpenCV folder. Create a new build folder and
navigate to it.
mkdir build
cd build
Now we have installed all the required dependencies, let’s install OpenCV. Installation has to be configured with
CMake. It specifies which modules are to be installed, installation path, which additional libraries to be used, whether
documentation and examples to be compiled etc. Below command is normally used for configuration (executed from
build folder).
It specifies that build type is “Release Mode” and installation path is /usr/local. Observe the -D before each
option and .. at the end. In short, this is the format:
You can specify as many flags you want, but each flag should be preceded by -D.
So in this tutorial, we are installing OpenCV with TBB and Eigen support. We also build the documentation, but we
exclude Performance tests and building samples. We also disable GPU related modules (since we use OpenCV-Python,
we don’t need GPU related modules. It saves us some time).
(All the below commands can be done in a single cmake statement, but it is split here for better understanding.)
• Enable TBB and Eigen support:
˓→gpustereo=OFF -D BUILD_opencv_gpuwarping=OFF ..
Each time you enter cmake statement, it prints out the resulting configuration setup. In the final setup you got, make
sure that following fields are filled (below is the some important parts of configuration I got). These fields should
be filled appropriately in your system also. Otherwise some problem has happened. So check if you have correctly
performed above steps.
-- GUI:
-- GTK+ 2.x: YES (ver 2.24.19)
-- GThread : YES (ver 2.36.3)
-- Video I/O:
-- DC1394 2.x: YES (ver 2.2.0)
-- FFMPEG: YES
-- codec: YES (ver 54.92.100)
-- format: YES (ver 54.63.104)
-- util: YES (ver 52.18.100)
-- swscale: YES (ver 2.2.100)
-- gentoo-style: YES
-- GStreamer:
-- base: YES (ver 0.10.36)
-- video: YES (ver 0.10.36)
-- app: YES (ver 0.10.36)
-- riff: YES (ver 0.10.36)
-- pbutils: YES (ver 0.10.36)
-- Python:
-- Interpreter: /usr/bin/python2 (ver 2.7.5)
-- Libraries: /lib/libpython2.7.so (ver 2.7.5)
-- numpy: /usr/lib/python2.7/site-packages/numpy/
˓→core/include (ver 1.7.1)
-- Documentation:
-- Build Documentation: YES
-- Sphinx: /usr/bin/sphinx-build (ver 1.1.3)
Many other flags and settings are there. It is left for you for further exploration.
Now you build the files using make command and install it using make install command. make install
should be executed as root.
make
su
make install
Installation is over. All files are installed in /usr/local/ folder. But to use it, your Python should be able to find
OpenCV module. You have two options for that.
1. Move the module to any folder in Python Path : Python path can be found out by entering import
sys;print sys.path in Python terminal. It will print out many locations. Move /usr/local/lib/
python2.7/site-packages/cv2.so to any of this folder. For example,
su mv /usr/local/lib/python2.7/site-packages/cv2.so /usr/lib/python2.7/
˓→site-packages
But you will have to do this every time you install OpenCV.
2. Add ‘‘/usr/local/lib/python2.7/site-packages‘‘ to the PYTHON_PATH: It is to be done only once. Just open
~/.bashrc and add following line to it, then log out and come back.
export PYTHONPATH=$PYTHONPATH:/usr/local/lib/python2.7/site-packages
Thus OpenCV installation is finished. Open a terminal and try import cv2.
To build the documentation, just enter following commands:
make docs
make html_docs
Additional Resources
Exercises
Learn to play videos, capture videos from Camera and write it as a video
• Mouse as a Paint-Brush
Goals
• Here, you will learn how to read an image, how to display it and how to save it back
• You will learn these functions : cv2.imread(), cv2.imshow() , cv2.imwrite()
• Optionally, you will learn how to display images with Matplotlib
Using OpenCV
Read an image
Use the function cv2.imread() to read an image. The image should be in the working directory or a full path of image
should be given.
Second argument is a flag which specifies the way image should be read.
• cv2.IMREAD_COLOR : Loads a color image. Any transparency of image will be neglected. It is the default
flag.
• cv2.IMREAD_GRAYSCALE : Loads image in grayscale mode
• cv2.IMREAD_UNCHANGED : Loads image as such including alpha channel
Note: Instead of these three flags, you can simply pass integers 1, 0 or -1 respectively.
import numpy as np
import cv2
Warning: Even if the image path is wrong, it won’t throw any error, but print img will give you None
Display an image
Use the function cv2.imshow() to display an image in a window. The window automatically fits to the image size.
First argument is a window name which is a string. second argument is our image. You can create as many windows
as you wish, but with different window names.
cv2.imshow('image',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
A screenshot of the window will look like this (in Fedora-Gnome machine):
cv2.waitKey() is a keyboard binding function. Its argument is the time in milliseconds. The function waits for
specified milliseconds for any keyboard event. If you press any key in that time, the program continues. If 0 is passed,
it waits indefinitely for a key stroke. It can also be set to detect specific key strokes like, if key a is pressed etc which
we will discuss below.
cv2.destroyAllWindows() simply destroys all the windows we created. If you want to destroy any specific window,
use the function cv2.destroyWindow() where you pass the exact window name as the argument.
Note: There is a special case where you can already create a window and load image to it later. In that case, you can
specify whether window is resizable or not. It is done with the function cv2.namedWindow(). By default, the flag is
cv2.WINDOW_AUTOSIZE. But if you specify flag to be cv2.WINDOW_NORMAL, you can resize window. It will be
helpful when image is too large in dimension and adding track bar to windows.
cv2.namedWindow('image', cv2.WINDOW_NORMAL)
cv2.imshow('image',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Write an image
cv2.imwrite('messigray.png',img)
This will save the image in PNG format in the working directory.
Sum it up
Below program loads an image in grayscale, displays it, save the image if you press ‘s’ and exit, or simply exit without
saving if you press ESC key.
import numpy as np
import cv2
img = cv2.imread('messi5.jpg',0)
cv2.imshow('image',img)
k = cv2.waitKey(0)
if k == 27: # wait for ESC key to exit
cv2.destroyAllWindows()
elif k == ord('s'): # wait for 's' key to save and exit
cv2.imwrite('messigray.png',img)
cv2.destroyAllWindows()
Warning: If you are using a 64-bit machine, you will have to modify k = cv2.waitKey(0) line as follows :
k = cv2.waitKey(0) & 0xFF
Using Matplotlib
Matplotlib is a plotting library for Python which gives you wide variety of plotting methods. You will see them in
coming articles. Here, you will learn how to display image with Matplotlib. You can zoom images, save it etc using
Matplotlib.
import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread('messi5.jpg',0)
plt.imshow(img, cmap = 'gray', interpolation = 'bicubic')
plt.xticks([]), plt.yticks([]) # to hide tick values on X and Y axis
plt.show()
See also:
Plenty of plotting options are available in Matplotlib. Please refer to Matplotlib docs for more details. Some, we will
see on the way.
Warning: Color image loaded by OpenCV is in BGR mode. But Matplotlib displays in RGB mode. So color
images will not be displayed correctly in Matplotlib if image is read with OpenCV. Please see the exercises for
more details.
Additional Resources
Exercises
1. There is some problem when you try to load color image in OpenCV and display it in Matplotlib. Read this
discussion and understand it.
Goal
Often, we have to capture live stream with camera. OpenCV provides a very simple interface to this. Let’s capture a
video from the camera (I am using the in-built webcam of my laptop), convert it into grayscale video and display it.
Just a simple task to get started.
To capture a video, you need to create a VideoCapture object. Its argument can be either the device index or the name
of a video file. Device index is just the number to specify which camera. Normally one camera will be connected (as
in my case). So I simply pass 0 (or -1). You can select the second camera by passing 1 and so on. After that, you can
capture frame-by-frame. But at the end, don’t forget to release the capture.
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while(True):
# Capture frame-by-frame
ret, frame = cap.read()
cap.read() returns a bool (True/False). If frame is read correctly, it will be True. So you can check end of the
video by checking this return value.
Sometimes, cap may not have initialized the capture. In that case, this code shows error. You can check whether it is
initialized or not by the method cap.isOpened(). If it is True, OK. Otherwise open it using cap.open().
You can also access some of the features of this video using cap.get(propId) method where propId is a number from
0 to 18. Each number denotes a property of the video (if it is applicable to that video) and full details can be seen here:
Property Identifier. Some of these values can be modified using cap.set(propId, value). Value is the new value you
want.
For example, I can check the frame width and height by cap.get(3) and cap.get(4). It gives me 640x480 by
default. But I want to modify it to 320x240. Just use ret = cap.set(3,320) and ret = cap.set(4,240).
Note: If you are getting error, make sure camera is working fine using any other camera application (like Cheese in
Linux).
It is same as capturing from Camera, just change camera index with video file name. Also while displaying the frame,
use appropriate time for cv2.waitKey(). If it is too less, video will be very fast and if it is too high, video will be
slow (Well, that is how you can display videos in slow motion). 25 milliseconds will be OK in normal cases.
import numpy as np
import cv2
cap = cv2.VideoCapture('vtest.avi')
while(cap.isOpened()):
ret, frame = cap.read()
cv2.imshow('frame',gray)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Note: Make sure proper versions of ffmpeg or gstreamer is installed. Sometimes, it is a headache to work with Video
Capture mostly due to wrong installation of ffmpeg/gstreamer.
Saving a Video
So we capture a video, process it frame-by-frame and we want to save that video. For images, it is very simple, just
use cv2.imwrite(). Here a little more work is required.
This time we create a VideoWriter object. We should specify the output file name (eg: output.avi). Then we should
specify the FourCC code (details in next paragraph). Then number of frames per second (fps) and frame size should
be passed. And last one is isColor flag. If it is True, encoder expect color frame, otherwise it works with grayscale
frame.
FourCC is a 4-byte code used to specify the video codec. The list of available codes can be found in fourcc.org. It is
platform dependent. Following codecs works fine for me.
• In Fedora: DIVX, XVID, MJPG, X264, WMV1, WMV2. (XVID is more preferable. MJPG results in high size
video. X264 gives very small size video)
• In Windows: DIVX (More to be tested and added)
• In OSX : (I don’t have access to OSX. Can some one fill this?)
FourCC code is passed as cv2.VideoWriter_fourcc('M','J','P','G') or cv2.
VideoWriter_fourcc(*'MJPG) for MJPG.
Below code capture from a Camera, flip every frame in vertical direction and saves it.
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while(cap.isOpened()):
ret, frame = cap.read()
if ret==True:
frame = cv2.flip(frame,0)
cv2.imshow('frame',frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
break
Additional Resources
Exercises
Goal
Code
In all the above functions, you will see some common arguments as given below:
• img : The image where you want to draw the shapes
• color : Color of the shape. for BGR, pass it as a tuple, eg: (255,0,0) for blue. For grayscale, just pass the
scalar value.
• thickness : Thickness of the line or circle etc. If -1 is passed for closed figures like circles, it will fill the shape.
default thickness = 1
• lineType : Type of line, whether 8-connected, anti-aliased line etc. By default, it is 8-connected. cv2.LINE_AA
gives anti-aliased line which looks great for curves.
Drawing Line
To draw a line, you need to pass starting and ending coordinates of line. We will create a black image and draw a blue
line on it from top-left to bottom-right corners.
import numpy as np
import cv2
Drawing Rectangle
To draw a rectangle, you need top-left corner and bottom-right corner of rectangle. This time we will draw a green
rectangle at the top-right corner of image.
cv2.rectangle(img,(384,0),(510,128),(0,255,0),3)
Drawing Circle
To draw a circle, you need its center coordinates and radius. We will draw a circle inside the rectangle drawn above.
Drawing Ellipse
To draw the ellipse, we need to pass several arguments. One argument is the center location (x,y). Next argument is
axes lengths (major axis length, minor axis length). angle is the angle of rotation of ellipse in anti-clockwise direc-
tion. startAngle and endAngle denotes the starting and ending of ellipse arc measured in clockwise direction
from major axis. i.e. giving values 0 and 360 gives the full ellipse. For more details, check the documentation of
cv2.ellipse(). Below example draws a half ellipse at the center of the image.
cv2.ellipse(img,(256,256),(100,50),0,0,180,255,-1)
Drawing Polygon
To draw a polygon, first you need coordinates of vertices. Make those points into an array of shape ROWSx1x2 where
ROWS are number of vertices and it should be of type int32. Here we draw a small polygon of with four vertices in
yellow color.
Note: If third argument is False, you will get a polylines joining all the points, not a closed shape.
Note: cv2.polylines() can be used to draw multiple lines. Just create a list of all the lines you want to draw
and pass it to the function. All lines will be drawn individually. It is more better and faster way to draw a group of
lines than calling cv2.line() for each line.
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(img,'OpenCV',(10,500), font, 4,(255,255,255),2,cv2.CV_AA)
Result
So it is time to see the final result of our drawing. As you studied in previous articles, display the image to see it.
Additional Resources
1. The angles used in ellipse function is not our circular angles. For more details, visit this discussion.
Exercises
1. Try to create the logo of OpenCV using drawing functions available in OpenCV
Goal
Simple Demo
Here, we create a simple application which draws a circle on an image wherever we double-click on it.
First we create a mouse callback function which is executed when a mouse event take place. Mouse event can be
anything related to mouse like left-button down, left-button up, left-button double-click etc. It gives us the coordinates
(x,y) for every mouse event. With this event and location, we can do whatever we like. To list all available events
available, run the following code in Python terminal:
Creating mouse callback function has a specific format which is same everywhere. It differs only in what the function
does. So our mouse callback function does one thing, it draws a circle where we double-click. So see the code below.
Code is self-explanatory from comments :
import cv2
import numpy as np
while(1):
cv2.imshow('image',img)
if cv2.waitKey(20) & 0xFF == 27:
break
cv2.destroyAllWindows()
Now we go for much more better application. In this, we draw either rectangles or circles (depending on the mode we
select) by dragging the mouse like we do in Paint application. So our mouse callback function has two parts, one to
draw rectangle and other to draw the circles. This specific example will be really helpful in creating and understanding
some interactive applications like object tracking, image segmentation etc.
import cv2
import numpy as np
if event == cv2.EVENT_LBUTTONDOWN:
drawing = True
ix,iy = x,y
Next we have to bind this mouse callback function to OpenCV window. In the main loop, we should set a keyboard
binding for key ‘m’ to toggle between rectangle and circle.
while(1):
cv2.imshow('image',img)
k = cv2.waitKey(1) & 0xFF
if k == ord('m'):
mode = not mode
elif k == 27:
break
cv2.destroyAllWindows()
Additional Resources
Exercises
1. In our last example, we drew filled rectangle. You modify the code to draw an unfilled rectangle.
Goal
Code Demo
Here we will create a simple application which shows the color you specify. You have a window which shows the
color and three trackbars to specify each of B,G,R colors. You slide the trackbar and correspondingly window color
changes. By default, initial color will be set to Black.
For cv2.getTrackbarPos() function, first argument is the trackbar name, second one is the window name to which it is
attached, third argument is the default value, fourth one is the maximum value and fifth one is the callback function
which is executed everytime trackbar value changes. The callback function always has a default argument which is
the trackbar position. In our case, function does nothing, so we simply pass.
Another important application of trackbar is to use it as a button or switch. OpenCV, by default, doesn’t have button
functionality. So you can use trackbar to get such functionality. In our application, we have created one switch in
which application works only if switch is ON, otherwise screen is always black.
import cv2
import numpy as np
def nothing(x):
pass
while(1):
cv2.imshow('image',img)
k = cv2.waitKey(1) & 0xFF
if k == 27:
break
if s == 0:
img[:] = 0
else:
img[:] = [b,g,r]
cv2.destroyAllWindows()
Exercises
1. Create a Paint application with adjustable colors and brush radius using trackbars. For drawing, refer previous
tutorial on mouse handling.
Learn to read and edit pixel values, working with image ROI and other basic
operations.
Learn some of the mathematical tools provided by OpenCV like PCA, SVD
etc.
Goal
Learn to:
• Access pixel values and modify them
• Access image properties
• Setting Region of Image (ROI)
• Splitting and Merging images
Almost all the operations in this section is mainly related to Numpy rather than OpenCV. A good knowl-
edge of Numpy is required to write better optimized code with OpenCV.
( Examples will be shown in Python terminal since most of them are just single line codes )
You can access a pixel value by its row and column coordinates. For BGR image, it returns an array of Blue, Green,
Red values. For grayscale image, just corresponding intensity is returned.
>>> px = img[100,100]
>>> print px
[157 166 200]
Warning: Numpy is a optimized library for fast array calculations. So simply accessing each and every pixel
values and modifying it will be very slow and it is discouraged.
Note: Above mentioned method is normally used for selecting a region of array, say first 5 rows and last 3 columns like
that. For individual pixel access, Numpy array methods, array.item() and array.itemset() is considered to
be better. But it always returns a scalar. So if you want to access all B,G,R values, you need to call array.item()
separately for all.
Image properties include number of rows, columns and channels, type of image data, number of pixels etc.
Shape of image is accessed by img.shape. It returns a tuple of number of rows, columns and channels (if image is
color):
Note: If image is grayscale, tuple returned contains only number of rows and columns. So it is a good method to
check if loaded image is grayscale or color image.
Note: img.dtype is very important while debugging because a large number of errors in OpenCV-Python code is
caused by invalid datatype.
Image ROI
Sometimes, you will have to play with certain region of images. For eye detection in images, first perform face
detection over the image until the face is found, then search within the face region for eyes. This approach improves
accuracy (because eyes are always on faces :D ) and performance (because we search for a small area).
ROI is again obtained using Numpy indexing. Here I am selecting the ball and copying it to another region in the
image:
The B,G,R channels of an image can be split into their individual planes when needed. Then, the individual channels
can be merged back together to form a BGR image again. This can be performed by:
Or
>>> b = img[:,:,0]
Suppose, you want to make all the red pixels to zero, you need not split like this and put it equal to zero. You can
simply use Numpy indexing which is faster.
>>> img[:,:,2] = 0
Warning: cv2.split() is a costly operation (in terms of time), so only use it if necessary. Numpy indexing
is much more efficient and should be used if possible.
If you want to create a border around the image, something like a photo frame, you can use cv2.copyMakeBorder()
function. But it has more applications for convolution operation, zero padding etc. This function takes following
arguments:
• src - input image
• top, bottom, left, right - border width in number of pixels in corresponding directions
• borderType - Flag defining what kind of border to be added. It can be following types:
– cv2.BORDER_CONSTANT - Adds a constant colored border. The value should be given as next
argument.
– cv2.BORDER_REFLECT - Border will be mirror reflection of the border elements, like this : fed-
cba|abcdefgh|hgfedcb
– cv2.BORDER_REFLECT_101 or cv2.BORDER_DEFAULT - Same as above, but with a slight
change, like this : gfedcb|abcdefgh|gfedcba
– cv2.BORDER_REPLICATE - Last element is replicated throughout, like this:
aaaaaa|abcdefgh|hhhhhhh
– cv2.BORDER_WRAP - Can’t explain, it will look like this : cdefgh|abcdefgh|abcdefg
• value - Color of border if border type is cv2.BORDER_CONSTANT
Below is a sample code demonstrating all these border types for better understanding:
import cv2
import numpy as np
from matplotlib import pyplot as plt
BLUE = [255,0,0]
img1 = cv2.imread('opencv_logo.png')
replicate = cv2.copyMakeBorder(img1,10,10,10,10,cv2.BORDER_REPLICATE)
reflect = cv2.copyMakeBorder(img1,10,10,10,10,cv2.BORDER_REFLECT)
reflect101 = cv2.copyMakeBorder(img1,10,10,10,10,cv2.BORDER_REFLECT_101)
wrap = cv2.copyMakeBorder(img1,10,10,10,10,cv2.BORDER_WRAP)
constant= cv2.copyMakeBorder(img1,10,10,10,10,cv2.BORDER_CONSTANT,value=BLUE)
plt.subplot(231),plt.imshow(img1,'gray'),plt.title('ORIGINAL')
plt.subplot(232),plt.imshow(replicate,'gray'),plt.title('REPLICATE')
plt.subplot(233),plt.imshow(reflect,'gray'),plt.title('REFLECT')
plt.subplot(234),plt.imshow(reflect101,'gray'),plt.title('REFLECT_101')
plt.subplot(235),plt.imshow(wrap,'gray'),plt.title('WRAP')
plt.subplot(236),plt.imshow(constant,'gray'),plt.title('CONSTANT')
plt.show()
See the result below. (Image is displayed with matplotlib. So RED and BLUE planes will be interchanged):
Additional Resources
Exercises
Goal
• Learn several arithmetic operations on images like addition, subtraction, bitwise operations etc.
• You will learn these functions : cv2.add(), cv2.addWeighted() etc.
Image Addition
You can add two images by OpenCV function, cv2.add() or simply by numpy operation, res = img1 + img2.
Both images should be of same depth and type, or second image can just be a scalar value.
Note: There is a difference between OpenCV addition and Numpy addition. OpenCV addition is a saturated operation
while Numpy addition is a modulo operation.
>>> x = np.uint8([250])
>>> y = np.uint8([10])
It will be more visible when you add two images. OpenCV function will provide a better result. So always better stick
to OpenCV functions.
Image Blending
This is also image addition, but different weights are given to images so that it gives a feeling of blending or trans-
parency. Images are added as per the equation below:
By varying 𝛼 from 0 → 1, you can perform a cool transition between one image to another.
Here I took two images to blend them together. First image is given a weight of 0.7 and second image is given 0.3.
cv2.addWeighted() applies following equation on the image.
img1 = cv2.imread('ml.png')
img2 = cv2.imread('opencv_logo.jpg')
dst = cv2.addWeighted(img1,0.7,img2,0.3,0)
cv2.imshow('dst',dst)
cv2.waitKey(0)
cv2.destroyAllWindows()
Bitwise Operations
This includes bitwise AND, OR, NOT and XOR operations. They will be highly useful while extracting any part of
the image (as we will see in coming chapters), defining and working with non-rectangular ROI etc. Below we will see
an example on how to change a particular region of an image.
I want to put OpenCV logo above an image. If I add two images, it will change color. If I blend it, I get an transparent
effect. But I want it to be opaque. If it was a rectangular region, I could use ROI as we did in last chapter. But OpenCV
logo is a not a rectangular shape. So you can do it with bitwise operations as below:
# Now create a mask of logo and create its inverse mask also
img2gray = cv2.cvtColor(img2,cv2.COLOR_BGR2GRAY)
ret, mask = cv2.threshold(img2gray, 10, 255, cv2.THRESH_BINARY)
mask_inv = cv2.bitwise_not(mask)
cv2.imshow('res',img1)
cv2.waitKey(0)
cv2.destroyAllWindows()
See the result below. Left image shows the mask we created. Right image shows the final result. For more understand-
ing, display all the intermediate images in the above code, especially img1_bg and img2_fg.
Additional Resources
Exercises
1. Create a slide show of images in a folder with smooth transition between images using cv2.addWeighted
function
Goal
In image processing, since you are dealing with large number of operations per second, it is mandatory that your code
is not only providing the correct solution, but also in the fastest manner. So in this chapter, you will learn
• To measure the performance of your code.
• Some tips to improve the performance of your code.
• You will see these functions : cv2.getTickCount, cv2.getTickFrequency etc.
Apart from OpenCV, Python also provides a module time which is helpful in measuring the time of execution. Another
module profile helps to get detailed report on the code, like how much time each function in the code took, how many
times the function was called etc. But, if you are using IPython, all these features are integrated in an user-friendly
manner. We will see some important ones, and for more details, check links in Additional Resouces section.
cv2.getTickCount function returns the number of clock-cycles after a reference event (like the moment machine was
switched ON) to the moment this function is called. So if you call it before and after the function execution, you get
number of clock-cycles used to execute a function.
cv2.getTickFrequency function returns the frequency of clock-cycles, or the number of clock-cycles per second. So
to find the time of execution in seconds, you can do following:
e1 = cv2.getTickCount()
# your code execution
e2 = cv2.getTickCount()
time = (e2 - e1)/ cv2.getTickFrequency()
We will demonstrate with following example. Following example apply median filtering with a kernel of odd size
ranging from 5 to 49. (Don’t worry about what will the result look like, that is not our goal):
img1 = cv2.imread('messi5.jpg')
e1 = cv2.getTickCount()
for i in xrange(5,49,2):
img1 = cv2.medianBlur(img1,i)
e2 = cv2.getTickCount()
t = (e2 - e1)/cv2.getTickFrequency()
print t
Note: You can do the same with time module. Instead of cv2.getTickCount, use time.time() function.
Then take the difference of two times.
Many of the OpenCV functions are optimized using SSE2, AVX etc. It contains unoptimized code also. So if our
system support these features, we should exploit them (almost all modern day processors support them). It is enabled
by default while compiling. So OpenCV runs the optimized code if it is enabled, else it runs the unoptimized code.
You can use cv2.useOptimized() to check if it is enabled/disabled and cv2.setUseOptimized() to enable/disable it.
Let’s see a simple example.
# check if optimization is enabled
In [5]: cv2.useOptimized()
Out[5]: True
# Disable it
In [7]: cv2.setUseOptimized(False)
In [8]: cv2.useOptimized()
Out[8]: False
See, optimized median filtering is ~2x faster than unoptimized version. If you check its source, you can see median
filtering is SIMD optimized. So you can use this to enable optimization at the top of your code (remember it is enabled
by default).
Sometimes you may need to compare the performance of two similar operations. IPython gives you a magic command
%timeit to perform this. It runs the code several times to get more accurate results. Once again, they are suitable to
measure single line codes.
For example, do you know which of the following addition operation is more better, x = 5; y = x**2, x =
5; y = x*x, x = np.uint8([5]); y = x*x or y = np.square(x) ? We will find it with %timeit in
IPython shell.
In [10]: x = 5
In [15]: z = np.uint8([5])
You can see that, x = 5 ; y = x*x is fastest and it is around 20x faster compared to Numpy. If you consider the
array creation also, it may reach upto 100x faster. Cool, right? (Numpy devs are working on this issue)
Note: Python scalar operations are faster than Numpy scalar operations. So for operations including one or two
elements, Python scalar is better than Numpy arrays. Numpy takes advantage when size of array is a little bit bigger.
We will try one more example. This time, we will compare the performance of cv2.countNonZero() and
np.count_nonzero() for same image.
Note: Normally, OpenCV functions are faster than Numpy functions. So for same operation, OpenCV functions are
preferred. But, there can be exceptions, especially when Numpy works with views instead of copies.
There are several other magic commands to measure the performance, profiling, line profiling, memory measurement
etc. They all are well documented. So only links to those docs are provided here. Interested readers are recommended
to try them out.
There are several techniques and coding methods to exploit maximum performance of Python and Numpy. Only
relevant ones are noted here and links are given to important sources. The main thing to be noted here is that, first try
to implement the algorithm in a simple manner. Once it is working, profile it, find the bottlenecks and optimize them.
1. Avoid using loops in Python as far as possible, especially double/triple loops etc. They are inherently slow.
2. Vectorize the algorithm/code to the maximum possible extent because Numpy and OpenCV are optimized for
vector operations.
3. Exploit the cache coherence.
4. Never make copies of array unless it is needed. Try to use views instead. Array copying is a costly operation.
Even after doing all these operations, if your code is still slow, or use of large loops are inevitable, use additional
libraries like Cython to make it faster.
Additional Resources
Exercises
• Changing Colorspaces
• Image Thresholding
• Smoothing Images
Learn to blur the images, filter the images with custom kernels etc.
• Morphological Transformations
Most people start at our website which has the main PG search
facility: www.gutenberg.org.
ebookbell.com