Image Processing and Computer Vision Libraries for Python

7425-cover.jpg

 

The need for digital imaging and computer vision is increasing day by day across all corners of the IT industry. Every day, we experience these solutions as users, not paying that much attention to the complexity and high-tech nature of these hardware and software systems. 

For example, nowadays it's relatively normal to have autopilot in our cars, use drones to evaluate the state of crops or have retail stores recognize goods by packaging.

These are only a handful of examples that show the reach of digital image processing. In parallel, computer vision is increasing its geometric progression. Algorithms are used both in web and mobile projects, and it's worth noting that these areas are intensively used to prepare data for science.

There are several libraries of programming languages for image processing and computer vision. These languages are often used on the backend such as Java, C#, or Ruby, and have many libraries to solve problems in this direction. There are also languages for the frontend side like JavaScript. For all, there are many solutions in the form of open-source libraries to use in a project.

In a previous blog post, Overview of modern computer vision tools, we’ve already considered the many libraries available for computer vision in several programming languages and cloud systems. To name a few, we discussed Keras, Scikit-learn, and Yolo.

There is one programming language in particular that has penetrated almost all industries and is widely used to solve applied problems. The Python language. Both researchers in the field of image processing and computer vision projects in the data science team, use emerging libraries with access through Python.

Let's see why Python and its libraries are so widespread and how the number of users in this ecosystem is growing.

What is image processing

Digital image processing is the use of computer algorithms to process digital images and then apply significantly more complex algorithms to the image. It also refers to the implementation of methods that would otherwise be impossible with analog implementation.

Some of the main tasks of digital image processing include filtering and affine transformations. Image processing, also referred to as image analysis, focuses on working with 2D images to transform one image into another.

What is computer vision?

Computer vision, also known as technical vision, is the theory and technology of creating machines that can detect, track, and classify objects. As a scientific discipline, computer vision refers to the theory and technology of creating artificial systems that receive information from images. 

Video data can come from video sequences, images from various cameras, or 3D data like the one you get from a medical scanner. Computer vision also includes event detection, tracking, pattern recognition, image recovery, etc.

In a nutshell, computer vision can classify, identify, inspect, and detect objects.

Based on the input image, the computer vision algorithm generates output data that can be used in a wide variety of applied industries, from processing medical images to analyzing the traffic situation in a standard traffic flow.

As mentioned in this post on ResearchGate “In image processing, an image is "processed", that is, transformations are applied to an input image and an output image is returned. Computer vision uses image processing algorithms to solve some of its tasks. The main difference between these two approaches is the goals (not the methods used). For example, if the goal is to enhance an image for later use, then this may be called image processing. If the goal is to emulate human vision, like object recognition, defect detection, or automatic driving, then it may be called computer vision.”

Why use Python for image processing

In previous blog posts, we’ve covered the many benefits of using Python. Its wide adoption confirms the growing interest in this programming language and the approach in general.

Now, almost every image processing or computer vision library has a form of scripting interface in its main functions.

Example of image processing and computer vision.

Oftentimes, this scripting language is Python. Developers favor this particular ecosystem because of the following:

  • Python is easy to learn without losing the quality of programming decisions.
  • Python has been on the market for a long time and has stabilized in terms of errors.
  • Python continues to evolve dynamically.
  • You can use the coolest OOP solutions and you can write simple and effective code.

There are a huge number of libraries that solve the whole range of programming problems.

  • There are some drawbacks for inexperienced users:
  • The code can be slow if you write processing of large amounts of data directly in Python (you must do this through libraries like NumPy to have everything work significantly faster)
  • It can take some getting used to the syntax without the parentheses at the beginning and end of a block (which can be solved using strict notation for the beginning of the line).

Web Solutions

Python libraries for image processing and computer vision

OpenCV-Python

For OpenCV-Python, we've already reviewed great features in one of our blog articles. As soon as OpenCV was available with the Python interface, this library became more popular and practical for usage. 

As mentioned in OpenCV documentation, “Compared to languages like C/C++, Python is slower. That said, Python can be easily extended with C/C++, which allows us to write computationally intensive code in C/C++ and create Python wrappers that can be used as Python modules. This gives us two advantages: first, the code is as fast as the original C/C++ code (since it is the actual C++ code working in the background), and second, it is easier to code in Python than C/C++. OpenCV-Python is a Python wrapper for the original OpenCV C++ implementation.”

OpenCV must follow the presentation of images as a NumPy object. This makes it easier to use the Python code. Also, high-speed computation through NumPy algorithms is valuable for creating fast image processing features when compared to pure Python. 

Developers from OpenCV projects say: “OpenCV-Python makes use of Numpy, which is a highly optimized library for numerical operations with a MATLAB-style syntax. All the OpenCV array structures are converted to and from Numpy arrays. This also makes it easier to integrate with other libraries that use Numpy such as SciPy and Matplotlib.”

We must mention that OpenCV enables both image processing and the newest computer vision algorithms from Python.

Scikit-image

Scikit-image is a collection of algorithms for image processing. It is free of charge and free of restriction. A team of volunteers provides high-quality, peer-reviewed code available for usage from Python.

Scikit-image is indispensable for its characteristics for image processing and filtering. In addition, this library has a valuable morphology module that can be used to generate structured elements in the image. Segmentation, transformation, exposure, and many other algorithms, make this Python library one of the best for image processing.

Scikit-image uses the NumPy interface for images as well as OpenCV. It makes these two libraries compatible, giving users the chance to combine different methods for images from both libraries.

PIL/Pillow

The Python Imaging Library (PIL) can be used to manipulate images in a fairly easy way. PIL hasn't had any changes or development since 2009.

Pillow is a fork of PIL (Python Image Library) that comes with the support of Alex Clark and others that has evolved into an improved, modern version. It provides support for opening, managing, and saving many image formats. Nearly everything works the same as in the original PIL.

Some of the supported file types are BMP, EPS, GIF, IM, JPEG, PCX PNG, PPM, TIFF, ICO, PSD, PDF, etc. Some file types are read-only while others are write-only.

Pillow provides the following set of predefined image enhancement filters: BLUR, CONTOUR, DETAIL, EDGE_ENHANCE, EMBOSS, FIND_EDGES, SMOOTH, SMOOTH_MORE, SHARPEN. 

This library is widely used for image transformations in web projects as it is more lightweight and usable if you don’t need functionality from OpenCV or scikit-image.

SimpleCV

SimpleCV library for Python is a reliable way to create simple code for computer vision and image processing. They have their own way to provide functionality from other popular existing CV libraries. As mentioned on their developer’s web site: “SimpleCV is an open-source framework for building computer vision applications. With it, you get access to several high-powered computer vision libraries such as OpenCV – without having to first learn about bit depths, file formats, color spaces, buffer management, eigenvalues, or matrix versus bitmap storage.”

This library is highly recommended for prototyping. It has easy methods for programming basic image manipulation as well as cool future detection, machine learning, segmentation, and tracking. 

An interesting resource for fast learning about SimpleCV is this book. As authors mention: “Learn how to build your own computer vision (CV) applications quickly and easily with SimpleCV, an open-source framework written in Python. Through examples of real-world applications, this hands-on guide introduces you to basic CV techniques for collecting, processing, and analyzing streaming digital images. You’ll then learn how to apply these methods with SimpleCV, using sample Python code. All you need to get started is a Windows, Mac, or Linux system, and a willingness to put your CV to work in a variety of ways. Programming experience is optional.”

Pgmagick

Pgmagick is a very good multipurpose image processing library for Python. It is actually a wrapper for GraphicsMagick which originally derives from ImageMagick.

It has a wide array of useful functionalities:

  • Convert an image from one format to another (e.g. TIFF to JPEG).
  • Resize, rotate, sharpen, reduce color, or add special effects to an image.
  • Create a montage of image thumbnails.
  • Create a transparent image suitable for use on the Web.
  • Compare two images.
  • Turn a group of images into a GIF animation sequence.
  • Create a composite image by combining several separate images.
  • Draw shapes or text on an image.
  • Decorate an image with a border or frame.
  • Describe the format and characteristics of an image.

This makes Pgmagick a universal powerful image tool for many tasks of building backends. Consider that image processing is multi-threaded using OpenMP which means you can scale image processing as much as how many processors you can add to the OpenMP server. This is a key feature for batch image processing when you need to process millions of files.

Conclusion

In conclusion, it is very easy to work with digital image processing tasks now, compared to, say, 5-10 years ago. This is achieved through the development of various libraries and processing methods, as well as through the development of computing hardware and specialized processors.

In addition, the convenience of using these algorithms and methods also increases. This is achieved through the use of scripting languages, and if necessary, you can write your part of the algorithm in fast C++ and connect it to the scripting language, for example, using swig.

This reduces the amount of code that needs to be written to call a particular method from the library. For example, you can compare the amount of code in Python and C++ for a typical image processing library.

Our developers at Svitla Systems are highly qualified and have proven their competence in a variety of projects related to image processing and computer vision.

You can contact our Svitla Systems specialists for solutions in the field of digital image processing in information systems and computer vision tasks in various application areas, ranging from image processing in medical fields to drones for video stream analysis.