Frequently Asked Questions

Computer vision is a form of artificial intelligence (AI) that trains machines to interpret and understand the visual world. Using visual data from the real world, machines can be taught to accurately identify and classify objects, and make a decision or take some action based on what they “see.”

In supervised learning, humans are in the loop. They annotate, or label, visual data that can be used to teach the machine to recognize, and sometimes track, the objects it is designed to detect. In unsupervised learning, unlabeled data is used to find patterns in the data.

Convolutional neural networks (CNNs or ConvNets) are commonly used in deep learning for computer vision, along with other algorithms. Common applications of computer vision include AgTech (or farmtech) to optimize food production and distribution, medical AI for detecting disease, device security, and autonomous vehicles.

One of the most cited references on this topic is a book, Computer Vision Algorithms and Applications, written by Richard Szeliski, based on his lectures at the University of Washington and Stanford University. While the first version is dated 2010, it provides an excellent resource for foundational knowledge about algorithms and applications of computer vision.

What is computer vision?

What are algorithms and applications of computer vision?