The world of computer vision

Tom Alabaster
Tom Alabaster
8 May 2019
blog post featured image

Examples of using computer vision

This blog is all about computer vision. There are some really cool uses for computer vision so let's get the creative part of your brains in gear and read on!

There is much more to computer vision, machine learning and augmented reality than we talk about here but this should give you some insight into what's possible.

Computer vision: you mean computers can see?

Yes, computers can see! However, what they can see is vastly different to what humans see. Computers only see what they’re told they can see.

If you give a computer (one set up to recognise and classify images) 10 images of a dog and tell it “this is a dog”, it will likely recognise the next image of a dog you give it as a dog.

If, however, you give a computer 10 images of a dog and tell it “this is a cat”, it will likely recognise the next image of a dog you give it as a cat. This is the machine learning aspect of computer vision. The computer only knows what you tell it.

Computers also don’t see a “box”. Computers see colours (as numbers), edges, gradients, corners and textures – we then tell computers that a combination of these things to a degree of accuracy is a “box”.

Uses for computer vision: classification

One main use of computer vision is for image classification.

Image classification is the process of giving a computer an image or images and getting it to tell you what the subject of each image is – with a level of accuracy. This is a typical task that humans can give to computers to do as it’s a lot faster than a human manually doing this. Only where the computer’s accuracy is below a given threshold will there be images that will need to be manually classified by a human.

Classification can be done in real-time (e.g. from a camera feed) or by a set of images. For this demo, I’m doing real-time classification on a set of food products. I wanted to choose items that are from the same class of objects (food items) but have very different appearances.

The accuracy is shown in the top left of my screen. This is based on the images I used to train the classifier, which was only trained from a small dataset of images which look pretty much the same, so the results aren’t too bad for a poor dataset. Where the top left says “Detected Negative”, this simply means it didn’t detect a known object.

What’s cool about this example is that it’s the kind of classification that could be applied to a whole variety of classes of products. You could use this concept to recognise different medical devices, foods and drinks packaging, medications... anything that a class of people would benefit from the convenience of holding an object in front of the camera and instantly having access to information right on their device.

Uses for computer vision: object tracking

Object tracking has actually been around for a while now in the form of token tracking. Tokens are known markers (such as a branded card) that the app, game or otherwise knows about and is actively looking for. Once the token is found in the camera’s viewport, the app is able to map objects to its position on the screen, so the user sees objects where the token would be in real life.

An example of token-based tracking which we’ve used quite a lot over the past year is the set of Merge Cube apps we’ve produced.

Object and token tracking can be achieved using a token such as a Merge Cube, and image such as a brand logo (in the example below, a brand coaster) or even text in the camera’s viewport!

Uses for computer vision: world tracking

World tracking is what makes augmented reality (AR) work well.

World tracking says what it does – tracks the world around the device. It uses the camera for recognising key points in the world (known as anchors) and the devices sensors for detecting motion and orientation, in order for the device to create a virtual world for mapping objects to.

We can use world tracking to simulate how objects will look in certain spaces as world tracking can provide measurements of planes and edges. It could be used to show plans for a building in 3D, or even for use in games – you could have an AR game map appear on your desk or have a tower of cubes come out of your floor.


The uses for computer vision are far greater today than ever before. This is thanks to modern technology in smart phones that makes them powerful enough to perform these operations in a fraction of the time that devices even 5 years ago ever could.

Not only that, but the accessibility of devices that can perform computer vision related tasks is higher than ever before. Overnight Apple rolled out an update to make millions of iOS devices capable of world tracking and on device image classification. Android is also in a similar position, with Google providing the tools to Android developers to incorporate similar concepts on the device or in the cloud.

As a result of the incredible achievements we’ve got to with modern mobile technologies, the onus is now in our hands, as creators, developers and innovators to come up with the real-world use cases of computer vision in order to empower or make our end users’ lives easier

Do you have an idea that could make use of computer vision or machine learning? Get in touch so we can help bring your (real-world) vision to life!

Close chatbot
Open chatbot
Open chatbot