ARTIFICIAL INTELLIGENCE | 3D COMPUTER VISION
The difficulty with visual semantic
Computers today have trouble recognising the shapes within a scene, never mind what the objects are or how these objects exist in relation to one another. But that won't always be the case...
20.05.2019 // 3 minute read
What is happening here? It’s easy (though getting harder every day) to invent tasks that a single human can do faster than a computer. Humans, for example, are still far better at recognising the objects within a scene and guessing what just happened. To test this theory, we asked a visitor to our office what they thought had happened in the picture scribbled on our whiteboard. “The kid knocked over the vase and the dog is investigating”. Correct. But for a moment let us depict the logic that goes into arriving at the correct assumption…
We must identify the shapes to understand what the objects are - we call this semantic object classification
We must interpret the clues to arrive at the most logical conclusion given the facts at hand
For a human brain, this is an easy task - for a computer not so much. Our office visitor cleverly rejected alternate hypothesis including:
The dog knocked over the vase
The dog jumped out of the vase at the kid
The kid was being chased by the dog and tried to climb up the dresser with a rope to escape
There’s a wild dog in the house and somebody threw a vase at it
The dog was mummified in the vase, but arose when the kid touched it with a magic rope
The kid and the dog are running around trying to catch a snake. The kid finally caught it and tied a knot in it
Computers today have trouble recognising the shapes within a scene, never mind what the objects are or how these objects exist in relation to one another.
At SenSat we build digital simulations of the real world to help computers solve complex problems. Our simulated reality platform Mapp allows companies operating in physical domains, such as infrastructure construction, to make more informed decisions based on real site data. Starting at the first rung of cognition - geometric recognition and contextual understanding - our 3D visual cortex identifies and contextualises objects so that they have semantic value within a scene, allowing us to analyse and train AI based on real-world data.