Artificial intelligence showdown: Google Lens vs Bixby Vision vs Huawei HiVision
Recently, smartphone manufacturers have been slapping the AI tag to pretty much every feature they have. From cameras to RAM management, supposedly everything uses artificial intelligence and machine learning for better performance and personalized results. Most of these implementations you can’t really interact with, but there’s one you can see working before your very eyes.
We’re talking about real-world object recognition – the ability of a phone to recognize what its camera is being pointed at. The way this works is not as futuristic as you might think, although the results can still be impressive. Put simply, services utilizing the technology compare the image from your phone to a database of tagged images – or in other words, identified and described by labels (usually by humans). The best matches are shown to you as results. Because the databases are too big to fit on your phone, these services need an internet connection to work.
Anyone can recognize a banana, but what about a fruit with a more common shape, like a plum for example? Here’s what we got:
Google Lens is short and to the point. Bixby Vision, on the other hand, was all over the place with this one. While on the screenshot it suggests that the item is a gemstone, other suggestions we didn’t capture included a cricket ball and radish. No matter how many times we tried, it never came to the conclusion that you’re looking at a plum. When it comes to food, on your Huawei device it’s better to use the dedicated food mode, which is powered by a different company and provides more accurate results. In this case, a plum was no match for HiVision.
This one should be relatively easy since the subject (a croissant) has a distinct shape. Let’s see the results:
A respectable showing from all three contenders but Bixby Vision seemed a bit unsure. The words within the circle were changing from one type of pastry to another rapidly which was why it was hard to properly capture the moment it said: “croissant” (you can spot it if you enlarge the middle screenshot). The croissant in question wasn’t with chocolate filling but since smartphones don’t have x-rays (yet?) we’ll turn a blind eye to Huawei’s mistake.
This is a more challenging task since candy bars often have similar shapes and textures, which is why we decided to split our test subject apart and give the phones a bit more to work with. And they did not disappoint:
This round it was Google Lens’ turn to slip. It seems like Google hasn’t spent much time tagging pictures of desserts so the software assumed the Snickers bar was some other type of multilayered candy. Meanwhile, Bixby Vision and HiVision were quick to point out that that’s a Snickers we’re looking at.
Granted, what’s shown below is not a real bird but it appears to be close enough to fool the matching algorithms of our AI helpers. Here’s what they showed when faced with a stork garden decoration:
As usual, Bixby is a bit vaguer, but if you look through the image results it suggests, you’ll find quickly enough that the bird is, in fact, a stork.
Here things get challenging as dog breeds often look similar which would make it particularly difficult for the AI to give a correct match. That’s especially true with the dog that served as our model and is a breed that’s not really a popular one globally. But first, let’s see what were the phones’ best guesses.
Here, the results really surprised us. We didn’t expect any phone to get it right, but Google Lens nails it once again. Karakachan, also known as Bulgarian Shepherd, is the actual breed of the dog. It’s even more impressive considering the coloring is not typical for the breed which mostly consists of dogs with black and white fur. The other two did fairly well considering the challenging task, their results were acceptable even if technically incorrect.
Time for something more abstract. We saw this goofy-looking sheep plushie and decided to test if the software will be able to recognize what animal it is despite the weird proportions. The results were hit and miss.
In its typical style, Google’s result looks like the software is bored with your constant questions and just spits out “sheep”, which is indeed correct. However, we can’t blame the other two apps for suggesting “toy” since the plushie is more a toy than a real sheep, obviously. Still, Bixby Vision had a hard time realizing there’s only one object it needs to recognize and suggested similar images that were of pies and other whipped-cream decorated pastries. At least it’s amusing if not very helpful.
A big part of the marketing of these apps is focused on the way they can recognize different products so you can buy them or just get more information about them while you’re on the go. So, we decided to test them with products of various popularity.
While for most people the AirPods case is easily recognizable, its shape can be tricky for algorithms to correctly identify. Or is it?
Almost perfect results! We say almost because HiVision’s first suggestions were of AirPods lookalikes/knockoffs, which is not ideal. Bixby Vision thought for a second that the case is a bar of soap but quickly got to the right product. It seems the abundance of pictures of the AirPods charging case helps with the recognition quite a bit.
This task is both easy and hard. On one hand, the helmet of Darth Vader is one of the most recognizable objects in pop culture. On the other hand, there are thousands of products that use it. So, how accurate can the apps be?
Well, what do you know? Three out of three! The exact keychain with light-up LED eyes was in the top results of each app. Quite impressive. Time for the final round!
Now, most sunglasses have a similar shape which would make the task too difficult, which is why we chose a pair with a more distinct look and from a popular brand.
Google Lens and HiVision share the top spot in this one, both suggesting the exact Dolce & Gabbana sunglasses that were in front of them. Samsung’s suggested pair was close enough but still not the one in question.
Real-time text translation is probably the most useful feature these apps provide. Being able to quickly check what a piece of text in a foreign language means can make your trips abroad a lot easier. Time to see how our three AI contenders will perform the task.
You’re walking around a park in Germany when you see the sign that clearly says something important. You don’t know any German and you don’t want to get in trouble, so you pull out your phone and let the powers of AI translate it. Here’s what you get:
All three will give you enough information about what the sign is warning you not to do, but the translation on the Huawei slightly edges out the other two for including the word “lead”. The original sign says “All dogs must be walked on a leash! Excluding guide dogs”
Similar scenario, but this time you’re in Japan. Just checking if you have anything to be concerned of:
Time to talk about how it feels using each of the apps. Google Lens is the most intuitive one: once it recognizes an object, a dot shows up, you tap it and get more information. Sometimes, however, it would just continue scanning without picking up the object that’s right in front of you. But scanning the object from a different angle may help. Overall, it’s currently the most polished and useful app of the three we tested.
Samsung’s Bixby Vision is a hit or miss – but mostly miss. It took the most time to get accurate results, which we’re aware of only because we already knew what the correct answer was. If you’re actually relying on Bixby Vision to identify something for you, then luck will be a big factor. Suggestions sometimes change multiple times a second and they vary wildly between all sorts of objects. It would be better if the app just chooses one answer and sticks with it, even if it’s the wrong one, instead of throwing random words at you hoping to get it right eventually.
Huawei’s HiVision did quite well on our tests and can definitely be useful in certain situations. Sometimes, though, it can give a bit too much information. If you have an object on a table you don’t need the app to tell you that there’s a table in the picture as well, or that there’s a hardwood floor in the background. Still, that’s a minor annoyance. What the developers need to work on is a more pleasing design. Those transparent text boxes look very dated and give a sort of gimmicky vibe to the whole app, which is unfortunate.
The good thing about this type of software is that the longer it exists and the more people use it, the better it gets. And if now we’re already seeing some pretty good results, then imagine what will be possible in a few years. We don’t want to get into the creepy territory too much but it’s not impossible to someday be able to point your phone at a person and get their names and e-mail as a result. Still, it’s exciting to see where the technology will get us and a similar test in a year or two could say a lot about how fast things are moving. Stay tuned!
The main players in that field are Google with its Google Lens, Samsung with its Bixby Vision and Huawei with its HiVision. The three rely on different databases: Google has developed its own, Samsung relies heavily on Pinterest for the object recognition functionality, and Huawei has partnered with Microsoft for its own identification. There are some variations in what each app can do which is why we’re focusing on the two main features: recognizing different types of objects and translating text.
Challenge 1: Food recognition
Task 1: Sneaky round fruit
Anyone can recognize a banana, but what about a fruit with a more common shape, like a plum for example? Here’s what we got:
Google Lens left, Bixby Vision middle, HiVision right
Google Lens is short and to the point. Bixby Vision, on the other hand, was all over the place with this one. While on the screenshot it suggests that the item is a gemstone, other suggestions we didn’t capture included a cricket ball and radish. No matter how many times we tried, it never came to the conclusion that you’re looking at a plum. When it comes to food, on your Huawei device it’s better to use the dedicated food mode, which is powered by a different company and provides more accurate results. In this case, a plum was no match for HiVision.
Task 2: A French breakfast
This one should be relatively easy since the subject (a croissant) has a distinct shape. Let’s see the results:
Google Lens left, Bixby Vision middle, HiVision right
A respectable showing from all three contenders but Bixby Vision seemed a bit unsure. The words within the circle were changing from one type of pastry to another rapidly which was why it was hard to properly capture the moment it said: “croissant” (you can spot it if you enlarge the middle screenshot). The croissant in question wasn’t with chocolate filling but since smartphones don’t have x-rays (yet?) we’ll turn a blind eye to Huawei’s mistake.
Task 3: Calorie bomb
This is a more challenging task since candy bars often have similar shapes and textures, which is why we decided to split our test subject apart and give the phones a bit more to work with. And they did not disappoint:
Google Lens left, Bixby Vision middle, HiVision right
This round it was Google Lens’ turn to slip. It seems like Google hasn’t spent much time tagging pictures of desserts so the software assumed the Snickers bar was some other type of multilayered candy. Meanwhile, Bixby Vision and HiVision were quick to point out that that’s a Snickers we’re looking at.
Challenge 2: Animals… kind of
Task 1: Guess the bird
Granted, what’s shown below is not a real bird but it appears to be close enough to fool the matching algorithms of our AI helpers. Here’s what they showed when faced with a stork garden decoration:
Google Lens left, Bixby Vision middle, HiVision right
As usual, Bixby is a bit vaguer, but if you look through the image results it suggests, you’ll find quickly enough that the bird is, in fact, a stork.
Task 2: What breed is that dog?
Here things get challenging as dog breeds often look similar which would make it particularly difficult for the AI to give a correct match. That’s especially true with the dog that served as our model and is a breed that’s not really a popular one globally. But first, let’s see what were the phones’ best guesses.
Google Lens left, Bixby Vision middle, HiVision right
Here, the results really surprised us. We didn’t expect any phone to get it right, but Google Lens nails it once again. Karakachan, also known as Bulgarian Shepherd, is the actual breed of the dog. It’s even more impressive considering the coloring is not typical for the breed which mostly consists of dogs with black and white fur. The other two did fairly well considering the challenging task, their results were acceptable even if technically incorrect.
Task 3: What animal is that plushie?
Time for something more abstract. We saw this goofy-looking sheep plushie and decided to test if the software will be able to recognize what animal it is despite the weird proportions. The results were hit and miss.
Google Lens left, Bixby Vision middle, HiVision right
In its typical style, Google’s result looks like the software is bored with your constant questions and just spits out “sheep”, which is indeed correct. However, we can’t blame the other two apps for suggesting “toy” since the plushie is more a toy than a real sheep, obviously. Still, Bixby Vision had a hard time realizing there’s only one object it needs to recognize and suggested similar images that were of pies and other whipped-cream decorated pastries. At least it’s amusing if not very helpful.
Challenge 3: Products
A big part of the marketing of these apps is focused on the way they can recognize different products so you can buy them or just get more information about them while you’re on the go. So, we decided to test them with products of various popularity.
Task 1: Mysterious white object
While for most people the AirPods case is easily recognizable, its shape can be tricky for algorithms to correctly identify. Or is it?
Google Lens left, Bixby Vision middle, HiVision right
Almost perfect results! We say almost because HiVision’s first suggestions were of AirPods lookalikes/knockoffs, which is not ideal. Bixby Vision thought for a second that the case is a bar of soap but quickly got to the right product. It seems the abundance of pictures of the AirPods charging case helps with the recognition quite a bit.
Task 2: Tiny dark lord
This task is both easy and hard. On one hand, the helmet of Darth Vader is one of the most recognizable objects in pop culture. On the other hand, there are thousands of products that use it. So, how accurate can the apps be?
Google Lens left, Bixby Vision middle, HiVision right
Well, what do you know? Three out of three! The exact keychain with light-up LED eyes was in the top results of each app. Quite impressive. Time for the final round!
Task 3: Cool shades
Now, most sunglasses have a similar shape which would make the task too difficult, which is why we chose a pair with a more distinct look and from a popular brand.
Google Lens left, Bixby Vision middle, HiVision right
Google Lens and HiVision share the top spot in this one, both suggesting the exact Dolce & Gabbana sunglasses that were in front of them. Samsung’s suggested pair was close enough but still not the one in question.
Challenge 4: Text translation
Real-time text translation is probably the most useful feature these apps provide. Being able to quickly check what a piece of text in a foreign language means can make your trips abroad a lot easier. Time to see how our three AI contenders will perform the task.
Task 1: A warning sign in German
You’re walking around a park in Germany when you see the sign that clearly says something important. You don’t know any German and you don’t want to get in trouble, so you pull out your phone and let the powers of AI translate it. Here’s what you get:
Google Lens
Bixby Vision
HiVision
All three will give you enough information about what the sign is warning you not to do, but the translation on the Huawei slightly edges out the other two for including the word “lead”. The original sign says “All dogs must be walked on a leash! Excluding guide dogs”
Task 2: A warning sign in Japanese
Similar scenario, but this time you’re in Japan. Just checking if you have anything to be concerned of:
Google Lens
Bixby Vision
HiVision
Again, pretty clear: if you have a car, that’s not the place to park it. You never know when those firefighting activities will break out! The overlays are far from ideal but they get the point across, which is what matters in this case.
Time to take things to another level. You have a piece of text in an unknown language and you want to know what it’s about? Well, time for your smartphone to prove how smart it is. You scan the text and here are the results:
I don’t know what’s going on with Bixby Vision here, but if I were Google, I’d want to have a word with Samsung about putting “translated by Google” under that abomination. You can see that both Google Lens and HiVision translate the text well enough that you can understand what the story is about and soak its wisdom. Google Lens gets extra credit for overlaying the translation better, the Huawei one looks a bit like a ransom note.
Here’s the actual text of the popular fable about the crow and the fox:
Mr. Crow, sitting in a tree,
Task 3: Text in French
Time to take things to another level. You have a piece of text in an unknown language and you want to know what it’s about? Well, time for your smartphone to prove how smart it is. You scan the text and here are the results:
Google Lens left, Bixby Vision middle, HiVision right
I don’t know what’s going on with Bixby Vision here, but if I were Google, I’d want to have a word with Samsung about putting “translated by Google” under that abomination. You can see that both Google Lens and HiVision translate the text well enough that you can understand what the story is about and soak its wisdom. Google Lens gets extra credit for overlaying the translation better, the Huawei one looks a bit like a ransom note.
Here’s the actual text of the popular fable about the crow and the fox:
Mr. Crow, sitting in a tree,
Held a piece of cheese in his beak.
Mr. Fox, mouth watering from the scent,
Uttered almost precisely this to him:
“Hey! Good morning, Mr. Crow.
How lovely you are! You look so beautiful!
Without lying, if your songs
Are in keeping with your feathers,
You are the Phoenix of the inhabitants of these woods.
”With these words the Crow feels nothing but delight.
And to show off his beautiful voice,He opens a wide beak and lets his prey fall.
The Fox grabs it and said: “My dear sir
Learn that every flatterer
Lives at the expense of the one who listens to him.
This lesson is worth a piece of cheese, no doubt.”
The Crow, ashamed and embarrassed,
Swore, but a bit late, that he would never be fooled again.
Final thoughts
Time to talk about how it feels using each of the apps. Google Lens is the most intuitive one: once it recognizes an object, a dot shows up, you tap it and get more information. Sometimes, however, it would just continue scanning without picking up the object that’s right in front of you. But scanning the object from a different angle may help. Overall, it’s currently the most polished and useful app of the three we tested.
The good thing about this type of software is that the longer it exists and the more people use it, the better it gets. And if now we’re already seeing some pretty good results, then imagine what will be possible in a few years. We don’t want to get into the creepy territory too much but it’s not impossible to someday be able to point your phone at a person and get their names and e-mail as a result. Still, it’s exciting to see where the technology will get us and a similar test in a year or two could say a lot about how fast things are moving. Stay tuned!
Things that are NOT allowed: