Jump to content

Did Google fake it's Google Duplex demonstration?

Source: https://www.androidauthority.com/google-duplex-calls-edited-faked-866951/

 

So yeah, it's a tin foil hat thingie but I did feel that the demo was too perfect. One of the things that raised my skepticism was how natural the Google assistant voice sounded (not talking about fillers but rather the pronunciation and the continuity). But now more reputable media outlets (Axios) have done some more thorough check and it kinda raises questions.

Quote

The publication notes that employees “almost always” identify their workplace when answering a call. In the case of both Google Duplex calls, the employee merely greets and asks the caller if they need help. No “welcome to [insert salon]” or “hi, I’m [insert name].”

To prove the point, the outlet contacted over two dozen hair salons and restaurants, including a few in Google’s home city of Mountain View. Every person identified their business at the beginning of the call.

 

Maybe they edited out the outlet and employee name to avoid unwanted attention, but they could've easily used a well-established outlet and a fake name

Quote

The Google Duplex calls also lacked any sort of background noise you’d expect to hear from a restaurant or hair salon. Axios noted that ambient noise was heard in most of the calls they made, though there were a few exceptions.

The third concern was that in the two calls played at I/O, neither employee asked for the assistant’s contact details. When was the last time you made a restaurant appointment and didn’t get asked for your phone number or email address?

 

The second and third may not necessarily need to happen, but Axios did contact a bunch of restaurants and hair salon and there were background noises and they did ask for more customer details

Quote

Vanity Fair also raised the issue of recording legalities, as California requires both parties to consent for a call to be recorded. This means Google had to request consent to record the employee beforehand — and if we are hearing the entire calls Google made, as we’re lead to believe, that doesn’t happen.

What adds fuel to the fire is the fact that Google hasn't responded to any of these claims and is seemingly being very tight-lipped about all this. 

 

It's a pretty impressive demo, no doubt. But I personally felt that this was a huge jump in voice assistant advancement given what we have today. Yes, Google Assistant is really good, but I don't think it would fair well with accents especially the one shown in the second call.

Link to comment
Share on other sites

Link to post
Share on other sites

What I'd personally would be sceptical about us the intonation of the presumed AI voice. Why the fuck would an AI assistant be seeming to hesitate on phrasing or things like this?

That aside, as everything scientifically based, until we get a patent or a fully fledged paper on how it works, we can't assess its quality, and if we can't even play around with it we can't know it actually works (which it probably doesn't right now, or at least not totally otherwise they'd have happily pushed it to the pixel phones or to Android O powered phones.).

Background noise can be extracted out to be fair, and personally I've never had any restaurant or hair salon ask me for anything else than my name. So it may just be that they tested it in a region where that's the norm. I'd guess in their Zurich lab in Europe. If chosen right, restaurants/hair salon will know how to speak English quite decently.

As I said, id be more suspicious about the way they talked, which could be explained by their training set (on which I have no idea how they structured their learning set).

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, laminutederire said:

What I'd personally would be sceptical about us the intonation of the presumed AI voice. Why the fuck would an AI assistant be seeming to hesitate on phrasing or things like this?

2

They do that purposely. Google says it's to give time for the AI to process and compute information, just like why we use them for lol. Also, it's to also make it sound natural.

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Sampsy said:

You could have stopped there. Google will have purposefully chosen a call they made which didn't reveal contact information and which had clean audio because they wanted to use it on stage.

 

But does having a clean audio matter? If anything it would have made the demo even more convincing. Or maybe Duplex can't handle when there's too much background noise, which realistically is inevitable

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, Sampsy said:

Calling a couple of salons is anecdotal evidence at best and doesn't even come close to proof of anything. And it's perfectly normal for businesses to simply take a name when making reservations.

The point isn't to prove anything.  Just explaining why people have doubts.  To my knowledge, there isn't a shred of evidence that this wasn't staged either.  That doesn't mean it's fake, but the burden of proof does lie on them.

Make sure to quote or tag me (@JoostinOnline) or I won't see your response!

PSU Tier List  |  The Real Reason Delidding Improves Temperatures"2K" does not mean 2560×1440 

Link to comment
Share on other sites

Link to post
Share on other sites

 there are many audio tools/software/algorithms/etc that could be in use to take out background noise and etc

Link to comment
Share on other sites

Link to post
Share on other sites

33 minutes ago, Sampsy said:

Honestly what is more likely? The world leader in AI has... improved their AI. Or that said leader would risk a PR catastrophe by lying about the state of their technology. 

I think it's more likely that Google, a company not exactly known for their morals and ethics, simply elected to conveniently ignore a few laws in order to push their product demo through. 

Link to comment
Share on other sites

Link to post
Share on other sites

Background noise can be filtered out as people have said. And it makes sense that Google would do that. A call with a lot of background noise wouldnt be a great demo. As for contact details my hair salon or the restaurants I called last year or so never asks for it.. 

 

But I've been a regular at my hair salon for like ten years and even went to school with my hair dresser so my personal experience is pretty moot. 

Link to comment
Share on other sites

Link to post
Share on other sites

Occam's Razor applies here: it's far more likely that Google just chose the best examples from the calls it conducted and didn't want ones that identified the businesses (to prevent crank calls).

 

The notion that Google went out of its way to drastically edit or fake the calls would require active evidence of manipulation, not just "well it sounded too good."

Link to comment
Share on other sites

Link to post
Share on other sites

These days, there they are fewer and fewer actual demos on stage. Everything is detailed scripted and faked more and more.

No one wants humiliations like it most notably occurred with Microsoft with Windows 98 BSOD on them, Cortana not working in its first reveal ages ago, Nintendo controllers not working, Apple failed iPhone 4 demo, and lots more.

 

Things looks impressive and gets people to talk when it runs perfectly on stage, and that is what they want. When the product is released, most people would have forgotten how it was, but start with a good first impression. If there is complaints, just slap "in the works", or "Beta", and now people won't be as critical. For example, if you have Google Home Speaker, you'll quickly realized that many things you can't do. And so instead of jumping to some web result, or try to the do the best it can, it goes "I am sorry, I can't do this........... yet!". Making you believe it is coming, when probably nothing is planned for whatever you requested to do. Oh and it still doesn't support conversations, a feature that was demo'ed part of Google assistance on its Android OS. Past conversation aren't kept in context, meaning: "Ok Google lower the volume", "lower it more", won't work past the first command.

 

Personally, it is kinda of sad, as it used to add a certain level of uncertainty as you saw a demo, if it would actually work or not, and add a human aspect to it. It is a a stress test of sort, on stage. I understand trailers, I mean how many trailers showed the best (or only) jokes in a comedy movie for you to think that there is more, only to go see this movie and be disappointed if you didn't see reviews to avoid it in the first place.

 

 

Anyway, all to say:

 - I am not surprised

 - Wait for the product to be released before judging.

Link to comment
Share on other sites

Link to post
Share on other sites

 

4 hours ago, Sampsy said:

 

Honestly what is more likely? The world leader in AI has... improved their AI. Or that said leader would risk a PR catastrophe by lying about the state of their technology. 

A little from column A and a little from column B.

PSU Tier List | CoC

Gaming Build | FreeNAS Server

Spoiler

i5-4690k || Seidon 240m || GTX780 ACX || MSI Z97s SLI Plus || 8GB 2400mhz || 250GB 840 Evo || 1TB WD Blue || H440 (Black/Blue) || Windows 10 Pro || Dell P2414H & BenQ XL2411Z || Ducky Shine Mini || Logitech G502 Proteus Core

Spoiler

FreeNAS 9.3 - Stable || Xeon E3 1230v2 || Supermicro X9SCM-F || 32GB Crucial ECC DDR3 || 3x4TB WD Red (JBOD) || SYBA SI-PEX40064 sata controller || Corsair CX500m || NZXT Source 210.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, GoodBytes said:

 

No one wants humiliations like it most notably occurred with Microsoft with Windows 98 BSOD on them, Cortana not working in its first reveal ages ago, Nintendo controllers not working, Apple failed iPhone 4 demo, and lots more.

And the CLOi failure was only a few months ago:

 

Make sure to quote or tag me (@JoostinOnline) or I won't see your response!

PSU Tier List  |  The Real Reason Delidding Improves Temperatures"2K" does not mean 2560×1440 

Link to comment
Share on other sites

Link to post
Share on other sites

Is this just overhype like when Google Glass first got announced?

Link to comment
Share on other sites

Link to post
Share on other sites

Yep, I thought it was fake too. It talked way too fluently and the responses were extremely human. The fact that they showed the video on a screen already had me skeptical, if they had it in their hand and asked the phone to do it then maybe it'd be more believable.

Main PC CPU: 7700K, MOBO: Asus Strix, GPU: Aorus Extreme 3080, PSU: EVGA Supernova G2 750, RAM: Corsair Vengeance 16GB Storage: 970 Evo 1tb

Lounge PC CPU: 4790K MOBO: Asus Hero VII GPU: EVGA 3060 Ti PSU: Corsair RM650 RAM: Kingston HyperX 16gb Storage: 970 Evo 1TB

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, GoodBytes said:

These days, there they are fewer and fewer actual demos on stage. Everything is detailed scripted and faked more and more.

No one wants humiliations like it most notably occurred with Microsoft with Windows 98 BSOD on them, Cortana not working in its first reveal ages ago, Nintendo controllers not working, Apple failed iPhone 4 demo, and lots more.

 

Things looks impressive and gets people to talk when it runs perfectly on stage, and that is what they want. When the product is released, most people would have forgotten how it was, but start with a good first impression. If there is complaints, just slap "in the works", or "Beta", and now people won't be as critical. For example, if you have Google Home Speaker, you'll quickly realized that many things you can't do. And so instead of jumping to some web result, or try to the do the best it can, it goes "I am sorry, I can't do this........... yet!". Making you believe it is coming, when probably nothing is planned for whatever you requested to do. Oh and it still doesn't support conversations, a feature that was demo'ed part of Google assistance on its Android OS. Past conversation aren't kept in context, meaning: "Ok Google lower the volume", "lower it more", won't work past the first command.

 

Personally, it is kinda of sad, as it used to add a certain level of uncertainty as you saw a demo, if it would actually work or not, and add a human aspect to it. It is a a stress test of sort, on stage. I understand trailers, I mean how many trailers showed the best (or only) jokes in a comedy movie for you to think that there is more, only to go see this movie and be disappointed if you didn't see reviews to avoid it in the first place.

 

 

Anyway, all to say:

 - I am not surprised

 - Wait for the product to be released before judging.

Long live Bill Gates' presentation of USB hot swap, haha.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, JoostinOnline said:

I'd like to just point out that Google never actually says it's not scripted.

 

So basically, lying by omission? 

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, RedRound2 said:

When was the last time you made a restaurant appointment and didn’t get asked for your phone number or email address?

I'm going to super-nitpick here. Who actually calls it a restaurant appointment? GG writer.

 

But yeah, it seemed way too natural. Why would Google program "mhms", "uhhs" and "umms" in it? Yes, it sounds more natural, but there's really little point. I'm not sure that most people would even notice the lack of them.

Main Rig: CPU: AMD Ryzen 7 5800X | RAM: 32GB (2x16GB) KLEVV CRAS XR RGB DDR4-3600 | Motherboard: Gigabyte B550I AORUS PRO AX | Storage: 512GB SKHynix PC401, 1TB Samsung 970 EVO Plus, 2x Micron 1100 256GB SATA SSDs | GPU: EVGA RTX 3080 FTW3 Ultra 10GB | Cooling: ThermalTake Floe 280mm w/ be quiet! Pure Wings 3 | Case: Sliger SM580 (Black) | PSU: Lian Li SP 850W

 

Server: CPU: AMD Ryzen 3 3100 | RAM: 32GB (2x16GB) Crucial DDR4 Pro | Motherboard: ASUS PRIME B550-PLUS AC-HES | Storage: 128GB Samsung PM961, 4TB Seagate IronWolf | GPU: AMD FirePro WX 3100 | Cooling: EK-AIO Elite 360 D-RGB | Case: Corsair 5000D Airflow (White) | PSU: Seasonic Focus GM-850

 

Miscellaneous: Dell Optiplex 7060 Micro (i5-8500T/16GB/512GB), Lenovo ThinkCentre M715q Tiny (R5 2400GE/16GB/256GB), Dell Optiplex 7040 SFF (i5-6400/8GB/128GB)

Link to comment
Share on other sites

Link to post
Share on other sites

51 minutes ago, abazigal said:

So basically, lying by omission? 

Personally I wouldn't consider it lying, but the point I'm making is that if it turns out to be scripted, they can respond "We didn't say it wasn't". There's definitely no course of legal action.

Make sure to quote or tag me (@JoostinOnline) or I won't see your response!

PSU Tier List  |  The Real Reason Delidding Improves Temperatures"2K" does not mean 2560×1440 

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, TheSLSAMG said:

I'm going to super-nitpick here. Who actually calls it a restaurant appointment? GG writer.

 

But yeah, it seemed way too natural. Why would Google program "mhms", "uhhs" and "umms" in it? Yes, it sounds more natural, but there's really little point. I'm not sure that most people would even notice the lack of them.

They said it's to stall for time when it's processing but other times it remains silent so I'm guessing it only does it occasionally as to not sound too weird. That and make it sound more human but that's pointless to me because it honestly shouldn't sound human in the first place.

 

I don't think adding stuff like this is the hard part. It's processing the speech of the person on the line and formulating a proper reply. They've shown before that they've mostly nailed the natural sounding voice (although I could hear several wrong intonations in sentences in that demo). It's how to use that voice that's the hard part now. Some things seem somewhat hard coded which makes things sound less natural.

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, Cheezdoodlez said:

Background noise can be filtered out as people have said. And it makes sense that Google would do that. A call with a lot of background noise wouldnt be a great demo. As for contact details my hair salon or the restaurants I called last year or so never asks for it.. 

3

Actually, it would. As long as people can understand what is going on it would be a great demo and much more convincing. This seemed too ideal world scenario.

 

7 hours ago, Commodus said:

Occam's Razor applies here: it's far more likely that Google just chose the best examples from the calls it conducted and didn't want ones that identified the businesses (to prevent crank calls).

 

They could've easily used well-established chain of restaurants and a fake name (anyway the Californian law requires the consent of both the parties for the call to be recorded). The point is, they could've made it alot more realistic but they didn't questioning the legitimacy of the demo. Plus Google as of now still haven't responded to any of this

 

28 minutes ago, TheSLSAMG said:

I'm going to super-nitpick here. Who actually calls it a restaurant appointment? GG writer.

 

I merely quoted the from the source link

 

Link to comment
Share on other sites

Link to post
Share on other sites

If I think that Hair Salon call only, Google may have had to call beforehand and ask could they please answer the next call (from number xxxx) without telling their business name, because they are going to record the call and play it to a big audience if that's ok with them. Doesn't have to be fake. Not saying it isn't, but there's ways to do it. 

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, HaveNoGlue said:

If I think that Hair Salon call only, Google may have had to call beforehand and ask could they please answer the next call (from number xxxx) without telling their business name, because they are going to record the call and play it to a big audience if that's ok with them. Doesn't have to be fake. Not saying it isn't, but there's ways to do it. 

Maybe, but then again it's basically staging the thing. They could've easily given a script too. The way they demoed it, we are led to believe it's totally a random call and Duplex can handle any situation.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, RedRound2 said:

Maybe, but then again it's basically staging the thing. They could've easily given a script too. The way they demoed it, we are led to believe it's totally a random call and Duplex can handle any situation.

But it doesn't mean they told that the caller isn't human...

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×