The Asian kid is always better - Baidu tops GLUE benchmark

williamcll · December 28, 2019

Recently, Baidu managed to beat Microsoft for an AI's ability to understand human language in a benchmark known as GLUE (General Language Understanding Evaluation). This is the first time a non-English company has reached the top score.

Quote

Earlier this month, a Chinese tech giant quietly dethroned Microsoft and Google in an ongoing competition in AI. The company was Baidu, China’s closest equivalent to Google, and the competition was the General Language Understanding Evaluation, otherwise known as GLUE. GLUE is a widely accepted benchmark for how well an AI system understands human language. It consists of nine different tests for things like picking out the names of people and organizations in a sentence and figuring out what a pronoun like “it” refers to when there are multiple potential antecedents. A language model that scores highly on GLUE, therefore, can handle diverse reading comprehension tasks. Out of a full score of 100, the average person scores around 87 points. Baidu is now the first team to surpass 90 with its model, ERNIE.

The public leaderboard for GLUE is constantly changing, and another team will likely top Baidu soon. But what’s notable about Baidu’s achievement is that it illustrates how AI research benefits from a diversity of contributors. Baidu’s researchers had to develop a technique specifically for the Chinese language to build ERNIE (which stands for “Enhanced Representation through kNowledge IntEgration”). It just so happens, however, that the same technique makes it better at understanding English as well. To appreciate ERNIE, consider the model it was inspired by: Google’s BERT. (Yes, they’re both named after the Sesame Street characters.) Before BERT (“Bidirectional Encoder Representations from Transformers”) was created in late 2018, natural-language models weren’t that great. They were good at predicting the next word in a sentence—thus well suited for applications like Autocomplete—but they couldn’t sustain a single train of thought over even a small passage. This was because they didn’t comprehend meaning, such as what the word “it” might refer to. But BERT changed that. Previous models learned to predict and interpret the meaning of a word by considering only the context that appeared before or after it—never both at the same time. They were, in other words, unidirectional. BERT, by contrast, considers the context before and after a word all at once, making it bidirectional. It does this using a technique known as “masking.” In a given passage of text, BERT randomly hides 15% of the words and then tries to predict them from the remaining ones. This allows it to make more accurate predictions because it has twice as many cues to work from. In the sentence “The man went to the ___ to buy milk,” for example, both the beginning and the end of the sentence give hints at the missing word. The ___ is a place you can go and a place you can buy milk. The use of masking is one of the core innovations behind dramatic improvements in natural-language tasks and is part of the reason why models like OpenAI’s infamous GPT-2 can write extremely convincing prose without deviating from a central thesis.

When Baidu researchers began developing their own language model, they wanted to build on the masking technique. But they realized they needed to tweak it to accommodate the Chinese language.

In English, the word serves as the semantic unit—meaning a word pulled completely out of context still contains meaning. The same cannot be said for characters in Chinese. While certain characters do have inherent meaning, like fire (火, huŏ), water (水, shuĭ), or wood (木, mù), most do not until they are strung together with others. The character 灵 (líng), for example, can either mean clever (机灵, jīlíng) or soul (灵魂, línghún), depending on its match. And the characters in a proper noun like Boston (波士顿, bōshìdùn) or the US (美国, měiguó) do not mean the same thing once split apart.

So the researchers trained ERNIE on a new version of masking that hides strings of characters rather than single ones. They also trained it to distinguish between meaningful and random strings so it could mask the right character combinations accordingly. As a result, ERNIE has a greater grasp of how words encode information in Chinese and is much more accurate at predicting the missing pieces. This proves useful for applications like translation and information retrieval from a text document. The researchers very quickly discovered that this approach actually works better for English, too. Though not as often as Chinese, English similarly has strings of words that express a meaning different from the sum of their parts. Proper nouns like “Harry Potter” and expressions like “chip off the old block” cannot be meaningfully parsed by separating them into individual words.

So for the sentence:

Harry Potter is a series of fantasy novels written by J. K. Rowling.

BERT might mask it the following way:

[mask] Potter is a series [mask] fantasy novels [mask] by J. [mask] Rowling.

But ERNIE would instead mask it like this:

Harry Potter is [mask] [mask] [mask] fantasy novels by [mask] [mask] [mask].

ERNIE thus learns more robust predictions based on meaning rather than statistical word usage patterns.The latest version of ERNIE uses several other training techniques as well. It considers the ordering of sentences and the distances between them, for example, to understand the logical progression of a paragraph. Most important, however, it uses a method called continuous training that allows it to train on new data and new tasks without it forgetting those it learned before. This allows it to get better and better at performing a broad range of tasks over time with minimal human interference.

Baidu actively uses ERNIE to give users more applicable search results, remove duplicate stories in its news feed, and improve its AI assistant Xiao Du’s ability to accurately respond to requests. It has also described ERNIE’s latest architecture in a paper that will be presented at the Association for the Advancement of Artificial Intelligence conference next year. The same way their team built on Google’s work with BERT, the researchers hope others will also benefit from their work with ERNIE. “When we first started this work, we were thinking specifically about certain characteristics of the Chinese language,” says Hao Tian, the chief architect of Baidu Research. “But we quickly discovered that it was applicable beyond that.”

Source:https://www.technologyreview.com/s/614996/ai-baidu-ernie-google-bert-natural-language-glue/

https://gluebenchmark.com/leaderboard/

https://github.com/PaddlePaddle/ERNIE

Thoughts: While interesting, it will still take a lot more effort to convince people to use alternatives to google translate, which I recently experienced difficulties in using when I went to Japan for vacation where it struggled to phrase the Japanese grammar properly into English. I do suggest everyone to give Microsoft translate a try though, the UI honestly feels better than google's design.

mr moose · December 29, 2019

Ernie also drove the fastest milk cart in the west. What a champ...

The1Dickens · December 29, 2019

Dang, I wonder how well/quickly this could be implemented into video games. In a game like Bethesda's epics, allowing characters to go off on tangents, as well as understand the player, and important NPCs can then guide the conversation back on topic "naturally". Would make for a whole 'nother other level of immersion. Or having characters have full on conversations you could eavesdrop on, instead of millions of pre-recorded dialogue options that repeat after a few times.

williamcll · December 29, 2019

30 minutes ago, The1Dickens said:

Dang, I wonder how well/quickly this could be implemented into video games. In a game like Bethesda's epics, allowing characters to go off on tangents, as well as understand the player, and important NPCs can then guide the conversation back on topic "naturally". Would make for a whole 'nother other level of immersion. Or having characters have full on conversations you could eavesdrop on, instead of millions of pre-recorded dialogue options that repeat after a few times.

The first problem is that you assume Bethesda will ever code well.

mr moose · December 29, 2019

11 hours ago, The1Dickens said:

Dang, I wonder how well/quickly this could be implemented into video games. In a game like Bethesda's epics, allowing characters to go off on tangents, as well as understand the player, and important NPCs can then guide the conversation back on topic "naturally". Would make for a whole 'nother other level of immersion. Or having characters have full on conversations you could eavesdrop on, instead of millions of pre-recorded dialogue options that repeat after a few times.

I can see it now, when talking to an NPC, you say: "I wish we had a car to get to the house quicker", NPC responds: "how much would you pay for that DLC"?

Beskamir · December 30, 2019

Well it's still getting a beating and a questioning of where the other 9.9% went.

Bombastinator · December 30, 2019

One More example that a headline often has very little to do with the actual content and not infrequently actually contradicts it.

dgsddfgdfhgs · December 30, 2019

Who owns baidu?

no thx

December 30, 2019

ill defeat this AI easily

AnonymousGuy · December 30, 2019

It being China though, probably cheated to get that score. Is their algo an IM window feeding questions to 30 asian nerds?

Jrhumphrey89 · December 30, 2019

36 minutes ago, AnonymousGuy said:

It being China though, probably cheated to get that score. Is their algo an IM window feeding questions to 30 asian nerds?

Sad, yea not sure why but I instantly assumed they cheated to lol

Automobili3XF · December 30, 2019

Not long ago Baidu was one of the most nefarious bundleware company around, you couldn't remove their software from computers being just a common user, you'd have to remove registry keys to prevent it from installing itself again.

Sign In

The Asian kid is always better - Baidu tops GLUE benchmark

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Topics

Latest From Linus Tech Tips:

PC Gamer ranks EVERY Console

Latest From Tech Quickie:

The NEW Chip Inside Your Phone! (NPUs)

Latest From TechLinked:

Windows 11: Now with More Ads!

Latest From GameLinked:

We don’t Need Triple-A Games.

Latest From ShortCircuit:

This wireless router can’t possibly be good… can it? - Minion Routers

Latest From Mac Address:

Why did you buy an Apple Vision Pro?

Latest From Channel Super Fun:

I Swapped the CEO's Assistant For a Day!