https://blogs.microsoft.com/ai/machine-translation-news-test-set-human-parity/
Researchers in the company’s Asia and U.S. labs said that their system achieved human parity on a commonly used test set of news stories, called newstest2017, which was developed by a group of industry and academic partners and released at a research conference called WMT17 last fall. To ensure the results were both accurate and on par with what people would have done, the team hired external bilingual human evaluators, who compared Microsoft’s results to two independently produced human reference translations.
Xuedong Huang, a technical fellow in charge of Microsoft’s speech, natural language and machine translation efforts, called it a major milestone in one of the most challenging natural language processing tasks.
“Hitting human parity in a machine translation task is a dream that all of us have had,” Huang said. “We just didn’t realize we’d be able to hit it so soon.”