Tag: Machine Translation

Here’s What Happened at the World’s Biggest Machine Translation Conference

Here’s What Happened at the World’s Biggest Machine Translation Conference

Machine translation is not a small part of the EMNLP mega conference and has its own conference stream, called the Conference on Machine Translation (WMT). WMT started out in 2006 as a series of workshops offered at EMNLP and became a full blown conference in its own right in 2016. EMNLP Organizer ACL was in fact originally named the Association for Machine Translation and Computational Linguistics (AMTCL) and changed name in 1962, six years after its founding.

 

Read full article.

memoQ 8.6 is Here!

memoQ 8.6 is Here!

memoQ 8.6 is our third release in 2018, and we are very excited about the new functionality it brings. The highlight of 8.6 is definitely the aim to pave the way to a more CMS-friendly translation environment, but like previous versions, it includes enhancements in many areas, including integration, terminology, productivity features, file filters, and user experience. Learn more about the most recent version and see how it will help you be even more productive.

 

Read full list of features.

More than 1 trillion words a day

More than 1 trillion words a day

The world is reaching a new milestone in MT – more than 1 trillion words a day. Machines translate in a single day more than all professional translators on the planet combined can do in a year.

 

Read full article.

Translators and Technology: Friends or Foes?

Translators and Technology: Friends or Foes?

It is a fact that different kinds of technology creep into the translation industry on all levels. As a result, some participants in this magical process of transforming a text to fit a different language, cultural, and sociological community, can feel quite uneasy, or even anxious. Will machine translation (MT) reach parity with human translation (HT)? Will there be a need for translators?

 

Read full article from here.

How Augmented Translation Will Redefine the Value of Translators

How Augmented Translation Will Redefine the Value of Translators

Norbert Oroszi, CEO of translation software company memoQ, joined the speaker line up at a sold-out SlatorCon San Francisco 2018 to reflect on the role of humans and machines in shaping the future of translation technology.

To lay the groundwork, Oroszi began by drawing a comparison between the role of technology in the automotive industry and localization. More than 100 years ago when technology hit the car industry to enable mass production of vehicles, many were fearful that machines would replace humans, but technology did not take jobs away from workers in the car industry. Instead, automation augmented human capabilities, redefined the value of workers, and facilitated what became an automotive revolution.

Read full article from here.

SYSTRAN presents its latest translation engines: huge quality & speed improvement!

SYSTRAN presents its latest translation engines: huge quality & speed improvement!

The latest version of our AI-powered Translation Software designed for Businesses

SYSTRAN Pure Neural® Server is our new generation of enterprise translation software based on Artificial Intelligence and Neural networks. It provides outstanding professional quality with the highest standards in data safety.

Our R&D team, extremely active to provide corporate users with state-of-the-art translation technology tailored for business, just released a new generation of Neural MT engines. SYSTRAN new engines are developed with OpenNMT-tf, our AI framework using latest TensorFlow features, and backed by a proprietary new training process: Infinite Training.

These innovations bring two major impacts on businesses:

  • Better Translation Quality & fluidity: the new engines exploit SAT (Self Attentional Transformers) neural networks that improve a contextual translation for better quality & fluency.
  • Better Performances: translation speed (char./sec.) is improved by 10 to 30 times on CPU hardware compared to previous generation engines.

For more info, please visit: https://bit.ly/2QgAMMq

What machine translation teaches us about the challenges ahead for AI

What machine translation teaches us about the challenges ahead for AI

oão Graça, co-founder and CTO of Unbabel, on what machine translation can teach us about the challenges still lying ahead for artificial intelligence.

Can you understand this sentence? Now try understanding the long and convoluted and unexpectedly – maybe never-ending, or maybe ending-sooner-than-you-think, but let’s hope it ends soon – nature of this alternative sentence.

The complexities of language can be an inconvenience to a reader. But even to today’s smartest machine learning algorithms, there are more translation challenges remaining than advances in other fields would have you believe.

These challenges in particular are a good demonstration of the multitude of complexities that still remain for machines to catch up with human performance.

You say tomato

When it comes to translation, there are two categories of content. On one hand, you have “commodity” translation. Perhaps you want to point your phone at a menu and get a rough idea of what it is. Or you want to impress a colleague with a phrase from their local language.

Here, phrases are short, the content is often formal and errors aren’t life or death.

But on the other hand, you have interactions where context is key – understanding the intent of the writer or speaker, and the expectations of the reader or listener. Take any example where a business speaks to its customers – you better hope you are speaking their language respectfully when they have a complaint or problem.

It’s not enough to solve the problem at a superficial level, and to achieve comparably “human quality” communication still has an enormous amount of research ahead of it. This need for perfection is why most research is focused in this second area.

In the examples below, I discuss the challenges still ahead for the translation industry, and touch on what they mean for how we use machine learning tech more broadly.

Challenge 1: Long-distance lookups

Many of the biggest challenges are structural.

A good example is long distance lookups. If you are translating a sentence word by word, but the order is the same, it’s just solving “what is the correct equivalent of this for that?”

But once you start having to think about reordering the sentence, the problem space that has to be explored is exponentially larger. And in languages like Chinese and Japanese, you find verbs at the end of the sentence, potentially producing the longest distances possible.

The system needs to assess at least three reordering systems. This is why these languages are so hard, because you have to cater to very different grammatical patterns, very different vocabularies, and how many characters are in each word.

Here, you can see how expanding problem spaces create difficulties in an area the human brain handles with ease.

Challenge 2: Taxonomy

The second major area of complexity involves different formats of data.

For example, conversational language has a completely different structure and appropriate models than formal documents. In areas like customer service translation, this makes a big difference. Nobody likes to feel like the representative of a company is being overly officious when handling their problem.

Therefore, any model that is able to learn from a volume of real human queries will have an advantage — and doubly so if it’s able to take it from a particular industry sector. Meanwhile, other models might be relying on news stories or generic online text, and output completely different results.

Similarly, with other machine learning challenges, the ability to learn from the most valuable and representative data can give a big advantage – or risk limiting taxonomical flexibility.

This brings us to context.

Challenge 3: Context

Most translation models still translate sentence by sentence, so they don’t take the context into account.

If they are translating a pronoun, they have no clue which pronoun should be translated. They will randomly generate sentences that are formal or informal. They don’t guarantee consistency of terminology – for instance, translating a legal term correctly in the same way throughout. There’s no way you can guarantee the whole document is correct.

The other problem is the content is not always in the same language. Sometimes it’s one sentence in Chinese, one sentence in English. The sentences are much shorter, so you probably have to look much higher for context. This reaches its extreme in “chat” interactions.

And the context problem is different than if you were translating an email. For example, if you are doing a legal document and the document is ten pages long, you would need to use the entire document for an accurate contextual translation.

This is next to impossible with current models – you have to find some way to summarise it. Otherwise, consistency is nearly impossible.

On the other hand, if you are translating for something like SEO, what you are actually translating is key words that don’t form a sentence, just keywords by themselves. This means you turn to more dictionary-like translation to disambiguate and use other words or the image associated with it.

People think “Oh, we are in the age of unlimited data” but actually we are still enormously lacking in many ways.

Yes, we have a lot of data but often not enough relevant data.

Looking to the future

There will be many translation engines but what makes them different is their models.

The model is going to look at the data and predict patterns and assign them to different customers, and from then, will decide which voice/ language/ tone/ etc. to choose.

In current common public translation tools, they aren’t aware of this yet. They don’t even have the knowledge of the document from where the translation came from, let alone the speaker or their translation preferences.

This will bring in the next level of sophistication in this area. Machine learning, exercised against use-specific corpus of language, will give fast and accurate translations, while being able to forward them to humans to finalise and learn from further.

Languages might still drive machines crazy – but with careful human thinking, we can teach them to persevere.

Reference: https://bit.ly/2PYYHAB

The Augmented Translator

The Augmented Translator

The idea that robots are taking over human jobs is by no means a new one. Over the last century, the automation of tasks has done everything from making a farmer’s job easier with tractors to replacing the need for cashiers with self-serve kiosks. More recently, as machines are getting smarter, discussion has shifted to the topic of robots taking over more skilled positions, namely that of a translator.

A simple search on the question-and-answer site Quora reveals dozens of inquiries on this very issue. While a recent survey shows that AI experts predict that robots will take over the task of translating languages by 2024. Everyone wants to know if they’ll be replaced by a machine and more importantly, when will that happen?

“I’m not worried about it happening in my lifetime” translator, Lizajoy Morales, told me when I asked if she was afraid of losing her job to a machine. This same sentiment echoes with most of Lilt’s users. Of course, this demographic is already using artificial intelligence to their advantage and tend to see the benefits over than the drawbacks.

null

Many translators, however, are quick to argue that certain types of content are impossible to be translated accurately by a machine, such as literature, which relies on a human’s understanding of nuance to capture the author’s intention. Or in fields like legal or medicine, that rely on the accuracy of a human translator.

But even in these highly-specialized fields, machines can find their place in the translation workflow. Not as a replacement, but rather as an assistant. As translators, we can use machines to our advantage, to work better and faster.

But I’m not talking about post-editing of machine translation. In a recent article from a colleague, Greg Rosner talks of the comparison of post-editing to the job of a janitor — just cleaning up a mess. True machine assistance augments the translator’s existing abilities and knowledge, letting them have the freedom to do what they do best — translate — and keeping interference to a minimum.

So how do machines help translators exactly? With an interactive, adaptive machine translation, such as that found in Lilt, the system learns in real-time from human feedback and/or existing translation memory data. This means that as a translator is working, the machine is getting to know their content, style and preferences and thus adapting to this unique translator/content combination. This adaptation allows the system to progressively provide better suggestions to human translators, and higher quality for fully automatic translation. In basic terms, it’s making translators faster and better.

Morales also pointed out another little-known benefit from machine translation suggestions: an increase in creativity. “This is an unexpected and much-appreciated benefit. I do all kinds of translations, from tourism, wine, gastronomy, history, social sciences, financial, legal, technical, marketing, gray literature, even poetry on occasion. And Lilt gives me fantastic and creative suggestions. They don’t always work, of course, but every so often the suggestion is absolutely better than anything I could have come up with on my own without spending precious minutes searching through the thesaurus…once again, saving me time and effort.”

Many are also finding that with increased productivity, comes increased free time. Ever wish there were more hours in the day? If you’re a translator, machine assistance may be the solution.

David Creuze, a freelance translator, told us how he spends his extra time, “I have two young children, and to be able to compress my work time from 6 or 7 hours (a normal day before their birth) to 4 hours a day, without sacrificing quality, is awesome.”

With these types of benefits at our fingertips, we should stop worrying about machines taking the jobs of translators and focus on using the machine to our advantage, to work better and ultimately focus on what we do best: being human.

 

Reference: https://bit.ly/2MgDaAj

Is This The Beginning of UNMT?

Is This The Beginning of UNMT?

Research at Facebook just made it easier to translate between languages without many translation examples. For example, from Urdu to English.

Neural Machine Translation

Neural Machine Translation (NMT) is the field concerned with using AI to translate between any language such as English and French. In 2015 researchers at the Montreal Institute of Learning Algorithms, developed new AI techniques [1] which allowed machine-generated translations to finally work. Almost overnight, systems like Google Translate became orders of magnitude better.

While that leap was significant, it still required having sentence pairs in both languages, for example, “I like to eat” (English) and “me gusta comer” (Spanish).  For translations between languages like Urdu and English without many of these pairs, translation systems failed miserably. Since then, researchers have been building systems that can translate without sentence pairings, ie: Unsupervised Neural Machine Translation (UNMT).

In the past year, researchers at Facebook, NYU, University of the Basque Country and Sorbonne Universites, made dramatic advancements which are finally enabling systems to translate without knowing that “house” means “casa” in Spanish.

Just a few days ago, Facebook AI Research (FAIR), published a paper [2] showing a dramatic improvement which allowed translations from languages like Urdu to English. “To give some idea of the level of advancement, an improvement of 1 BLEU point (a common metric for judging the accuracy of MT) is considered a remarkable achievement in this field; our methods showed an improvement of more than 10 BLEU points.”

Check out more info at Forbes.

Let us know what do you think about this new leap!

Here’s Why Neural Machine Translation is a Huge Leap Forward

Here’s Why Neural Machine Translation is a Huge Leap Forward

Though machine translation has been around for decades, the most you’ll read about it is the perceived proximity to the mythical “Babel Fish” –an instantaneous personal translation device– itself ready to replace each and every human translator. The part that gets left out is machine translation’s relationship with human translators. For a long time, this relationship was no more complex than post-editing badly translated text, a process most translators find to be a tiresome chore. With the advent of neural machine translation, however, machine translation is not just something that creates more tedious work for translators. It is now a partner to them, making them faster and their output more accurate.

So What’s the Big Deal?

Before we jump into the brave new translating world of tomorrow, let’s put the technology in context. Prior to neural machine translation, there have been two main paradigms in the history of the field. The first was rules-based machine translation (RBMT) and the second, dominant until very recently, was phrase-based statistical machine translation (SMT).

When building rules-based machine translation systems, linguists and computer scientists joined forces to write thousands of rules for translating text from one language to another. This was good enough for monolingual reviewers to be able to get the general idea of important documents in an otherwise unmanageable body of content in a language they couldn’t read. But for the purposes of actually creating good translations, this approach has obvious flaws: it’s time consuming and, naturally, results in low quality translations.

Phrase-based SMT, on the other hand, looks at a large body of bilingual text and creates a statistical model of probable translations. The trouble with SMT is its reliance on systems. For instance, it is unable to associate synonyms or derivatives of a single word, requiring the use of a supplemental system responsible for morphology. It also requires a language model to ensure fluency, but this is limited to a given word’s immediate surroundings. SMT is therefore prone to grammatical errors, and relatively inflexible when it encounters phrases that are different from those included in its training data.

Finally, here we are at the advent of neural machine translation. Virtually all NMT systems use what is known as “attentional encoder-decoder” architecture. The system has two main neural networks, one that receives a sentence (the encoder) and transforms it into a series of coordinates, or “vectors”. A decoder neural network then gets to work transforming those vectors back into text in another language, with an attention mechanism sitting in between, helping the decoder network focus on the important parts of the encoder output.

The effect of this encoding is that an NMT system learns the similarity between words and phrases, grouping them together in space, whereas an SMT system just sees a bunch of unrelated words that are more or less likely to be present in a translation.

Interestingly, this architecture is what makes Google’s “zero-shot translation” possible. A well-trained multilingual NMT can decode the same encoded vector into different languages it knows, regardless of whether that particular source/target language combination was used in training.

As the decoder makes its way through the translation, it predicts words based on the entire sentence up to that point, which means it produces entire coherent sentences, unlike SMT. Unfortunately, this also means that any flaws appearing early in the sentence tend to snowball, dragging down the quality of the result. Some NMT models also struggle with words it doesn’t know, which tend to be rare words or proper nouns.

Despite its flaws, NMT represents a huge improvement in MT quality, and the flaws it does have happen to present opportunities.

Translators and Machine Translation: Together at Last

While improvements to MT typically mean increases in its usual applications (i.e. post-editing, automatic translation), the real winner with NMT is translators. This is particularly true when a translator is able to use it in real time as they translate, as opposed to post-editing MT output. When the translator actively works with an NMT engine to create a translation, they are able to build and learn from each other, the engine offering up a translation the human may not have considered, and the human serving as a moderator, and in so doing, a teacher of the engine.

For example, during the translation process, when the translator corrects the beginning of a sentence, it improves the system’s chances getting the rest of the translation right. Often all it takes is a nudge at the beginning of a sentence to fix the rest, and the snowball of mistakes unravels.

Meanwhile, NMT’s characteristic improvements in grammar and coherence mean that when it reaches a correct translation, the translator spends less time fixing grammar, beating MT output and skipping post-editing all together. When they have the opportunity to work together, translators and their NMT engines quite literally finish each other’s sentences. Besides speeding up the process, and here I’m speaking as a translator, it’s honestly a rewarding experience.

Where Do We Go Now?

Predicting the future is always a risky business, but provided the quality and accessibility of NMT continues to improve, it will gradually come to be an indispensable part of a translator’s toolbox, just as CAT tools and translation memory already have.

A lot of current research has to do with getting better data, and with building systems that need less data. Both of these areas will continue to improve MT quality and accelerate its usefulness to translators. Hopefully this usefulness will also reach more languages, especially ones with less data available for training. Once that happens, translators in those languages could get through more and more text, gradually improving the availability of quality text both for the public and for further MT training, in turn allowing those translators, having already built the groundwork, to move on to bigger challenges.

When done right, NMT has the potential to not just improve translators’ jobs, but to move the entire translation industry closer to its goal of being humanity’s Babel Fish. Not found in an app, or in an earbud, but in networks of people.

 

Reference: https://bit.ly/2CewZNs