Tag: Translation Memory

How can word counts differ within the same tool on different machines? (2)

How can word counts differ within the same tool on different machines? (2)

Have you ever run a word count with the same document on two different machines and received different word counts?

Well, here is what can have an impact on the word count statistics:

  • The use of a TM on one machine and no TM on the other machine can produce different word counts. A project with no TM will use default settings for counting, which might have been adjusted in the TM you actually use. For example, the setting to count words with hyphens as one or two words.

Read the rest from here

Translation Memory and Survival

Translation Memory and Survival

Written by:  Eman Elbehiry

Homer and Langley Collyer, two brothers, were killed by “hoarding”, and storing things in their house. That led psychology scientists to categorizing this “hoarding” under “psychological disease” umbrella. But when it comes to Translation, sorry psychology! You’re wrong this time. Why? Because “Hoarding” is a big sign of perfectionism, and adulthood.

How many times did you looked at your outstanding translation and wished you could use it again? How many times have you came across a sentence that you bet your life that you have translated it before, but unfortunately you can’t remember in which file, or which project exactly? How many times have you translated similar texts, and wished for something that could help you get the job done in half of its time?
Though you are stuck in only “wishing”, hoarding and storing this translation has found its way to you through translation memories, TM.

While the CAT tool divides the whole text into segments, the translation memory becomes where all your translation is stored in units; exactly as it was saved either in sentences, paragraphs, headlines, or titles. Which means that it stores the segment with its language pair. Consequently, you can get back to it in the time of need. According to SDL description of the translation memory “When a translator’s jobs regularly contain the same kinds of phrases and sentences, a translation memory will drastically increase the speed of translation.” This makes it one essential component of any CAT tool par excellence.

Later, when you summon this translation memory to re-use this “stored” translation; it starts suggesting for you a translation. You can add and enhance this translation, you can modify it, or replace it with a better one. For the translation memory being so smart; it updates itself all through. It has whatever you added, and enhanced. If the translator accepted the exact suggested translation, the tool will choose to call this an exact match; which is percented as 100%.  We can see this crystal clear especially in the texts that include a lot of repeated patterns. Furthermore the minimum similarity between a new segment and a saved segment in TM varies from 99% to 50% depending on the similarity match is called “Fuzzy match”.

According to what we mentioned above, the translation memory is best fit for texts including repetitions, or similarities. Which means that it is most suitable for technical, and legal translations, for having specialized and repeated expressions and vocabularies. Moreover, if you are working on one project in a team, it is very possible that each translator have his own distinctive expressions, and vocabulary in his mind, but if you are working with one translation memory, your documents will be more coherent and cohesive, and you are always on the same track.
As a result to this, translation memory saves time and efforts, which results in reducing the cost of the long-termed projects. Therefore, by now, we agree that it provides the best quality possible, and it’s definitely unlimited.

            Genius ha! Wondering how it works? Here is a hint:
The mechanism is called “Edit Distance”. This mechanism’s role is to identify how dissimilar two entities (e.g words, segments, units) are. Thus, in the case of the fuzzy match, it tries to measure approximately how close these two patterns are, and consequently, suggests to the translators who have the power to accept or modify, so that it could improve itself.

            The translation memory allows you to use it hundreds of times, to include it in whatever project and keep updating it. But the question here is “What if I didn’t use it before, will my past treasure go in vain?” The answer is NO!
Actually you are allowed to “Align” the segments of your past work to make a translation memory out of it, we will explain that the alignment in another article. It will also be your own made-dictionary to search in your previous work for any terms, sentences. In addition to this, you will be able to share your translation memory, and use the shared-with you, so that you have a solid, and saving base. The reviewer can have a part of the share with you so you have everything updated, and flawless. This is seen usually in the online tools. And yes it happens to have a movement of exchanging translation memories. “Sharing is caring” right?!

            And here emerges another question, which is: What if I am using a specific tool, while my peers, or my reviewers are using another? How can I share with them my TM? Or how do they share theirs?
Now, there is an extension for sharing which is “TMx” (Translation Memory Exchange). This allows you to import and export the translation memory among many different tools. Never easier!

According to Martín-Mor (2011), the use of TM systems does have an effect on the quality of the translated texts, especially on novices, but experienced translators are able to avoid it.” And here we say to all translators, experienced or beginners, “hoarding” is not killing, and there is nothing better than hoarding and storing years of hard work embracing experience. This storing and hoarding is exactly “what doesn’t kill you makes you stronger”.

Share with us a story in which you used a TM and it was super beneficial!

Across Systems Presents New Major Release of its Across Language Server

Across Systems Presents New Major Release of its Across Language Server

Karlsbad. Across Systems GmbH has released version 7 of its translation management software Across Language Server. Under the motto “Speed up your translation processes”, the main benefits of the new major release include optimized translation processes and seamless connection of third-party machine translation systems.

Check the full release from this link.

Why should you CAT tool your translation?

Why should you CAT tool your translation?

Written by:  Eman Elbehiry

As long as the early human was living, he searched for ways to survive, to facilitate life, to save time and energy, and to develop and move his life to another level. Thus, he worked his available tools like dry grass, leaves, and bark to make his first flame of fire. Since then he had an incredible power to make nature submit to his power. The early human taught us a lesson, in addition to this marvelous discovery; he taught us the skill of searching for what makes us win in our battle with time. The fire to the translator is CAT tools.

In the time the translator is fed up of going back and forth for what he translated before, to what he missed up in one file out of many and his desire to revise them all once again, CAT tools were there to give him a helping hand. In his time dreaming of something that can store his translation, his dictionaries, in addition to having grammar and spell checkers in one place, CAT tools were his fulfilling jinni. The term “CAT tools” stands for “Computer-Aided Translation”. As the name signifies, the computer helps and supports us in translation process through managing, organizing, checking the quality, and storing our translation. Having all these features does not mean that it translates on its own; on the contrary, as a translator, you do the work.

A CAT tool has some basic components: first, the translation memory, abbreviated as TM. This memory stores our translation in units to be restored in the time of need. Second, the dictionaries for retrieving words, and checking the spelling. Third, the Term-base: the term base is like a glossary for terms that has a long explanation put in a long cluster of words or in expressions. It also could have a thorough clarification for an abbreviation. It is highly important for specific translations like the medical, and the legal. Fourth, segmentation, and the segment isn’t actually a complete sentence. It could be a long sentence, a long statement, a complete paragraph. This division depends on the punctuation of the language. The tool divides the file to segments, each segment has its own organization, layout, and format which we call “tags”. Simply, the tool can help you copying the same format from the source to target, without going into the hustle of translation and formatting the text.

CAT tools are three types: the online, the offline, and the one that collects both, the “hybrid”. First, the online tools, like the “Smart CAT”; it helps in managing the workflow through establishing a shared platform with the team with a shared, updatable, translation memories. It also grants saving time because the translator, reviewer, and proofreader, can work in parallel. It also allows the manger to follow the progress of his team. Second, the offline tools, SDL Trados is one of the most leading tools in the market. The third one is like Memsource. It works both online and offline, and also update the translation memories, and the termbases.

As a result of these magnificent features, we can say that the CAT tools are a great addition to the industry. Big projects that are full repetitions are done with the highest performance possible. It would even facilitate the coming tasks by saving what you already translated. Furthermore, CAT tools in action grants you the best quality, with standardized terminology if you are working in a team. In addition to this, it can analyze your files word by word to be paid fairly.

Bottom line, CAT tools are great piece of technology that grants any quality with best quality and organization for preserving time, energy, and to keep all the members of the time on the same track of terminology, synonyms using the same glossary and expressions.

Here we give you some of the reviews about CAT tools. Caner K. who is a validated reviewer, and a verified current user of Trados says: “What business problems are you solving with the product? What benefits have you realized?

Most technical translations have repeating phrases and Trados makes it easier to translate these. So you save time by skipping translating the same words and sentences. It also makes collaborating on a long translation easier with a fellow translator. You can constantly share your translation memory with a colleague and make translating even more easier. The target text is usually more cohesive when two translators work on one text and the same translation memory.”

Ekaterina B. also is validated reviewer, and a verified current user of Trados says:”

What do you like best?
Trados is an essential tool for this business. It increases productivity and is a door opener when doing business with big clients.
The new UpLift capability is wonderful. Fragment matches have saved me so much time!”

About Matecat, one of the online CAT tools, Jorge Herran, a Spanish translator, says:

“It is an outstanding CAT tool, I have worked with SDL, Fluency, OmegaT and other CAT tools and in most cases this one allows me to work faster using a much better quality automatic translation as a base for my work, I still have to learn more about it, but so far, even if it lacks of many features, looks like a very promising CAT tool.”

             

What about you? Will you consider working with a CAT tool? Share with us your opinion!

memoQ 8.6 is Here!

memoQ 8.6 is Here!

memoQ 8.6 is our third release in 2018, and we are very excited about the new functionality it brings. The highlight of 8.6 is definitely the aim to pave the way to a more CMS-friendly translation environment, but like previous versions, it includes enhancements in many areas, including integration, terminology, productivity features, file filters, and user experience. Learn more about the most recent version and see how it will help you be even more productive.

 

Read full list of features.

Files, Files Everywhere: The Subtle Power of Translation Alignment

Files, Files Everywhere: The Subtle Power of Translation Alignment

Here’s the basic scenario: you have the translated versions of your documents, but the translation wasn’t performed in a CAT tool and you have to build a translation memory because these documents need to be updated or changed across the languages, you want to retain the existing elements, style and terminology, and you have integrated CAT technology in your processes in the meantime. The solution is a neat piece of language engineering called translation alignment.

Translation alignment is a native feature of most productivity tools for computer-assisted translation, but its application in real life is limited to very specific situations, so even the language professionals rarely have an opportunity to use it. However, these situations do happen once in while and when they do, alignment usually comes as a trusty solution for process optimization. We will take a look at two actual cases to show you what exactly it does.

Example No. 1: A simple case

Project outline:

Three Word documents previously translated to one language, totaling 6000 unweighted words. Two new documents totaling around 2500 words that feature certain elements of the existing files and need to follow the existing style and terminology.

Project execution:

Since the translated documents were properly formatted and there were no layout issues, the alignment process was completed almost instantly. The software was able to segmentize the source files and we matched the translated segments, with some minor tweaking of segmentation. We then built a translation memory from those matched segments and added the new files to the project.

The result:

Thanks to the created translation assets, the final wordcount of the new content was around 1500 and our linguists were able to produce translation in accordance with the previously established style and terminology. The assets were preserved for use on future projects.

Example No.2: An extreme case of multilingual alignment

Project outline:

In one of our projects we had to develop translation assets in four language pairs, totaling roughly 30k words per language. The source materials were expanded with new content totaling about 20k words unweighted and the language assets had to be developed both to retain the existing style and terminology solution and to help the client switch to a new CAT platform.

Project execution:

Unfortunately, there was no workaround for ploughing through dozens of files, but once we organized the materials we could proceed to the alignment phase. Since these files were localized and some parts were even transcreated to match the target cultures, which also included changes in layout and differences in content, we knew that alignment was not going to be fully automated.

This is why native linguists in these languages performed the translation alignment and communicated with the client and the content producer during this phase. While this slowed the process a bit, it ultimately yielded the best results possible.

We then exported the created translation memory in the cross-platform TMXformat that allowed use in different CAT tools and the alignment phase was finished.

The result:

With the TM applied, the weighted volume of new content was around 7k words. Our linguists localized the new materials in accordance with the existing conventions in the new CAT platform and the translation assets were saved for future use.

Wrap up

In both cases, translation alignment enabled us to reduce the volume of the new content for translation and localization and ensure stylistic and lexical consistencywith the previously translated materials. It also provided an additional, real-time quality control and helped our linguists produce a better translation in less time.

Translation alignment is not an everyday operation, but it is good to know that when it is called to deliver the goods, this is exactly what it does.

Reference: https://bit.ly/2p5aYr0

Machine Translation From the Cold War to Deep Learning

Machine Translation From the Cold War to Deep Learning

In the beginning

The story begins in 1933. Soviet scientist Peter Troyanskii presented “the machine for the selection and printing of words when translating from one language to another” to the Academy of Sciences of the USSR. The invention was super simple — it had cards in four different languages, a typewriter, and an old-school film camera.

The operator took the first word from the text, found a corresponding card, took a photo, and typed its morphological characteristics (noun, plural, genitive) on the typewriter. The typewriter’s keys encoded one of the features. The tape and the camera’s film were used simultaneously, making a set of frames with words and their morphology.

Despite all this, as often happened in the USSR, the invention was considered “useless”. Troyanskii died of Stenocardia after trying to finish his invention for 20 years. No one in the world knew about the machine until two Soviet scientists found his patents in 1956.

It was at the beginning of the Cold War. On January 7th 1954, at IBM headquarters in New York, the Georgetown–IBM experiment started. The IBM 701 computer automatically translated 60 Russian sentences into English for the first time in history.

However, the triumphant headlines hid one little detail. No one mentioned the translated examples were carefully selected and tested to exclude any ambiguity. For everyday use, that system was no better than a pocket phrasebook. Nevertheless, this sort of arms race launched: Canada, Germany, France, and especially Japan, all joined the race for machine translation.

The race for machine translation

The vain struggles to improve machine translation lasted for forty years. In 1966, the US ALPAC committee, in its famous report, called machine translation expensive, inaccurate, and unpromising. They instead recommended focusing on dictionary development, which eliminated US researchers from the race for almost a decade.

Even so, a basis for modern Natural Language Processing was created only by the scientists and their attempts, research, and developments. All of today’s search engines, spam filters, and personal assistants appeared thanks to a bunch of countries spying on each other.

Rule-based machine translation (RBMT)

The first ideas surrounding rule-based machine translation appeared in the 70s. The scientists peered over the interpreters’ work, trying to compel the tremendously sluggish computers to repeat those actions. These systems consisted of:

  • Bilingual dictionary (RU -> EN)
  • A set of linguistic rules for each language (For example, nouns ending in certain suffixes such as -heit, -keit, -ung are feminine)

That’s it. If needed, systems could be supplemented with hacks, such as lists of names, spelling correctors, and transliterators.

PROMPT and Systran are the most famous examples of RBMT systems. Just take a look at the Aliexpress to feel the soft breath of this golden age.

But even they had some nuances and subspecies.

Direct Machine Translation

This is the most straightforward type of machine translation. It divides the text into words, translates them, slightly corrects the morphology, and harmonizes syntax to make the whole thing sound right, more or less. When the sun goes down, trained linguists write the rules for each word.

The output returns some kind of translation. Usually, it’s quite crappy. It seems that the linguists wasted their time for nothing.

Modern systems do not use this approach at all, and modern linguists are grateful.

Transfer-based Machine Translation

In contrast to direct translation, we prepare first by determining the grammatical structure of the sentence, as we were taught at school. Then we manipulate whole constructions, not words, afterwards. This helps to get quite decent conversion of the word order in translation. In theory.

In practice, it still resulted in verbatim translation and exhausted linguists. On the one hand, it brought simplified general grammar rules. But on the other, it became more complicated because of the increased number of word constructions in comparison with single words.

Interlingual Machine Translation

In this method, the source text is transformed to the intermediate representation, and is unified for all the world’s languages (interlingua). It’s the same interlingua Descartes dreamed of: a meta-language, which follows the universal rules and transforms the translation into a simple “back and forth” task. Next, interlingua would convert to any target language, and here was the singularity!

Because of the conversion, Interlingua is often confused with transfer-based systems. The difference is the linguistic rules specific to every single language and interlingua, and not the language pairs. This means, we can add a third language to the interlingua system and translate between all three. We can’t do this in transfer-based systems.

It looks perfect, but in real life it’s not. It was extremely hard to create such universal interlingua — a lot of scientists have worked on it their whole lives. They’ve not succeeded, but thanks to them we now have morphological, syntactic, and even semantic levels of representation. But the only Meaning-text theory costs a fortune!

The idea of intermediate language will be back. Let’s wait awhile.

As you can see, all RBMT are dumb and terrifying, and that’s the reason they are rarely used unless for specific cases (like the weather report translation, and so on). Among the advantages of RBMT, often mentioned are its morphological accuracy (it doesn’t confuse the words), reproducibility of results (all translators get the same result), and the ability to tune it to the subject area (to teach economists or terms specific to programmers, for example).

Even if anyone were to succeed in creating an ideal RBMT, and linguists enhanced it with all the spelling rules, there would always be some exceptions: all the irregular verbs in English, separable prefixes in German, suffixes in Russian, and situations when people just say it differently. Any attempt to take into account all the nuances would waste millions of man hours.

And don’t forget about homonyms. The same word can have a different meaning in a different context, which leads to a variety of translations. How many meanings can you catch here: I saw a man on a hill with a telescope?

Languages did not develop based on a fixed set of rules — a fact which linguists love. They were much more influenced by the history of invasions in past three hundred years. How could you explain that to a machine?

Forty years of the Cold War didn’t help in finding any distinct solution. RBMT was dead.

Example-based Machine Translation (EBMT)

Japan was especially interested in fighting for machine translation. There was no Cold War, but there were reasons: very few people in the country knew English. It promised to be quite an issue at the upcoming globalization party. So the Japanese were extremely motivated to find a working method of machine translation.

Rule-based English-Japanese translation is extremely complicated. The language structure is completely different, and almost all words have to be rearranged and new ones added. In 1984, Makoto Nagao from Kyoto University came up with the idea of using ready-made phrases instead of repeated translation.

Let’s imagine that we have to translate a simple sentence — “I’m going to the cinema.” And let’s say we’ve already translated another similar sentence — “I’m going to the theater” — and we can find the word “cinema” in the dictionary.

All we need is to figure out the difference between the two sentences, translate the missing word, and then not screw it up. The more examples we have, the better the translation.

I build phrases in unfamiliar languages exactly the same way!

EBMT showed the light of day to scientists from all over the world: it turns out, you can just feed the machine with existing translations and not spend years forming rules and exceptions. Not a revolution yet, but clearly the first step towards it. The revolutionary invention of statistical translation would happen in just five years.

Statistical Machine Translation (SMT)

In early 1990, at the IBM Research Center, a machine translation system was first shown which knew nothing about rules and linguistics as a whole. It analyzed similar texts in two languages and tried to understand the patterns.

The idea was simple yet beautiful. An identical sentence in two languages split into words, which were matched afterwards. This operation repeated about 500 million times to count, for example, how many times the word “Das Haus” translated as “house” vs “building” vs “construction”, and so on.

If most of the time the source word was translated as “house”, the machine used this. Note that we did not set any rules nor use any dictionaries — all conclusions were done by machine, guided by stats and the logic that “if people translate that way, so will I.” And so statistical translation was born.

The method was much more efficient and accurate than all the previous ones. And no linguists were needed. The more texts we used, the better translation we got.

There was still one question left: how would the machine correlate the word “Das Haus,” and the word “building” — and how would we know these were the right translations?

The answer was that we wouldn’t know. At the start, the machine assumed that the word “Das Haus” equally correlated with any word from the translated sentence. Next, when “Das Haus” appeared in other sentences, the number of correlations with the “house” would increase. That’s the “word alignment algorithm,” a typical task for university-level machine learning.

The machine needed millions and millions of sentences in two languages to collect the relevant statistics for each word. How did we get them? Well, we decided to take the abstracts of the European Parliament and the United Nations Security Council meetings — they were available in the languages of all member countries and were now available for download at UN Corporaand Europarl Corpora.

Word-based SMT

In the beginning, the first statistical translation systems worked by splitting the sentence into words, since this approach was straightforward and logical. IBM’s first statistical translation model was called Model one. Quite elegant, right? Guess what they called the second one?

Model 1: “the bag of words”

Model one used a classical approach — to split into words and count stats. The word order wasn’t taken into account. The only trick was translating one word into multiple words. For example, “Der Staubsauger” could turn into “Vacuum Cleaner,” but that didn’t mean it would turn out vice versa.

Here’re some simple implementations in Python: shawa/IBM-Model-1.

Model 2: considering the word order in sentences

The lack of knowledge about languages’ word order became a problem for Model 1, and it’s very important in some cases.

Model 2 dealt with that: it memorized the usual place the word takes at the output sentence and shuffled the words for the more natural sound at the intermediate step. Things got better, but they were still kind of crappy.

Model 3: extra fertility

New words appeared in the translation quite often, such as articles in German or using “do” when negating in English. “Ich will keine Persimonen” → “I donot want Persimmons.” To deal with it, two more steps were added to Model 3.

  • The NULL token insertion, if the machine considers the necessity of a new word
  • Choosing the right grammatical particle or word for each token-word alignment

Model 4: word alignment

Model 2 considered the word alignment, but knew nothing about the reordering. For example, adjectives would often switch places with the noun, and no matter how good the order was memorized, it wouldn’t make the output better. Therefore, Model 4 took into account the so-called “relative order” — the model learned if two words always switched places.

Model 5: bugfixes

Nothing new here. Model 5 got some more parameters for the learning and fixed the issue with conflicting word positions.

Despite their revolutionary nature, word-based systems still failed to deal with cases, gender, and homonymy. Every single word was translated in a single-true way, according to the machine. Such systems are not used anymore, as they’ve been replaced by the more advanced phrase-based methods.

Phrase-based SMT

This method is based on all the word-based translation principles: statistics, reordering, and lexical hacks. Although, for the learning, it split the text not only into words but also phrases. These were the n-grams, to be precise, which were a contiguous sequence of n words in a row.

Thus, the machine learned to translate steady combinations of words, which noticeably improved accuracy.

The trick was, the phrases were not always simple syntax constructions, and the quality of the translation dropped significantly if anyone who was aware of linguistics and the sentences’ structure interfered. Frederick Jelinek, the pioneer of the computer linguistics, joked about it once: “Every time I fire a linguist, the performance of the speech recognizer goes up.”

Besides improving accuracy, the phrase-based translation provided more options in choosing the bilingual texts for learning. For the word-based translation, the exact match of the sources was critical, which excluded any literary or free translation. The phrase-based translation had no problem learning from them. To improve the translation, researchers even started to parse the news websites in different languages for that purpose.

Starting in 2006, everyone began to use this approach. Google Translate, Yandex, Bing, and other high-profile online translators worked as phrase-based right up until 2016. Each of you can probably recall the moments when Google either translated the sentence flawlessly or resulted in complete nonsense, right? The nonsense came from phrase-based features.

The good old rule-based approach consistently provided a predictable though terrible result. The statistical methods were surprising and puzzling. Google Translate turns “three hundred” into “300” without any hesitation. That’s called a statistical anomaly.

Phrase-based translation has become so popular, that when you hear “statistical machine translation” that is what is actually meant. Up until 2016, all studies lauded phrase-based translation as the state-of-the-art. Back then, no one even thought that Google was already stoking its fires, getting ready to change our whole image of machine translation.

Syntax-based SMT

This method should also be mentioned, briefly. Many years before the emergence of neural networks, syntax-based translation was considered “the future or translation,” but the idea did not take off.

The proponents of syntax-based translation believed it was possible to merge it with the rule-based method. It’s necessary to do quite a precise syntax analysis of the sentence — to determine the subject, the predicate, and other parts of the sentence, and then to build a sentence tree. Using it, the machine learns to convert syntactic units between languages and translates the rest by words or phrases. That would have solved the word alignment issue once and for all.

The problem is, the syntactic parsing works terribly, despite the fact that we consider it solved a while ago (as we have the ready-made libraries for many languages). I tried to use syntactic trees for tasks a bit more complicated than to parse the subject and the predicate. And every single time I gave up and used another method.

Let me know in the comments if you succeed using it at least once.

Neural Machine Translation (NMT)

A quite amusing paper on using neural networks in machine translation was published in 2014. The Internet didn’t notice it at all, except Google — they took out their shovels and started to dig. Two years later, in November 2016, Google made a game-changing announcement.

The idea was close to transferring the style between photos. Remember apps like Prisma, which enhanced pictures in some famous artist’s style? There was no magic. The neural network was taught to recognize the artist’s paintings. Next, the last layers containing the network’s decision were removed. The resulting stylized picture was just the intermediate image that network got. That’s the network’s fantasy, and we consider it beautiful.

If we can transfer the style to the photo, what if we try to impose another language to a source text? The text would be that precise “artist’s style,” and we would try to transfer it while keeping the essence of the image (in other words, the essence of the text).

Imagine I’m trying to describe my dog — average size, sharp nose, short tail, always barks. If I gave you this set of the dog’s features, and if the description was precise, you could draw it, even though you have never seen it.

Now, imagine the source text is the set of specific features. Basically, it means that you encode it, and let the other neural network decode it back to the text, but, in another language. The decoder only knows its language. It has no idea about of the features’ origin, but it can express them in, for example, Spanish. Continuing the analogy, it doesn’t matter how you draw the dog — with crayons, watercolor or your finger. You paint it as you can.

Once again — one neural network can only encode the sentence to the specific set of features, and another one can only decode them back to the text. Both have no idea about the each other, and each of them knows only its own language. Recall something? Interlingua is back. Ta-da.

The question is, how do we find those features? It’s obvious when we’re talking about the dog, but how to deal with the text? Thirty years ago scientists already tried to create the universal language code, and it ended in a total failure.

Nevertheless, we have deep learning now. And that’s its essential task! The primary distinction between the deep learning and classic neural networks lays precisely in the ability to search for those specific features, without any idea of their nature. If the neural network is big enough, and there are a couple of thousand video cards at hand, it’s possible to find those features in the text as well.

Theoretically, we can pass the features gotten from the neural networks to the linguists, so that they can open brave new horizons for themselves.

The question is, what type of neural network should be used for encoding and decoding? Convolutional Neural Networks (CNN) fit perfectly for pictures since they operate with independent blocks of pixels.

But there are no independent blocks in the text — every word depends on its surroundings. Text, speech, and music are always consistent. So recurrent neural networks (RNN) would be the best choice to handle them, since they remember the previous result — the prior word, in our case.

Now RNNs are used everywhere — Siri’s speech recognition (it’s parsing the sequence of sounds, where the next depends on the previous), keyboard’s tips (memorize the prior, guess the next), music generation, and even chatbots.

In two years, neural networks surpassed everything that had appeared in the past 20 years of translation. Neural translation contains 50% fewer word order mistakes, 17% fewer lexical mistakes, and 19% fewer grammar mistakes. The neural networks even learned to harmonize gender and case in different languages. And no one taught them to do so.

The most noticeable improvements occurred in fields where direct translation was never used. Statistical machine translation methods always worked using English as the key source. Thus, if you translated from Russian to German, the machine first translated the text to English and then from English to German, which leads to a double loss.

Neural translation doesn’t need that — only a decoder is required so it can work. That was the first time that direct translation between languages with no сommon dictionary became possible.

The conclusion and the future

Everyone’s still excited about the idea of “Babel fish” — instant speech translation. Google has made steps towards it with its Pixel Buds, but in fact, it’s still not what we were dreaming of. The instant speech translation is different from the usual translation. You need to know when to start translating and when to shut up and listen. I haven’t seen suitable approaches to solve this yet. Unless, maybe, Skype…

And here’s one more empty area: all the learning is limited to the set of parallel text blocks. The deepest neural networks still learn at parallel texts. We can’t teach the neural network without providing it with a source. People, instead, can complement their lexicon with reading books or articles, even if not translating them to their native language.

If people can do it, the neural network can do it too, in theory. I found only one prototype attempting to incite the network, which knows one language, to read the texts in another language in order to gain experience. I’d try it myself, but I’m silly. Ok, that’s it.

Reference: https://bit.ly/2HCmT6v