Month: November 2019

MT Isn’t Good For Languages, But..

MT Isn’t Good For Languages, But..

Reading Time: 3 minutes

In a Sci-Fi movie we all know, there is the machine invention merging in to fulfill a certain demand and a great need covering all the routine and dangerous work. But, there is this end; that we all know by heart as well, the machine evolves and revolts against the maker and works for the maker’s destruction. This shows us the double edged weapon that the machine has. In this article we will reflect these edges: pros and cons, of the machine translation.

The machine has been always a great tool in saving time and money for the sake of exerting the effort to enhance it to produce a better quality. Luckily, we have achieved this, and the machine translation has saved a lot of time translating repetitive content such as the financial field that is full of numbers and the technical content that is full of saved terminology,…etc. Machine translation also grants accuracy translating such content, where no complex structure or ambiguity surrounds the content. It also translates fast, which means less time; and in a world where time means money; this is the best to rely on.

Such applications also evolved to translate images and signs on the spot. Imagine you’re lost in a country you don’t speak its language! No one to understand, no one to ask!

If a programmer or a developer wants to make his application or software or even a website to be used in the whole world, what would he do? Get translators from all the whole world and pay them and get broke before he even takes off?

Machine translation is the answer here as well, as it will be of a great help to this programmer because such a repetitive content will can be translated in a good quality.

On the other hand, the climax of the movie emerges. And we come to the question: does it scheme to destroy the translation profession and the death of the language?

Did it prove high quality in the artistic translation? No, why?

This is due some factors, which are:

  • Perplexity

Language is an evolving living entity. One of the language schools measured the elevation of a language by its ability to complicate. The more complicated the structure and the diction, the higher this language.

Unfortunately, machine translation lacks the ability to deal or understand such complication. This, by turn lead us to the length ratio.

  • Length Ratio

After much trainings and tests to the machine translation followed by evaluation, it appears that the produced is more of chopped sentences, cut off, simple, and normalized affected by the source. This also relates to the language complication; consequently, it defies the language elevation and richness. And speaking of richness, we go to

  • Lexical Density

According to MacMillan Dictionary, it is: “the proportion of content words to function words in a text. The higher the proportion of content words, the greater the lexical density”.

Machine translation is effective at these types of texts that don’t really have many of content words, and it gets sometimes lost the more content words existed. This also applies to the diverse significant word in a translated text, which adds to the richness and the uniqueness of the language. How does the machine translation work with this?

The Machine works with the most frequent words, to keep the text consistent; which threatens the less frequent words with death and oblivion.

However, don’t get disappointed dear linguist, there is this MT Summit that gets to be held globally every 2 years, and regionally every year. They discuss such issues and they try to find solutions for it based on research papers and tests with evaluations. Furthermore, we have you to post-edit these translations, creatively. Machine translation will never replace you, it’s just you who will upgrade and drench yourself more in your language linguistics.

Here, we reach a conclusion that tells us that machine translation is not scheming to the linguists death, unless they allowed, for sure. Nevertheless, it might affect the language.

You do what you got to do.

Six Challenges for Neural Machine Translation

Six Challenges for Neural Machine Translation

Reading Time: 1 minute
Abstract

We explore six challenges for neural machine translation: domain mismatch, amount of training data, rare words, long sentences, word alignment, and beam search. We show both deficiencies and improvements over the quality of phrase-based statistical machine translation.

See the paper from here

Quantisation of Neural Machine Translation models

Quantisation of Neural Machine Translation models

Reading Time: 1 minute

When large amounts of training data are available, the quality of Neural MT engines increases with the size of the model. However, larger models imply decoding with more parameters, which makes the engine slower at test time. Improving the trade-off between model compactness and translation quality is an active research topic. One of the ways to achieve more compact models is via quantisation, that is, by requiring each parameter value to occupy a fixed number of bits, thus limiting the computational cost. In this post we take a look at a paper which achieves 4 times more compact Transformer Neural MT models via quantisation into 8 bit values, with no loss in translation quality according to BLEU score.

Read more here

Issue #55 – Word Alignment from Neural Machine Translation

Issue #55 – Word Alignment from Neural Machine Translation

Reading Time: 1 minute

Word alignments were the cornerstone of all previous approaches to statistical MT. You take your parallel corpus, align the words, and build from there. In Neural MT however, word alignment is no longer needed as an input of the system. That being said, research is coming back around to the idea that it remains useful in real-world practical scenarios for tasks such as replacing tags in MT output.

Read more from here

Evaluating machine translation in a low-resource language combination

Evaluating machine translation in a low-resource language combination

Reading Time: 1 minute

Aim
• Main aim:
• Determining which type of MT system (RBMT, PBMT
or NMT) is perceived as more adequate in the
context of a minoritized language such as Galician in
a MT+PE workflow.
3
• Specific aims:
• BLEU automatic evaluation.
• Human evaluation (quality perception survey
conducted among experienced professional posteditors)
• Error analysis framework (MQM)
Evaluating machine translation in a low-resource
language combination: Spanish-Galician

Look at the paper thoroughly from here

Hungarian translators’ perceptions of neural machine translation in the European Commission

Hungarian translators’ perceptions of neural machine translation in the European Commission

Reading Time: 1 minute

In the framework of its investigative mandate, the Office collects information of investigative interest, including personal data, from various sources – public authorities, private entities and natural persons – and exchanges it with Union institutions, bodies, offices and agencies, with competent authorities of Member States and third countries, as well as with international organisations before, during and after the investigation or coordination activities.
(Commission Decision (EU) 2018/1962 )

learn more about their paper from here 

Improving CAT Tools in the Translation Workflow: New Approaches and Evaluation – Mihaela Vela

Improving CAT Tools in the Translation Workflow: New Approaches and Evaluation – Mihaela Vela

Reading Time: 1 minute

CAT Tools
• Important part of modern translation workflow
– Trados Studio
– MemoQ
– DejaVu
– XTM
– MateCAT
– Etc.
• Increase translator‘s productivity
• Improve consistency in translation
• Reduce costs

Read more from this paper from here