Tag: CAT Tools

A Beginner’s Guide to Machine Translation

A Beginner’s Guide to Machine Translation

What is Machine Translation?

Machine translation (MT) is automated translation by computer software. MT can be used to translate entire texts without any human input, or can be used alongside human translators. The concept of MT started gaining traction in the early 50s, and has come a long way since. Many used to consider MT an inadequate alternative to human translators, but as the technology has advanced, more and more companies are turning to MT to aid human translators and optimize the localization process.

How Does Machine Translation Work?

Well, that depends on the type of machine translation engine. There are several different kinds of MT software which work in different ways. We will introduce Rule-based, Statistical, and Neural.

Rule-based machine translation (RBMT) is the forefather of MT software. It is based on sets of grammatical and syntactical rules and phraseology of a language. RBMT links the structure of the source segment to the target segment, producing a result based on analysis of the rules of the source and target languages. The rules are developed by linguists and users can add terminology to override the MT and improve the translation quality.

Statistical MT (SMT) started in the age of big data and uses large amounts of existing translated texts and statistical models and algorithms to generate translations. This system relies heavily on available multilingual corpora and an average of two millions words are needed to train the engine for a specific domain – which can be time and resource intensive. When a using domain specific data, SMT can produce good quality translations, especially in the technical, medical, and financial field.

Neural MT (NMT) is a new approach which is built on deep neural networks. There are a variety of network architectures used in NMT but typically, the network can be divided into two components: an encoder which reads the input sentence and generates a representation suitable for translation, and a decoder which generates the actual translation. Words and even whole sentences are represented as vectors of real numbers in NMT. Compared to the previous generation of MT, NMT generates outputs which tend to be more fluent and grammatically accurate. Overall, NMT is a major step in MT quality. However, NMT may slightly lack behind previous approaches when it comes to translating rare words and terminology. Long and/or complex sentences are still an issue even for state-of-the-art NMT systems.

The Pros and Cons of Machine Translation

So now you have a brief understanding of MT – but what does it mean for your translation workflow? How does it benefit you?

  • MT is incredibly fast and can translate thousands of words per minute.
  • It can translate into multiple languages at once which drastically reduces the amount of manpower needed.
  • Implementing MT into your localization process can do the heavy lifting for translators and free up their valuable time, allowing them to focus on the more intricate aspects of translation.
  • MT technology is developing rapidly, and is constantly advancing towards producing higher quality translations and reducing the need for post-editing.

There are many advantages of using MT but we can’t ignore the disadvantages. MT does not always produce perfect translations. Unlike human translators, computers can’t understand context and culture, therefore MT can’t be used to translate anything and everything. Sometimes MT alone is suitable, while others a combination of MT and human translation is best. Sometimes it is not suitable at all. MT is not a one-size-fits-all translation solution.

When Should You Use Machine Translation?

When translating creative or literary content, MT is not a suitable choice. This can also be the case when translating culturally specific-texts. A good rule of thumb is the more complex your content is, the less suitable it is for MT.

For large volumes of content, especially if it has a short turnaround time, MT is very effective. If accuracy is not vital, MT can produce suitable translations at a fraction of the cost. Customer reviews, news monitoring, internal documents, and product descriptions are all good candidates.

Using a combination of MT along with a human translator post-editor opens the doors to a wider variety of suitable content.

Which MT Engine Should You Use?

Not all MT engines are created equal, but there is no specific MT engine for a specific kind of content. Publicly available MT engines are designed to be able to translate most types of content, however, with custom MT engines the training data can be tailored to a specific domain or content types.

Ultimately, choosing an MT engine is a process. You need to choose the kind of content you wish to translate, review security and privacy policies, run tests on text samples, choose post-editors, and several other considerations. The key is to do your research before making a decision. And, if you are using a translation management system (TMS) be sure it is able to support your chosen MT engine.

Using Machine Translation and a Translation Management System

You can use MT on its own, but to get the maximum benefits we suggest integrating it with a TMS. With these technologies integrated, you will be able to leverage additional tools such as translation memories, term bases, and project management features to help streamline and optimize your localization strategy. You will have greater control over your translations, and be able to analyze the effectiveness of your MT engine.

Reference: http://bit.ly/2P85d7P

TRANSLATION TECH GOES MEDIA

TRANSLATION TECH GOES MEDIA

Four out of the five fastest-growing language services companies in 2018 are media localization specialists. The media business has seen a boom over the last two years, and traditional translation companies are taking notice. Media localization won’t stay an uncontested insular niche for long. In fact, conventional LSPs and technology providers are moving into this sector and expanding their technical capabilities this year.

HERE ARE A FEW EXAMPLES, WITH MORE TO FOLLOW…

Omniscien launched an automated subtitling tool

Omniscien, previously Asia Online, is best known for its trainable machine translation software, but now they are going into a new area – video subtitling. Omniscien has just started selling Media Studio, which was built based on product requirements from iflix, a Malaysian competitor to Netflix.

Under the hood Media Studio has machine learning components: audio transcription, dialog extraction, and neural MT engines pre-trained for subtitles in more than 40 language combinations. The technology is able to create a subtitle draft out of a raw video already in the target language. It can even adjust timings and split long sentences into multiple subtitles where necessary. And it’s learning all the time.

For the human part of the work, Media Studio includes a web-based subtitle editor and a management system, both including a significant range of features right from the start. Translators can edit time codes in a drag-and-drop fashion, skip parts of the video without speech, customize keyboard shortcuts, and more. Project managers can assign jobs and automatically send job notifications, track productivity, and MT leverage.

The video is hosted remotely and is streamed to linguists instead of sending complete films and episodes. This adds a security layer for the intellectual property. No one in the biz wants the next episode of the Game of Thrones to end up on thepiratebay.org faster than it would on a streaming service. Linguists in low-bandwidth countries can download videos in low quality and with a watermark.

On the downside, this new tool does not integrate with existing CAT and business management systems for LSPs out of the box, doesn’t have translation memory support or anything else that would make it fit as one of the blades in the Swiss army knife of LSP technology.

According to Omniscien’s CEO Dion Wiggins, iflix has processed hundreds of thousands of video hours through the system since its inception in late 2016. By now, three more large OTT providers have started with Media Studio. Content distribution companies are the main target for the tool, but it will be available for LSPs as well once the pricing is finalized.

GlobalLink deployed subtitle and home dubbing software

At a user conference in Amsterdam this June, TransPerfect unveiled a new media localization platform called Media.Next. The platform has three components:

The subtitle editor is a CAT-tool with an embedded video player. Translators using this platform can watch videos and transcribe them with timings, possibly with integrated speech recognition to automatically create the first pass. As they translate using translation memory and termbase, they are able to see the subtitles appear on the screen.

The home dubbing is all about the setup on the voice-actor side. TransPerfect sends them mics and soundproofing so that recording can happen at home rather than at a local audio studio.

A media asset management platform stores videos at a single location and proxies them to the translator applications instead of sending complete files over the Internet, similar to Omniscien’s approach.

The official launch of TransPerfect’s Media.NEXT is scheduled for mid-August.

Proprietary tech launched earlier this year

TransPerfect’s tech is proprietary, meant to create a competitive advantage. Media localization companies such as Zoo Digital and Lylo took a similar approach. They have launched cloud subtitling and dubbing platforms, but continue to keep technology under the radar of other LSPs.

The idea of “dubbing in the cloud” is that it gives the client visibility into the actual stage of the process, and flexibility with early-stage review and collaboration with the vendor. The same idea permeates Deluxe Media’s platform Deluxe One unveiled in April this year. It’s a customer portal that provides clients with access to multiple services and APIs.

Deluxe One user interface

MemoQ and Wordbee add view video preview for subtitling

At the same time, subtitling capabilities are beginning to make their way into tools that are available to hundreds of LSPs around the world.

Popular translation editor memoQ has added a video player with a preview in their July release. The editor now opens the video file at the point that is being translated and displays the translated text so that translators can check it live. It can also show the number of words per minute, characters per second, or characters per line.

A similar preview appears in Wordbee. The embedded video player can open videos from an URL, or play clips that are uploaded to the editor directly. The initial release includes a character limitation feature to keep subtitles concise, and anchoring: clicking on the segment with the text rewinds the video to that text.

This is a step showing memoQ’s and Wordbee’s move deeper into media, and differentiating them from other TMS.

So far, few TMS had video previews, one of them was Smartcat. Subtitling functionality in Smartcat has been developed in 2013 for a special project, crowdsourced localization of e-learning platform Courserra. Today, users need to enable subtitling functionality on request. The feature set available includes a video player, timecode parsing, and anchoring. Subtitling user numbers in Smartcat are rising, according to product manager Pavel Doronin.

Back to memoQ and Wordbee, their development teams probably will need to expand the list subtitling features over time: first of all, timecode editing. Moreover, memoQ and Wordbee support .SRT extension, whereas Omniscien’s tool supports TTML as well: a more advanced format that allows manipulating subtitle colors, position on screen and formatting. TTML might become more important for video on demand work and streaming platforms, for instance, it is the format that Netflix uses.

Future “luxury” features could include character tracking with descriptions explaining their voice and preferred vocabulary, support for the speech-to-text technology, audio recording, etc.

Subtitling commoditization looms

Subtitling is not new to the translation industry, and almost every mature CAT/TMS supports .srt SubRip text files. However, linguists have to run a third-party video player in a separate window to see their work. They also have to reload and rewind every time to see changes in the subtitles.

That’s why in professional scenarios, subtitlers often use Amara, Subtitle Workshop, Oona captions manager, CaptionHub or similar specialized tools. These tools came from outside the language industry and didn’t support translation memories, term bases, and embedded MT.

Previous attempts to create tools that combine the best of two worlds didn’t quite meet with commercial success. Years following the launch, user numbers for dotsub.com, hakromedia SoundCloud, and videolocalize.com stayed limited. So far, most language industry professionals viewed media localization as a niche service rather than as a growth area. As a result, they didn’t invest in specialized departments and software. But with video content increasing in share, and with media companies demonstrating record revenues, this might eventually change.

However, by the time it does change, translation tools may achieve a “good enough” capability. Fast-forward 1-2 years – most LSPs might be able to subtitle without extra investment or training. It will become even easier to enter into subtitling and compete, leading to price pressure. Subtitling may turn into an even more crowded and low-margin space before you can say “commoditization”.

Dubbing: Home studio vs studio M&A strategy

Dubbing, on the other hand, is a different kind of deal.

So far, the dubbing market has been dominated by larger companies such as Deluxe and SDI Media that provide voice talent in physical sound studios located in multiple countries. Perhaps one of the best examples of this would be Disney’s Let It Go which has been translated into at least 74 languages.

Infrastructure for such projects is costly to build and maintain. Brick-and-mortar studios have high bills and need a constant flow of work to stay profitable. Projects might be hard to find for second-tier language outlets. To have a French studio overloaded and a Croatian studio suffering losses year after year is a realistic scenario for a network company.

The virtual/home studio approach being used by newer players in this field such as TransPerfect, Zoo Digital and Lylo Media, is more scalable and provides acceptable quality for most clients. But will it be enough for high-profile content owners that award the largest contracts?

If the home studio approach produces sustainable growth, commercial software vendors will jump in and replicate the technology, leading to lower entry to dubbing. However, if it fails over 2018-2019, instead M&A will become the go-to-market strategy in the media localization space. Watch out for smaller studio acquisition frenzy!

Reference: http://bit.ly/2LVhf6C

Translating in-house? Here’s what you need to know

Translating in-house? Here’s what you need to know

How does your law firm get translations done? If you do some or all of it in-house, you may run your own translation team; rely on the language skills of lawyers, knowledge managers and other colleagues; or use a combination of the two.

Given the ever-present need to work as efficiently as possible in order to meet delivery deadlines, we’ve seen a number of ways in which law firms try to speed up their in-house translation process. Here are a few of the things they’ve learned along the way:

Be wary of manually reusing content

Translators may look to reuse content from previous translations by copying and pasting chunks of text. It’s an exceedingly common practice — but one that’s fraught with danger. Just as with drafting precedents, starting with existing documents risks missing subtle differences between cases — and horror stories abound of the consequences of these errors (including houses being sold to the wrong person). The end result: most of the time saved reusing content could well be spent fixing errors (either those caught internally or spotted by your client).

Many hands make light work?

Another option is to split a document into sections for several colleagues to work on concurrently. This may be faster and may reduce the chance of errors when compared to reusing old documents, but it raises concerns around consistency between different translators. It also means great care must be taken when the translated document is stitched back together at the end of the process.

Free tools come with a cost

It may be tempting to turn to free online translation tools like Google Translate to make the work go faster, especially for small jobs like birth certificates or even tweets. But law firms are generally against the idea — and understandably so. For starters, free tools aren’t designed to translate complex legal terminology, or to render legal concepts from one jurisdiction to another — so there’s no guarantee of quality or accuracy. Reviewing and amending the translated output could therefore take more time than doing the work from scratch. Besides that, using such tools could put confidential or valuable information at risk.

The right technology is a lifesaver

One reason many law firms struggle to get translations done quickly, accurately and consistently is that they’re doing the work using standard office productivity apps. But as these apps aren’t designed with translators’ needs in mind, they don’t include the features needed to make translation an efficient process.

That’s why you’ll find that firms who are translating in-house successfully are often using computer-assisted translation (CAT) tools.

CAT tools are designed to help translators work faster and smarter. Using technology developed specifically to support translation work, it can:

  • Raise quality and consistency, and accelerate handling of repetitive content, with translation memories and terminology databases that simplify reuse of previously approved content
  • Increase translators’ productivity with features to increase the speed of translation while safeguarding against mistakes
  • Turn lengthy documents round faster by making it easy for several people to collaborate on the same translation

Balance speed, quality and cost

At the end of it all, the translation challenge comes down to three variables: cost, quality, and speed. Just as with the outsourcing model, translating in-house brings unique challenges to maintaining the balance between these three variables.

CAT tools can help law firms tip the scales in their favor, by giving them a way to improve speed without compromising on quality.

Reference: http://bit.ly/2OgpXcg

The Translation Industry’s Top-Earning Career Paths

The Translation Industry’s Top-Earning Career Paths

Adaptive’s recruiters are often asked by candidates how they can build their careers to raise their market value and earnings. Here we share our map of the paths which lead to some of the top-paying roles in the global language services industry.

First things first – you don’t have to be a salesperson to make big bucks.

Often when Adaptive is approached by candidates looking to up their earnings in the language services industry, there’s an expectation that only the high-flying BDMs and C-suite management are making top money.

After all, BDMs are on commission plans and signing big customer deals can be very lucrative. And it’s true – top BDMs and sales managers can be making as much as anyone on this list.

But salespeople are not the only ones with strong pay packages in the language industry.

In fact, we’ve left sales out of our list below to offer alternate options to translation and localization industry professionals looking to build their careers.

So here we go – four routes to top-paying roles if cold calling isn’t your thing:

1. Business Unit Leadership

e.g. VP Life Sciences, VP Engineering

Broad-ranging VP titles usually signify a role that is a mix of client relations, operations and specific expertise in a particular area.

Professionals in these positions are in charge of ‘business units’ which operate like mini-companies within the larger organization, focused on one specific area – such as services to the Life Sciences market or engineering services.

This means the VP’s responsibility is wide, often covering a separate profit and loss account for their unit. VPs leading these areas can come from a variety of backgrounds, but have often worked their way up an internal hierarchy where their increasing experience makes them more and more valuable.

They head up hiring, account management and ensure that their company’s service offering continues to be competitive and evolve with the market.

Career Entry Point: Project Manager, Account Manager, BDM

Key Skill: ability to combine rounded business skills with deep subject-matter expertise

Average Salary Range: $100,000 – $160,000 + bonus

2. Internal Technology Management

e.g. CTO, VP of Technology

At the highest level, technology managers need to be more than just experts in localization workflows, and lead areas such as networking, security, compliance, training, technology change management, data recovery and more.

Their focus is on the role technology plays in helping the company reach strategic goals and impacting overall P&L.

Localization career paths typically go from specialist to generalist with candidates building a base in CAT tools, internal and client workflows and then rounding out generalist IT competencies to continue progressing.

Career Entry Point: CAT Tools Specialist, Localization Technology Manager, Localization Engineer

Average Salary Range: $120,000 – $180,000 + bonus

Key skill: ability to visualize and implement technology changes which make high-value improvements to the global organization

3. Operations & General Management

e.g. VP Operations, General Manager

A great goal for Project Managers!

Many of the industry’s top-paid professionals in operations (production) leadership started ‘in the trenches’ as PMs. Growth in this career channel comes from deep first-hand knowledge of internal workflows, aptitude for working directly with key customers and versatile operational skills – organization, planning, financial management and personnel leadership.

As operations candidates move up the career ladder, they broaden their generalist business skills and combine them with their expert knowledge of localization processes to eventually step up and take overall responsibility.

Career Entry Point: Project Manager, QA Manager

Salary range: $120,000 – $150,000 + bonus

Key skill: ability to design and maintain efficient teams and workflows to deliver reliably and profitably for customers

4. Client Solutions Development

e.g. VP Client Solutions, Global Solutions Manager

A specialist team within most LSPs, solutions professionals focus on bridging the gap between sales, production and IT.

Their focus is building creative solutions for prospective and existing customers, which involves customizing, integrating and potentially selecting new tools to bring together clients’ existing technology systems and those used by the LSP.

Many client solutions experts get their start in engineering and are well versed in CAT tools, but also work to develop strong client relationship skills throughout their careers. Often professionals in this space work on the client side for at least a few years, building inside knowledge from the buyer perspective.

At the top of the tree, global managers for solutions teams build some of the most advanced workflows in commercial localization.

Career Entry Point: Localization Engineer, CAT Tools Specialist, Project Manager, Account Manager

Salary range: $130,000 – $150,000 + bonus

Key skill: ability to think creatively to create unique technology-based workflow solutions

* * *

Adaptive Globalization recruits within the translation, localization and language technology sectors from entry-level to VP+.

We love to chat with translation industry professionals about how they can make the right career moves to achieve their goals. Drop me a line at ray.green@adaptiveglobalization.com

You can check out Adaptive Globalization’s full list of vacancies for PMs, Account Managers, Loc Engineers, BDMs and more in our job listings here

Reference: http://bit.ly/2LAenLc

Nimdzi Language Technology Atlas

Nimdzi Language Technology Atlas

For this first version, Nimdzi has mapped over 400 different tools, and the list is growing quickly. The Atlas consists of an infographic accompanied by a curated spreadsheet with software listings for various translation and interpreting needs.

As the language industry becomes more technical and complex, there is a growing need for easy-to-understand materials explaining available tech options. The Nimdzi Language Technology Atlas provides a useful view into the relevant technologies available today.

Software users can quickly find alternatives for their current tools and evaluate market saturation in each segment at a glance. Software developers can identify competition and find opportunities in the market with underserved areas.

Reference: https://bit.ly/2ticEyT

Six takeaways from LocWorld 37 in Warsaw

Six takeaways from LocWorld 37 in Warsaw

Over the past weekend, Warsaw welcomed Localization World 37 which gathered over 380 language industry professionals. Here is what Nimdzi has gathered from conversations at this premiere industry conference.

1. A boom in data processing services

A new market has formed preparing data to train machine learning algorithms. Between Lionbridge, Pactera, appen, and Welocalize  – the leading LSPs that have staked a claim in this sector – the revenue from these services already exceeds USD 100 million.

Pactera calls it “AI Enablement Services”, Lionbridge and Welocalize have labelled it “Global services for Machine Intelligence”, and appen prefers the title, “data for machine learning enhanced by human touch.” What companies really do is a variety of human tasks at scale:

  • Audio transcription
  • Proofreading
  • Annotation
  • Dialogue management

Humans help to train voice assistants and chat bots, image-recognition programs, and whatever else the Silicon Valley disruptors decide to unleash upon the world. One prime example was performed at the beginning of this year when Lionbridge recorded thousands of children pronouncing scripted phrases for a child-voice recognition engine.

Machine learning and AI are the second biggest areas for venture investment, according to dealroom.co. According to the International Data Corporation’s (IDC) forecast, this is likely to  quadruple in the next 5 years, from USD 12 billion in 2017 to USD 57.6 billion. Companies will need lots of accurate data to train their AI, hence there is significant business opportunity in data cleaning. Compared to flash platforms like Clickworker and Future Eight, LSPs have a broader human resource management competence and can compete for a large slice of the market.

2. LSP AI: Separating fact from fantasy

Artificial intelligence was high on information at #Locworld 37, but apart from the advances in machine translation, nothing radically new was presented. If any LSPs use machine learning for linguist selection, ad-hoc workflow building, or predictive quality analytics, they didn’t show it.

On the other hand, everyone is chiming in to the new buzzword. In a virtual show of hands at the AI panel discussion, an overwhelming proportion of LSP representatives voted that they already use some AI in their business. That’s pure exaggeration to put it mildly.

3. Introducing Game Global

Locworld’s Game Localization Roundtable expanded this year into a fully-fledged sister conference – the Game Global Forum. The two-day event gathered just over 100 people, including teams from King.com, Electronic Arts, Square Enix, Ubisoft, Wooga, Zenimax / Bethesda, Sony, SEGA, Bluehole and other gaming companies.

We spoke to participants on the buying side who believe the content to be very relevant, and vendors were happy with pricing – for roughly EUR 500, they were able to network with the world’s leading game localization buyers. This is much more affordable than the EUR 3,300+ price tag for the rival IQPC Game QA and Localization Conference.

Given the success of Game Global and the continued operation of the Brand2Global event, it’s fair to assume there is room for more industry-specific localization conferences.

4. TMS-buying rampage

Virtually every client company we’ve spoken to at Locworld is looking for a new translation management system. Some were looking for their first solution while others were migrating from heavy systems to more lightweight cloud-based solutions. This trend has been picked up by language technology companies which brought a record number of salespeople and unveiled new offerings.

Every buyer talked about the need for integration as well as end-to-end automation, and shared the “unless there is an integration, I won’t buy” sentiment. Both TMS providers and custom development companies such as Spartan Software are fully booked and churning out new connectors until the end of the 2018.

5. Translation tech and LSPs gear up for media localization

Entrepreneurs following the news have noticed that all four of the year’s fastest organically-growing companies are in the business of media localization. Their success made ripples that reached the general language services crowd. LSP voiceover and subtitling studios are overloaded, and conventional CAT-tools will roll out media localization capabilities this year. MemoQ will have a subtitle editor with video preview, and a bigger set of features is planned to be released by GlobalLink.

These features will make it easier for traditional LSPs to hop on the departed train of media localization. However, LSP systems won’t compare to specialized software packages that are tailored to dubbing workflow, detecting and labeling individual characters who speak in videos, tagging images with metadata, and the like.

Reference: https://bit.ly/2JZpkSM

Machine Translation From the Cold War to Deep Learning

Machine Translation From the Cold War to Deep Learning

In the beginning

The story begins in 1933. Soviet scientist Peter Troyanskii presented “the machine for the selection and printing of words when translating from one language to another” to the Academy of Sciences of the USSR. The invention was super simple — it had cards in four different languages, a typewriter, and an old-school film camera.

The operator took the first word from the text, found a corresponding card, took a photo, and typed its morphological characteristics (noun, plural, genitive) on the typewriter. The typewriter’s keys encoded one of the features. The tape and the camera’s film were used simultaneously, making a set of frames with words and their morphology.

Despite all this, as often happened in the USSR, the invention was considered “useless”. Troyanskii died of Stenocardia after trying to finish his invention for 20 years. No one in the world knew about the machine until two Soviet scientists found his patents in 1956.

It was at the beginning of the Cold War. On January 7th 1954, at IBM headquarters in New York, the Georgetown–IBM experiment started. The IBM 701 computer automatically translated 60 Russian sentences into English for the first time in history.

However, the triumphant headlines hid one little detail. No one mentioned the translated examples were carefully selected and tested to exclude any ambiguity. For everyday use, that system was no better than a pocket phrasebook. Nevertheless, this sort of arms race launched: Canada, Germany, France, and especially Japan, all joined the race for machine translation.

The race for machine translation

The vain struggles to improve machine translation lasted for forty years. In 1966, the US ALPAC committee, in its famous report, called machine translation expensive, inaccurate, and unpromising. They instead recommended focusing on dictionary development, which eliminated US researchers from the race for almost a decade.

Even so, a basis for modern Natural Language Processing was created only by the scientists and their attempts, research, and developments. All of today’s search engines, spam filters, and personal assistants appeared thanks to a bunch of countries spying on each other.

Rule-based machine translation (RBMT)

The first ideas surrounding rule-based machine translation appeared in the 70s. The scientists peered over the interpreters’ work, trying to compel the tremendously sluggish computers to repeat those actions. These systems consisted of:

  • Bilingual dictionary (RU -> EN)
  • A set of linguistic rules for each language (For example, nouns ending in certain suffixes such as -heit, -keit, -ung are feminine)

That’s it. If needed, systems could be supplemented with hacks, such as lists of names, spelling correctors, and transliterators.

PROMPT and Systran are the most famous examples of RBMT systems. Just take a look at the Aliexpress to feel the soft breath of this golden age.

But even they had some nuances and subspecies.

Direct Machine Translation

This is the most straightforward type of machine translation. It divides the text into words, translates them, slightly corrects the morphology, and harmonizes syntax to make the whole thing sound right, more or less. When the sun goes down, trained linguists write the rules for each word.

The output returns some kind of translation. Usually, it’s quite crappy. It seems that the linguists wasted their time for nothing.

Modern systems do not use this approach at all, and modern linguists are grateful.

Transfer-based Machine Translation

In contrast to direct translation, we prepare first by determining the grammatical structure of the sentence, as we were taught at school. Then we manipulate whole constructions, not words, afterwards. This helps to get quite decent conversion of the word order in translation. In theory.

In practice, it still resulted in verbatim translation and exhausted linguists. On the one hand, it brought simplified general grammar rules. But on the other, it became more complicated because of the increased number of word constructions in comparison with single words.

Interlingual Machine Translation

In this method, the source text is transformed to the intermediate representation, and is unified for all the world’s languages (interlingua). It’s the same interlingua Descartes dreamed of: a meta-language, which follows the universal rules and transforms the translation into a simple “back and forth” task. Next, interlingua would convert to any target language, and here was the singularity!

Because of the conversion, Interlingua is often confused with transfer-based systems. The difference is the linguistic rules specific to every single language and interlingua, and not the language pairs. This means, we can add a third language to the interlingua system and translate between all three. We can’t do this in transfer-based systems.

It looks perfect, but in real life it’s not. It was extremely hard to create such universal interlingua — a lot of scientists have worked on it their whole lives. They’ve not succeeded, but thanks to them we now have morphological, syntactic, and even semantic levels of representation. But the only Meaning-text theory costs a fortune!

The idea of intermediate language will be back. Let’s wait awhile.

As you can see, all RBMT are dumb and terrifying, and that’s the reason they are rarely used unless for specific cases (like the weather report translation, and so on). Among the advantages of RBMT, often mentioned are its morphological accuracy (it doesn’t confuse the words), reproducibility of results (all translators get the same result), and the ability to tune it to the subject area (to teach economists or terms specific to programmers, for example).

Even if anyone were to succeed in creating an ideal RBMT, and linguists enhanced it with all the spelling rules, there would always be some exceptions: all the irregular verbs in English, separable prefixes in German, suffixes in Russian, and situations when people just say it differently. Any attempt to take into account all the nuances would waste millions of man hours.

And don’t forget about homonyms. The same word can have a different meaning in a different context, which leads to a variety of translations. How many meanings can you catch here: I saw a man on a hill with a telescope?

Languages did not develop based on a fixed set of rules — a fact which linguists love. They were much more influenced by the history of invasions in past three hundred years. How could you explain that to a machine?

Forty years of the Cold War didn’t help in finding any distinct solution. RBMT was dead.

Example-based Machine Translation (EBMT)

Japan was especially interested in fighting for machine translation. There was no Cold War, but there were reasons: very few people in the country knew English. It promised to be quite an issue at the upcoming globalization party. So the Japanese were extremely motivated to find a working method of machine translation.

Rule-based English-Japanese translation is extremely complicated. The language structure is completely different, and almost all words have to be rearranged and new ones added. In 1984, Makoto Nagao from Kyoto University came up with the idea of using ready-made phrases instead of repeated translation.

Let’s imagine that we have to translate a simple sentence — “I’m going to the cinema.” And let’s say we’ve already translated another similar sentence — “I’m going to the theater” — and we can find the word “cinema” in the dictionary.

All we need is to figure out the difference between the two sentences, translate the missing word, and then not screw it up. The more examples we have, the better the translation.

I build phrases in unfamiliar languages exactly the same way!

EBMT showed the light of day to scientists from all over the world: it turns out, you can just feed the machine with existing translations and not spend years forming rules and exceptions. Not a revolution yet, but clearly the first step towards it. The revolutionary invention of statistical translation would happen in just five years.

Statistical Machine Translation (SMT)

In early 1990, at the IBM Research Center, a machine translation system was first shown which knew nothing about rules and linguistics as a whole. It analyzed similar texts in two languages and tried to understand the patterns.

The idea was simple yet beautiful. An identical sentence in two languages split into words, which were matched afterwards. This operation repeated about 500 million times to count, for example, how many times the word “Das Haus” translated as “house” vs “building” vs “construction”, and so on.

If most of the time the source word was translated as “house”, the machine used this. Note that we did not set any rules nor use any dictionaries — all conclusions were done by machine, guided by stats and the logic that “if people translate that way, so will I.” And so statistical translation was born.

The method was much more efficient and accurate than all the previous ones. And no linguists were needed. The more texts we used, the better translation we got.

There was still one question left: how would the machine correlate the word “Das Haus,” and the word “building” — and how would we know these were the right translations?

The answer was that we wouldn’t know. At the start, the machine assumed that the word “Das Haus” equally correlated with any word from the translated sentence. Next, when “Das Haus” appeared in other sentences, the number of correlations with the “house” would increase. That’s the “word alignment algorithm,” a typical task for university-level machine learning.

The machine needed millions and millions of sentences in two languages to collect the relevant statistics for each word. How did we get them? Well, we decided to take the abstracts of the European Parliament and the United Nations Security Council meetings — they were available in the languages of all member countries and were now available for download at UN Corporaand Europarl Corpora.

Word-based SMT

In the beginning, the first statistical translation systems worked by splitting the sentence into words, since this approach was straightforward and logical. IBM’s first statistical translation model was called Model one. Quite elegant, right? Guess what they called the second one?

Model 1: “the bag of words”

Model one used a classical approach — to split into words and count stats. The word order wasn’t taken into account. The only trick was translating one word into multiple words. For example, “Der Staubsauger” could turn into “Vacuum Cleaner,” but that didn’t mean it would turn out vice versa.

Here’re some simple implementations in Python: shawa/IBM-Model-1.

Model 2: considering the word order in sentences

The lack of knowledge about languages’ word order became a problem for Model 1, and it’s very important in some cases.

Model 2 dealt with that: it memorized the usual place the word takes at the output sentence and shuffled the words for the more natural sound at the intermediate step. Things got better, but they were still kind of crappy.

Model 3: extra fertility

New words appeared in the translation quite often, such as articles in German or using “do” when negating in English. “Ich will keine Persimonen” → “I donot want Persimmons.” To deal with it, two more steps were added to Model 3.

  • The NULL token insertion, if the machine considers the necessity of a new word
  • Choosing the right grammatical particle or word for each token-word alignment

Model 4: word alignment

Model 2 considered the word alignment, but knew nothing about the reordering. For example, adjectives would often switch places with the noun, and no matter how good the order was memorized, it wouldn’t make the output better. Therefore, Model 4 took into account the so-called “relative order” — the model learned if two words always switched places.

Model 5: bugfixes

Nothing new here. Model 5 got some more parameters for the learning and fixed the issue with conflicting word positions.

Despite their revolutionary nature, word-based systems still failed to deal with cases, gender, and homonymy. Every single word was translated in a single-true way, according to the machine. Such systems are not used anymore, as they’ve been replaced by the more advanced phrase-based methods.

Phrase-based SMT

This method is based on all the word-based translation principles: statistics, reordering, and lexical hacks. Although, for the learning, it split the text not only into words but also phrases. These were the n-grams, to be precise, which were a contiguous sequence of n words in a row.

Thus, the machine learned to translate steady combinations of words, which noticeably improved accuracy.

The trick was, the phrases were not always simple syntax constructions, and the quality of the translation dropped significantly if anyone who was aware of linguistics and the sentences’ structure interfered. Frederick Jelinek, the pioneer of the computer linguistics, joked about it once: “Every time I fire a linguist, the performance of the speech recognizer goes up.”

Besides improving accuracy, the phrase-based translation provided more options in choosing the bilingual texts for learning. For the word-based translation, the exact match of the sources was critical, which excluded any literary or free translation. The phrase-based translation had no problem learning from them. To improve the translation, researchers even started to parse the news websites in different languages for that purpose.

Starting in 2006, everyone began to use this approach. Google Translate, Yandex, Bing, and other high-profile online translators worked as phrase-based right up until 2016. Each of you can probably recall the moments when Google either translated the sentence flawlessly or resulted in complete nonsense, right? The nonsense came from phrase-based features.

The good old rule-based approach consistently provided a predictable though terrible result. The statistical methods were surprising and puzzling. Google Translate turns “three hundred” into “300” without any hesitation. That’s called a statistical anomaly.

Phrase-based translation has become so popular, that when you hear “statistical machine translation” that is what is actually meant. Up until 2016, all studies lauded phrase-based translation as the state-of-the-art. Back then, no one even thought that Google was already stoking its fires, getting ready to change our whole image of machine translation.

Syntax-based SMT

This method should also be mentioned, briefly. Many years before the emergence of neural networks, syntax-based translation was considered “the future or translation,” but the idea did not take off.

The proponents of syntax-based translation believed it was possible to merge it with the rule-based method. It’s necessary to do quite a precise syntax analysis of the sentence — to determine the subject, the predicate, and other parts of the sentence, and then to build a sentence tree. Using it, the machine learns to convert syntactic units between languages and translates the rest by words or phrases. That would have solved the word alignment issue once and for all.

The problem is, the syntactic parsing works terribly, despite the fact that we consider it solved a while ago (as we have the ready-made libraries for many languages). I tried to use syntactic trees for tasks a bit more complicated than to parse the subject and the predicate. And every single time I gave up and used another method.

Let me know in the comments if you succeed using it at least once.

Neural Machine Translation (NMT)

A quite amusing paper on using neural networks in machine translation was published in 2014. The Internet didn’t notice it at all, except Google — they took out their shovels and started to dig. Two years later, in November 2016, Google made a game-changing announcement.

The idea was close to transferring the style between photos. Remember apps like Prisma, which enhanced pictures in some famous artist’s style? There was no magic. The neural network was taught to recognize the artist’s paintings. Next, the last layers containing the network’s decision were removed. The resulting stylized picture was just the intermediate image that network got. That’s the network’s fantasy, and we consider it beautiful.

If we can transfer the style to the photo, what if we try to impose another language to a source text? The text would be that precise “artist’s style,” and we would try to transfer it while keeping the essence of the image (in other words, the essence of the text).

Imagine I’m trying to describe my dog — average size, sharp nose, short tail, always barks. If I gave you this set of the dog’s features, and if the description was precise, you could draw it, even though you have never seen it.

Now, imagine the source text is the set of specific features. Basically, it means that you encode it, and let the other neural network decode it back to the text, but, in another language. The decoder only knows its language. It has no idea about of the features’ origin, but it can express them in, for example, Spanish. Continuing the analogy, it doesn’t matter how you draw the dog — with crayons, watercolor or your finger. You paint it as you can.

Once again — one neural network can only encode the sentence to the specific set of features, and another one can only decode them back to the text. Both have no idea about the each other, and each of them knows only its own language. Recall something? Interlingua is back. Ta-da.

The question is, how do we find those features? It’s obvious when we’re talking about the dog, but how to deal with the text? Thirty years ago scientists already tried to create the universal language code, and it ended in a total failure.

Nevertheless, we have deep learning now. And that’s its essential task! The primary distinction between the deep learning and classic neural networks lays precisely in the ability to search for those specific features, without any idea of their nature. If the neural network is big enough, and there are a couple of thousand video cards at hand, it’s possible to find those features in the text as well.

Theoretically, we can pass the features gotten from the neural networks to the linguists, so that they can open brave new horizons for themselves.

The question is, what type of neural network should be used for encoding and decoding? Convolutional Neural Networks (CNN) fit perfectly for pictures since they operate with independent blocks of pixels.

But there are no independent blocks in the text — every word depends on its surroundings. Text, speech, and music are always consistent. So recurrent neural networks (RNN) would be the best choice to handle them, since they remember the previous result — the prior word, in our case.

Now RNNs are used everywhere — Siri’s speech recognition (it’s parsing the sequence of sounds, where the next depends on the previous), keyboard’s tips (memorize the prior, guess the next), music generation, and even chatbots.

In two years, neural networks surpassed everything that had appeared in the past 20 years of translation. Neural translation contains 50% fewer word order mistakes, 17% fewer lexical mistakes, and 19% fewer grammar mistakes. The neural networks even learned to harmonize gender and case in different languages. And no one taught them to do so.

The most noticeable improvements occurred in fields where direct translation was never used. Statistical machine translation methods always worked using English as the key source. Thus, if you translated from Russian to German, the machine first translated the text to English and then from English to German, which leads to a double loss.

Neural translation doesn’t need that — only a decoder is required so it can work. That was the first time that direct translation between languages with no сommon dictionary became possible.

The conclusion and the future

Everyone’s still excited about the idea of “Babel fish” — instant speech translation. Google has made steps towards it with its Pixel Buds, but in fact, it’s still not what we were dreaming of. The instant speech translation is different from the usual translation. You need to know when to start translating and when to shut up and listen. I haven’t seen suitable approaches to solve this yet. Unless, maybe, Skype…

And here’s one more empty area: all the learning is limited to the set of parallel text blocks. The deepest neural networks still learn at parallel texts. We can’t teach the neural network without providing it with a source. People, instead, can complement their lexicon with reading books or articles, even if not translating them to their native language.

If people can do it, the neural network can do it too, in theory. I found only one prototype attempting to incite the network, which knows one language, to read the texts in another language in order to gain experience. I’d try it myself, but I’m silly. Ok, that’s it.

Reference: https://bit.ly/2HCmT6v