History of Computer-Aided Translation

History of Computer-Aided Translation

Reading Time: 6 minutes

Proposals for the translator’s workstation can be traced back over more than 20 years. Their full integration and acceptance had to await technical developments of the 1990s, but their desirability for the effective utilization of machine aids and translation tools was recognized long ago. The title of workstation has been applied to a number of translation aids, but here we are concerned only with the type of workstation intended for direct use by professional translators knowing both source and target languages, and retaining full control over the production of their translations. Workstations and other computer-based translation tools are traditionally referred to as systems for “machine aided human translation” (MAHT), in order to distinguish them from MT systems with some kind of human assistance either before or after processing (pre- and post-editing), known often as “human aided machine translation” (HAMT).

The 1966 ALPAC report encouraged support for basic computational linguistics and the development of computer-based aids for translators.

Computer-based terminological resources were received with increasing favor by translators from the late 1960s. Particularly in large governmental and industrial organizations, there was an increasingly pressing need for fast access to up-to-date glossaries and dictionaries in science, technology, economics and the social sciences in general. The difficulties were clear: rapidly changing terminology in many scientific and technical disciplines, the emergence of new concepts, new techniques and new products, the often insufficient standardization of terminology, and the multiplicity of information sources of variable quality and reliability. It was recognized from the outset that on-line dictionaries for translators could not be the kinds of dictionaries developed in MT systems. Translators do not need the kind of detailed information about grammatical functions, syntactic categories, semantic features, inflected forms, etc. which is to be found in MT lexica, and which is indeed essential for automatic analysis. Nor do translators need to consult dictionaries for items of general vocabulary-which are equally essential components of an MT system dealing with full sentences.

In the 1970s, terminology data banks were being built to provide information on demand about individual words or phrases as the basis for the production of glossaries for specific texts, and for the production of published up-to-date specialized dictionaries for general use. Many of the databanks were multilingual, nearly all provided direct online access and most included definitions. In the case of other termbanks, the emphasis was on the provision of terms in actual context.

[…] The databases were intended not just for translators but also for lexicographers and other documentation workers, with facilities for compiling dictionaries and term glossaries, for producing text-related glossaries for machine-aided translation, for direct online access to multilingual terminology databanks, and for accessing already translated texts by means of indexes. The archive of translations, recorded on magnetic tapes, could also be the source of re-usable translation segments. However, the whole complex of interlinked linguistic databases was constrained by the computer technology then available.

The use of a translation archive was elaborated by Peter Arthern (1979) in a proposal for what has now, since the late 1980s, become known as a translation memory. The suggestion was made in a discussion of the potential use of computer-based terminology systems in the European Commission. After stressing the importance of developing multilingual text processing tools and of providing access to terminological databanks, Arthern went on to comment that many EC texts were highly repetitive, frequently quoting whole passages from existing EC documents and that translators were wasting much time re-translating texts which had already been translated. He proposed the storage of all source and translated texts, the ability to quickly retrieve any parts of any texts, and their immediate insertion into new documents as required. He referred to his concept as “translation by text-retrieval”, and envisioned an early model translator’s workstation which could still accommodate a full MT system. The concept would not come to fruition for another decade or more.

One of the most decisive moments in the development of the future translator’s workstation is now considered to be the (initially limited) circulation of a memorandum in 1980 by Martin Kay. This combined a critique of the current approach to MT, namely the aim to produce systems which could essentially replace human translators or at best relegate them to post-editing and dictionary updating roles, and an argument for the development of translation tools which would actually be used by translators. Since this was before the development of microprocessors and personal computers, the context was a network of terminals linked to a mainframe computer. Kay’s basic idea was that existing text-processing tools could be augmented incrementally with translation facilities. The basic need was a good multilingual text editor and a terminal with a split screen; to this would be added a facility to automatically look up any word or phrase in a dictionary and the ability to refer to previous decisions by the translator to ensure consistency in translation; and finally to provide automatic translation of text segments, which the translator could opt to let the machine do without intervention and then post-edit the result, or which could be done interactively, i.e. the computer could ask the translator to resolve ambiguities.

Alan Melby, in 1981, put forward the use of a bilingual concordance as a valuable tool for translators. It enabled translators to identify text segments with potential translation equivalents in relevant contexts. As an example, he showed an English text segmented into phrases and its corresponding French version, segmented likewise. The computer program would then create a concordance based on selected words or word pairs displaying words in context. The concordance could be used not only as an aid to study and analyze translations, but also for quickly determining whether or not a given term was translated consistently in technical texts, to assist translators in lexical selection, and in the development of an MT system for some narrow sublanguage. Melby seems to be the first to suggest concordance application as a translation tool. In his experiment, texts were input manually and correspondences between texts (later called “alignments”) were also made by human judgement. Only the concordancing program was automated, but Melby was clearly looking forward to the availability of electronically produced texts and of automatic alignment. At the same time, he was making specific proposals for a translator’s workstation—quite independently of Kay’s proposals in 1980. Like Kay, Melby wanted the translator to be in control, to make his/her own decisions about when to translate fully and when to post-edit, and he wanted to assist translation from scratch by providing integrated computer aids.

The aim was the “smooth integration of human and machine translations” (Melby 1982), bringing together various ideas for supporting translators in an environment offering three levels of assistance. At the first level, certain translation aids can be used without the source text having to be in machine-readable form. The translator could start by just typing in the translation. This first level would be a text processor with integrated terminology aids and access to a bilingual terminology data bank, both in the form of a personal file of terms and in facilities for accessing remote termbanks (through telecommunications networks). In addition, there might be access at this level to a database of original and translated texts. At the second level, the source text would be in machine-readable form. It would add a concordancing facility to find all occurrences of an unusual word or phrase in the text being translated, facilities to look up terms automatically in a local term file, display possible translations, and means of automatically inserting selected terms into the text. The third level would integrate the translator work station with a full-blown MT system. Melby suggested that the ideal system would be one which evaluates the quality of its own output (from “probable human quality” to “deficient”), which the translator could choose to incorporate unchanged, to revise or to ignore.

Both Melby and Kay stressed the importance of allowing translators to use aids in ways they personally found most efficient. The difference between them was that whereas Melby proposed discrete levels of machine assistance, Kay proposed incremental augmentation of translator’s computer- based facilities. Translators could increase their use of computer aids as and when they felt confident and satisfied with the results. And for both of them, full automation would play a part only if an MT system made for greater and cost-effective productivity. These ideas of Kay and Melby were being made when text-processing systems still consisted essentially of a range of terminals connected to a mainframe computer and to separate printers for producing publishable final documents. It was natural to envisage networked systems rather than individual workstations. For ex., Melby assumed that the future scenario was a “distributed system in which each translator has a microcomputer tied into a loose network to share resources such as large dictionaries.” (1982) The technology situation definitively changed with the appearance of the first personal computers in the mid 1980s, providing access to word processing and printing facilities within the range of individual professional translators

Source: Origins of the Translator’s Workstation (John Hutchins), http://www.hutchinsweb.me.uk/MTJ-1998.pdf

Comments are closed.