Abstract
This article aims at giving an overview of how CAT Tools work and is specifically intended for those translators who are familiarizing themselves with these tools for the first time. A CAT Tool is software whose functioning is based on the creation of a Translation Memory, which facilitates and speeds up the translator's work. After having briefly explained how the tools work, the advantages and disadvantages of their use are analyzed. Finally, suggestions are given for those who would like to know more about these tools and start using them.
1. A little bit of history
lthough Petr Trojanski invented "a machine for selecting and typing words when translating one language into another or several others simultaneously" in the thirties (this definition can be found in the patent granted to him in 1933), the history of CAT Tools actually begins in the Cold War years, when the information collected by the intelligence services had to be translated without delay. For this reason, considerable funds were allocated, for the first time, to translation technology. The first attempts with machine translation were made in specialized research centers and financed mainly by the USA and the USSR. The term Machine Translation (MT), was coined in 1947 by Warren Weaver who, in his famous memoranda, defended the feasibility of developing an automatic translation program. Systran (Acronym for System Translation) was established in those years and is still used by the European Commission. The American researcher Toma is the inventor of this system, which was used by the U.S. Air Force for gisting reports and documents written in Russian. Despite the initial enthusiasm and the belief that translators could be replaced by machines in the near future, the results did not meet the expectations, and the funds soon stopped flowing. It was at this time, between the late 60s and early 70s that a novel approach was suggested. The machine to be invented should not translate automatically, but rather facilitate the work of the human translator.
When you are ready to spend time, money, and patience to familiarize yourself with the world of CAT tools, you'll wonder how you were able to live without them. | The first attempts consisted of terminology databases; the idea of translation memories, i.e., of a mechanism that forms the basis of today's computer-aided translation software, began to gain acceptance in the late 70s. TSS (Translation Support System), the first CAT tool developed by the U.S. company Alpnet, debuted in the mid-80s. However, the acceptance of this system was limited because of its high cost, which made it affordable only to large companies doing massive amounts of translation. IBM was one of the first purchasers of this system.
In the second half of the 80s, the Dutch company INK developed a system named "TextTools," inspired by TSS; Trados, a company established in 1984 in Stuttgart, became its official dealer in Germany.
The 90s witnessed the expansion of the CAT market by making the software affordable to small businesses and freelance translators, but both the prices and the system requirements for use were still too high. The introduction of the Internet and the possibility for translators to exchange data worldwide required adaptation and the introduction of generally acceptable standards. Translation memories represented such a standard, and their adoption was soon followed by an exponential growth of the market for computer-aided translation software.
2. Basic Principle
2.1 Preamble
In order to provide a general overview of these tools, we will now describe in detail three software packages. First we will dedicate our attention to the most popular tool, Trados. Then we will examine Wordfast, which is number two despite being easier to use and less expensive. We will then describe DéjàVu.
2.2 Introduction
CAT Tools contain no program based on machine translation and no ready-made bilingual dictionary. The "dictionary" is created by the translator with each translation and revision. Looking for a term with a CAT tool means searching through the previously created translation memory. The result of the search is not the equivalent term in the other language, but the text string in which the term occurs. The process is certainly slower and more complicated compared to machine translation, but the result is more interesting in the long run.
2.3 Trados
To make the principle of operation of the computer-aided translation software as easily understandable as possible, we shall describe Trados, the most popular CAT tool today, as an example.
The first thing to do when Trados is called upon for translating a text, is to open the Translatorīs Workbench even before opening the source text. This "workbench" will remain open during the entire translation process. As the next step, the text to be translated is segmented. This is done automatically and is based on the punctuation marks present, the comma usually being an exception. However, the translator can make the segments longer or shorter, depending on the type of text and the translator's personal preferences. The advantage of working with shorter segments is that shorter segments are more likely to occur in the text being translated or in future texts.
The translator now has two windows on the screen:
The blue window on the top contains the first segment of the source text; the translation is to be inserted into the second, yellow bottom window. The segments are delimited by tags, i.e., unchangeable codes, which show the beginning and the end of the unit to be translated and are part of Trados's language.
In this context, it should be pointed out that the first advantage of using Trados is that the translator immediately sees which parts of the text are to be translated, which makes it easy to concentrate on the essential. Whoever has attempted to translate a text in electronic format knows how difficult it is, since each time the already translated source text must be deleted. As an alternative, one must work with two texts side-by-side, or open the original text on the screen again and then return to the already translated text. This process is not only annoying, but also often results in careless mistakes and increases the risk of skipping a sentence or a word. The situation becomes even worse if the translation is received as hard copy. If the translator does not have the possibility of converting the text into an electronic format by using optical character recognition (OCRanother tool to make the translator's life easier), he or she must constantly shift the focus between the paper and the screen. Finding the last translated sentence on the paper and adapting to two or more different type sizes and typefaces adds to the difficulty and the aggravation. With Trados, the translator does not have to look for the sentence to be translated or the point where the translated text to be inserted, since they are both located in the same area of the screen, are marked with different colors, and have the same type size and typeface.
Once the translation has been inserted in the bottom window, the program saves both source and translated segments in the TM (translation memory). Thanks to this process, Trados can automatically suggest, in the bottom window, the saved translation every time this segment occurs in this or a future translation. The translator is then free to accept it unchanged or to modify it.
In some cases, the program can also suggest translations of similar, but not 100% identical segments, depending on the Minimum Match Value set by the translator before starting work. If a 100% match was set, Trados will suggest translations only for identical segments, which has the advantage that the translator can be sure, without checking, that the suggested translation is correct. In this way, however he gets no suggestion from the CAT in the case of segments that differ from each other, even if only by a single word. Therefore, most translators set a Minimum Match Value of 60% so they can consider also similar segments (fuzzy matches).
If, during the translation process, a term comes up which the translator knows is stored in the memory, it is marked and a Concordance command is entered, which immediately retrieves all segments in which this term occurs.
At the end of the process, the translator has a text that looks like this:
{0>At 98°C the electronic engine control (EEC) switches solenoid Y35, which in turn switches the controller of the variable-delivery pump to full load.<{0{>A 98°C il sistema di regolazione elettronica del motore (REM) aziona la valvola magnetica Y35, che a sua volta aziona il regolatore della pompa a portata variabile a pieno carico.<0}
Segments of the source text, which are separated from the translated text by tags, can still be seen. Translation companies often require that the translator deliver not only the TM, but also this "bilingual" file to allow their editor to proofread the entire translation again. In most cases, however, only the file with the translated text is to be delivered. The translator continues to work on the bilingual text, using Workbench again to delete all tags and segments of the source text with the help of the Clean-up function.
It should be pointed out that Trados, like many other CAT tools, allows the translator working with this tool for the first time to create a highly usable TM with the help of the WinAlign function using already translated material and without employing computer-aided translation tools. When the translator resorts to this function, Trados segments both source text and translation and suggests a translated segment for each segment of the source text. All the translator has to do is check and confirm or possibly change the connection between the segments. However, this "adjustment" process may be very time-consuming and is only possible if both the source and target texts are available in electronic format.
A new version of Trados has recently been launched: SDL Trados Studio 2009, which, however, does not replace the previous version, but rather supplements it. The basic principle of the new version is no different from the one described above, although it, of course, features enhancements e.g., regarding computer-supported formats. The most recent functions include AutoSuggest, which shows, during the text input, translations of words and expressions stored in the memory; AutoPropagation inserts all repetitions automatically and in real time; PDF files can now be translated directly. Unlike the previous version, Studio 2009 does not use Word's interface, but it includes a license for Trados 2007, so that those who wish to translate Word documents in Word may (still) continue to do so.
2.4 Wordfast
Wordfast has recently undergone a rather revolutionary facelift. In January 2009, Wordfast Pro 6.0 was launched, which is being sold as an alternative to Wordfast Classic. The basic principle of both versions, as well as of Trados, is the combination of two technologies: segmentation and translation memory. Let us briefly describe Wordfast Classic and Pro.
Unlike other CAT tools, Wordfast Classic is not actually a program, but a suite of Microsoft Word macros. This is why it is especially well-suited for those just starting out in the world of CAT tools. Since it is based on Word, it lets the beginner move on well-known terrain without having to deal with a new program and a new window.
Wordfast Classic is certainly an easy-to-use tool, although this ease of use is offset by reduced performance and flexibility. One example is the Glossaries, which, while will not satisfy the terminologist's requirements, are adequate for the translator's needs. Also, compared with Trados and other CAT tools, Wordfast Classic handles fewer formats, which makes it easier to use, but does not contribute to the popularity of the product.
In conclusion, although Wordfast Classic has a limited performance range compared to other computer-aided tools, it is an excellent aid for those who wish to enter the world of CAT tools, especially for Word users. And it is clearly a better value compared to other programs.
The greatest innovation of Wordfast Pro is that, unlike the previous versions, it is a standalone program not based on Word. Its graphics is also new. The interface is clear, simple, and intuitive. In addition, the visual appearance of the page and the keyboard shortcuts can be customized. The best innovations include faster Analyze and Clean-up functions, the retrieval of repeating segments, and an expanded range of compatible file formats. For full evaluation, a few more months will be necessary, but in the meantime a free demo version of Wordfast Pro can be downloaded from the website www.wordfast.com.
2.5 Atril DéjàVu
DéjàVu has been around since 1993 and is being offered as a relatively easy-to-use program compared with other CAT tools. One of the most interesting aspects of the Atril software is its capability to import hundreds of files in a single operation without changing the file structures. Multiple files can be displayed in a single window or in different windows. Certain functions, such as search, replace, filter, and viewing lines containing only selected words and expressions, can be applied to all segment pairs (source text and translation) appearing in a window. Thus, all sentences in which a certain expression is repeated can be reviewed again, e.g., to check whether a word was always translated in the same way.
Another interesting function of DéjàVu is EBMT (Example-Based Machine Translation), which few other CAT tools feature. It is a function that is different from the fuzzy match of Trados or Wordfast, since it does not only retrieve from the memory the segments that are similar to the segment to be translated, and which may differ from it by a single word or by a comma instead of a period, but is also capable of "correcting" the differences. Interestingly, the proposed version is often the correct one. The following example should illustrate the above principle.
The following sentence is to be translated from English into Italian:
The first chapter systematically discusses the traditional foreign trade theories
DéjàVu searches the memory and suggests a similar, although not identical sentence:
Il primo capitolo tratta in maniera sistematica i tradizionali modelli del commercio internazionale (The first chapter systematically discusses the traditional foreign trade models)
In addition, the terminology database contains the following expressions:
foreign trade models = modelli del commercio internazionale
foreign trade theories = teorie del commercio internazionale
DéjàVu recognizes that the two sentences differ by the strings "foreign trade models" and "foreign trade theories," and since the terminology database contain both, it automatically replaces "foreign trade models" with "foreign trade theories," which results in the following sentence:
Il primo capitolo tratta in maniera sistematica le tradizionali teorie del commercio internazionale
While certain functions, such as recognizing flexed forms, are handled better by other programs, in conclusion it can be stated that DéjàVu represents an excellent alternative to Trados for many reasons and thanks to some functions such as stability and customizability.
3. Advantages and disadvantages
The most important advantages of CAT tools are based on the above explanations and do not apply only to specific texts: When using these tools, it is immediately clear which parts of the text must be translated; the unchanging portions are transferred accurately and directly; the time savings due to repeating expressions is huge; and expressions are translated consistently. However, there are many other functions that offer the translator a number of other advantages.
Most CAT tools let the translator work with formats other than Word (Excel, PowerPoint, Visual Studio, Java, HTML, XML, etc.) without modifying the format. In files created in a tagged language or which have special page breaks, the CAT tool leaves the layout unchanged, which allows the translator to focus exclusively on the translation.
Another common feature of many CAT tools is the creation of easily and rapidly retrievable terminology cards. Most translators had to work with terminology cards and glossaries, which were often the product of hard work, especially in the case of special and complicated texts. CAT tools allow correct terminology databases to be created, which contain not only the expression in the source and target languages, but the context, examples, and images can also be stored, and the categories provided for describing the expression can be freely customized. Of course, an expression contained in the database can be retrieved during the translation in order immediately to display the appropriate word in the target language.
Of course, CAT tools, like most computer programs, have their shortcomings and sometimes do not function as they should for unknown reasons. While considerable progress has been made regarding trustworthiness, these tools are not yet absolutely reliable, especially in the case of complicated applications. It must be stated that difficulties often result from the user's insufficient familiarity with the software, and their resolution is often trivial. Learning these programs requires time, especially if the translator wishes to make use of all available resources. Thanks to the abundance of information available on the Web such as user forums which often provide solutions to problems, learning programs provided with the software, and professional associations, which offer courses and specific workshops, it was never so easy for the translator to familiarize him/herself with the different CAT tools.
4. Conclusions
We have seen that the principle of computer-aided translation software (CAT tools) is based on the segmentation of the text and the creation of a translation memory. While the underlying principles of most tools are very similar, each tool is best suited for a different application and meets different requirements. For example, Wordfast Classic is best suited for those working exclusively with texts in Word format. In contrast, those who work with different and special texts, will prefer Trados. DéjàVu is the ideal solution for those working with very large projects consisting of many files in different formats (e.g., for translating Web pages).
Therefore, we recommend that you find out which software package best meets your needs, test it before buying (almost all vendors offer a free demo version for download from their web site after you register). Do not get discouraged after the first difficulties. When you are ready to spend time, money, and patience to familiarize yourself with the world of CAT tools and to use all available resources, you'll wonder how you were able to live without them.
References
AMTrad Services: www.amtrad.it
Atril DéjàVu www.atril.com.
F. Braun-Chen, La traduction automatique à la Commission européenne: d'hier à aujourd'hui, in: Terminologie et traduction, Bruxelles.
Y. Champollion, Wordfast: www.wordfast.org
D. Cosmai, Tradurre per l'Unione Europea, Milano: Hoepli, 2003, 113-119.
P. Gupta - M. Schulze, Trattamento automatico della lingua: www.ict4lt.org/it/it_mod3-5.htm
S. Nirenburg - Y. Wilks, Machine Translation, in: M. Zelkowitz (edited by), Advances in Computers, New York: Academic Press, 2000 (www.ilit.umbc.edu/SergeiPub/MachineTranslation2.pdf)
B. Osimo, Traduzione e nuove tecnologie. Informatica e Internet per traduttori, Milano: Hoepli, 2001.
D. Spiraglio, Guida all'utilizzo di Trados: www.terminologia.sslmit.unibo.it
Trados: www.trados.com.
Wordfast: www.wordfast.net.
|