Going Rogue: Translating Professionally with a Free CAT Tool

While typing the title of this post, the word “professionally” shortly followed by the word “free” made me feel a bit awkward. In most developed nations, where capitalism is a reigning economic reality, such a reference can be considered blasphemy. But since I was going rogue here, I was willing to explore this possibility and ask the million-word question: Can a professional freelance translator do the job using only free tools?

The Objective

Obviously, this is a quite nonspecific question and to be able to research the topic, and present the facts in a reasonable way, I needed to define some criteria that would narrow down the scope of this inquiry. Being the perfectionist I am, I set forth some very demanding prerequisites that would either prove or bust this whole free proposition thing. Thus, after some serious thinking, the bare minimum requirements that I came up with and believe a free CAT Tool must successfully meet were:

  1. It must be free (i.e., $0).
  2. It must be able to run on all three major Operating Systems (Microsoft Windows, Apple macOS, GNU/Linux).
  3. It must accept directly, or with minimum tinkering, bilingual files from major proprietary CAT Tools.
  4. The translation process must be as seamless as using a proprietary tool, or close enough.
  5. The resulting translated bilingual files must be identical to the ones a proprietary tool would produce and adhere to the client’s delivery specifications.

At first sight, the above points seem quite challenging but anything less would hamper a professional translator’s workflow. And when dealing with actual, paid translation jobs, the last thing you want is for your tools to get in the way, or worse, not allow you to deliver in the way the client expects you to.

For this case, I conducted the experiment under a controlled environment: I acted as the client and the translator. This allowed me to assess the situation better and push the free CAT Tool scenario to its limits. Thanks to my translation project management experience and earlier freelance translation career, I was in a position to cover both sides effectively, allowing a clear, transparent approach to the process.

The Free CAT Tool

The first step in this endeavor was to locate the right free CAT Tool that would comply with all 5 requirements set in the previous section. After a quick research on Google, I shortlisted the following 3 free tools:

By applying an in-depth analysis to each of the above tools’ capabilities, I eliminated the first two and nominated Heartsome Translation Studio as the suitable free CAT Tool for our experiment. It should be stated here that the eliminated tools are quite capable software (and have the backing of the Open Source community), but for the niche purposes of my strictly defined test they didn’t cover all 5 of the set requirements.

On the other hand, Heartsome Translation Studio proved a perfect fit, covering all requirements and offering a familiar user interface and feature-set that matched those provided by other major proprietary CAT Tools like SDL Trados Studio, memoQ and Wordfast. That’s no coincidence since this CAT Tool was once proprietary too! Developed by a Hong Kong based company (now defunct) named Heartsome Technologies Ltd., this application was pitched against the big guys in an attempt to win a part of the translation tools market. Despite its features and multi-platform capabilities it struggled along, trailing behind the competition. This led the company into a financial turmoil that forced them to close shop. However, instead of sinking their software into oblivion they took a rather bold decision and open sourced the code, allowing it to be copied, modified and used freely by everyone. This offering to the open source community in general, and to translators in particular, is quite valuable, since it provides everyone with a commercial-grade software without the hefty price tag.

Before we proceed to the next section, that deals with the actual process of translating with the chosen free CAT Tool, I’d like to mention 2 more free tools that were deemed necessary to complete this whole experiment:

The first one, Trados Studio TM and TB Converter, is a free multi-platform Java application that converts SDL Trados Studio translation memories (.sdltm) and terminology databases (.sdltb) into formats readable by all current CAT Tools (.tmx and .tbx files). It’s a very handy utility for anyone that doesn’t have access to an installation of SDL Trados Studio, and was needed during my tests when dealing directly with SDL Trados Studio project packages (.sdlppx). This tool is developed and provided for free by the nice folks at Closed Tags.

LibreOffice doesn’t need any introductions as I’m quite sure you’re familiar with it in one way or another. It’s a powerful free office suite and the best alternative to Microsoft Office. The main reason I needed this software was for handling the older (but still widely used) .doc, .xls and .ppt versions of MS Office files. Heartsome Translation Studio can’t deal with these formats directly (but handles the newer MS Office files .docx, .xlsx, .pptx just fine), so it uses as its backend process the LibreOffice file filters to convert them into readable content for our CAT Tool.

The Process

With our client established, let’s proceed and see how we should handle this case using our free CAT Tool software, Heartsome Translation Studio. We’ll be following the industry’s standard methodology which consists of:

  1. Source file preparation
  2. Analysis and pre-processing
  3. Translation
  4. Proofreading and spell-checking
  5. Verification checks
  6. Delivery to the client

The client sends in (or handoffs) the project in the form of an SDL Trados Studio Project package (.sdlppx). The language pair is English (US) into German (DE). After completion, the instructions mention the delivery (or handback) of an SDL Trados Studio Return package (.sdlrpx). Note that our free CAT Tool cannot handle Project packages directly, nor create Return packages. So, and this is very important, we must inform the client upfront that we won’t be able to deliver a Return package, but can deliver bilingual SDLXLIFFs and TM/TBX exports, which they can use to update their project. In nearly all cases, the client won’t have a problem with this, but it’s crucial to inform them accordingly from the beginning.

So, let’s start the process!

Source File Preparation

Since our chosen free CAT Tool doesn’t support Trados Studio Project packages directly (although it does support the Trados Studio bilingual SDLXLIFF files), we’ll do some tricks to extract the files we need and which are the:

  • SDLXLIFFs (bilingual files)
  • SDLTM (Translation Memory)
  • SDLTB (Terminology Database)

Here are the steps we need to do:

  • First we’ll create a folder structure that will help us deal with the phases that will follow. In the folder you use for your translation work (if you don’t have one, pick a suitable location on your system and create one, i.e., with a name like Projects), create the following folders/subfolders:
\---20170628-Project-FooBar_ENU-DEU
    +---1-From_Client
    +---2-Work
    +---3-To_Client
  • In the From_Client folder place the .sdlppx file that was provided by the client.
  • Create a copy of the .sdlppx file and paste it into the Work folder. Rename the file extension from .sdlppx to .zip so your system can recognize it as an archive. Afterwards, extract the .zip file (using your preferred archive extractor tool) in the same folder.
  • You should have a new folder with the name of the .zip file. Enter it, and you should see a structure similar to the following:
|   Sample Project EN-DE SDL_Trados-2017618-21h1m9s.sdlproj
|
+---de-DE
|       SamplePhotoPrinter.doc.sdlxliff
|       SamplePresentation.pptx.sdlxliff
|       SecondSample.docx.sdlxliff
|       TryPerfectMatch.doc.sdlxliff
|
+---en-US
|       SamplePhotoPrinter.doc.sdlxliff
|       SamplePresentation.pptx.sdlxliff
|       SecondSample.docx.sdlxliff
|       TryPerfectMatch.doc.sdlxliff
|
+---Reports
|       Analyze Files en-US_de-DE.xml
|
+---Termbases
|       Printer.sdltb
|
+---Tm
        English-German.sdltm

We have 5 folders (de-DE, en-US, Reports, Termbases, Tm) and one project file (.sdlproj). Each folder contains the files we’ll be using, either directly (SDLXLIFFs) or by converting them (SDLTM and SDLTB). Note that we won’t need the .sdlproj file nor the contents of the Reports folder (the Reports folder contains the word count analysis of the project in XML format, readable only from within Trados Studio or by using my free tool Trados Studio XML Analysis Viewer).

Since our target language is German (DE), we’ll be using the SDLXLIFFs in folder de-DE. The existence of this folder denotes that the files have been pre-processed by the client and, thus, contain pre-translated segments (i.e., 100% Matches and/or Perfect Matches) along with potential locked segments. If this folder was missing, then it would’ve meant that there was no pre-processing, and we would’ve used the SDLXLIFFs in folder en-US instead.

With the bilingual files (SDLXLIFFs) available, we’ll move on and convert the Translation Memory (SDLTM) and Terminology Database (SDLTB) to the corresponding formats that are readable by Heartsome Translation Studio. For this process we’ll use the Free Trados Studio TM and TB Converterer for converting the SDLTM and SDLTB files, along with Heartsome Translation Studio’s internal TBX Maker tool for completing the SDLTB file conversion.

Here are the required steps for the SDLTM conversion:

  • Run the Trados Studio Resource Converter and select the first button, “Convert SDLTM”.
  • Point the “Open” dialog to the folder in which we extracted the .sdlppx file and, specifically, to the Tm folder.
  • Select the .sdltm file and click on the “Open” button. Once the process completes, a message will appear stating the number of translation units that were converted. Additionally, a .tmx file will be created in the Tm folder.

To convert the SDLTB, follow these steps:

  • Run the Trados Studio Resource Converter and select the second button, “Convert SDLTB”.
  • Point the “Open” dialog to the folder in which we extracted the .sdlppx file and, specifically, to the Termbases folder.
  • Select the .sdltb file and click on the “Open” button. In the dialog that appears select the default option, “Comma-separated CSV”, and press “OK”. Once the process completes, a message will appear stating that the termbase was successfully converted. Additionally, a .csv file will be created in the Termbases folder.
  • Launch Heartsome Translation Studio, and from the menu select Tools -> TBX Maker. In the TBX Maker window, select the menu option File -> Open CSV file…
  • In the dialog box that appears, click on “Browse” and select the .csv file we created above (in the Termbases folder). Next, change the “Main Language” to “English (United States) en-US”. Without touching anything else, click “OK”.
  • The window will now be populated with the content of the .csv file. Don’t freak out if you see a lot of columns with strange titles and additional languages. This would be normal since a Termbase can contain extra languages, depending on the number of target languages supported by the project. In our case, we’re only interested in the columns that have to do with the source (English) and target (German) languages.
  • To simplify the process we want to end up with 4 columns, 2 per language which will hold the term and its definition (or description) per language. So, we will scan the columns and identify the ones with the English and German terms, and then find the columns with their descriptions. We need to keep a note of their column numbers for the next steps.
  • From the menu, select Tasks -> Delete Columns…
  • In the “Delete” dialog, select all the other column numbers except the ones we noted further above (which are the 4 columns we need to keep). Click on “Delete specified rows”. You should end up with only the columns we need: Term + Description (per language).
  • From the menu, select Tasks -> Column Property…. This is where we’ll assign the type of each column, so the tool can properly finish the conversion. You should see 4 columns with dropdown boxes, and rows which correspond to the columns we kept in the previous step. In the 1st column (“Type of Column”), assign the value “Term” in all dropdown boxes. In the 2nd column (“Type”), select the value “term” (if the row belongs to a term column), or select the value “descrip” (if the row belongs to a description column). Ignore the 3rd column (“Attribute”). In the 4th column (“Term Language”) select the correct language per term and description. In our case, “en-US English (United States)” and “de-DE German (Germany)”. Click “OK”.
  • From the menu, select File -> Convert to TBX File…. In the dialog box that appears make sure the path in which the .tbx file will be saved to is the Termbases folder. Click “Convert”. Once the process completes, a message will appear stating that the CSV file was successfully converted to a TBX file. Additionally, a .tbx file will be created in the Termbases folder.

If you’re still with me, we now have files ready to be added to a Heartsome Translation Studio project, allowing us to proceed to the next phase.

Analysis and pre-processing

In Heartsome Translation Studio’s main menu select File -> New -> Project…. The New Project Wizard will appear which will guide us in setting up our project. Follow the below steps to complete the wizard:

  • Project Information: Enter the project’s name and click “Next”
  • Language Pairs: Select “English (Unites States) en-US” as the source language, and “German (Germany) de-DE” as the target language. Click “Next”.
  • Translation Memory: Here we need to declare our working TM, but first we need to create one and then import the .tmx we converted in the previous phase.
  1. Click on the “Create” button and add a name in the “Basic Information” section of the dialog (make sure the “Type” is “File-based TM”).
  2. In the “Location” section, we need to declare the path of the new TM, so click on “Browse” and select the Tm folder we have from the previous phase. Click “Next”.
  3. Here we can import the .tmx file we created further above. Click “Browse” and locate the .tmx file, then click on the “Open” button. Now click “Finish”, to return to the main wizard (a .hstm file has been created in the Tm folder). Click “Next”.
  • Termbase: We’ll follow a similar process with the previous step in order to import our .tbx file into a Termbase for our project.
  1. Click on the “Create” button and add a name in the “Basic Information” section of the dialog (make sure the “Type” is “File-based Termbase”).
  2. In the “Location” section, we need to declare the path of the new Termbase, so click on “Browse” and select the Termbases folder we have from the previous phase. Click “Next”.
  3. Here we can import the .tbx file we created further above. Click “Browse” and locate the .tbx file, then click on the “Open” button. Now click “Finish”, to return to the main wizard (a .hstb file has been created in the Termbases folder). Click “Next”.
  • Add Source Files: This is the step in which we need to add the SDLXLIFFs to our project. Make sure the checkbox “Convert Source Files to XLIFFs” is ticked, and then click on the “Add” button. Locate the SDLXLIFFs in the de-DE folder, select them all (press Ctrl + A) and choose “Open”. Click on “Create”. A new dialog will appear in which you simply need to click on the “Finish” button.

In the “Project” panel, located on the left side of Heartsome Translation Studio’s user interface, you should see the project we just created. Click on the arrow on the left of the name to expand its components. The following folders and subfolders are present:

Here’s a quick rundown of what each folder/subfolder is for:

  • Intermediate: Holds key information of the project’s internals. More specifically:
  1. Report: Contains any analysis reports that have been applied to the project.
  2. SKL: Contains the so-called “skeleton” files which hold information regarding the conversion of the files from/into source/target files.
  • Source: Includes the source files we added with the New Project Wizard.
  • Target: Will hold the fully translated source files.
  • XLIFF: Contains the working files of our project, which are basically the source files converted to XLIFFs.

We’re now ready to analyze the files and compare our results with the analysis report provided by the client. Keep in mind that we won’t have a perfect match of numbers due to the use of different CAT Tools. But, the figures will be quite close, in which case we can safely assume our conversion process was successful, allowing us to continue with the translation of the project.

Right-click on the XLIFF folder (in the “Project” panel) and select “Analyze Files…”. This will bring up the Analyze Files wizard; click all 3 checkboxes related to the locking of segments (context matches, exact matches, internal repetitions) so we won’t have to deal with these during translation. Click “OK” and you’ll end up with a report similar to this one:

I’ve placed a red rectangle around the total numbers of the report. You should compare these with the totals from the client’s analysis report and decide if they’re quite close. Here’s the report from the client (from SDL Trados Studio):

Again, I’ve placed a red rectangle around the total words. The most important numbers to make sure are close enough are the “New” and “Total(s)” ones. In our case, we’re very close so we’re good to go!

Translation

We’re not going to cover the actual translation process in this post. You can check the Heartsome Translation Studio manual that comes with the application, or press F1 to read the Help file. These are the best sources for learning how to work with this CAT Tool. You’ll find out that the functionality is quite similar to other major CAT Tools, so the process won’t be that difficult to handle.

But, I can offer the following tips:

  • Leave each translated segment with a “Draft” status. This will help you during the next phase that follows.
  • Add comments to segments that need extra attention or research. You can do this by right-clicking on a segment and selecting “Add Comment…”.
  • Always pay attention to the “Termbase” panel (on the bottom right of the screen), and consult/follow its recommendations.

Proofreading and spell-checking

Here, too, we won’t be getting into details regarding this process. Simply put, you should proofread/review each segment that has a “Draft” status, and check your comments (if available). Once a segment has been properly checked, change its status to “Confirmed” by right-clicking on it and selecting “Confirm translation” (or press Ctrl + Enter).

You should always spell-check your translation, so make sure you have the right language support for this task by checking Heartsome Translation Studio’s spell-checking options. In the menu, click Tools -> Options… and in the left-side panel select the option “Spell Checking”. You can consult the manual or help file for more information.

Verification checks

This is a very important step which many translators skip, shooting themselves in the foot! The verification checks make sure that the translated files are technically correct, practically guaranteeing the safe conversion back into target files. Among other things, these checks catch issues such as:

  • Untranslated segments
  • Inconsistent numbers or date formats
  • Missing or wrongly placed tags
  • Extra leading or trailing spaces
  • Term inconsistency etc.

In Heartsome Translation Studio you can run these checks by right-clicking on the “XLIFF” folder in the “Project” panel, and selecting “QA…”. The QA wizard will appear; click on “OK”. After it completes the checks, it will output the results in the “QA Results” panel which is located on the right-side panel in the user interface. You can then go through the results, one-by-one, and double-click on any entry to get to the affected segment.

Note that sometimes, the results could contain false-positives which are not actual errors. You could safely ignore these after quickly checking a few cases.

Delivery to the client

If you remember, the client’s delivery requirement stated the return of an SDL Trados Studio Return Package (.sdlrpx). Since Heartsome Translation Studio doesn’t support any SDL Trados Studio Project packages (.sdlppx), you won’t be surprised to learn that neither does it support creating Return packages; you need the first to create the latter.

So, the best alternative solution we have is to prepare a set of files the client can use to update their project with our translations. This set will contain the following files:

  • Translated SDLXLIFFs
  • A Translation Memory export in TMX format
  • A Terminology Database export in TBX format

I always like to keep things neat and tidy, so we’ll create a suitable folder structure to hold the above files. Using File Explorer, go in our project’s folder then locate and enter the 3-To_Client folder. In here, create the following 3 folders: Bilingual_Files, Termbase_Export, and TM_Export. You should end up with a folder structure like the following:

\---3-To_Client
    +---Bilingual_Files
    +---Termbase_Export
    +---TM_Export

In Heartsome Translation Studio, go to the Project panel, expand the Target folder and right-click on the de-DE subfolder. Select “Copy” from the contextual menu. Switch to File Explorer, go to our project’s folder and specifically in 3-To_Client -> Bilingual_Files. Paste here the target files we’ve copied above (i.e., press Ctrl + V).

To create our Termbase export, in Heartsome Translation Studio select the menu option Databases -> Export TBX…. Then click on “Add” and select “File-based Termbase…”. In the Open dialog, locate the .hstb Termbase we created further above in the Analysis and pre-processing phase (you should find it in the Termbases folder), and click “Open”. Click on the “Browse” button which is located on the right of the “Save to” field and navigate to the project’s folder path 3-To_Client -> Termbase_Export. Enter a suitable name and click on the “Save” button. Afterwards, click on “Export”. A .tbx file with the name you entered will be created.

For the TM export, we’ll follow a very similar process to the Termbase export. In Heartsome Translation Studio, select the menu option Databases -> Export TMX…. Then click on “Add” and select “File-based TM…”. In the Open dialog, locate the .hstm TM we created further above in the Analysis and pre-processing phase (you should find it in the Tm folder), and click “Open”. Click on the dropdown box next to the field “Source Language” and select the option “en-US”. Click on the “Browse” button which is located on the right of the “Save to” field and navigate to the project’s folder path 3-To_Client -> TM_Export. Enter a suitable name and click on the “Save” button. Afterwards, click on “Export”. A .tmx file with the name you entered will be created.

For the final touch, we need to zip all this before sending to the client. In File Explorer move up to the top level of our project’s folder, right-click on 3-To_Client and select “Send to” -> “Compressed (zipped) folder”. A .zip file will appear in the project folder, and this is the file you should send to the client.

The Verdict

So, can a professional freelance translator do the job using only free tools? Before we answer that, let’s see whether the process we used covered all 5 objectives we had put forth in the beginning of this post.

ObjectiveCovered?
It must be freeYes
It must be able to run on all three major Operating SystemsYes
It must accept directly, or with minimum tinkering, bilingual file from major proprietary CAT ToolsYes
The translation process must be as seamless as using a proprietary tool, or close enoughYes
The resulting translated bilingual files must be identical to the ones a proprietary tool would produce, and adhere to the client’s delivery specificationsYes*

That last objective has an asterisk for a good reason: although the bilingual files delivered to the client are identical to the original ones (in terms of format, type and structure), the client’s delivery specifications couldn’t be met completely (we couldn’t deliver a Trados Studio Return Package). But, keep in mind that the delivered files can be used by the client successfully, and with minimal fuss on their side.

Returning to our pressing question, I’d say yes, a professional freelance translator could do the job using only free tools. Granted, the procedures we outlined in this post are not for the faint of heart. However, if you’re a freelancer then you’re a bit of a tinkerer as well—it goes with the territory.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.