Translating/hacking OmegaT to use two source texts: tmx and matches
Thread poster: Thijs Vissia
Thijs Vissia
Thijs Vissia
Netherlands
Feb 27, 2020

Another question, I've a text to translate where I want to refer to both an English and a German source document to make a Dutch translation. In fact, the German is the original, and the English is a translation, and since my German isn't excellent, I want to use both. If possible I would like to see both inside OmegaT as I work.

I was wondering if there are any people who've worked out a not-too-elaborate way to do this?

What I've thought to do is enter the English and
... See more
Another question, I've a text to translate where I want to refer to both an English and a German source document to make a Dutch translation. In fact, the German is the original, and the English is a translation, and since my German isn't excellent, I want to use both. If possible I would like to see both inside OmegaT as I work.

I was wondering if there are any people who've worked out a not-too-elaborate way to do this?

What I've thought to do is enter the English and the German into a .tmx, and put this into the tm folder of a new translation project, with the German text as the source document.

The German - English tmx file doesn't show up in the Fuzzy matches panel, probably because I have now selected a different target language.

So if I change the target language to English, the German-English translation now shows up in Fuzzy Matches. But it also inserts them automatically into the target field, which I do not want to happen.

This shouldn't be the case, as the manual says, a translation memory in the tm folder does not automatically insert even exact matches:

"Exact matches
These matches are automatically copied into the segments they are identical to without being user selected. This only happens if the match belongs to the project's translation memory (i.e. file project_save.tmx located in the /omegat/ directory). Any exact match found in other translation memory files located in /tm/ will not be automatically inserted, they must be deliberately inserted." (https://sourceforge.net/p/omegat/svn/261/tree/trunk/docs/en/Matches.html)

Is there another setting that is overruling this supposed behaviour?

And the other question: anyone else find another good way to hack OmegaT into this kind of dual source text type of translation?

Any help appreciated!
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 20:37
Member (2006)
English to Afrikaans
+ ...
@Thijs Feb 28, 2020

Thijs Vissia wrote:
So if I change the target language to English, the German-English translation now shows up in Fuzzy Matches. But it also inserts them automatically into the target field, which I do not want to happen.


Option 1: In OmegaT, go Options > Preferences > Editor > deselect "Insert the best fuzzy match".

Option 2: Add a marker e.g. # in front of all source texts in the DE-EN TM, so that they're never 100% matches. In your plain text editor, when editing the TMX file, change "<seg>" to "<seg> # ".

Option 3: Check out these two additional TM options:
https://omegat.sourceforge.io/manual-standard/en/project.folder.html#d0e4596


[Edited at 2020-02-28 09:04 GMT]


esperantisto
 
Thijs Vissia
Thijs Vissia
Netherlands
TOPIC STARTER
thanks Feb 28, 2020

Hi Samuel, thanks very much.

 
tcordonniery
tcordonniery
France
Local time: 20:37
For source in 2 languages Mar 2, 2020

Samuel Murray wrote:
Option 1: In OmegaT, go Options > Preferences > Editor > deselect "Insert the best fuzzy match".
Option 2: Add a marker e.g. # in front of all source texts in the DE-EN TM, so that they're never 100% matches. In your plain text editor, when editing the TMX file, change "" to " # ".
Option 3: Check out these two additional TM options
https://omegat.sourceforge.io/manual-standard/en/project.folder.html#d0e4596


Option 4: copy the TMX file containing EN and DE into the directory tm/tmx2source. If you imported original in DE in source/ deirectory, then EN will also appear in the editor with a distinct color (only for current segment). If you imported EN/ in the source/ directory, then DE will appear with distinct color.
Only inconvenience is that it works only with 100% matches, so only if the source is exactly, tag for tag, identical to what OmegaT extracts from the source file.

[Edit]. Sorry I forgot an important detail: the file must be renamed as en.tmx, de.tmx, en-gb.tmx or de-de.tmx depending on the name of the alternative language you want to display. The name to use must be the language name as present in the file itself (i.e. don't rename as de.tmx if the contains de-DE).

[Edited at 2020-03-03 06:38 GMT]


 
Thijs Vissia
Thijs Vissia
Netherlands
TOPIC STARTER
Looks fantastic Mar 5, 2020

tcordonniery wrote:

Samuel Murray wrote:
Option 1: In OmegaT, go Options > Preferences > Editor > deselect "Insert the best fuzzy match".
Option 2: Add a marker e.g. # in front of all source texts in the DE-EN TM, so that they're never 100% matches. In your plain text editor, when editing the TMX file, change "" to " # ".
Option 3: Check out these two additional TM options
https://omegat.sourceforge.io/manual-standard/en/project.folder.html#d0e4596


Option 4: copy the TMX file containing EN and DE into the directory tm/tmx2source. If you imported original in DE in source/ deirectory, then EN will also appear in the editor with a distinct color (only for current segment). If you imported EN/ in the source/ directory, then DE will appear with distinct color.
Only inconvenience is that it works only with 100% matches, so only if the source is exactly, tag for tag, identical to what OmegaT extracts from the source file.

[Edit]. Sorry I forgot an important detail: the file must be renamed as en.tmx, de.tmx, en-gb.tmx or de-de.tmx depending on the name of the alternative language you want to display. The name to use must be the language name as present in the file itself (i.e. don't rename as de.tmx if the contains de-DE).

[Edited at 2020-03-03 06:38 GMT]

hi tcordonniery,

Many thanks for knowing and pointing this out. This seems excellent, a hidden feature to do exactly what I was looking for. However, I can't seem to get it to work.

I've got a file with the German source text (source/German_source.txt) in the source folder and the "reference translation" as a .tmx in English in the subfolder called tmx2source of the tm folder (so tm/tmx2source/en-GB.tmx). The project itself now has the language pair of the source document (de) and the target document (nl-NL). (Note this is different than the pair of German-UK English that I used in the project to create the en-GB.tmx file. Next time I'll probably use the Align Files feature to create this .tmx file.)

The contents of the English part of the "tmx2source/en-GB.tmx" file do not show up, just the segments from the source document. Also nothing shows in Fuzzy Matches panel.

Is this a feature recently added? FWIW I'm running OmegaT 5.1.0.

Is there anything else that might be the cause?

Edit: I also found this old thread by Samuel Murray - https://www.proz.com/forum/omegat_support/238939-translating_from_a_secondary_source_language_in_omegat.html?print=1

I think my above test setup is identical as Samuel's description?


[Edited at 2020-03-05 14:21 GMT]


 
Thijs Vissia
Thijs Vissia
Netherlands
TOPIC STARTER
hmm... Mar 21, 2020

Any word on this? I'd really like to get this to work, but ongoing attempts haven't been succesful yet.

Many thanks.


 
Ruben Angel
Ruben Angel
South Korea
Local time: 03:37
English to Spanish
+ ...
Solution Apr 27, 2020

Thijs Vissia wrote:

Any word on this? I'd really like to get this to work, but ongoing attempts haven't been succesful yet.

Many thanks.


I got it to work in OmegaT 5.2.0 by naming the reference tm "EN_US.tmx" instead of "en-US.tmx"


 
Thijs Vissia
Thijs Vissia
Netherlands
TOPIC STARTER
thanks Apr 27, 2020

Ruben Angel wrote:

Many thanks.


I got it to work in OmegaT 5.2.0 by naming the reference tm "EN_US.tmx" instead of "en-US.tmx" [/quote]

Thanks very much, will give that a try.


 
Thijs Vissia
Thijs Vissia
Netherlands
TOPIC STARTER
how to Apr 27, 2020

Ruben Angel wrote:

I got it to work in OmegaT 5.2.0 by naming the reference tm "EN_US.tmx" instead of "en-US.tmx"


Yes, this is working. But the hyphen vs. underscore is not what makes the difference, it's the capital letters. Many thanks.


So here's a brief HOW TO using a third language reference translation within OmegaT:

- You want to use a third language text (in language B) as reference during the translation of a source text in language A to language C, and be able to see it in OmegaT's regular editing window.

- Make a project #1 to produce a .tmx file containing both the source text (in language A) and the reference translation (in language B). This is a bit tedious, you need to align/copy-paste both texts sentence by sentence. (Another route is using OmegaT to create a segmented version of each text, save each as an individual .tmx, and use OmegaT's Aligner to align and join the two. This saves some copy-pasting but has its own tedium.)

- When done and after saving, you take the project_save.tmx file from the #1 project folder, and stick it in a newly created folder called "tmx2source". The .tmx file itself you need to rename according to the language (B) of the reference translation, USING ALL CAPS: EN.tmx, EN_GB.tmx, IT.tmx, NL-NL.tmx, etc. depending on what the relevant language C is.
NOTE: I noticed that the language's regional variety doesn't matter, it'll work no matter what random characters you use in the second pair, nor does it matter if you use a hyphen or an underscore. The first two characters however need to refer to a valid language and be in uppercase letters.)

- The tmx2soure folder then needs to be moved to your second project where you will make the actual translation, i.e. from source language A to the target language C.

- Create project #2. Copy the tmx2source folder into this project, and put it into the tm folder that sits in the project's root.

- After including the source-to-reference .tmx in project #2, re-open or reload that project. OmegaT should then display the reference segment under the source segment, allowing you to see both while translating the segment.


Milan Condak
 
Milan Condak
Milan Condak  Identity Verified
Local time: 20:37
English to Czech
I updated my presentation on eTranslation Apr 28, 2020

Thijs Vissia wrote:
The first two characters however need to refer to a valid language and be in uppercase letters.)


http://www.condak.cz/nove/2020-04/26/cs/08.html

http://www.condak.cz/nove/2020-04/26/cs/00.html

eTranslation - První kroky / First Steps

24 jazykových párů v 24 TMX jedním rázem

01 První zkušenosti českého uživatele
02 24 jazykových párů v 24 TMX jedním rázem
03 Automatizovaný překlad v OmegaT
04 JANOS, THE STORY OF A DOCTOR
05 János, Ein Arzt Erzählt Sein Leben
06 Přidání TMX do projektu
07 Portuguese language magazine
08 The State of the Linguist Supply Chain (tmx2source)

Machine translation:

24 language pairs in one fell swoop 24 TMX

01 First experience of Czech user
02 24 language pairs in 24 TMXs one fell swoop
03 Automated translation in OmegaT
04 Janos, THE STORY OF A DOCTOR
05 János, Ein Arzt Erzählt Sein Leben
06 Adding to the TMX project
07 Portuguese language magazine
08 The State of the Linguist Supply Chain (tmx2source)

Milan


 
Milan Condak
Milan Condak  Identity Verified
Local time: 20:37
English to Czech
OmegaT with tmx2source Jun 8, 2020

Again eTranslation as support for Interlingua in OmegaT with tmx2source

http://www.condak.cz/nove/2020-05/02/cs/00.html

eTranslation, interlingva a tmx2source

eTranslation - PPTX do TMX
--
PDF as source file for creating TMXs is better than PPTX file.

Milan


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Translating/hacking OmegaT to use two source texts: tmx and matches






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »