Strategies for migrating Trados TMs
Thread poster: Adam Łobatiuk
Adam Łobatiuk
Adam Łobatiuk  Identity Verified
Poland
Local time: 02:27
Member (2009)
English to Polish
+ ...
Jan 4, 2012

I know that you can import Trados TMs in TXT and TMX formats. However, Trados is really messy when it comes to tags in TM segments. Some turn out ok in DVX 2, e.g.

Click {1}Start{2} and select {3}All programs{4}...

but sometimes I get:

Click {1}< b >{2}Start{3}< /b >{4} and select {5}< b >{6}All programs{7}< /b >{8}...

This seems to be a Trados problem, because such tags are marked with different code in the TM export file. I've tried various o
... See more
I know that you can import Trados TMs in TXT and TMX formats. However, Trados is really messy when it comes to tags in TM segments. Some turn out ok in DVX 2, e.g.

Click {1}Start{2} and select {3}All programs{4}...

but sometimes I get:

Click {1}< b >{2}Start{3}< /b >{4} and select {5}< b >{6}All programs{7}< /b >{8}...

This seems to be a Trados problem, because such tags are marked with different code in the TM export file. I've tried various options, and the best one so far has been to import TMX to and export from SDLX. The file ends up with doubled tags, but without the < b > code, and DVX 2 handles that well. I was wondering if anyone has a better (quicker) procedure. I know you can delete code in Olifant, but I don't want to get rid of all tags.
Collapse


 
Grzegorz Gryc
Grzegorz Gryc  Identity Verified
Local time: 02:27
French to Polish
+ ...
tw4winInternal style problem... Jan 5, 2012

Adam Łobatiuk wrote:

I know that you can import Trados TMs in TXT and TMX formats. However, Trados is really messy when it comes to tags in TM segments. Some turn out ok in DVX 2, e.g.

Click {1}Start{2} and select {3}All programs{4}...

but sometimes I get:

Click {1}< b >{2}Start{3}< /b >{4} and select {5}< b >{6}All programs{7}< /b >{8}...

This seems to be a Trados problem, because such tags are marked with different code in the TM export file. I've tried various options, and the best one so far has been to import TMX to and export from SDLX. The file ends up with doubled tags, but without the < b > code, and DVX 2 handles that well. I was wondering if anyone has a better (quicker) procedure. I know you can delete code in Olifant, but I don't want to get rid of all tags.


Well done, interesting

Generally, probably the best results for legacy Trados to DVX conversion can be obtained using the Trados 2.x-6.x txt intermediary format.

Your problem is probably due to some incompetent Trados enabled bastard who didn't respect the tw4winInternal style.
It's a very frequent error in Trados bilingual tagged RTF/DOC files.
It looks correct in Word but the red text is only formatted as red and not marked as tw4winInternal as it should be.
I.e. the target is formatted incorrectly and DVX is perfectly right here, it assumes it's no idiot between.
You can use a double set of TMs, i.e. one with tags ont the second one without tags (processed with Olifant), it may improve the matching results.
You can also merge 'em, then delete the duplicates.

As another workaround, I would try to process the Trados .txt in Word using RegEx (convert RTF red marks to tw4winInternal).
Many years ago I purchased a tool called Quintillian from TerminologyMatters (out of business now) which would probably make it easier, I don't remember exactly, it was so many years ago I don't remember where I put the serial...
I should have a 30 day trial somewhere, just ask me.
AFAIR it works in Word 2003 or earlier.

Cheers
GG

[Edited at 2012-01-05 02:55 GMT]


 
Adam Łobatiuk
Adam Łobatiuk  Identity Verified
Poland
Local time: 02:27
Member (2009)
English to Polish
+ ...
TOPIC STARTER
@GG Jan 5, 2012

Thank you very much, Grzegorz. This sounds plausible, although, on the other hand, the content where the redundant tags appear sounds like it comes from Help files, which, in this particular case, are HTML and done in Tag Editor. Still, what you wrote about does happen—I just hoped it was more limited than it looks now.

 
Grzegorz Gryc
Grzegorz Gryc  Identity Verified
Local time: 02:27
French to Polish
+ ...
RTF workflow for HTML help... Jan 5, 2012

Adam Łobatiuk wrote:

Thank you very much, Grzegorz. This sounds plausible, although, on the other hand, the content where the redundant tags appear sounds like it comes from Help files, which, in this particular case, are HTML and done in Tag Editor. Still, what you wrote about does happen—I just hoped it was more limited than it looks now.


No, I think it's a tagged RTF.
I translated several help projects like that, gazillions of small RTF files in hundreds of folders (the global project view in DVX saved me a lot of time...).
The HTML codes were marked as tw4winInternal or tw4winExternal.
The tw4winInternal style is interpreted correctly by DVX i.e. the HTML tag content is not shown while the red text is interpreted as DVX tag (red formatting start), text (HTML tag content), DVX tag (red formatting end), exactly as in your example.
The tags from the TTX workflow are imported correctly as tags.

Can you paste here a segment from the Trados txt?

BTW, on of the function of Quintillian was to create a tagged bilingual Word file which was easy to edit, unlike the Trados legacy txt TM.
The problem was the TM size, Word was unable to convert large files.

Cheers
GG


 
Adam Łobatiuk
Adam Łobatiuk  Identity Verified
Poland
Local time: 02:27
Member (2009)
English to Polish
+ ...
TOPIC STARTER
@GG Jan 5, 2012

Thanks a lot again, Grzegorz. This requires some investigation, which I don't have time for right now, but I will come back to this topic in a few days. As far as I remember, I have always translated help content in those projects in Tag Editor, but those strings could not be mine. I'll let you know.

 
Adam Łobatiuk
Adam Łobatiuk  Identity Verified
Poland
Local time: 02:27
Member (2009)
English to Polish
+ ...
TOPIC STARTER
My experiment Jan 9, 2012

I came across the same problem again while translating for my direct client - the TM is mine, and I know how to handle tags in Trados. So, I decided to make an experiment.

I created a doc file with a sample sentence containing a bolded and a red word as well as a link. I translated it in Tag Editor, and then again in Word. I exported the TM as TMX, and imported it into a new DVX TM. It turns out, the redundant codes are caused by TTX files (possibly not all, but those converted from
... See more
I came across the same problem again while translating for my direct client - the TM is mine, and I know how to handle tags in Trados. So, I decided to make an experiment.

I created a doc file with a sample sentence containing a bolded and a red word as well as a link. I translated it in Tag Editor, and then again in Word. I exported the TM as TMX, and imported it into a new DVX TM. It turns out, the redundant codes are caused by TTX files (possibly not all, but those converted from Doc files).

The sentence coming from the TTX file looks like this in DVX (I replaced the < > brackets with [ ] so they display on the forum):

This is a {1}[cf bold="on"]{2}sentence{3}[/cf]{4} with {5}[cf fontcolour="0xff"]{6}formatting{7}[/cf]{8} and a {9}[field][csf style="Hiperłącze" fontcolour="0xff0000" underlinestyle="single"]{10}link{11}[/csf][/field]{12}.

The sentence coming from the Doc file looks like this in DVX:

This is a {1}sentence{2} with {3}formatting{4} and a {5}.
Collapse


 
Grzegorz Gryc
Grzegorz Gryc  Identity Verified
Local time: 02:27
French to Polish
+ ...
Mixed TTX-RTF workflows for Trados... Word versions... etc. Jan 9, 2012

Adam Łobatiuk wrote:

I came across the same problem again while translating for my direct client - the TM is mine, and I know how to handle tags in Trados. So, I decided to make an experiment.

I created a doc file with a sample sentence containing a bolded and a red word as well as a link.

I translated it in Tag Editor, and then again in Word.

You retranslated the file in Word manually or you only pretranslated it without taking a look on it?
It makes a difference.

Trados has two incompatible tagging modes i.e. RTF and TTX which give different matches etc.
If you simply pretranslated the RTF/DOC file "as is" (with no manual intervention), the TTX content will be inserted incorrectly "as is" (with tag content formatted as tw4winInternal) if you use Word 2010 (probably also behaves Word 2007).
In Word 2003 and earlier, these Workbench suggestions are processed in a more sound way in the pure RTF mode i.e. the tag content (and its formatting) will be stripped.
It's a flagrant Trados problem.

DVX will handle these files well for the pretranslated "pure" RTF workflow (Trados with Word 2003), you need only to insert the lacking codes (as you should reformat it in Trados-Word combo).

For the "mixed" RTF-XML workflow (Trados with Word 2010) DVX will insert codes twice in the target (well, it's a quirk...) but you'll be able to easily leverage the imported Trados TM from TTX files (DVX is far better than Trados here...) or delete 'em if no match is found.

I exported the TM as TMX, and imported it into a new DVX TM. It turns out, the redundant codes are caused by TTX files (possibly not all, but those converted from Doc files).

The sentence coming from the TTX file looks like this in DVX (I replaced the < > brackets with [ ] so they display on the forum):

This is a {1}[cf bold="on"]{2}sentence{3}[/cf]{4} with {5}[cf fontcolour="0xff"]{6}formatting{7}[/cf]{8} and a {9}[field][csf style="Hiperłącze" fontcolour="0xff0000" underlinestyle="single"]{10}link{11}[/csf][/field]{12}.


Are you sure it's really a TTX?
It seems it's a doc file and you use an incorrect filter.
I.e. it seems you use the Word one while the "Trados Wokbench" is needed.
If you use the Word filter, all these codes are indeed considered as simple text, DVX doesn't care about the Trados styles handling and imports a lot of crap.

What's your DVX2 build?

The sentence coming from the Doc file looks like this in DVX:

This is a {1}sentence{2} with {3}formatting{4} and a {5}.


Yep.
BTW, you see here one of the DVX quirks related to the Trados bilingual Word workflow.
I.e. the field (placeable) is replaced by a code which represents it.
In most cases it's OK but it's a pain in the neck for more complex fields.
The only sound Trados-DVX cat hopping scenario for files with multiple complex fields is the TTX workflow.

Cheers
GG

[Edited at 2012-01-09 23:58 GMT]


 
Adam Łobatiuk
Adam Łobatiuk  Identity Verified
Poland
Local time: 02:27
Member (2009)
English to Polish
+ ...
TOPIC STARTER
More details... Jan 10, 2012

Hi Grzegorz (and everyone),

My setup is DVX2 8.0.519
SDL Trados 2007 Suite Freelance
Windows 7
Office 2010 (previously 2003, which might matter with regard to older TM entries, but in my latest experiment I only used Office 2010).

1. I created a doc file and typed a sentence that used bold and colour formatting and a hyperlink. There were no tw4wininternal or external styles or any tags. Just a regular, but formatted sentence.
2. I opened the doc
... See more
Hi Grzegorz (and everyone),

My setup is DVX2 8.0.519
SDL Trados 2007 Suite Freelance
Windows 7
Office 2010 (previously 2003, which might matter with regard to older TM entries, but in my latest experiment I only used Office 2010).

1. I created a doc file and typed a sentence that used bold and colour formatting and a hyperlink. There were no tw4wininternal or external styles or any tags. Just a regular, but formatted sentence.
2. I opened the doc file in Tag Editor (by dragging and dropping) and translated it there using an empty TM. I didn't clean the file up, because I just wanted the segment in the TM.
3. I opened the doc file again, this time in Word. I opened the segment with the same TM. The match was quite low and with no formatting, so I copied the source and overwrote it manually. I closed the segment and updated the TM.
4. I exported the Trados TM (with segments from both files) as TMX 1.4 and imported it into an empty DVX TM file. I opened the TM in DVX2 and noticed the difference between the segments. There is also an obvious difference already in the TMX file, so I don't think it's DVX2's fault.

I did not translate or open the file in DVX2, so the choice of filters is irrelevant. The DVX2 TM window clearly shows post-DOC-TTX tags as plain text between gray DVX codes like {1}, which I already saw while translating my two different projects.

The TM workflow I described in my initial post (via SDLX) seems to get rid of plain text tags. I end up with redundant {1}{2} tags in DVX2, but they seem to be handled well by DVX2. I just had hoped there was a quicker way than 4 export/import operations.

By the way, yes, I noticed that the hyperlink text was unavailable for translation in the second segment imported from the TMX file.

I could probably post some screenshots if necessary, but the experiment seems simple enough to reproduce. It would be interesting to see the results other people get.
Collapse


 
Yngve Roennike
Yngve Roennike
Local time: 20:27
Swedish to English
+ ...
What is the meaning of these tag abbreviations? Jun 15, 2012

Was searching the Help files in SDL Studio to get a simple answer to a simple question, I would think, but it becomes very nebulous quickly, because there is no direct link to an explanation.

I haven't the foggiest clue, at least not at the time of writing this, what br/, cf/ csf/ etc. stand for.
Would there be a listing somewhere that would show their expansion? I might have known in my earlier days what cf means, but not any more.
It would be nice to know and increase
... See more
Was searching the Help files in SDL Studio to get a simple answer to a simple question, I would think, but it becomes very nebulous quickly, because there is no direct link to an explanation.

I haven't the foggiest clue, at least not at the time of writing this, what br/, cf/ csf/ etc. stand for.
Would there be a listing somewhere that would show their expansion? I might have known in my earlier days what cf means, but not any more.
It would be nice to know and increase your control over them.
I got here by googling.

Thanks very much in advance.

[Edited at 2012-06-15 14:31 GMT]
Collapse


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Pavel Tsvetkov[Call to this topic]

You can also contact site staff by submitting a support request »

Strategies for migrating Trados TMs






Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »