JRC-Acquis, how to convert to tmx
Thread poster: Magdalena Kowalska
Magdalena Kowalska
Magdalena Kowalska  Identity Verified
United Kingdom
Local time: 20:18
Polish to English
+ ...
Dec 13, 2015

Hi,

I've downloaded the already aligned files from https://ec.europa.eu/jrc/en/language-technologies/jrc-acquis. Now, I need to convert those xml files to nything my memoq can process, like csv, if not directly to tmx.

How do I go about it? I've tried all online xml-csv converters I could find, but the files are too big for them to work.

... See more
Hi,

I've downloaded the already aligned files from https://ec.europa.eu/jrc/en/language-technologies/jrc-acquis. Now, I need to convert those xml files to nything my memoq can process, like csv, if not directly to tmx.

How do I go about it? I've tried all online xml-csv converters I could find, but the files are too big for them to work.

Has anyone succeeded in using the JRC texts with their cat tool?
Collapse


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 19:18
Member (2009)
Dutch to English
+ ...
two tips Dec 13, 2015

Magdalena Kowalska wrote:

Hi,

I've downloaded the already aligned files from https://ec.europa.eu/jrc/en/language-technologies/jrc-acquis. Now, I need to convert those xml files to nything my memoq can process, like csv, if not directly to tmx.

How do I go about it? I've tried all online xml-csv converters I could find, but the files are too big for them to work.

Has anyone succeeded in using the JRC texts with their cat tool?


I suggest getting Andras Farkas’s collection. For a small fee, he will supply you with the ultimate EU collection of TMXs, or in any other format you might want: http://www.farkastranslations.com/eu_translation_memories.php

The best place to start if you want to get the DGT/JRC stuff directly from the EU is here:

https://ec.europa.eu/jrc/en/language-technologies/dgt-translation-memory

[Edited at 2015-12-13 17:36 GMT]

[Edited at 2015-12-13 17:37 GMT]


 
Emma Goldsmith
Emma Goldsmith  Identity Verified
Spain
Local time: 20:18
Member (2004)
Spanish to English
And a third tip Dec 13, 2015

Dominique Pivard posted a useful video on the DGT TM here:

https://www.youtube.com/watch?v=GNj07W2ZqhQ


 
Blaž Košir
Blaž Košir
Belgium
Local time: 20:18
English to Slovenian
+ ...
Try here Dec 13, 2015

Try here: http://www.ttmem.com/terminology/download-translation-memory/european-commission-translation-memory/

 
Magdalena Kowalska
Magdalena Kowalska  Identity Verified
United Kingdom
Local time: 20:18
Polish to English
+ ...
TOPIC STARTER
Thanks Dec 16, 2015

I actually did that already a few years ago.. downloading, aligning with that tool, etc. Jut wasn't sure it is still the same TM. It is worth to add the 2015 additions though, which I'm doing right now.

 
Milan Condak
Milan Condak  Identity Verified
Local time: 20:18
English to Czech
Extract and split Dec 16, 2015

Magdalena Kowalska wrote:

How do I go about it? I've tried all online xml-csv converters I could find, but the files are too big for them to work.


TMXs are ready in multilingual Translation Memory.

Since November 2007 the European Commission's Directorate-General for Translation has made its multilingual Translation Memory

https://ec.europa.eu/jrc/en/language-technologies/dgt-translation-memory

How to produce bilingual extractions

The multilingual extraction has English as the source language. Users can extract any language pair as follows, using the extraction tool TMXtract:
For the Windows Operating System:
Download the TMXtract.jar file;

After extraction I use Heartsome TMX Editor for merging and splitting of TMXs.

http://www.condak.cz/nove/2015-12/08/cs/04.html

Another solution: use a CAT with server for TMs. Felix-cat is now open-source, a server is included.

Milan


 
CafeTran Training (X)
CafeTran Training (X)
Netherlands
Local time: 20:18
DGT-Translation Memory: different generations Jun 7, 2017

For the DGT-Translation Memory different generations can be downloaded (2007, 2011 etc.). Do these generations overlap (contain identical TUs)?

 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

JRC-Acquis, how to convert to tmx







Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »