What is the significance or utility of "Export unknown segments" feature
Thread poster: Rajan Chopra
Rajan Chopra
Rajan Chopra
India
Local time: 03:18
Member (2008)
English to Hindi
+ ...
Aug 14, 2020

Hello experts,

I have Trados Studio 2015. I would be grateful if someone could explain to me the purpose, significance or utility of "Export unknown segments" feature.

I have a file which has almost

13000 new words
7000 fuzzy matches (75-94%)
2400 fuzzy matches (95-100)
2300 words (pre-translated)
21800 Internal repetitions

When I use this feature even if by inserting a maximum match value of 71% and 96% , it exports al
... See more
Hello experts,

I have Trados Studio 2015. I would be grateful if someone could explain to me the purpose, significance or utility of "Export unknown segments" feature.

I have a file which has almost

13000 new words
7000 fuzzy matches (75-94%)
2400 fuzzy matches (95-100)
2300 words (pre-translated)
21800 Internal repetitions

When I use this feature even if by inserting a maximum match value of 71% and 96% , it exports almost same number of words i.e. 22400 and I can see that it even exports almost similar segments. Does it export only 100% same segments, if so, what is the use of inserting a match value?

I am looking for a software or an application which extracts the segments on the basis of which the other segments may be translated. For example, please have a look at these segments:

1. This is a good pen but it is very expensive and rare.
2. This is a good pen but it is very expensive.
3. This is a good pen but it is very cheap.
4. This is a very good pen but it is expensive.
5. This is a good pen but it is not expensive.
6. This is a good pen and it is very expensive but not rare.

I am looking for a software which may keep the segments like 1 and remove the others from the file because the other similar segments can be translated if we have the TM for segments like 1.

Is it possible at all? Does a software exist to do so?

Thanks and regards,

Chopra




[Edited at 2020-08-14 19:14 GMT]
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 22:48
Member (2006)
English to Afrikaans
+ ...
@Chopra Aug 14, 2020

chopra_2002 wrote:
I am looking for a software which may keep the segments like 1 and remove the others from the file because the other similar segments can be translated if we have the TM for segments like 1.


You have asked similar questions for a number of years. I agree that such a tool or feature would be very useful, but I know of no such tool.

As an experiment, I can tell you that doing the following gets me interesting (useful) results in OmegaT: create a source=target TM containing the same segments as the source file, then run through all the segments in OmegaT, and grab the *lowest* fuzzy match. OmegaT shows at most 5 fuzzy matches, and if I use your sample sentences and always grab the lowest fuzzy match, I can reduce your six sentences to two sentences:

This is a very good pen but it is expensive. (4x)
This is a good pen but it is very cheap. (2x)

The problem is, of course, that there is no script (yet) in OmegaT that can extract the lowest fuzzy matches, so any such extraction would have to be done using an AutoIt script that moves to the next sentence and clicks inside the fuzzy match pane and copies the matches and extracts the lowest match.

With the rain-in-spain list from this old thread I can reduce the 7 segments down to just 1.

However, I don't see this going faster than 10-15 segments per minute, depending on the speed of the Windows clipboard.


[Edited at 2020-08-15 08:48 GMT]


Rajan Chopra
 
Rajan Chopra
Rajan Chopra
India
Local time: 03:18
Member (2008)
English to Hindi
+ ...
TOPIC STARTER
Thanks so much for your detailed and informative reply Aug 16, 2020

[quote]Samuel Murray wrote:

chopra_2002 wrote:


You have asked similar questions for a number of years. I agree that such a tool or feature would be very useful, but I know of no such tool.



Hi Samuel,

I am indebted to your for your detailed reply. I am not a very technical man. Trados is a very sophisticated and top CAT tool. I think that Trados should consider introducing this feature in its forthcoming versions.

May be, some other knowledgeable person can suggest in this respect. I look forward to hearing from others.

Thanks and regards,

Chopra


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 22:48
Member (2006)
English to Afrikaans
+ ...
@Chopra Aug 16, 2020

chopra_2002 wrote:
I am not a very technical man. Trados is a very sophisticated and top CAT tool. I think that Trados should consider introducing this feature in its forthcoming versions.


I agree, and I think other CAT tools should do this too.

Over the past weekend I experimented with some AutoIt scripts to flag similar sentences in a list of sentences, and my conclusion is that it's really not difficult to program this feature. The difficulty is in finding a good fuzzy matching system, but all CAT tools already have fuzzy matching systems, so the only thing preventing them from implementing this is will power.

(I, on the other hand, am forced to use primitive fuzzy matching methods during my experiments, so my results show great promise without actually being promising.)

See also:
https://www.proz.com/forum/cat_tools_technical_help/345716-trim_internal_fuzzies_autoit_script.html


[Edited at 2020-08-16 11:10 GMT]


Rajan Chopra
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

What is the significance or utility of "Export unknown segments" feature







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »