Which is the best software for extraction of term candicates from Word documents and PDFs? Thread poster: Fredrik Pettersson
|
I have a license for TshwaneLex but haven't used it yet. I'm not sure though I can use TshwaneLex for term extraction from documents. SDL Multiterm Extract seems to work best with German language only. What I'm looking for is an easy-to-use term extraction software that I can extract terms from Word documents and PDF-files based on different criteria I set (such as number of occurences in the text etc.). Or maybe Toolbox from SIL could work: <... See more I have a license for TshwaneLex but haven't used it yet. I'm not sure though I can use TshwaneLex for term extraction from documents. SDL Multiterm Extract seems to work best with German language only. What I'm looking for is an easy-to-use term extraction software that I can extract terms from Word documents and PDF-files based on different criteria I set (such as number of occurences in the text etc.). Or maybe Toolbox from SIL could work: http://www.linguistics.ucsb.edu/faculty/infield/courses/resources/Lex_H2.pdf Other alternatives would be Terminotix or SynchroTerm. I found a posting here at ProZ also, but it seemed more about terminology management: http://www.proz.com/forum/cat_tools_technical_help/83657-best_terminology_management_termbase_software.html ▲ Collapse | | | Michael Beijer United Kingdom Local time: 06:30 Member (2009) Dutch to English + ... SynchroTerm + TerMine | Sep 2, 2015 |
In my experience, the two best are: (1) the free, online TerMine (http://www.nactem.ac.uk/software/termine/ ), and (2) the paid, desktop Synchroterm. SynchroTerm is probably the best term extractor currently on the market, bar none. It also has all kinds of cool tricks, such as the ability to indicate which file (from a group of files) a term derives from... See more In my experience, the two best are: (1) the free, online TerMine (http://www.nactem.ac.uk/software/termine/ ), and (2) the paid, desktop Synchroterm. SynchroTerm is probably the best term extractor currently on the market, bar none. It also has all kinds of cool tricks, such as the ability to indicate which file (from a group of files) a term derives from, include a snippet of context, etc. etc. etc. It's what I use for large, corporate term extraction and glossary creation jobs. Michael PS: TAAS also has a good term extractor: https://term.tilde.com/projects
[Edited at 2015-09-02 19:39 GMT] ▲ Collapse | | | Fredrik Pettersson Hong Kong Local time: 07:30 Member (2009) English to Swedish + ... TOPIC STARTER Can SynchroTerm extract only source terms from monolingual documents? | Sep 2, 2015 |
Thanks, SynchroTerm seems to be the best alternative. I looked in the fact sheet for SynchroTerm now and watched a video, but I can't see if it's possible to extract only source terms and leave the target term translations empty. So I can fill in the translations before and during the translation phase. | | | Michael Beijer United Kingdom Local time: 06:30 Member (2009) Dutch to English + ...
Fredrik Pettersson wrote: Thanks, SynchroTerm seems to be the best alternative. I looked in the fact sheet for SynchroTerm now and watched a video, but I can't see if it's possible to extract only source terms and leave the target term translations empty. So I can fill in the translations before and during the translation phase. Not entirely sure what you mean. Assuming you have a monolingual document, and want to extract terms, you open the doc(s) in SynchroTerm, and extract terms. You can then save this list of terms in various formats (Excel, tabbed txt, html, etc.), and later add translations if desired. I made a quick screencast to show how it works (no sound due to sleeping baby in next room): https://www.youtube.com/watch?v=9zmHGhZvyb4 PS: SynchroTerm can perform both monolingual extraction and bilingual extraction.
[Edited at 2015-09-02 23:42 GMT] | |
|
|
Luca Tutino Italy Member (2002) English to Italian + ...
Michael Beijer wrote: I made a quick screencast to show how it works (no sound due to sleeping baby in next room): Great stuff (the first minute of the track too) - and thanks! | | | Michael Beijer United Kingdom Local time: 06:30 Member (2009) Dutch to English + ...
Luca Tutino wrote: Michael Beijer wrote: I made a quick screencast to show how it works (no sound due to sleeping baby in next room): Great stuff (the first minute of the track too) - and thanks! Ha ha, thanks Luca! The soundtrack's actually a very old song of mine ("Casio-Beat-2-(extended-version)-Michael-Beijer.mp3"), made on a little plastic Casio keyboard | | |
Fredrik Pettersson wrote: [is it] possible to extract only source terms and leave the target term translations empty. So I can fill in the translations before and during the translation phase. Most CAT tools can do that, but for more advanced purposes, I use AntConc. It can a lot of things, including extract terms (and exclude stopwords or words already in your termbase) and phrases (n-grams. AntConc is free and cross-platform. Cheers, Hans | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Which is the best software for extraction of term candicates from Word documents and PDFs? Trados Studio 2022 Freelance | The leading translation software used by over 270,000 translators.
Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop
and cloud solution, empowering you to work in the most efficient and cost-effective way.
More info » |
| CafeTran Espresso | You've never met a CAT tool this clever!
Translate faster & easier, using a sophisticated CAT tool built by a translator / developer.
Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools.
Download and start using CafeTran Espresso -- for free
Buy now! » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |