Split multi-sentence TMX segments into single sentence segments Thread poster: Samuel Murray
| Samuel Murray Netherlands Local time: 00:33 Member (2006) English to Afrikaans + ...
Hello everyone I have a TM from a client that was segmented by paragraph, and so many segments contain more than one sentence. I would like to split these segments up so that if the source and target fields have the same number of sentences, they are split into separate segments of one segment each. Does anyone know of a tool that can do this? I use Windows 7. Unfortunately my CAT tool does not do this automatically, as some others do. Thanks Samuel | | | Michael Beijer United Kingdom Local time: 23:33 Member (2009) Dutch to English + ...
You could try importing your TMX into a CAT tool that can edit TMXs (memoQ and CafeTran), and seeing what happens (the paragraphs might get segmented into individual sentences), or fiddling with the segmentation rules. Michael | | | Minh Nguyen Vietnam Local time: 05:33 English to Vietnamese | Re-segment manually | Jun 27, 2016 |
Hi, Re-segmenting something that has paragraph segmentation should be done manually. You may have to split sentences or reorder translations. You can use Stingray, http://www.maxprograms.com/products/stingray.html, for adjusting segmentation in a TMX file. Regards, Rodolfo | |
|
|
Samuel Murray Netherlands Local time: 00:33 Member (2006) English to Afrikaans + ... TOPIC STARTER Or: extract only multi-sentence segments | Jun 27, 2016 |
Thanks everyone for your ideas. Something that would help me a great deal is a tool that can extract all segments with more than one sentence in it. Yes, I can convert the TM to a two-column format, convert that into two separate files, and then use an aligner to recreate the TM (which I would have to check manually), but my TM is huge and not all segments are multi-sentence segments. | | | Michael Beijer United Kingdom Local time: 23:33 Member (2009) Dutch to English + ... | esperantisto Local time: 01:33 Member (2006) English to Russian + ... SITE LOCALIZER
May I remind you that OmegaT converts paragraph-segmented TMX files to sentence-segmented on-the-fly?
[Edited at 2016-06-27 15:01 GMT] | | | Samuel Murray Netherlands Local time: 00:33 Member (2006) English to Afrikaans + ... TOPIC STARTER Yes, some CAT tools do this on the fly | Jun 27, 2016 |
esperantisto wrote: May I remind you that OmegaT converts paragraph-segmented TMX files to sentence-segmented on-the-fly? Yes, some CAT tools do this on the fly, but I need the TMX file itself to become subsegmented, because I'm not using a CAT tool that can do that. | |
|
|
CafeTran Training (X) Netherlands Local time: 00:33 Filter via regular expression | Jun 27, 2016 |
Samuel Murray wrote: Something that would help me a great deal is a tool that can extract all segments with more than one sentence in it. You can use a regular expression to filter on segments that contain multiple sentences. Here the project (CafeTran can handle TMX files like projects): Filtered: You can then Split and Merge left and right (source and target). It's of course also possible to write these filtered segments to a new file (and delete them from the current one), duplicate, triplicate them. Then via Find and Replace and regular expressions you can remove all second (third etc.) sentences from every segment (left and right). Repeat this in another copy of the file for all first and fourth (fifth etc.) sentences. Here demonstrated to remove the first sentence of a paragraph: Of course this doesn't consider abbrev.--you'll have to enhance the regular expression for that. If it's possible, at all.
[Edited at 2016-06-27 17:11 GMT] | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Split multi-sentence TMX segments into single sentence segments Anycount & Translation Office 3000 | Translation Office 3000
Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.
More info » |
| Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |