Which is the best Software you can buy to convert PDF into Excel ?
Thread poster: Rita Utt
Rita Utt
Rita Utt  Identity Verified
France
Local time: 20:33
English to German
+ ...
Jan 10, 2019

For translation of annual reports, so I'm not interested in free online tools ...
Is there an efficient tool to extract all those tables into Excel (without having to check all numbers?)
I do have a very old Abby Fine Reader, but it's no good for tables.
Any better ideas or a newer version you could recommend ?


 
Bernd Albrecht
Bernd Albrecht
Switzerland
English to German
I don't know if it's really "the best PDF converting software" money can buy Jan 10, 2019

Have you tried this one?

https://acrobat.adobe.com/us/en/acrobat/export-pdf-online-pricing.html


 
Dan Lucas
Dan Lucas  Identity Verified
United Kingdom
Local time: 19:33
Member (2014)
Japanese to English
Do you mean dead PDFs? Jan 10, 2019

Rita Utt wrote:
I do have a very old Abby Fine Reader, but it's no good for tables.

Rita, do you mean PDF files that are not machine-readable? In that case it's down to OCR, and I have found v14 of FineReader reasonably good at that. However, I have found most annual reports are machine-readable these days.

Regards,
Dan


Merab Dekano
 
Rita Utt
Rita Utt  Identity Verified
France
Local time: 20:33
English to German
+ ...
TOPIC STARTER
Machine readable ? Jan 11, 2019

Hello,
thank you for your answer,
but sorry, I'm not very good in PDF-technology.
Does machine readable mean, PDF-Files I could write in, if I had the appropriate software (which would be what ?) ?


 
Rita Utt
Rita Utt  Identity Verified
France
Local time: 20:33
English to German
+ ...
TOPIC STARTER
Hello Bernd Jan 11, 2019

I would like to use an online-Tool, but I'm worried about confidentiality ...
Or do you think I shouldn't be ?


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 20:33
Member (2006)
English to Afrikaans
+ ...
@Rita: no, none Jan 11, 2019

Rita Utt wrote:
Is there an efficient tool to extract all those tables into Excel (without having to check all numbers?)


No, you're always going to have to check the numbers.

Even if the PDF is editable, i.e. you can copy the text and paste it in a text editor, you can't predict in which sequent the PDF creator added the numbers. Programs that outputs PDF do not write the table cells in necessarily the same sequence as in Excel (often there doesn't seem to be any logic to it). In addition, as you may know, copy/pasting from a table sometimes converts the tabs into spaces, but sometimes converts them into nothing, so numbers end up all squished together. So even if you can copy/paste the numbers, you'd still have to check them.

There was a time when I translated annual and financial reports, and for that I always used Acrobat 5.0.5 to copy the text, because Acrobat 5.0.5 has a block copy feature called "column select" which was replaced in modern versions of Acrobat with something called "graphics select", which can't copy blocks of text. However, most modern PDF files are "PDF 1.6" files, and Acrobat 5.0.5 can only open PDF 1.4, and the only way I know of to convert a PDF 1.6 file to a PDF 1.4 file is by using an online converter. Also, not all columns in PDF files like to be selected, so sometimes you just can't win.

Added: I just discovered that my current PDF program (PDF X Change) can copy columns, if you press the Alt key while dragging with the mouse.

[Edited at 2019-01-11 09:41 GMT]


 
Heinrich Pesch
Heinrich Pesch  Identity Verified
Finland
Local time: 21:33
Member (2003)
Finnish to German
+ ...
Error prone process Jan 11, 2019

I would ask for the original file format, Excel or whatever, and not try to convert to from pdf. It might take ages to get the stuff formatted. If the customer does the conversation themselves, that's another possibility. You translate that and the customer takes over the formatting. If the result is not good enough its not your problem.

neilmac
DZiW (X)
Adam Warren
 
Dan Lucas
Dan Lucas  Identity Verified
United Kingdom
Local time: 19:33
Member (2014)
Japanese to English
Can be read by CAT or other tools Jan 11, 2019

Rita Utt wrote:
Does machine readable mean, PDF-Files I could write in, if I had the appropriate software (which would be what ?) ?

For many/most recently generated PDF files, the text they contain can be read by software such as SDL Studio, MemoQ, etc. So you can often import the file, translate the text, and then create a Word file that is similar to the original. If you use a CAT tool, try it with an annual report created within the past 12 months.

But also, if you have a client who is inquiring about it, why not ask them if they have a recommended work flow.

Regards,
Dan


 
John Fossey
John Fossey  Identity Verified
Canada
Local time: 15:33
Member (2008)
French to English
+ ...
Get the orriginal file Jan 11, 2019

I have several clients that come to me each year for their annual report. For these I have asked for the original file, since no document was originally created as a PDF file. Some were created in Word, others were created in InDesign. The tables are often embedded Excel files. All these formats can be translated directly in a CAT tool, eliminating the need of conversion.

[Edited at 2019-01-11 14:21 GMT]


 
Drew MacFadyen
Drew MacFadyen  Identity Verified
United States
Local time: 15:33
Spanish to English
+ ...
ProZ.com Plus members receive 10 free credits at TransPDF a month Jan 11, 2019

ProZ.com Plus members have access to 10 free credits at https://www.iceni.com/transpdf.htm per month. Simply login using your ProZ.com profile. Non-members can purchase credits at a discount. That said, I've no information as to TransPDF ability to maintain the formatting of data in Excel tables.

Regards,

Drew MacFadyen


 
Rita Utt
Rita Utt  Identity Verified
France
Local time: 20:33
English to German
+ ...
TOPIC STARTER
Your are right Jan 11, 2019

I'll ask for the original files (before conversion to PDF), much safer.
Have a nice year, all of you and thank you very much for your help.
Rita


 
Luca Gentili
Luca Gentili  Identity Verified
Belgium
Local time: 20:33
English to Italian
+ ...
Just to share knowledge Mar 21, 2019

Non-machine-readable PDFs (i.e. scanned images into a PDF file) can be converted into text by MS Office ONE NOTE. The procedure is as simple as: copy/paste/include the image/pdf file into a OneNote page left-click it and select the option "Copy Text from Picture".


HIH


Luca


[edited] I did not try it on excel-like tables, yet.

[Edited at 2019-03-21 12:43 GMT]


 
Elif Baykara Narbay
Elif Baykara Narbay  Identity Verified
Türkiye
Local time: 22:33
German to Turkish
+ ...
That was helpful! Mar 22, 2019

Luca Gentili wrote:

Non-machine-readable PDFs (i.e. scanned images into a PDF file) can be converted into text by MS Office ONE NOTE. The procedure is as simple as: copy/paste/include the image/pdf file into a OneNote page left-click it and select the option "Copy Text from Picture".


HIH


Luca


[edited] I did not try it on excel-like tables, yet.

[Edited at 2019-03-21 12:43 GMT]


I just tried this on a nasty piece of word file with stupidly pasted tables in picture format (screenshots from a non-editable pdf).

I have already rejected that job and I would do so in the future as well. But I am impressed with this workaround.

The "text" thus obtained still needs a lot of editing but it may come in handy, who knows.

Elif


 
Rolf Keller
Rolf Keller
Germany
Local time: 20:33
English to German
Converters aren't perfect Mar 25, 2019

Samuel Murray wrote:

Programs that outputs PDF do not write the table cells in necessarily the same sequence as in Excel

The other way round isn't problem-free either. While writing a FROM-Excel-Konverter I've just discovered that Excel's cells sometimes aren't in the expected sequence. If the x-th cell in a row is empty, Excel shifts the (x+1)-th cell backwards to position x, so that the current row may include less cells than other rows. Of course one can overcome this, but only IF one knows about it.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Which is the best Software you can buy to convert PDF into Excel ?






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »