data, data journalism

How to: convert PDF to Excel using ScraperWiki

Sometimes data can be presented in PDF files and we prefer to analyse and manipulate in Excel.

There are many websites online converters but ScraperWiki is fast and easy, taking less than five steps to get the Excel file ready to be scrutinised.

Step 1. Get the data

I am going to use a PDF file from the United Nations on the number of military and police contribution as an example. It can be downloaded here.

Screen Shot 2016-02-16 at 09.54.19

Step 2. Upload the file on ScraperWiki – PDFTables

Go here (PDFTables) and select the document from the computer that you want to convert:

Screen Shot 2016-02-16 at 14.43.05

Step 3. Preview the paper and download it as Excel

Once it is converted, the previsualization will show that text and numbers from the PDF have been placed into rows and columns.

This file can be downloaded as an Excel file as a single sheet, multiple sheets, CSV or XML:

Screen Shot 2016-02-16 at 09.56.22

Step 4. Open the spreadsheet in Excel

Now the data is clean and ready to be explored:

Screen Shot 2016-02-16 at 09.57.10

Do you have any other examples? Let me know in the comments

Advertisements
Standard

One thought on “How to: convert PDF to Excel using ScraperWiki

  1. Pingback: How to: convert PDF to Excel using ScraperWiki | DHS News

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s