


Next, we’ve to call the VBA Shell function to open the PDF file. ⧪ Step 2: Opening the PDF File (by Using the VBA Shell Command) PDF_Path = "E:\ExcelDemy\standardnormaltable" Set MyWorksheet = ActiveWorkbook.Worksheets("Sheet1")Īpplication_Path = "C:\Program Files\Adobe\Acrobat DC\Acrobat\Acrobat.exe" These include the worksheet name, the range of the cells, the location of the application through which the PDF file will be opened ( Adobe Reader in this example), and the location of the PDF file.

Now I’ll show you can copy data from the PDF file to the Excel worksheet through step-by-step analysis.įirst of all, you have to declare the necessary inputs. Here we’ve got a PDF file called standardnormaltable.pdf that contains a table of the normal distribution.Īnd we’ve opened a worksheet called Sheet1 in an Excel workbook where we’ll copy the data from the PDF file. So, without further delay, let’s go to our main discussion today. for i in range(len(df)):ĭf = tabula.read_pdf('file_path/file.An Overview to Extract Specific Data from PDF to Excel Using VBA (Step-by-Step Analysis) To save these tables separately, you will have to use a for loop that will save each table in an Excel file.

df = tabula.read_pdf('file_path/file.pdf', pages = 'all') The first element corresponds to the first table, the second to the second table, etc. Here, the variable df will be in fact a list of DataFrame. Ideal to convert them then in Excel file ! This function automatically detects the tables in a pdf and converts them into DataFrames. Then, we will read the pdf with the read_pdf() function of the tabula library. We load the libraries in our text editor : import tabula Receive the method import tabula import pandas as pdĭf = tabula.read_pdf('file_path/file.pdf', pages = 'all')ĭf.to_excel('file_path/file.xlsx') Photo by Darius Cotoi on Unsplash PDF containing several tables
