The scope of this study is limited to focus on an implementation of a conversion tool (P2X); developed to automatically convert large batches of PDF tabular data (PDF tables) to spreadsheet format (MS Excel). We begin by introducing the PDF specification standards on table structure. A scenario example of the problem and a description of the conversion tool (P2X) architecture. Specific details of the algorithms and applications used during the PDF to plain text format (PTF) conversion process follows. A brief overview of the reformatting process and a formalization of the table tags that we identified using regular expressions will be introduced. Lastly, a description of the GUI, its images, and functionality will be discussed in the User Interface section.Code. import java.io.*; import java.awt.*; import javax.swing.*; import java.lang. String; import java.lang.Object; import java.util. ... Program Author: LaToyia DeVonne Penny Date of Completion: June 17, 2008 Program Description: This program is an ... their functions, and the CDTMOD1 Visual Basic application to parse the files based on specific characters in the HTML source code of each document.
|Title||:||Design & Implementation of a PDF to Excel Conversion Tool (P2X).|
|Publisher||:||ProQuest - 2008|