Text Processing Methods, Continued (PDF to HTML Conversion)

I am continuing evaluation of some text processing tools that I began in an earlier open notebook post on the same topic. I also had an idea that perhaps I should open my PDF documents in Word, then re-save them as HTML.  That workflow might standardize the formatting to something less Continue reading Text Processing Methods, Continued (PDF to HTML Conversion)

Text Processing Methods for Data Extraction (PDF to HTML conversion)

Tried for mac: VeryPDF “PDF to Any Converter” Did not like it.  PDF to HTML was not good.  PDF to Excel was ok, but one complaint is the documents are placed into a new folder Might be useful: http://sourceforge.net/projects/pdftohtml/. From the first freeware/trial ware software I tried, I’m definitely dog-earing Continue reading Text Processing Methods for Data Extraction (PDF to HTML conversion)