8 Apr
Does anyone know where I can get a script (either Perl or PHP) that will parse .doc or .rtf files? I’ve tried using RTF::Parser and RTF::TEXT::Converter, but they just return a bunch of blank lines. Win32::OLE would work nicely, but it relies on the underlying win32 architecture to run, i.e. it needs to run on a Windows box.
Basically a client has a site that we maintain, and the content that changes regularly is provided in a standard Word document, which would make it very simple to automate the updates if I could just figure out how to parse the document.
4 Responses for "Word or RTF parser"
Why not just have them save it as a text file instead? This is easy enough to do from inside Word for most users. And I assume you know how to parse a plain text file…
Why not just have them save it as a text file instead? This is easy enough to do from inside Word for most users. And I assume you know how to parse a plain text file…
Yeah, I tried that, unfortunately that removes all the formatting, which removes all the data delimiters (table cells, etc.) which then make it impossible to seperate out the data from the labels.
Yeah, I tried that, unfortunately that removes all the formatting, which removes all the data delimiters (table cells, etc.) which then make it impossible to seperate out the data from the labels.
Leave a reply