[personal profile] gdt

PC-Write is a 1980's word processor, one of the core programs of the "shareware" era. The file format is undocumented but readily reverse engineered.

Resources for writing LibreOffice filters

Writing a filter for LibreOffice looks complex, but it turns out that that's writing a filter directly against LibreOffice isn't encouraged. Rather you write the filter against the librevenge library, which LibreOffice then uses. Other packages can also use librevenge, including some command-line tools to convert to text, HTML, xhtml, ePub, etc.

Source code: sourceforge.net/projects/libwpd.

Presentations: Document Liberation Project: Trying to Achieve Freedom from Vendor Lock, Writing Import Filters for LibreOffice: Diminishing the Number of Reinvented Wheels by Fridrich Strba.

Blogs: Writing import libraries with librevenge, part I: Getting started.

librevenge installation on Fedora 20

$ sudo yum install boost-devel zlib-devel doxygen automake cppunit-devel cppunit-doc
$ git clone git://git.code.sf.net/p/libwpd/librevenge libwpd-librevenge
$ cd libwpd-librevenge
$ sh autogen.sh
$ ./configure
$ make
$ sudo make install

Doxygen documentation is installed into file:///usr/local/share/doc/librevenge/html/index.html.

There is a program to generate a template converter.

$ git clone git://git.code.sf.net/p/libwpd/project-generator libwpd-project-generator
$ cd libwpd-project-generator
$ ./project-generator -p libpcwrite -a 'Glen Turner' -e 'spamtrap@gdt.id.au' -d 'Import library for PC-Write documents'
$ mv libpcwrite ..

Sample code

Let's make the command-line WordPerfect filter compile so we can see how things are meant to work.

Firstly, the library which does the heavy lifting:

$ git clone git://git.code.sf.net/p/libwpd/code libwpd-code
$ cd libwpd-code
$ sh autogen.sh
$ PKG_CONFIG_PATH=/usr/local/lib/pkgconfig ./configure
$ make
$ sudo make install

A first glance that code seems to have one file per structural element of the document. We'll have a closer look at that later.

Secondly, a helper library to export to ODF:

$ git clone git://git.code.sf.net/p/libwpd/libodfgen libwpd-libodfgen
$ cd libwpd-libodfgen
$ sh autogen.sh
$ PKG_CONFIG_PATH=/usr/local/lib/pkgconfig ./configure
$ make
$ sudo make install

Thirdly, a library for WordPerfect and other Corel images:

$ git clone git://git.code.sf.net/p/libwpg/code libwpg-code
$ cd libwpg-code
$ sh autogen.sh
$ PKG_CONFIG_PATH=/usr/local/lib/pkgconfig ./configure
$ make
$ sudo make install

Finally, the command line wrapper:

$ git clone git://git.code.sf.net/p/libwpd/writerperfect libwpd-writerperfect
$ cd libwpd-writerperfect
$ sh autogen.sh
$ PKG_CONFIG_PATH=/usr/local/lib/pkgconfig ./configure
$ make
$ sudo make install

Sample code to example code

$ ./project-generator -p libpcwrite -a 'Glen Turner' -e 'libpcwrite@gdt.id.au' -d 'Import library for PC-Write documents'
$ mv libpcwrite ..
$ cd ../libpcwrite
$ bash autogen.sh
libtoolize: putting auxiliary files in `.'.
libtoolize: copying file `./ltmain.sh'
libtoolize: putting macros in AC_CONFIG_MACRO_DIR, `m4'.
libtoolize: copying file `m4/libtool.m4'
libtoolize: copying file `m4/ltoptions.m4'
libtoolize: copying file `m4/ltsugar.m4'
libtoolize: copying file `m4/ltversion.m4'
libtoolize: copying file `m4/lt~obsolete.m4'
configure.ac:27: installing './ar-lib'
configure.ac:34: installing './config.guess'
configure.ac:34: installing './config.sub'
configure.ac:20: installing './install-sh'
configure.ac:20: installing './missing'
src/conv/html/Makefile.am: installing './depcomp'
parallel-tests: installing './test-driver'
$ PKG_CONFIG_PATH=/usr/local/lib/pkgconfig ./configure
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /usr/bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking whether make supports nested variables... (cached) yes
checking for style of include used by make... GNU
checking for g++... g++
checking whether the C++ compiler works... yes
checking for C++ compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking dependency style of g++... gcc3
checking for ar... ar
checking the archiver (ar) interface... ar
checking for gcc... gcc
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking dependency style of gcc... gcc3
checking how to run the C preprocessor... gcc -E
checking whether we are using the GNU C++ compiler... (cached) yes
checking whether g++ accepts -g... (cached) yes
checking dependency style of g++... (cached) gcc3
checking whether ln -s works... yes
checking whether make sets $(MAKE)... (cached) yes
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking how to print strings... printf
checking for a sed that does not truncate output... /usr/bin/sed
checking for grep that handles long lines and -e... /usr/bin/grep
checking for egrep... /usr/bin/grep -E
checking for fgrep... /usr/bin/grep -F
checking for ld used by gcc... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking the maximum length of command line arguments... 1572864
checking whether the shell understands some XSI constructs... yes
checking whether the shell understands "+="... yes
checking how to convert x86_64-unknown-linux-gnu file names to x86_64-unknown-linux-gnu format... func_convert_file_noop
checking how to convert x86_64-unknown-linux-gnu file names to toolchain format... func_convert_file_noop
checking for /usr/bin/ld option to reload object files... -r
checking for objdump... objdump
checking how to recognize dependent libraries... pass_all
checking for dlltool... no
checking how to associate runtime and link libraries... printf %s\n
checking for archiver @FILE support... @
checking for strip... strip
checking for ranlib... ranlib
checking command to parse /usr/bin/nm -B output from gcc object... ok
checking for sysroot... no
checking for mt... no
checking if : is a manifest tool... no
checking for ANSI C header files... no
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for dlfcn.h... yes
checking for objdir... .libs
checking if gcc supports -fno-rtti -fno-exceptions... no
checking for gcc option to produce PIC... -fPIC -DPIC
checking if gcc PIC flag -fPIC -DPIC works... yes
checking if gcc static flag -static works... no
checking if gcc supports -c -o file.o... yes
checking if gcc supports -c -o file.o... (cached) yes
checking whether the gcc linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking whether -lc should be explicitly linked in... no
checking dynamic linker characteristics... GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking whether stripping libraries is possible... yes
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... yes
checking whether to build static libraries... no
checking how to run the C++ preprocessor... g++ -E
checking for ld used by g++... /usr/bin/ld -m elf_x86_64
checking if the linker (/usr/bin/ld -m elf_x86_64) is GNU ld... yes
checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking for g++ option to produce PIC... -fPIC -DPIC
checking if g++ PIC flag -fPIC -DPIC works... yes
checking if g++ static flag -static works... no
checking if g++ supports -c -o file.o... yes
checking if g++ supports -c -o file.o... (cached) yes
checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking dynamic linker characteristics... (cached) GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking for pkg-config... /usr/bin/pkg-config
checking pkg-config is at least version 0.20... yes
checking for REVENGE... yes
checking for boost/scoped_ptr.hpp... yes
checking for boost/shared_ptr.hpp... yes
checking for native Win32... no
checking for Win32 platform in general... no
checking for CPPUNIT... yes
checking for doxygen... /usr/bin/doxygen
checking for REVENGE_GENERATORS... yes
checking for REVENGE_STREAM... yes
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: creating inc/Makefile
config.status: creating inc/libpcwrite/Makefile
config.status: creating src/Makefile
config.status: creating src/conv/Makefile
config.status: creating src/lib/Makefile
config.status: creating src/lib/libpcwrite.rc
config.status: creating src/conv/html/Makefile
config.status: creating src/conv/html/pcwrite2html.rc
config.status: creating src/conv/raw/Makefile
config.status: creating src/conv/raw/pcwrite2raw.rc
config.status: creating src/conv/text/Makefile
config.status: creating src/conv/text/pcwrite2text.rc
config.status: creating src/test/Makefile
config.status: creating build/Makefile
config.status: creating build/win32/Makefile
config.status: creating docs/Makefile
config.status: creating docs/doxygen/Makefile
config.status: creating libpcwrite-0.1.pc
config.status: creating config.h
config.status: executing depfiles commands
config.status: executing libtool commands
configure:
==============================================================================
Build configuration:
    debug:           no
    docs:            yes
    tests:           yes
    tools:           yes
    werror:          yes
==============================================================================
$ make
make  all-recursive
make[1]: Entering directory `/home/gdt/libpcwrite/libpcwrite'
Making all in build
make[2]: Entering directory `/home/gdt/libpcwrite/libpcwrite/build'
Making all in win32
make[3]: Entering directory `/home/gdt/libpcwrite/libpcwrite/build/win32'
make[3]: Nothing to be done for `all'.
make[3]: Leaving directory `/home/gdt/libpcwrite/libpcwrite/build/win32'
make[3]: Entering directory `/home/gdt/libpcwrite/libpcwrite/build'
make[3]: Nothing to be done for `all-am'.
make[3]: Leaving directory `/home/gdt/libpcwrite/libpcwrite/build'
make[2]: Leaving directory `/home/gdt/libpcwrite/libpcwrite/build'
Making all in inc
make[2]: Entering directory `/home/gdt/libpcwrite/libpcwrite/inc'
Making all in libpcwrite
make[3]: Entering directory `/home/gdt/libpcwrite/libpcwrite/inc/libpcwrite'
make[3]: Nothing to be done for `all'.
make[3]: Leaving directory `/home/gdt/libpcwrite/libpcwrite/inc/libpcwrite'
make[3]: Entering directory `/home/gdt/libpcwrite/libpcwrite/inc'
make[3]: Nothing to be done for `all-am'.
make[3]: Leaving directory `/home/gdt/libpcwrite/libpcwrite/inc'
make[2]: Leaving directory `/home/gdt/libpcwrite/libpcwrite/inc'
Making all in src
make[2]: Entering directory `/home/gdt/libpcwrite/libpcwrite/src'
Making all in lib
make[3]: Entering directory `/home/gdt/libpcwrite/libpcwrite/src/lib'
  CXX      PCWRITEDocument.lo
PCWRITEDocument.cpp:33:30: error: unused parameter 'input' [-Werror=unused-parameter]
 PCWRITEAPI PCWRITEDocument::Confidence PCWRITEDocument::isSupported(librevenge::RVNGInputStream *const input, Type *const type) try
                              ^
PCWRITEDocument.cpp:59:26: error: unused parameter 'document' [-Werror=unused-parameter]
 PCWRITEAPI PCWRITEDocument::Result PCWRITEDocument::parse(librevenge::RVNGInputStream *const input, librevenge::RVNGTextInterface *const document, const PCWRITEDocument::Type type, const char *const) try
                          ^
cc1plus: all warnings being treated as errors
make[3]: *** [PCWRITEDocument.lo] Error 1
make[3]: Leaving directory `/home/gdt/libpcwrite/libpcwrite/src/lib'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/home/gdt/libpcwrite/libpcwrite/src'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/gdt/libpcwrite/libpcwrite'
make: *** [all] Error 2

So the filter is implemented at libpcwrite/src/lib/PCWRITEDocument.cpp PCWRITEDocument::parse().

Determining the file type, PCWRITEDocument::isSupported()

librevenge::RVNGInputStream *const input is the file to check.

Type *const type does something undocumented.

Returns a CONFIDENCE_* value.

PC-Write dot command look like this:

dot-command ::= alt-g . command-letter printable-string new-line
alt-g ::= \x0b
command-letter ::= A | … | Z | . | + | -
printable-string ::= printable-character* | null
printable-character ::= \x20 | | … | \x7e
new-line ::= cr lf
cr ::= \x0d
lf ::= \x0a

From:
Anonymous( )Anonymous This account has disabled anonymous posting.
OpenID( )OpenID You can comment on this post while signed in with an account from many other sites, once you have confirmed your email address. Sign in using OpenID.
User (will be screened)
Account name:
Password:
If you don't have an account you can create one now.
Subject:
HTML doesn't work in the subject.

Message:

If you are unable to use this captcha for any reason, please contact us by email at support@dreamwidth.org


 
Notice: This account is set to log the IP addresses of everyone who comments.
Links will be displayed as unclickable URLs to help prevent spam.

Profile

Glen Turner

July 2017

S M T W T F S
      1
2345678
9 101112131415
16171819202122
23242526272829
3031     

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated 2017-08-22 20:45
Powered by Dreamwidth Studios