Just as with PHP, OOXML, and specifically, docx is not my favorite format, but when I use it, I prefer tracking the history using my preferred SCM of choice, Git. What makes it perfect to track documents is not only the fact that setting up a repository takes one command and a few miliseconds, but its ability to use an external program to transform artifacts (files) to text before displaying differences, which results in meaningful diffs.
The process of setting up an environment like this is described best in
Chapter 7.2 of Pro Git. The solution I found best to convert docx files
to plain text was docx2txt, especially since it's available as a Debian
package in the official repositories, so it takes only an apt-get install
docx2txt
to have it installed on a Debian/Ubuntu box.
The only problem was that Git executes the text conversion program with the
name of the input file given as the first and only argument, and docx2txt
(in contrast with catdoc or antiword, which uses the standard output) saves
the text content of foo.docx
in foo.txt
. Because of this, I needed to
create a wrapper in the form of the following small shell script.
#!/bin/sh
docx2txt <$1
That being done, the only thing left to do is configuring Git to use this wrapper for docx files by issuing the following commands in the root of the repository.
$ git config diff.docx.textconv /path/to/wrapper.sh
$ echo "*.docx diff=docx" >>.git/info/attributes