I've always found hyperref one of the best features of LaTeX, and although
it supported Unicode, certain accented characters (in my case, ő and ű) were
treated abnormally in case of PDF metadata fields, such as author and title.
I mostly ignored the issue and reworded the contents, until I met a situation,
where changing the data was not an option. To illustrate the issue, the
following example was saved as wrong.tex
and got compiled with the
pdflatex wrong.tex
command.
\documentclass{report}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage[unicode, pdftitle={Árvíztűrő tükörfúrógép}]{hyperref}
\begin{document}
foobar
\end{document}
The result could be checked with pdfinfo
and was far from what I expected.
$ pdfinfo wrong.pdf | grep Title
Title: Árvízt¶r® tükörfúrógép
I searched the web, and was disappointed at first, having found unsolved forum
threads, such as one written by also a Hungarian. Finally, I opened up the
TeX section of the Stack Exchange network, and started typing a title for my
question. Based on this, the forum offered a number of probably related posts,
and I browsed through them out of curiosity. As it turned out, the solution
lied within a post about Polish characters in pdftitle, and in retrospect,
it seems obvious – like any other great idea. As Schweinebacke writes,
“The optional argument of \usepackage
is read by the LaTeX kernel, so hyperref
cannot change scanning of the argument”. The problem can be eliminated simply by
moving the title setup into a separate \hypersetup
command – and behold, the
pilcrow and the registered sign is gone, as seen in the following example.
$ diff wrong.tex right.tex
5c5,6
< \usepackage[unicode, pdftitle={Árvíztűrő tükörfúrógép}]{hyperref}
---
> \usepackage[unicode]{hyperref}
> \hypersetup{pdftitle={Árvíztűrő tükörfúrógép}}
$ pdfinfo right.pdf | grep Title
Title: Árvíztűrő tükörfúrógép