[tex4ht] [bug #611] Random SIGSEGV of tex4ht due to invalid memory accesses
Oliver Freyermuth
puszcza-hackers at gnu.org.ua
Thu Oct 5 21:31:55 CEST 2023
URL:
<http://puszcza.gnu.org.ua/bugs/?611>
Summary: Random SIGSEGV of tex4ht due to invalid memory
accesses
Project: tex4ht
Submitted by: olifre
Submitted on: Thu Oct 5 19:31:55 2023
Category: None
Priority: 5 - Normal
Severity: 7 - Important
Status: None
Privacy: Public
Assigned to: None
Originator Email:
Open/Closed: Open
Discussion Lock: Any
_______________________________________________________
Details:
I've been haunted by "Illegal storage address" for large documents with many
fonts for quite a while (coming and going with tex4ht releases and even
changing depending on shell environment), and often, unrelated changes in the
document (e.g. reducing fonts) fix it.
I believe I have finally hunted down the underlying issue to an invalid memory
access in tex4ht — reproducible with an MWE (which does not crash), visible
with valgrind / gdb.
Reporducer:
1) Create foo.tex with content:
-----
\documentclass{scrbook}
\begin{document}
\section{foo}
\subsection{bar}
Foo
\end{document}
------
2) Run:
make4ht --utf8 --output-dir html foo
3) Re-run the tex4ht step with valgrind:
valgrind tex4ht -cmozhtf -utf8 foo.dvi
This reveals:
-----
==4487== Conditional jump or move depends on uninitialised value(s)
==4487== at 0x10EE14: main (tex4ht.c:8099)
==4487==
==4487== (action on error) vgdb me ...
==4487== Continuing ...
==4487== Conditional jump or move depends on uninitialised value(s)
==4487== at 0x10EE16: main (tex4ht.c:8108)
==4487==
==4487== (action on error) vgdb me ...
==4487== Continuing ...
==4487== Invalid read of size 4
==4487== at 0x10E794: main (tex4ht.c:8741)
==4487== Address 0x8beb1f8 is 8 bytes before a block of size 2 alloc'd
==4487== at 0x48407C4: malloc (vg_replace_malloc.c:431)
==4487== by 0x11851B: malloc_chk (tex4ht.c:1481)
==4487== by 0x10EC30: main (tex4ht.c:7104)
==4487==
-----
The invalid read is most worrisome, it originates from the source lines:
8740 if( span_on && !in_span_ch && !ignore_chs && !in_accenting
8741 && (default_font != font_tbl[cur_fnt].num) ){
It is caused by the part:
(default_font != font_tbl[cur_fnt].num)
being evaluated, while the index cur_fmt is negative:
(gdb) p default_font
$12 = -1
(gdb) p cur_fnt
$13 = -1
This yields an invalid read. If the document has many (many!) fonts, the
dynamically allocated and subsequently freed memory from opendir/closedir
looking for the htf files ends up right before the font_tbl array, and
depending on page boundaries, this read with negative index may yield an
invalid read / SIGSEGV.
Since I don't understand the full logic of the code, I'm not fit to propose a
(good) fix.
It seems this might be affecting other users, too, looking for reports of
"Illegal storage address" on tex stackexchange which in some cases were
"fixed" by unrelated document changes.
Nota bene:
The two "Conditional jump or move depends on uninitialised value(s)" are from
the lines:
if( value == htf_4hf[mid].ch ){
and
} else if( value < htf_4hf[mid].ch ){
since htf_4hf seems to be used (in some cases) before being initialized. This
does not lead to a crash, though, since it's not an invalid read.
_______________________________________________________
Reply to this item at:
<http://puszcza.gnu.org.ua/bugs/?611>
_______________________________________________
Message sent via/by Puszcza
http://puszcza.gnu.org.ua/
More information about the tex4ht
mailing list.