Converting TeX to HTML: was Looking for a serious TeX hacker

Deyan Ginev deyan.ginev at gmail.com
Sun Jun 18 17:09:40 CEST 2023


Hi Bill,

You've waded in some off-topic waters, but I'll try to briefly clarify...

On Sat, Jun 17, 2023 at 8:45 PM William F Hammond <hmwlfsr at yahoo.com> wrote:

> Hi Deyan,
>
> You write:
>
> > ...
> > LaTeXML is a substantial software application written in
> > Perl, ...
> > . . .
> > Lastly, most of latexml's functionality does not require a
> > locally installed texlive or MikTeX (similarly to pandoc,
> > plastex, tralics, hevea, ...) and can be installed and used
> > standalone.
>
> Have you (or Bruce) thought about a second LaTeXML output
> stream for tagged PDF?  I expect that eventually arXiv and
> others will want that.
>

Based on what I have been hearing from the LaTeX team recently, this should
be solvable via the native PDF workflow(s) for LaTeX.
So we shouldn't need to detour development effort from the structured
markup tools to also emit PDF. I'm sure there will be more news in due time.

That said, if someone external wrote a post-processor for latexml that
emits PDF as you described, and did it well, I don't think we have a priori
reasons to reject such a contribution from the main project.
But I wouldn't encourage spending effort there, when one can be fruitfully
working on improving support for our HTML conversions.

Lastly, there is also the web-native perspective. We get a lot of great
features from browser vendors nowadays.
Recently, I had noticed a mention that Chrome's "Print - Save as PDF"
feature now emits Tagged PDF from HTML pages:
https://pdfa.org/chrome-adds-support-for-tagged-pdf/

I haven't really tried or studied those capabilities in detail, but it's
certainly easier to imagine developing some small upgrades to our HTML
dialect optimized for that print feature (and a print-oriented CSS theme),
than to develop a PDF emitter from scratch.

Greetings,
Deyan


> Another question is whether anyone in the LaTeX Project has
> thought about writing LaTeX for well-tagged PDF beginning
> with LaTeXML's XML.
>
> Just asking.  :-)
>
>                               -- Bill
>
>
> Email: hmwlfsr at yahoo.com
>        gellmu at gmail.com
> https://www.facebook.com/william.f.hammond
> http://www.albany.edu/~hammond/
>
>
> 𝑻𝒉𝒆 𝒕𝒊𝒎𝒆 𝒕𝒐 𝒔𝒂𝒗𝒆 𝒂 𝒅𝒆𝒎𝒐𝒄𝒓𝒂𝒄𝒚 𝒊𝒔 𝒃𝒆𝒇𝒐𝒓𝒆 𝒊𝒕
> 𝒊𝒔 𝒍𝒐𝒔𝒕.
>    -- 𝐊𝐞𝐧 𝐁𝐮𝐫𝐧𝐬
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tug.org/pipermail/texhax/attachments/20230618/cd392ca3/attachment.htm>


More information about the texhax mailing list.