TEXMFHOME on Windows (for users with long names, diacritics or spaces in their names)

Reinhard Kotucha reinhard.kotucha at gmx.de
Sat May 25 04:25:06 CEST 2024


On 2024-05-24 at 22:48:34 +0200, Denis Bitouzé wrote:

 > Couldn't Powershell be useful here?

Hi Denis,
this is a good point.  But I must admit that I don't have access to
a Windows machine, hence I can't try anything myself.

You have to distinguish between "Windows PowerShell" (powershell.exe),
which is part of any Windows installation, and "PowerShell" (pwsh.exe).
AFAIK the latter must be downloaded and installed explicitly.

The difference is that powershell.exe uses UTF16LE by default (this is
how filenames are encoded internally on Windows) and that pwsh.exe
uses UTF-8 by default.

   https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_character_encoding?view=powershell-7.4

If the batch file invoking the Perl script is replaced with a pwsh
script I suppose that things behave as on Unix.  The drawback is that
users have to download and install external software.

When using "Windows PowerShell" (powershell.exe) it should be possible
to switch from UTF-16LE to UTF-8 within the script but in this case
UTF-8 sequences are preceded by a Byte Order Mark (BOM) which can
cause trouble.

On the other hand, if it's possible to tell cmd.exe to use UTF-8 and
the installer works as expected with non-ASCII characters in
file/directory names, I assume that the UTF-8 BOM doesn't hurt because
it's certainly present.

The BOM is necessary for UTF-16 and UTF-32 encodings because
characters are stored as binary numbers where the byte order matters.
Characters in UTF-8 are encoded as sequences of bytes and thus don't
need a BOM.

On 2024-05-24 at 21:57:04 +0200, Siep Kroonenberg via tex-live wrote:

 > And if a script sets the codepage to utf-8, then this setting will
 > NOT be inherited by child processes.

I don't think that this is the case with PowerShell.  There is a
variable called $OutputEncoding.  The name wouldn't make sense if the
specified encoding is only used internally and not by child processes.

If the sole problem is that forcing cmd.exe to use UTF-8 requires
admin permissions I believe that it's worthwhile to keep an eye on
PowerShell which uses Unicode by default.

Regards,
  Reinhard

-- 
------------------------------------------------------------------
Reinhard Kotucha                            Phone: +49-511-3373112
Marschnerstr. 25
D-30167 Hannover                    mailto:reinhard.kotucha at gmx.de
------------------------------------------------------------------



More information about the tex-live mailing list.