Perl Unicode Cookbook: Make File I/O Default to UTF-8
℞ 17: Make file I/O default to utf8
If you’ve ever had the misfortune of seeing the Unicode warning “wide character in print”, you may have realized that something forgot to set the appropriate Unicode-capable encoding on a filehandle somewhere in your program. Remember that the rule of Unicode handling in Perl is “always encode and decode at the edges of your program”.
You can easily Decode STDIN
, STDOUT, and STDERR
as UTF-8 by default or Decode STDIN
, STDOUT, and STDERR
per local settings as a default, or you can use binmode
to set the encoding on a specific filehandle.
Alternately, you can set the default encoding on all filehandles through the entire program, or on a lexical basis. As documented in perldoc perlrun, the -C
flag and the PERL_UNICODE
environment variable are available. Use the D
option to make all filehandles default to UTF-8 encoding. That is, files opened without an encoding argument will be in UTF-8:
$ perl -CD ...
# or
$ export PERL_UNICODE=D
The open pragma configures the default encoding of all filehandle operations in its lexical scope:
use open qw(:encoding(UTF-8));
Note that the open
pragma is currently incompatible with the autodie
pragma.
Previous: ℞ 16: Decode Standard Filehandles as Locale Encoding
Series Index: The Standard Preamble
Tags
Feedback
Something wrong with this article? Help us out by opening an issue or pull request on GitHub