[virt-tools-list] [PATCH virt-viewer] Don't set LC_ALL=C during build as that breaks python apps
Daniel P. Berrange
berrange at redhat.com
Fri Jul 28 08:53:20 UTC 2017
On Thu, Jul 27, 2017 at 04:00:53PM +0100, Richard W.M. Jones wrote:
> On Tue, Jul 25, 2017 at 01:48:41PM +0100, Daniel P. Berrange wrote:
> > Setting LC_ALL=C breaks python apps doing I/O on UTF-8 source
> > files. In particular this broke glib-mkenums
> >
> > Traceback (most recent call last):
> > File "/usr/bin/glib-mkenums", line 669, in <module>
> > process_file(fname)
> > File "/usr/bin/glib-mkenums", line 406, in process_file
> > line = curfile.readline()
> > File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
> > return codecs.ascii_decode(input, self.errors)[0]
> > UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 849: ordinal not in range(128)
> >
> > Signed-off-by: Daniel P. Berrange <berrange at redhat.com>
> > ---
> >
> > Pushed to fix rawhide build
> >
> > maint.mk | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/maint.mk b/maint.mk
> > index 79104d0..2e70cae 100644
> > --- a/maint.mk
> > +++ b/maint.mk
> > @@ -117,8 +117,8 @@ news-check-lines-spec ?= 1,10
> > news-check-regexp ?= '^\*.* $(VERSION_REGEXP) \($(today)\)'
> >
> > # Prevent programs like 'sort' from considering distinct strings to be equal.
> > -# Doing it here saves us from having to set LC_ALL elsewhere in this file.
> > -export LC_ALL = C
> > +# Doing it here saves us from having to set LC_COLLATE elsewhere in this file.
> > +export LC_COLLATE = C
>
> I don't know what the answer is, but two observations:
>
> (1) We had the same problem in libguestfs and this was our fix:
>
> https://github.com/libguestfs/libguestfs/commit/f861c138550a0c99247a6955aa2c594f380867f4
Hmm, unsetting LC_ALL means the output of the script is potentially affected
by differing sort ordering of the user's locale, which is why i kept setting
LC_COLLATE.
It seems that a better approach would be to use C.UTF-8, and then fallback
to en_US.UTF-8 on systems which lack it, since en_US is still pretty close
to C in its semantics, while supporting UTF-8 everywhere. eg change maint.mk
to be
export LC_ALL = $(shell LC_ALL=C.utf-8 locale -ck charmap 2>/dev/null | \
grep -i UTF-8 1>/dev/null 2>&1 && \
echo "C.UTF-8" || echo "en_US.UTF-8")
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
More information about the virt-tools-list
mailing list