[virt-tools-list] [PATCH virt-viewer] Don't set LC_ALL=C during build as that breaks python apps

Daniel P. Berrange berrange at redhat.com
Fri Jul 28 08:53:20 UTC 2017


On Thu, Jul 27, 2017 at 04:00:53PM +0100, Richard W.M. Jones wrote:
> On Tue, Jul 25, 2017 at 01:48:41PM +0100, Daniel P. Berrange wrote:
> > Setting LC_ALL=C breaks python apps doing I/O on UTF-8 source
> > files. In particular this broke glib-mkenums
> > 
> >     Traceback (most recent call last):
> >       File "/usr/bin/glib-mkenums", line 669, in <module>
> >         process_file(fname)
> >       File "/usr/bin/glib-mkenums", line 406, in process_file
> >         line = curfile.readline()
> >       File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
> >         return codecs.ascii_decode(input, self.errors)[0]
> >     UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 849: ordinal not in range(128)
> > 
> > Signed-off-by: Daniel P. Berrange <berrange at redhat.com>
> > ---
> > 
> > Pushed to fix rawhide build
> > 
> >  maint.mk | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/maint.mk b/maint.mk
> > index 79104d0..2e70cae 100644
> > --- a/maint.mk
> > +++ b/maint.mk
> > @@ -117,8 +117,8 @@ news-check-lines-spec ?= 1,10
> >  news-check-regexp ?= '^\*.* $(VERSION_REGEXP) \($(today)\)'
> >  
> >  # Prevent programs like 'sort' from considering distinct strings to be equal.
> > -# Doing it here saves us from having to set LC_ALL elsewhere in this file.
> > -export LC_ALL = C
> > +# Doing it here saves us from having to set LC_COLLATE elsewhere in this file.
> > +export LC_COLLATE = C
> 
> I don't know what the answer is, but two observations:
> 
> (1) We had the same problem in libguestfs and this was our fix:
> 
> https://github.com/libguestfs/libguestfs/commit/f861c138550a0c99247a6955aa2c594f380867f4

Hmm, unsetting LC_ALL means the output of the script is potentially affected
by differing sort ordering of the user's locale, which is why i kept setting
LC_COLLATE.


It seems that a better approach would be to use C.UTF-8, and then fallback
to en_US.UTF-8 on systems which lack it, since en_US is still pretty close
to C in its semantics, while supporting UTF-8 everywhere. eg change maint.mk
to be

 export LC_ALL = $(shell LC_ALL=C.utf-8 locale -ck charmap 2>/dev/null | \
                         grep -i UTF-8 1>/dev/null 2>&1 && \
                         echo "C.UTF-8"  || echo "en_US.UTF-8")
 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




More information about the virt-tools-list mailing list