Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accents (non ASCII 7bit chars?) are not handled correctly with vi (git commit, ...) #298

Closed
Skywalker13 opened this issue Aug 21, 2015 · 18 comments

Comments

@Skywalker13
Copy link

Example (release 10ca1f7):

$ mkdir test

$ cd test

$ git init .
Initialized empty Git repository in S:/test/.git/

$ vi README

# Write some accents like ééééé with vi, then there is a problem
# because a space appears between each chars:
  1 é é é é é
# :wq

$ cat README
ééééé
# OK, the content is right even if on the screen (in vi) it was wrong

$ git add README
warning: LF will be replaced by CRLF in README.
The file will have its original line endings in your working directory.

$ git commit
# Same problem with `git commit`, `git commit --amend` of course...
@dscho
Copy link
Member

dscho commented Aug 21, 2015

I cannot reproduce this behavior here. This is what I did:

  1. From my installation of Git for Windows 2.5.0 64-bit (installed via the installer, not portable Git, the only non-default option is that I checked the experimental option "enable filesystem cache"), I called the Git Bash entry in the start menu
  2. Directly after the terminal opened, I called "vi"
  3. Then I copied the accented characters from this ticket (because I have a US keyboard) and pasted them.

This is the result:

vi-works-with-accents

@Skywalker13
Copy link
Author

With a copy/paste I've blanks added just at the end
image

Otherwise I use the same version and without the filesystem cache but it seems totally unrelated.

I'm on Windows 8.1 English and I use a fr_CH keyboard

@nalla
Copy link

nalla commented Aug 21, 2015

Will $ echo $LANG outputs C.UTF-8?

@kostix
Copy link

kostix commented Aug 21, 2015

What does vim output if you execute set enc?, set tenc? and set fenc? in it? (These commands are supposed to explain the current Vim's idea about the three encodings it's aware of.)

@Skywalker13
Copy link
Author

@nalla nothing...

DevBox2@DevBox2 MINGW64 /s
$ echo $LANG


DevBox2@DevBox2 MINGW64 /s
$

@Skywalker13
Copy link
Author

@kostix

:set enc
encoding=latin1
:set tenc
temrencoding=
:set fenc
fileencoding=

@kostix
Copy link

kostix commented Aug 21, 2015

@Skywalker13, please copy-and-paste what running locale in the Git Bash window outputs to you.

@Skywalker13
Copy link
Author

DevBox2@DevBox2 MINGW64 /s
$ locale
LANG=
LC_CTYPE="C.UTF-8"
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_ALL=

@kostix
Copy link

kostix commented Aug 21, 2015

I'd say vim picks up wrong encoding from the environment.

I have RC5 installed on Win XP SP3 32-bit with Cyrillic locale, and I observe the same behaviour as @Skywalker13 -- just I'm unable to directly input European accented characters but my Cyrillic characters appear in the same way: with a single extra space character each.

Manually setting set enc=utf-8 in vim fixes the issue right away.

@nalla
Copy link

nalla commented Aug 21, 2015

I'd say somehow LANG becomes unset. Or is not set proper. How do you start your shell?

@kostix
Copy link

kostix commented Aug 21, 2015

By the way, for me locale outputs

$ locale
LANG=C
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_ALL=

but it's because I've explicitly set LANG=C in the environment to make Git speak English to me.

@kostix
Copy link

kostix commented Aug 21, 2015

Doing export LANG=C.UTF-8 in Git Bash and re-running vim fixes its encoding setting immediately, so I'm with @nalla on it.

Though I'd honestly expect whatever layer is involved with this to make UTF-8 assumed if the user has its LANG or LC_MESSAGES set to a string which misses the encoding part.

@nalla
Copy link

nalla commented Aug 21, 2015

C:\Program Files\Git\etc\profile.d seems to be missing lang.sh.. @dscho what do you think?

@nalla
Copy link

nalla commented Aug 21, 2015

I think a possible permanent workaround is adding

Locale=C
Charset=UTF-8

to your ~/.minttyrc

or by setting LANG=C.UTF-8 explicitly in your ~/.bash_profile.

@kostix
Copy link

kostix commented Aug 21, 2015

@nalla, please note that it's possible to opt for not using mintty for Git Bash during installation.

For instance, I'm not using mintty and on my system bash.exe is hosted by cmd.exe.

@nalla
Copy link

nalla commented Aug 21, 2015

Sure! updated the comment.

@dscho
Copy link
Member

dscho commented Aug 21, 2015

I've explicitly set LANG=C in the environment to make Git speak English to me.

@kostix Please note that I stripped the locales out of the installer before releasing 2.5.0. The reason is a combination of installer size and your comments ;-)

@dscho
Copy link
Member

dscho commented Aug 21, 2015

C:\Program Files\Git\etc\profile.d seems to be missing lang.sh.. @dscho what do you think?

@nalla That makes a ton of sense, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants