Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some special characters are changed with asian characters #2185

Open
tonivj5 opened this issue Dec 13, 2015 · 24 comments
Open

Some special characters are changed with asian characters #2185

tonivj5 opened this issue Dec 13, 2015 · 24 comments
Labels
💊 bug Something isn't working 🙇‍♂️ help wanted Need your help

Comments

@tonivj5
Copy link
Contributor

tonivj5 commented Dec 13, 2015

For example: ñ, ó, ú, é, í, !

I have uploaded a code with those errors. Maybe does it need the tag ?

The code uploaded: Code

My version of gogs is: 0.8.0.1212, the last one.

Edit:
It's so strange... Same code, same repo and different characters... (Left in line 6 and right in line 10)
Error code

Saludos! 👍

@unknwon
Copy link
Member

unknwon commented Dec 14, 2015

<meta charset="utf-8"/>

It does not help... 😓

@unknwon
Copy link
Member

unknwon commented Dec 14, 2015

What is the real encoding of this file?

@unknwon unknwon added the status: needs feedback Tell me more about it label Dec 14, 2015
@unknwon unknwon added this to the 0.9.0 milestone Dec 14, 2015
@tonivj5
Copy link
Contributor Author

tonivj5 commented Dec 14, 2015

Ok, sorry 😅

Info of this file: text/plain; charset=iso-8859-1

I have created with Eclipse, if this help...

@unknwon
Copy link
Member

unknwon commented Dec 14, 2015

After debug.. the encoding Go detects is Big5... really don't how to solve this kind of problem, don't know how GitHub handles it.

@unknwon
Copy link
Member

unknwon commented Dec 14, 2015

Maybe https://github.com/gogits/chardet is leaking detection for iso-8859-1.

@unknwon unknwon added 💊 bug Something isn't working 🙇‍♂️ help wanted Need your help and removed status: needs feedback Tell me more about it labels Dec 14, 2015
@unknwon unknwon removed this from the 0.9.0 milestone Dec 14, 2015
@unknwon
Copy link
Member

unknwon commented Dec 14, 2015

@unknwon
Copy link
Member

unknwon commented Dec 14, 2015

Good news... golang.org/x/net/html/charset is giving the encoding windows-1252, not same but showing result correctly.

image

image

@tonivj5
Copy link
Contributor Author

tonivj5 commented Dec 14, 2015

Woow, so good! But, it's so strange... I thought that It would be more difficult of to find the trouble 👍

@unknwon
Copy link
Member

unknwon commented Dec 14, 2015

Not sure why was not knowing the existence of this Go package, but tested with some files seems working well. Changing code now.

@unknwon
Copy link
Member

unknwon commented Dec 14, 2015

Haha.... turns out Gogs is already using this package... not sure why was not using it for detection...

@tonivj5
Copy link
Contributor Author

tonivj5 commented Dec 14, 2015

hahaha, the mystery of code 👻

@unknwon
Copy link
Member

unknwon commented Dec 14, 2015

@unknwon unknwon added status: needs feedback Tell me more about it and removed 🙇‍♂️ help wanted Need your help labels Dec 14, 2015
@unknwon unknwon added this to the 0.9.0 milestone Dec 14, 2015
@tonivj5
Copy link
Contributor Author

tonivj5 commented Dec 14, 2015

It's correct!! 👏 👍

When could you update the binary?

so thanks!

@unknwon
Copy link
Member

unknwon commented Dec 14, 2015

@xxxTonixxx Official ones you have to wait until next release, planned around 2015-12-18, 3rd-party one you can find on https://gobuild.io/gogits/gogs/master (wait for like 10-15mins) it is very hard to use... but it does the work sometimes...

@tonivj5
Copy link
Contributor Author

tonivj5 commented Dec 14, 2015

Ok, good info, so thanks! 😄

@unknwon unknwon removed the status: needs feedback Tell me more about it label Dec 14, 2015
@unknwon unknwon closed this as completed Dec 14, 2015
@hho2002
Copy link

hho2002 commented Dec 28, 2015

gbk编码文件, diff会乱码,检测为windows-1252编码
Cannot display GBK content in diff page #711

@unknwon
Copy link
Member

unknwon commented Dec 28, 2015

Can you reproduce on demo site?

@hho2002
Copy link

hho2002 commented Dec 28, 2015

在 try.gogs.io 下 diff 显示为空白页,查看文件内容显示乱码
testgbk

本地搭建的gogs 0.8.10.1217 diff可以显示,但是显示乱码,加入调式语句,显示检测为windows-1252编码
2015/12/28 22:01:36 [I] Diff Encoding: windows-1252

@unknwon
Copy link
Member

unknwon commented Dec 28, 2015

Thanks, looks like go's sub repo is broken, I'll roll back changes.

@unknwon unknwon reopened this Dec 29, 2015
@unknwon
Copy link
Member

unknwon commented Jan 1, 2016

@hho2002 things should be good now.

@xxxTonixxx we need to wait more time about this issue, the go sub-repo seems is not really usable to me.

@unknwon unknwon removed this from the 0.9.0 milestone Jan 1, 2016
@unknwon unknwon added the 🙇‍♂️ help wanted Need your help label Jan 1, 2016
@xiegeo
Copy link

xiegeo commented Jan 14, 2016

Run into the same problem, so here are more test cases:
The gb files are Chinese simplified, utf files show what's expected.

The test cases are relevant to my use case. I have small csv files just like the first case with very few Chinese Characters. But, as the second test case shows, there might not be enough information to tell which encoding it is.

I propose a way to define what encoding should be used when a file is not utf-8. I think Gogs is normally use in small enough groups that a globe setting in app.ini should be sufficient.

@unknwon
Copy link
Member

unknwon commented Jan 15, 2016

@xiegeo
Copy link

xiegeo commented Jan 15, 2016

@unknwon thanks, that was exactly what I was look for.

@IssueHuntBot
Copy link

@0maxxam0 has funded $5.00 to this issue. See it on IssueHunt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💊 bug Something isn't working 🙇‍♂️ help wanted Need your help
Projects
None yet
Development

No branches or pull requests

5 participants