Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs.watch doesn't support Chinese characters #2088

Closed
daysv opened this issue Jul 1, 2015 · 21 comments
Closed

fs.watch doesn't support Chinese characters #2088

daysv opened this issue Jul 1, 2015 · 21 comments
Labels
confirmed-bug Issues with confirmed bugs. fs Issues and PRs related to the fs subsystem / file system.

Comments

@daysv
Copy link

daysv commented Jul 1, 2015

rename

info

io.js, v2.3.1
Windows 7, 64bit

@trevnorris
Copy link
Contributor

I just tried this on Linux 3.19 x64 and it ran fine. This may be a windows specific issue.

Can you please post the test script to make sure we're attempting the same thing?

@daysv
Copy link
Author

daysv commented Jul 1, 2015

var root = path.join(__dirname,'./doc');
fs.watch(root,function(event,filename){
    console.log(event,filename);
});

Create a new folder and try to rename it to 新建文夹件

@trevnorris
Copy link
Contributor

Can you output filename.length and also new Buffer(filename)?

@daysv
Copy link
Author

daysv commented Jul 1, 2015

I can't output them. @trevnorris

@trevnorris
Copy link
Contributor

The command console.log(new Buffer(filename)) doesn't print anything? Make sure it's the only statement in the callback.

@daysv
Copy link
Author

daysv commented Jul 1, 2015

sorry.It works and I got them.

@trevnorris
Copy link
Contributor

Can you post the output?

@daysv
Copy link
Author

daysv commented Jul 1, 2015

length: 15
buffer: <Buffer c3 a6 c2 96 c2 b0 c3 a5 c2 bb c2 ba c3 a6 c2 96 c2 87 c3 a4 c2 bb c2 b6 c3 a5 c2 a4 c2 b9>

@trevnorris
Copy link
Contributor

Thanks. And yeah. It's borked on Linux too. Didn't test the right thing.

@trevnorris trevnorris added fs Issues and PRs related to the fs subsystem / file system. confirmed-bug Issues with confirmed bugs. labels Jul 1, 2015
@bnoordhuis
Copy link
Member

That's because for better or worse, src/fs_event_wrap.cc encodes the filename as a one-byte (i.e. ISO-8859-1) string. That was changed in 2013 in f674b09, before that the filename was encoded as UTF-8.

On Windows, UTF-8 is the right encoding because libuv decodes the filename to UTF-8 before passing it on to io.js. On Unices, it's complicated; filenames don't have an encoding, they're just byte strings, except on OS X, where they are stored as NFD-normalized UTF-8.

@trevnorris
Copy link
Contributor

@daysv Have a work around for you. Try the following:

fs.watch(PATH, function(event, filename) {
  console.log(Buffer(filename, 'binary').toString());
});

That will reinterpret the one-byte encoded string as UTF8.

@daysv
Copy link
Author

daysv commented Jul 1, 2015

@trevnorris Thanks! ^_^

@trevnorris
Copy link
Contributor

@bnoordhuis Thanks for the commit reference and reasoning. Think this, along with the work around, should be documented?

EDIT: Or possibly this work around could be implemented internally to make the difference transparent.

@bnoordhuis
Copy link
Member

I think we should fix it but it's part of a broader theme where file path handling is sub-optimal. We probably need some platform-specific code for encoding them to JS strings.

@UncleBill
Copy link

I reported same issue at nodejs/node-v0.x-archive#25504.

@seishun
Copy link
Contributor

seishun commented Oct 15, 2015

@bnoordhuis

On Unices, it's complicated; filenames don't have an encoding, they're just byte strings

I don't get it. Doesn't Node.js assume UTF-8 encoding everywhere else?

@UncleBill
Copy link

Hi, @seishun , is your patch included in the last release(v5.1.1)? This issue hasn't solved yet.(Tested on windows 10 64 bit)

@seishun
Copy link
Contributor

seishun commented Dec 5, 2015

@UncleBill the fix was postponed to v6.0.0.

@UncleBill
Copy link

@seishun Oh, got it. Thanks!

@jasnell
Copy link
Member

jasnell commented Mar 9, 2016

@trevnorris @bnoordhuis ... thinking about this one further... one possible solution would be to add an encoding option on the fs.watch options. If specified, the filename would be automatically passed as the result of Buffer(filename, 'binary').toString(options.encoding)

jasnell added a commit to jasnell/node that referenced this issue Mar 25, 2016
This makes several changes:

1. Allow path/filename to be passed in as a Buffer on fs methods
2. Add `options.encoding` to fs.readdir, fs.readdirSync, fs.readlink,
   fs.readlinkSync and fs.watch.
3. Documentation updates

For 1... it's now possible to do:

```js
fs.open(Buffer('/fs/foo/bar'), 'w+', (err, fd) => { });
```

For 2...
```js
fs.readdir('/fs/foo/bar', {encoding:'hex'}, (err,list) => { });

fs.readdir('/fs/foo/bar', {encoding:'buffer'}, (err, list) => { });
```

encoding can also be passed as a string

```js
fs.readdir('/fs/foo/bar', 'hex', (err,list) => { });
```

The default encoding is set to UTF8 so this addresses the
discrepency that existed previously between fs.readdir and
fs.watch handling filenames differently.

Fixes: nodejs#2088
Refs: nodejs#3519
Alternate: nodejs#3401
@jorangreef
Copy link
Contributor

Thanks @jasnell!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. fs Issues and PRs related to the fs subsystem / file system.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants