cygpath -w 'a"b'

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

cygpath -w 'a"b'

Brien Oberstein
cygpath -w 'a"b' doesn't seem to translate the double quotes into a windows
accesible file name.

should it, and if not, what is the proper way to translate from cygwin
filenames with special mapped characters (eg " and : )?


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Reply | Threaded
Open this post in threaded view
|

Re: cygpath -w 'a"b'

Warren Young-2
On Jul 14, 2016, at 8:36 AM, Brien Oberstein <[hidden email]> wrote:
>
> cygpath -w 'a"b' doesn't seem to translate the double quotes into a windows
> accesible file name.

Double quotes are illegal on NTFS:

  https://msdn.microsoft.com/library/windows/desktop/aa365247.aspx

> what is the proper way to translate from cygwin
> filenames with special mapped characters (eg " and : )?

If you look at such a file name in Explorer, Cygwin (?) seems to be mapping double-quotes to U+F022, which is currently not defined within Unicode:

  http://www.fileformat.info/info/unicode/char/f022/

That’s fine as far as Cygwin goes, but it still isn’t going to make native Windows programs believe that double-quotes are legal in file names.

While playing with all of this, I stumbled across an actual cygpath bug:

  $ mkdir 'the "foo" directory’
  $ cygpath -w 'the "foo" directory/' | od -c
  0000000   t   h   e       "   f   o   o 357 200 242       d   i   r   e
  0000020   c   t   o   r   y   \  \n
  0000027

That is, it translates the second double-quote only.

Attempting something more like what Brien talks about also fails:

  $ mkdir 'a"b'
  $ explorer 'a"b'                        # opens my Documents folder!
  $ explorer "$(cygpath -w 'a"b')"        # ditto
  $ explorer $(echo -e "a\xEF\x80\xA2b")  # opens expected folder

Strange stuff.
--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Reply | Threaded
Open this post in threaded view
|

Re: cygpath -w 'a"b'

Warren Young-2
On Jul 14, 2016, at 9:24 AM, Warren Young <[hidden email]> wrote:
>
> If you look at such a file name in Explorer, Cygwin (?) seems to be mapping double-quotes to U+F022, which is currently not defined within Unicode:
>
>  http://www.fileformat.info/info/unicode/char/f022/

I think this may be a typo in whatever code is doing the translation, because U+FF02 is a typographically distinct variation of the double-quote character:

  http://www.fileformat.info/info/unicode/char/ff02/index.htm

That would give the visual appearance of double-quotes in file names in Explorer without violating the restriction on 0x22 characters in NTFS.

In fact, it might be a thinko rather than a typo: 0x22 -> 0xF022.  It looks like someone thought they could just add 0xF000 to the character value, when the correct value os 0xFF02.

…And this is why we need man7/{ascii,latin-1,unicode}.7 files in Cygwin. :)

(In the SHTDI spirit, I’m looking into some sort of automated way to generate such files from the miscfiles originals.)
--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Reply | Threaded
Open this post in threaded view
|

Re: cygpath -w 'a"b'

Corinna Vinschen-2
On Jul 14 11:12, Warren Young wrote:

> On Jul 14, 2016, at 9:24 AM, Warren Young <[hidden email]> wrote:
> >
> > If you look at such a file name in Explorer, Cygwin (?) seems to be mapping double-quotes to U+F022, which is currently not defined within Unicode:
> >
> >  http://www.fileformat.info/info/unicode/char/f022/
>
> I think this may be a typo in whatever code is doing the translation,
> because U+FF02 is a typographically distinct variation of the
> double-quote character:
>
>   http://www.fileformat.info/info/unicode/char/ff02/index.htm
>
> That would give the visual appearance of double-quotes in file names
> in Explorer without violating the restriction on 0x22 characters in
> NTFS.
>
> In fact, it might be a thinko rather than a typo: 0x22 -> 0xF022.  It
> looks like someone thought they could just add 0xF000 to the character
> value, when the correct value os 0xFF02.
Nope.  The idea(*) was *not* to provide a typiographically similar
character, the idea was to allow to express characters disallowed in the
Windows namespace but allowed in the POSIX namespace by transposing them
into the private use area on creating the filename and converting it
back to the untransposed ASCII char when reading the filename from disk.
You can't perform this action by converting the character to another
*valid* character.

Btw., there's a section in the Cygwin User's Guide on special characters:

https://cygwin.com/cygwin-ug-net/using-specialnames.html#pathnames-specialchars


Corinna


(*) Original idea by the Interix folks, picked up by Cygwin for
    compatibility.

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: cygpath -w 'a"b'

Nellis, Kenneth-3
It seems that the length of the string passed to cygpath -w
affects whether double quotes are translated or not:

$ cygpath -w '"12345"' | od -cAn
   "   1   2   3   4   5   "  \n
$ cygpath -w '"123456"' | od -cAn
   "   1   2   3   4   5   6 357 200 242  \n
$ cygpath --version
cygpath (cygwin) 2.5.2
Path Conversion Utility
Copyright (C) 1998 - 2016 Cygwin Authors
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ uname -srv
CYGWIN_NT-6.1 2.5.2(0.297/5/3) 2016-06-23 14:29
$

--Ken Nellis
Reply | Threaded
Open this post in threaded view
|

Re: cygpath -w 'a"b'

Corinna Vinschen-2
On Jul 14 18:22, Nellis, Kenneth wrote:
> It seems that the length of the string passed to cygpath -w
> affects whether double quotes are translated or not:
>
> $ cygpath -w '"12345"' | od -cAn
>    "   1   2   3   4   5   "  \n
> $ cygpath -w '"123456"' | od -cAn
>    "   1   2   3   4   5   6 357 200 242  \n

Thanks a lot for this additional info!  This hint gave me the idea what
goes wrong.  It only affects *relative* paths.  I applied a patch and
uploaded new developer snapshots to https://cygwin.com/snapshots/


Thanks again,
Corinna

--
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

signature.asc (836 bytes) Download Attachment