How to execute spawns with UNICODE arguments?

  • (2 Pages)
  • +
  • 1
  • 2

16 Replies - 595 Views - Last Post: 19 December 2013 - 06:06 AM Rate Topic: -----

#1 Sciuriware  Icon User is offline

  • New D.I.C Head

Reputation: 1
  • View blog
  • Posts: 21
  • Joined: 17-August 13

How to execute spawns with UNICODE arguments?

Posted 15 December 2013 - 12:44 AM

Hi there,
I've got some 500000 files on my Computer running XP; I manage my files mostly by JAVA applications.
Some of those files have names outside the ASCII set, like a' or << (chevron).
Those files are saved web pages from the internet or songs from CD's.
Until now I had no problems, because copying those files or setting them READ-ONLY in JAVA
doesn't seem to be a problem, as the File class can handle strange file names.

Recently I started moving to APPLE and, as OSX is in fact UNIX++, many features are not
implemented in JAVA, like owner, group or full file mode (rwxrwxrwx).
So I let JAVA spawn the standard commands chown, chgrp and chmod.

But .... these fail on UNICODE file names; obviously 'exec()' can only use ASCII arguments.
The same problem occurs when you want to record file names in a text file:
the String class is UNICODE capable, but storing text in a file leads to ASCII (single byte characters).

Did I overlook something or is there a work-around,
or should I write a program to rename all my files to plain ASCII?
;JOOP!

Is This A Good Question/Topic? 0
  • +

Replies To: How to execute spawns with UNICODE arguments?

#2 g00se  Icon User is online

  • D.I.C Lover
  • member icon

Reputation: 2675
  • View blog
  • Posts: 11,298
  • Joined: 20-September 08

Re: How to execute spawns with UNICODE arguments?

Posted 15 December 2013 - 06:03 AM

Quote

and, as OSX is in fact UNIX++
Hmm, that's a bit debatable ;) Actually i shouldn't really say much as my experience with OSX is limited. When i have tried, at the command line, to use the system as some variant of a BSD box i've been disappointed. On more than one occasion, what was specified in the man page was simply untrue as it wasn't implemented.

Anyway after that digression, let's try to get to the actual question ;) Your best scenario would be to be able to see Unicode at the command line as well. What, for starters, is the value of System property file.encoding

System.out.println(System.getProperty("file.encoding"));

Was This Post Helpful? 0
  • +
  • -

#3 Sciuriware  Icon User is offline

  • New D.I.C Head

Reputation: 1
  • View blog
  • Posts: 21
  • Joined: 17-August 13

Re: How to execute spawns with UNICODE arguments?

Posted 15 December 2013 - 08:07 AM

Hi, CEHJ,
it says: Cp1252

In the meantime I have searched the internet again
and I get the impression that this is a common(-ly accepted) problem.
May be I should just remap all my file names with characters >127.
I found that (e.g.) even zip 'destroys' such names by applying its own 'remapping'.
Funny: of course you'll store your remapping scheme in a file; how? (LOL)

Before giving in, I'm curious if there is a simple solution.
Thanks anyway for replying.
;JOOP!
Was This Post Helpful? 0
  • +
  • -

#4 g00se  Icon User is online

  • D.I.C Lover
  • member icon

Reputation: 2675
  • View blog
  • Posts: 11,298
  • Joined: 20-September 08

Re: How to execute spawns with UNICODE arguments?

Posted 15 December 2013 - 03:11 PM

Quote

Hi, CEHJ,
it says: Cp1252

Well, that's odd. That encoding is a Windows one
Was This Post Helpful? 0
  • +
  • -

#5 cfoley  Icon User is offline

  • Cabbage
  • member icon

Reputation: 1948
  • View blog
  • Posts: 4,048
  • Joined: 11-December 07

Re: How to execute spawns with UNICODE arguments?

Posted 15 December 2013 - 03:48 PM

What happens when you type ls in the terminal? Does it preserver the unicode characters or does if give you a clue on how to change the encoding?
Was This Post Helpful? 0
  • +
  • -

#6 g00se  Icon User is online

  • D.I.C Lover
  • member icon

Reputation: 2675
  • View blog
  • Posts: 11,298
  • Joined: 20-September 08

Re: How to execute spawns with UNICODE arguments?

Posted 15 December 2013 - 04:58 PM

Quote

But .... these fail on UNICODE file names; obviously 'exec()' can only use ASCII arguments.


You needn't worry. See the following (in Linux rather than OSX)
import net.proteanit.io.IOUtils;

import java.io.*;


public class WeirdFilenames {
    public static void main(String[] args) throws Exception {
        String s = (args.length < 1) ? "«o».txt" : args[0];
        File f = new File(s);
        f.createNewFile();

        Process p = Runtime.getRuntime().exec(new String[] { "chmod", "600", s });
        IOUtils.outputProcessStreams(p);
    }
}



...

goose@vaio:/tmp$ strace -f java WeirdFilenames «X» 2>&1 | tee w.log


...
[pid  4705] close(10)                   = 0
[pid  4705] close(11)                   = 0
[pid  4705] getdents64(4, /* 0 entries */, 32768) = 0
[pid  4705] close(4)                    = 0
[pid  4705] fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
[pid  4705] execve("/usr/local/bin/chmod", ["chmod", "600", "\302\253X\302\273"], [/* 54 vars */]) = -1 ENOENT (No such file or directory)
[pid  4705] execve("/usr/bin/chmod", ["chmod", "600", "\302\253X\302\273"], [/* 54 vars */]) = -1 ENOENT (No such file or directory)
[pid  4705] execve("/bin/chmod", ["chmod", "600", "\302\253X\302\273"], [/* 54 vars */]) = 0
[pid  4697] <... read resumed> "", 4)   = 0
[pid  4697] close(12 <unfinished ...>
...

goose@vaio:/tmp$ ls -l *.txt
-rw------- 1 goose goose 0 Dec 15 23:38 «o».txt
goose@vaio:/tmp$ 



This post has been edited by g00se: 15 December 2013 - 05:10 PM
Reason for edit:: Extra observation

Was This Post Helpful? 1
  • +
  • -

#7 Sciuriware  Icon User is offline

  • New D.I.C Head

Reputation: 1
  • View blog
  • Posts: 21
  • Joined: 17-August 13

Re: How to execute spawns with UNICODE arguments?

Posted 16 December 2013 - 12:12 AM

View Postg00se, on 15 December 2013 - 11:11 PM, said:

Quote

Hi, CEHJ,
it says: Cp1252

Well, that's odd. That encoding is a Windows one

Indeed, mea culpa, that was on MSWindows XP.
On Mavericks (run from Eclipse) it says: US-ASCII
;JOOP!
Was This Post Helpful? 0
  • +
  • -

#8 Sciuriware  Icon User is offline

  • New D.I.C Head

Reputation: 1
  • View blog
  • Posts: 21
  • Joined: 17-August 13

Re: How to execute spawns with UNICODE arguments?

Posted 16 December 2013 - 12:23 AM

View Postcfoley, on 15 December 2013 - 11:48 PM, said:

What happens when you type ls in the terminal? Does it preserver the unicode characters or does if give you a clue on how to change the encoding?

Give me some time to experiment with this.
Obviously some characters I tested with (e.g. <<) are still ASCII.
;JOOP!
Was This Post Helpful? 0
  • +
  • -

#9 cfoley  Icon User is offline

  • Cabbage
  • member icon

Reputation: 1948
  • View blog
  • Posts: 4,048
  • Joined: 11-December 07

Re: How to execute spawns with UNICODE arguments?

Posted 16 December 2013 - 12:47 AM

I did a bit of a search. Apparently there are settings you can change to use a different encoding for the terminal. But there are mixed reports of success and I would worry about portability to other OSX computers.
Was This Post Helpful? 1
  • +
  • -

#10 g00se  Icon User is online

  • D.I.C Lover
  • member icon

Reputation: 2675
  • View blog
  • Posts: 11,298
  • Joined: 20-September 08

Re: How to execute spawns with UNICODE arguments?

Posted 16 December 2013 - 06:57 AM

Quote

On Mavericks (run from Eclipse) it says: US-ASCII

OK, so in the terminal, you're (normally) restricted to that character set, which unfortunately is even more restricted than Cp1252. Your 'chevrons', for instance, don't even appear in US-ASCII. This does not mean of course, that Unicode file names cannot be used from Java if you can get them into your code. Unicode escape sequences are your friend

This post has been edited by g00se: 16 December 2013 - 09:56 AM
Reason for edit:: Clarification

Was This Post Helpful? 1
  • +
  • -

#11 Sciuriware  Icon User is offline

  • New D.I.C Head

Reputation: 1
  • View blog
  • Posts: 21
  • Joined: 17-August 13

Re: How to execute spawns with UNICODE arguments?

Posted 16 December 2013 - 10:36 AM

If I summarise all the pro's and con's so far, I fear that the ease of using UNICODE
falls to the problems already mentioned.
I believe that you can put several utilities and programs into displaying UNICODE
and handling it well, but to what price and to what restrictions.
I asked myself: "what was the reason NOT to rename those few files to UASCII?".
And to be honest, the reasons (2) are:
1) my wife is Vietnamese and some of her files had Vietnamese names,
2) whenever I find interesting info on the net, I (blindly) download the web page for later.
Well, Vietnamese is still readable (OK, not by you) when all the accents are removed.
And those web pages with weird names don't lose their value with a normalised file name.

When I started my career, it was all CAPITALS (6-bits), with UNIX came the lower case letters,
and now you can type Hindi and Arabic. Nice, I can live with those 8-bits characters.
Thanks to both of you so far, but I'll come back to you later.
;JOOP!
Was This Post Helpful? 0
  • +
  • -

#12 g00se  Icon User is online

  • D.I.C Lover
  • member icon

Reputation: 2675
  • View blog
  • Posts: 11,298
  • Joined: 20-September 08

Re: How to execute spawns with UNICODE arguments?

Posted 16 December 2013 - 11:37 AM

Vietnamese is really only a problem if you don't have good Unicode locale support

String s = (args.length < 1) ? "\u1eb7\u1eac\u1ec2\u1ef5\u0303.txt" : args[0];

can be used in the Java app i posted earlier. What i see in my terminal is shown below

Posted Image
Was This Post Helpful? 1
  • +
  • -

#13 Sciuriware  Icon User is offline

  • New D.I.C Head

Reputation: 1
  • View blog
  • Posts: 21
  • Joined: 17-August 13

Re: How to execute spawns with UNICODE arguments?

Posted 17 December 2013 - 09:38 AM

[quote name='g00se' date='16 December 2013 - 12:58 AM' timestamp='1387151896' post='1947597']

Quote

Process p = Runtime.getRuntime().exec(new String[] { "chmod", "600", s });

Hello again,
I tried several program snippets like above, but on OSX/JAVA 7u45 it just does not work.
I have to write some smart code that will rename UNICODE names in several places to ASCII.
Pity, because it displays well in the GUI.
Thanks for the help.
;JOOP!
Was This Post Helpful? 0
  • +
  • -

#14 g00se  Icon User is online

  • D.I.C Lover
  • member icon

Reputation: 2675
  • View blog
  • Posts: 11,298
  • Joined: 20-September 08

Re: How to execute spawns with UNICODE arguments?

Posted 17 December 2013 - 10:32 AM

Quote

I tried several program snippets like above, but on OSX/JAVA 7u45 it just does not work.
I have to write some smart code that will rename UNICODE names in several places to ASCII.

Please post it and i'll have a look
Was This Post Helpful? 0
  • +
  • -

#15 Sciuriware  Icon User is offline

  • New D.I.C Head

Reputation: 1
  • View blog
  • Posts: 21
  • Joined: 17-August 13

Re: How to execute spawns with UNICODE arguments?

Posted 17 December 2013 - 12:43 PM

View Postg00se, on 17 December 2013 - 06:32 PM, said:

Please post it and i'll have a look

I guess you mean the 'smart' code? No, thanks.
And the sample you gave did not work.
No worries.
I'll be on this forum with more important problems soon.
;JOOP!

P.S.: EE sends (or seems to send) me now daily a "Neglected Question Alert".
Is that real? Can I safely try to stop it?
Was This Post Helpful? 0
  • +
  • -

  • (2 Pages)
  • +
  • 1
  • 2