UTF-8 Support
Does anyone know how to enable IntelliJ to support UTF-8 files? I have a
.properties file that is UTF-8 but it can't handle the file properly. Same
goes for a .java file.
--Grant Gochnauer
http://www.gochnauer.org
Please sign in to leave a comment.
Also, I have UTF-8 selected as the file encoding in the General Preferences
in Settings but it doesn't make any difference.
"Grant Gochnauer" <ggochnauer@braunconsult.com> wrote in message
news:cbs8fn$bh2$1@is.intellij.net...
>
>
>
.properties files are always ISO8859-1. You need to use \u notation to put characters belonging to other charsets in it. Use native2ascii to convert to \u notation.
From the javadocs for java.util.Properties:
The load and store methods load and store properties in a simple line-oriented format specified below. This format uses the ISO 8859-1 character encoding. Characters that cannot be directly represented in this encoding can be written using Unicode escapes ; only a single 'u' character is allowed in an escape sequence. The native2ascii tool can be used to convert property files to and from other character encodings.
That said, it would be nice to have the ability to edit .properties files and but any character in them, and have IDEA automatically do the native2ascii on it when they are saved.
I've been thinking about writing a plugin to do this, but it would be nice to have it as a standard feature.
Can you get a little more specific? What exactly is the problem?
I am doing the very same thing and for the most part it works pretty well. That said I do have some minor problems: If I do not use a BOM (byte order marker at the start of the file) Idea sometimes fails to recognize the file as Unicode and displays only garbage. If I include the BOM, Idea sometimes displays a square symbol at the start of the file. Sometimes it displays the file just fine, but if I then insert some text at the very start of the file, Idea inserts the text before the BOM which in effect breaks the file.
And
Not really true. In fact the java.util.Properties class can read *.properties files only when they have ISO8859-1 encoding.
But it is perfectly valid to store your *.properties files in UTF-8 and either transform them as needed to ascii by escaping all characters or use a replacement custom class rather than java.util.Properties to handle the files (which is what I did).
I'm using UTF-8 files withing IDEA without any problem. I hava the "File Encoding" option set to UTF-8, and this is all I have done.
The only caveat is that now I can't edit my property files withing IDEA (not without changing the encoding back to iso-8859-1 first, anyway), as these files MUST be encoded using iso-8859-1. I've been using Attesoro for my property file editing needs.
This is the same problem I am having with that little square showing up at
the beginning of the UTF-8 files.
--Grant
"Stephen Kelvin" <mail@gremlin.info> wrote in message
news:26958672.1088602940727.JavaMail.itn@is.intellij.net...
>
That said I do have some minor problems: If I do not use a BOM (byte order
marker at the start of the file) Idea sometimes fails to recognize the file
as Unicode and displays only garbage. If I include the BOM, Idea sometimes
displays a square symbol at the start of the file. Sometimes it displays the
file just fine, but if I then insert some text at the very start of the
file, Idea inserts the text before the BOM which in effect breaks the
file.
>
I filed a tracker entry for that. Feel free to vote for it ;)
http://intellij.net/tracker/idea/viewSCR?publicId=35543
Hello Grant,
GG> This is the same problem I am having with that little square showing
GG> up at the beginning of the UTF-8 files.
That's because Java doesn't handle BOM unfortunately.
That's why you see this little square.
Ideally, when opening a file with a BOM, IDEA should ignore this marker.
--
Guillaume Laforge
http://glaforge.free.fr/weblog
Ant has a task to do native2ascii, so if you have an ant build file for your distribtution, you don't have to worry about this problem.
Keep your source properties in UTF-8, or whatever, and when you go to deploy, have ant do it's thing.
<native2ascii encoding="UTF-8" src="${src.dir}" dest="${build.classes.dir}"
includes="*/.properties"/>
Now if your ant build has properties itself (in UTF-8, for instance), and you use those to write out properties files to change your runtime deployment around, then watch out. :)