UTF-8 Encoding problems on Os X
Hi,
I'm trying to work on a full UTF-8 environement.
But I'm still facing some problem that I really don't understand.
Here's a test method exemple, which fails because of encoding:
@Test(groups = {"jpa", "common"})
public void testGetInstanceByProperty() {
ExpenseType et = commonDaoJpa.getInstanceByProperty(ExpenseType.class, "expenseTypeCode", 1);
assertEquals("Pédagogie Obligatoire", et.getExpenseTypeName());
}
The probleme comes from the é char.
The result of the test gives:
java.lang.AssertionError:
Expected :Pédagogie Obligatoire
Actual édagogie Obligatoire
The file - the whole project actually - is UTF-8 encoded, the test runner has the -Dfile.encoding=UTF-8 option.
It seems to me that everything is UTF-8. But the harcoded "Pédagogie Obligatoire" is somehow still in MacRoman.
If I replace the expected value during debug and just input it, it works.
What is actually reading the UTF-8 encoded class with MacRoman encoding???
thanks for helping!
Please sign in to leave a comment.
Hi.
You can try the following. Go to Settings / File encodings. There set Project encoding to UTF-8 (if your files are actually UTF-8). By default that encoding is "System default" which is MacRoman on Mac. Then rebuild (important) and re-run. Some similar problems can be avoided this way.
Alexander.
Alexander, the project Encoding is of course UTF-8.
Files are interpreted as UTF-8 by IDEA, there' don't seem to be any problem on this side.
But still, the problem is there.
I found out what the problem was:
it's at compilation time, javac needs to be told to use UTF-8 with -the -encoding option.
I originally had some problem to produce an application that would be full UTF-8.
The main mistake I made was that I didn't tell the compiler - javac - to use UTF-8 to read the .java source file. if you don't, any hardcoded string in your classes might not be rendered or used correctly.
But even now that I use javac with UTF-8 encoding, I set the JVM option file.encoding=UTF-8 for my tomcat application, Idea open the project in UTF-8 only, no exception, I still get messages like these with some test code:
log.info(System.getProperty("file.encoding") + ": é à ê");
produces: UTF-8: é à ê
which indicate that somewhere another encoding has been used.
I'd be really happy if someone had a clue about where this can possibly happen!
.nodje
That may be related to just the process output, that is an OS level thing rather than your program itself : I'm not sure at all that the standard console is able to display UTF-8 ...