IntelliJ editors silently precompose diacritics

Answered

When pasting text with decomposed diacritics into PyCharm, they are silently converted to a combined character. This also seems to be the case in PHPStorm.

As a test case, take the string hebräisch. The middle 'a' is made up of two code points - latin small letter a (U+0061) and combining diaeresis (U+0308).

If I copy this into PyCharm, then back out, it is converted to hebräisch, where the middle 'a' is latin small letter a with diaeresis (U+00E4).

This is unambiguously wrong behaviour - the original characters should be preserved, and this introduces the worrying possibility that merely copying strings into the editor can break or change the behaviour of code.

Is there an option to disable this behaviour? In either case, this should not be the default.

 

 

0
1 comment

Thank you for the report! Reproducible on my side, ticket https://youtrack.jetbrains.com/issue/PY-24374

0

Please sign in to leave a comment.