Hi,
I have been trying to solve a problem I have with UTF-8 encoded sql scripts via the maven plugin. Data is being inserted into the database with encoding "windows-1252", even though the file.encoding System property is set to "UTF-8".
I tried various ways to workaround this, maven settings, plugins etc, but to no avail. Working with the code I see that, even though the system property for file.encoding is UTF-8, the Charset.defaultCharset() still returns "windows-1252".
Further testing (via groovy scripts) seems to verify the file.encoding has no effect on the default charset, contrary to the information I found online:
- $ groovy -Dfile.encoding=test -e "println System.getProperty(\"file.encoding\")"
- test
- $ groovy -Dfile.encoding=UTF-8 -e "println java.nio.charset.Charset.defaultCharset().name()"
- windows-1252
My "solution" at this point appears to be to fork the code and change the default charset in UtfBomAwareReader to use the file.encoding property (diff is below), but I also wanted to ask is this is a problem for others, and whether this change would make sense going into the main codebase?
Best regards,
Gary
- diff --git a/liquibase-core/src/main/java/liquibase/resource/UtfBomAwareReader.java b/liquibase-core/src/main/java/liquibase/resource/UtfBomAwareReader.java
- index af65682..7297f71 100644
- --- a/liquibase-core/src/main/java/liquibase/resource/UtfBomAwareReader.java
- +++ b/liquibase-core/src/main/java/liquibase/resource/UtfBomAwareReader.java
- @@ -28,7 +28,7 @@ public class UtfBomAwareReader extends Reader {
- public UtfBomAwareReader(InputStream in) {
- pis = new PushbackInputStream(in, 4);
- - this.defaultCharsetName = Charset.defaultCharset().name();
- + this.defaultCharsetName = System.getProperty("file.encoding");
- }
- public UtfBomAwareReader(InputStream in, String defaultCharsetName) {