I have an XML file that is encoded in LATIN-1, and I'm trying to read it and manipulate it, then output it back in the same encoding. My problem is that the data that I read from the files isnt manipulatable, and then when I write the output to the new file, I get null characters for all the line breaks. If anyone has any ideas, they would be much appreciated!
Example:
File file = new File(filepath);
BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream(file), "iso-8859-1"));
ArrayList output = new ArrayList();
String line = null;
while ((line = in.readLine()) != null){
//heres where I get the first problem
if (line.contains("<?xml"){
System.out.println("XML Header, not copying"); //this never gets printed, was just a debug test
}
else {
output.add(line);
System.out.println(line);
}
}
DoStuff(output); //NYI because each arraylist node has unexpected values because the read isn't working properly
OutputStreamWriter out = new OutputStreamWriter(new FileOutputStream(new File(filepath)), "iso-8859-1"));
for (int i=0; i<output.size(); i++){
out.write((String)output.get(i));
}
Heres the fire two lines of output I get in the console from line 14:
ÿþ< ? x m l v e r s i o n = " 1 . 0 " e n c o d i n g = " i s o - 8 8 5 9 - 1 " ? > < ? x m l - s t y l e s h e e t h r e f = " R e n d e r i n g / l o g . x s l " t y p e = " t e x t / x s l " ? >
It doesn't *really* matter what charset I output the file in. The XML renderer that runs the file seems to be able to handle any charset (I made a backup of the input file and manually changed its encoding via Notepad++ to test), but I just can't figure out why I'm getting bad data from the input.
This post has been edited by Laggy: 10 May 2012 - 11:48 AM

New Topic/Question
Reply




MultiQuote








|