Prev: java cacerts file vs MS windows trusted root certificate authorities store
Next: show JDialog every 30 seconds
From: François R on 4 Nov 2009 04:08 I redirect system.out to a JTextArea with the following class private class TextAreaOutputStream extends OutputStream { JTextArea textArea; TextAreaOutputStream(JTextArea textArea) { this.textArea = textArea; } public void flush() { textArea.repaint(); } public void write(int b) { //try { textArea.append(new String(new byte[] {(byte)b})); // } catch (UnsupportedEncodingException e){e.printStackTrace();} } } and I use the class with JTextArea msg = new JTextArea(); System.setOut(new PrintStream(new TextAreaOutputStream(msg), true)); This works well except when I have a character like Ä (latin capital letter C with caron, '\u010C') in a string, which is displayed as ? in the text area whereas msg.append(string); would be ok. How could I correct the code above to have such a letter well formed ? Thanks François
From: Mayeul on 4 Nov 2009 07:25 François R wrote: > This works well except when I have a character like Č (latin capital > letter C with caron, '\u010C') in a string, which is displayed as ? in > the text area whereas > msg.append(string); would be ok. > How could I correct the code above to have such a letter well > formed ? You have a character encoding problem. Both the constructors PrintStream(OutputStream,boolean) and String(byte[]) assume you're using your platform's default character encoding to translate chars to bytes and vice-versa. I expect your platform's default character to _not_ handle characters such as U+10C, hence them being replaced with question marks. The fix is to specify a character encoding to use, a unicode one, for instance utf-8. You can do that by constructing your PrintStream this way: new PrintStream(new TextAreaOutputStream(msg), true, "utf-8") And implementing your TextAreaOutputStream differently : it should store the bytes in a buffer and wait til the OutputStream is flushed, thus probably aligned after a character's final byte, then transform the bytes received into a String and update the TextArea with it. This could be done by writing the bytes you receive to a ByteArrayOutputStream, and whenever it is flushed, fetch the byte[] and build a String with it as such: new String(bytes, "utf-8") Note: one may think that using utf-16 instead of utf-8 would guarantee a character to be 2-bytes and thus the solution easier to implement. Except that *really* special characters (higher-than-U+FFFF characters) still are be 4-bytes instead of 2-bytes with utf-16. ucs-4 may work better if well-supported, I'm not sure. -- Mayeul
From: Roedy Green on 5 Nov 2009 05:53 On Wed, 4 Nov 2009 01:08:55 -0800 (PST), Fran�ois R <rappazf(a)gmail.com> wrote, quoted or indirectly quoted someone who said : > >This works well except when I have a character like ? (latin capital >letter C with caron, '\u010C') in a string, which is displayed as ? in >the text area whereas >msg.append(string); would be ok. >How could I The way I would do it is direct the output to a file using UTF-8 encoding, or at least an encoding that supports the letters you need. Then view it in some sort of viewer/editor that understands encodings. See http://mindprod.com/applet/fileio.html for the code to set up a PrintWriter to a file. -- Roedy Green Canadian Mind Products http://mindprod.com An example (complete and annotated) is worth 1000 lines of BNF.
From: François R on 5 Nov 2009 10:28
On Nov 4, 1:25 pm, Mayeul <mayeul.marg...(a)free.fr> wrote: > François R wrote: > > This works well except when I have a character like Ä (latin capital > > letter C with caron, '\u010C') in a string, which is displayed as ? in > > the text area whereas > > msg.append(string); would be ok. > > How could I correct the code above to have such a letter well > > formed ? > > You have a character encoding problem. > > Both the constructors PrintStream(OutputStream,boolean) and > String(byte[]) assume you're using your platform's default character > encoding to translate chars to bytes and vice-versa. > > I expect your platform's default character to _not_ handle characters > such as U+10C, hence them being replaced with question marks. > > The fix is to specify a character encoding to use, a unicode one, for > instance utf-8. > > You can do that by constructing your PrintStream this way: > > new PrintStream(new TextAreaOutputStream(msg), true, "utf-8") > > And implementing your TextAreaOutputStream differently : it should store > the bytes in a buffer and wait til the OutputStream is flushed, thus > probably aligned after a character's final byte, then transform the > bytes received into a String and update the TextArea with it. > > This could be done by writing the bytes you receive to a > ByteArrayOutputStream, and whenever it is flushed, fetch the byte[] and > build a String with it as such: > > new String(bytes, "utf-8") > > Note: one may think that using utf-16 instead of utf-8 would guarantee a > character to be 2-bytes and thus the solution easier to implement. > Except that *really* special characters (higher-than-U+FFFF characters) > still are be 4-bytes instead of 2-bytes with utf-16. > ucs-4 may work better if well-supported, I'm not sure. > > -- > Mayeul Thanks a lot for the suggestion ! I tried this: try { System.setOut(new PrintStream(new TextAreaOutputStream(msg), true, "utf-8")); } catch .... and private class TextAreaOutputStream extends OutputStream { JTextArea textArea; ByteArrayOutputStream buffer = new ByteArrayOutputStream(); TextAreaOutputStream(JTextArea textArea) { this.textArea = textArea; } public void flush() { //textArea.repaint(); try { textArea.append(buffer.toString("utf-8")); buffer.reset(); } catch (UnsupportedEncodingException e){e.printStackTrace();} } public void write(int b) { buffer.write(b); //try { //textArea.append(new String(new byte[] {(byte)b})); // } catch (UnsupportedEncodingException e){e.printStackTrace();} } } And it works well as it seems, with name like CÞek or ÄÞek properly displayed. François |