From: moonhkt on
Hi All
I want output the Character in the string one by one.
Now,codePointAt just print the Code points value.


On 1月28日, 上午12時12分, RedGrittyBrick <RedGrittyBr...(a)spamweary.invalid>
wrote:
> moonhkt wrote:
> > On Jan 27, 8:17 pm, Lothar Kimmeringer <news200...(a)kimmeringer.de>
> > wrote:
> >> moonhkt wrote:
> >>> Below not work.
> >> [...]
>
> >>>     char[] ch = new char[];
> >> Because it doesn't compile.
>
> >> What exactly doesn't work. Do you get a wrong output, do you
> >> get an exception (you ignore in the source you provided). A
> >> bit more information would really help to be able to answer
> >> more than "something will be wrong in your code".
>
> >> Regards, Lothar
> >> --
> >> Lothar Kimmeringer                E-Mail: spamf...(a)kimmeringer.de
> >>                PGP-encrypted mails preferred (Key-ID: 0x8BC3CD81)
>
> >> Always remember: The answer is forty-two, there can only be wrong
> >>                  questions!
>
> > Thank. I get below Example. But I can not get the UTF-8 char code.
>
> What do you mean by "UTF-8 char code"? Strictly speaking there is no
> such thing. You might mean "Unicode code-point" or "sequence of octets
> in UTF8-encoding"
>
>
>
>
>
>
>
> > class CodePointAtstring
> > {
> >   public static void main(String[] args)
> >   {
> >     // Declaration of String
> >     String a="\u00fc" + "\u34d7"+ "Welcome to Rose india";
> >     //Displays the Actual String declared above
> >     System.out.println("GIVEN STRING IS="+a);
> >     //  Returns the character (Unicode code point) at the specified
> > index.
> >    System.out.println("Unicode code point at position 0 IN THE STRING
> > IS="+a.codePointAt(0));
> >    System.out.println("Unicode code point at position 1 IN THE STRING
> > IS="+a.codePointAt(1));
> >     System.out.println("Unicode code point at position 2 IN THE STRING
> > IS="+a.codePointAt(2));
> >    System.out.println("Unicode code point at position 3 IN THE STRING
> > IS="+a.codePointAt(3));
> >    System.out.println("Unicode code point at position 6 IN THE STRING
> > IS="+a.codePointAt(6));
> >   }
> > }
>
> > Output
> > java CodePointAtstring
> > GIVEN STRING IS=³?Welcome to Rose india
> > Unicode code point at position 0 IN THE STRING IS=252
> > Unicode code point at position 1 IN THE STRING IS=13527
> > Unicode code point at position 2 IN THE STRING IS=87
> > Unicode code point at position 3 IN THE STRING IS=101
> > Unicode code point at position 6 IN THE STRING IS=111
>
> That seems completely reasonable to me because 252 = 0x00fc and 13527 =
> 0x34d7.
>
> Nothing in your program has anything to do with UTF-8 encoding.
>
> --
> RGB- 隱藏被引用文字 -
>
> - 顯示被引用文字 -- 隱藏被引用文字 -
>
> - 顯示被引用文字 -

From: Lew on
Please, do not top-post.

moonhkt wrote:
> I want output the Character in the string one by one.
> Now,codePointAt just print the Code points value.

'codePointAt()' doesn't print anything. How are you actually printing it?

'codePointAt()' returns an int, not a character.
<http://java.sun.com/javase/6/docs/api/java/lang/String.html#codePointAt(int)>

Most methods that output an int show the int value, not the equivalent
character. If you want to display an int as a character, you have to use a
method that will do that. I don't know offhand of a method in the standard
API that does that, but perusal of the Javadocs might reveal one, otherwise
you'll have to code one yourself or find a third-party library that already
has such.

--
Lew
From: Roedy Green on
On Wed, 27 Jan 2010 16:12:18 +0000, RedGrittyBrick
<RedGrittyBrick(a)spamweary.invalid> wrote, quoted or indirectly quoted
someone who said :

>
>What do you mean by "UTF-8 char code"? Strictly speaking there is no
>such thing. You might mean "Unicode code-point" or "sequence of octets
>in UTF8-encoding"

The point of an encoding is it hides the details of how 16-chars are
inserted into an 8-bit stream. All you are interested in the 16-bit
Java char value or perhaps the java codepoint value if you have 32-bit
chars embedded as well.
--
Roedy Green Canadian Mind Products
http://mindprod.com
Computers are useless. They can only give you answers.
~ Pablo Picasso (born: 1881-10-25 died: 1973-04-08 at age: 91)
From: moonhkt on
Yes. This is my want.
But my output is not same with you. You are correct.

Run in Jcreator 4.5 version
--------------------Configuration: <Default>--------------------
GIVEN STRING IS=羹?elcome to Rose India ??.
Length of string is 27
CodePoints in string is 26
Character[0] is ç¾¹
Character[1] is ??
Character[2] is W
Character[3] is e
Character[4] is l
Character[5] is c
Character[6] is o
Character[7] is m
Character[8] is e
Character[9] is
Character[10] is t
Character[11] is o
Character[12] is
Character[13] is R
Character[14] is o
Character[15] is s
Character[16] is e
Character[17] is
Character[18] is I
Character[19] is n
Character[20] is d
Character[21] is i
Character[22] is a
Character[23] is
Character[24] is ?
Character[25] is ?
Character[26] is .

Process completed.



On Jan 28, 6:38 pm, RedGrittyBrick <RedGrittyBr...(a)spamweary.invalid>
wrote:> moonhkt wrote:
> > RedGrittyBrick wrote:
> >> moonhkt wrote:
> >>> Lothar Kimmeringer wrote:
> >>>> moonhkt wrote:
>
> >>>>> Below not work.
>
> >>>> [...]
> >>>> Because it doesn't compile. What exactly doesn't work. Do you
> >>>> get a wrong output, do you get an exception (you ignore in the
> >>>> source you provided). A bit more information would really help
> >>>> to be able to answer more than "something will be wrong in your
> >>>> code". Regards,
>
> >>> Thank. I get below Example. But I can not get the UTF-8 char
> >>> code.
>
> >> What do you mean by "UTF-8 char code"? Strictly speaking there is
> >> no such thing. You might mean "Unicode code-point" or "sequence of
> >> octets in UTF8-encoding"
>
> >> [...]
>
> >> Nothing in your program has anything to do with UTF-8 encoding.
>
> > Hi All I want output the Character in the string one by one.
> > Now,codePointAt just print the Code points value.
>
> Why not use String's length() and CharAt() methods?
>
> I assume you can disregard characters outside Unicode's Base
> Multilingual Plane (BMP) - if not, I think you'll have to check for
> surrogate pairs. Characters outside the BMP are too big for a char.
>
> -------------------------------------8<-----------------------------------
> public class UnicodeChars {
>    public static void main(String[] args)
>        throws UnsupportedEncodingException {
>
>      // I want console output in UTF-8
>      PrintStream sysout = new PrintStream(System.out, true, "UTF-8");
>
>      // \u00fc is LATIN SMALL LETTER U WITH DIAERESIS;
>      // \u34d7 is a character in CJK Unified Ideographs Extension A.
>      // \uD834\uDD1E" are the surrogate pair for character U+1D11E.
>      // U+1D11E is MUSICAL SYMBOL G CLEF;
>      String a = "\u00fc\u34d7Welcome to Rose India \uD834\uDD1E.";
>
>      int n = a.length();
>      sysout.println("GIVEN STRING IS=" + a);
>      sysout.printf("Length of string is %d%n", n);
>      sysout.printf("CodePoints in string is %d%n",
>          a.codePointCount(0,n));
>      for (int i = 0; i < n; i++) {
>        sysout.printf("Character[%d] is %s%n", i, a.charAt(i));
>      }
>    }}
>
> -------------------------------------8<-----------------------------------
> GIVEN STRING IS=ü㓗Welcome to Rose India 𝄞.
> Length of string is 27
> CodePoints in string is 26
> Character[0] is ü
> Character[1] is 㓗
> Character[2] is W
> Character[3] is e
> Character[4] is l
> Character[5] is c
> Character[6] is o
> Character[7] is m
> Character[8] is e
> Character[9] is
> Character[10] is t
> Character[11] is o
> Character[12] is
> Character[13] is R
> Character[14] is o
> Character[15] is s
> Character[16] is e
> Character[17] is
> Character[18] is I
> Character[19] is n
> Character[20] is d
> Character[21] is i
> Character[22] is a
> Character[23] is
> Character[24] is ?
> Character[25] is ?
> Character[26] is .
>
> --
> RGB

From: Lew on
On Jan 28, 12:57 pm, RedGrittyBrick <RedGrittyBr...(a)spamweary.invalid>
wrote:
> PLEASE DON'T TOP-POST, PLEASE PUT YOUR REPLY AT THE BOTTOM, BELOW ANY
> QUOTED TEXT. THANKS!
>

Actually, it's better to post inline, with comments interspersed with
quoted material.

--
Lew