From: Jerry Boetje on 2 Mar 2010 10:27 The spec for STRING-CAPITALIZE is defined to break into words where: "a ``word'' is defined to be a consecutive subsequence consisting of alphanumeric characters". This gives interesting results such as "don't" => "Don'T". Any 4th-grader would know that the right capitalization is "Don't". In CLforJava, we use the Unicode definitions for breaking, and we get "Don't". Any thoughts about changing this weirdness? Please, no "but, but it's the specification" comments. I get the spec. This gets more into a transition from the 1980's definition of characters and strings and into the Unicode world. I'd rather talk about the world of today and what we can do about it.
From: Tamas K Papp on 2 Mar 2010 10:33 On Tue, 02 Mar 2010 07:27:16 -0800, Jerry Boetje wrote: > The spec for STRING-CAPITALIZE is defined to break into words where: "a > ``word'' is defined to be a consecutive subsequence consisting of > alphanumeric characters". This gives interesting results such as "don't" > => "Don'T". Any 4th-grader would know that the right capitalization is > "Don't". In CLforJava, we use the Unicode definitions for breaking, and > we get "Don't". Any thoughts about changing this weirdness? Please, no > "but, but it's the specification" comments. I get the spec. This gets > more into a transition from the 1980's definition of characters and > strings and into the Unicode world. I'd rather talk about the world of > today and what we can do about it. The obvious solution seems to be writing and using your own function to capitalize strings (which would be the usual approach to cases where the standard is clear, but you don't like it). Tamas
From: Zach Beane on 2 Mar 2010 10:33 Jerry Boetje <jerryboetje(a)mac.com> writes: > The spec for STRING-CAPITALIZE is defined to break into words where: > "a ``word'' is defined to be a consecutive subsequence consisting of > alphanumeric characters". This gives interesting results such as > "don't" => "Don'T". Any 4th-grader would know that the right > capitalization is "Don't". In CLforJava, we use the Unicode > definitions for breaking, and we get "Don't". Any thoughts about > changing this weirdness? Please, no "but, but it's the specification" > comments. I get the spec. This gets more into a transition from the > 1980's definition of characters and strings and into the Unicode > world. I'd rather talk about the world of today and what we can do > about it. Follow the spec for STRING-CAPITALIZE and provide your own function that does what you prefer, instead of what's mandated by the standard. Zach
From: Pascal Costanza on 2 Mar 2010 11:05 On 02/03/2010 16:27, Jerry Boetje wrote: > The spec for STRING-CAPITALIZE is defined to break into words where: > "a ``word'' is defined to be a consecutive subsequence consisting of > alphanumeric characters". This gives interesting results such as > "don't" => "Don'T". Any 4th-grader would know that the right > capitalization is "Don't". In CLforJava, we use the Unicode > definitions for breaking, and we get "Don't". Any thoughts about > changing this weirdness? Please, no "but, but it's the specification" > comments. I get the spec. This gets more into a transition from the > 1980's definition of characters and strings and into the Unicode > world. I'd rather talk about the world of today and what we can do > about it. Even in the world of today, not everybody speaks only English. Pascal -- My website: http://p-cos.net Common Lisp Document Repository: http://cdr.eurolisp.org Closer to MOP & ContextL: http://common-lisp.net/project/closer/
From: Tim Bradshaw on 2 Mar 2010 11:18
On 2010-03-02 15:27:16 +0000, Jerry Boetje said: > Any thoughts about > changing this weirdness? If you're happy to search the entire corpus of Lisp code for things that changing this might break, make and test modifications to be sure that things will not break, and cover any possible problems if things broke despite your fixes, then yes, I'd be happy to see it changed. |