From: Anthony Jones on

"Jed" <jedatu(a)newsgroups.nospam> wrote in message
news:C6F5325A-BC98-4467-B611-12B96BC745D1(a)microsoft.com...
> Actually, this is the CDOSYS code I tried.
>
> msg.BodyPart.Charset = "utf-8"
> msg.HTMLBody = Message
> msg.HTMLBodyPart.Charset = "utf-8"
> msg.Send
>
> I accidentally copied the CDONTS code in the last post.
>

Try this in a VBScript file:-

Option Explicit

Const cdoSendUsingMethod =
"http://schemas.microsoft.com/cdo/configuration/sendusing"
Const cdoFlushBuffersOnWrite =
"http://schemas.microsoft.com/cdo/configuration/flushbuffersonwrite"
Const cdoSMTPServerPickupDirectory =
"http://schemas.microsoft.com/cdo/configuration/smtpserverpickupdirectory"
Const cdoSendUsingPickup = 1

Dim oMsg : Set oMsg = CreateObject("CDO.Message")

Set oMsg.Configuration = CreateObject("CDO.Configuration")

With oMsg.Configuration.Fields
.Item(cdoSendUsingMethod) = cdoSendUsingPickup
.Item(cdoFlushBuffersOnWrite) = True
.Item(cdoSMTPServerPickupDirectory) = "G:\temp\pickup" '*** change this
.Update
End With

oMsg.BodyPart.charset = "UTF-8"

oMsg.From = "Dude(a)somewhere.com"
oMsg.To = "Bloke(a)elsewhere.com"
oMsg.Subject = "Testing"
oMsg.HTMLBody = "<html><body>?</body></html>"

oMsg.Send

MsgBox "Done"


Change the pick folder to a temp folder on your macine.

When executed open the resulting eml file in Outlook Express (double click
it). Does the ? appear correctly without other strange characters?

Open the eml file in notepad you should see something like:-

X-Receiver: Bloke(a)elsewhere.com
X-Sender: Dude(a)somewhere.com
From: <Dude(a)somewhere.com>
To: <Bloke(a)elsewhere.com>
Subject: Testing
Date: Sun, 12 Nov 2006 19:46:27 -0000
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_NextPart_000_0001_01C70693.3DE9F350"
Content-Class: urn:content-classes:message

This is a multi-part message in MIME format.

------=_NextPart_000_0001_01C70693.3DE9F350
Content-Type: text/plain;
charset="UTF-8"
Content-Transfer-Encoding: base64

wqPigqzFkg0K

------=_NextPart_000_0001_01C70693.3DE9F350
Content-Type: text/html;
charset="UTF-8"
Content-Transfer-Encoding: 8bit

<html><body>?</body></html>
------=_NextPart_000_0001_01C70693.3DE9F350--

I deleted some headers for clarity. However you can see that by specifying
UTF-8 on the main message body part before writing anything to the message
has caused it to cascade the UTF-8 encoding to the alternative parts.

What happens you change the code so that the configuration sends using port
25 to your SMTP server and you specify your real email address as the
receiver. Does the email look ok when it arrives in outlook/thunderbird?




From: Jed on
Thanks for the input Anthony,

I wrote out the email as you indicated and indeed the headers are UTF-8 but
the text is wrong:

This:
msg.BodyPart.Charset = "UTF-8"
msg.TextBody = Message
msg.TextBodyPart.Charset = "UTF-8"
msg.HTMLBody = Message
msg.HTMLBodyPart.Charset = "UTF-8"

Yields this:

------=_NextPart_000_0001_01C70806.B7C0CB80
Content-Type: text/plain;
charset="UTF-8"
Content-Transfer-Encoding: 8bit

------=_NextPart_000_0001_01C70806.B7C0CB80
Content-Type: text/html;
charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

When I just set the BodyPart.Charset = "UTF-8" and set the HTMLBody =
Message then I get the following text in the text version of the email.

=C3=83=C2=BA

Using Notepad++
Start in ANSI
Convert the chars above from Hex to Text
(Plugins > TextFX Convert > Hex to Text)
Then switch to UTF-8
You get the chars in the email

Then cut the characters
Switch to ANSI mode
Paste the characters
Switch to UTF-8
And you get the character that is supposed to be there.

I don't get it. Any ideas what I am doing wrong?


"Anthony Jones" wrote:

>
> "Jed" <jedatu(a)newsgroups.nospam> wrote in message
> news:C6F5325A-BC98-4467-B611-12B96BC745D1(a)microsoft.com...
> > Actually, this is the CDOSYS code I tried.
> >
> > msg.BodyPart.Charset = "utf-8"
> > msg.HTMLBody = Message
> > msg.HTMLBodyPart.Charset = "utf-8"
> > msg.Send
> >
> > I accidentally copied the CDONTS code in the last post.
> >
>
> Try this in a VBScript file:-
>
> Option Explicit
>
> Const cdoSendUsingMethod =
> "http://schemas.microsoft.com/cdo/configuration/sendusing"
> Const cdoFlushBuffersOnWrite =
> "http://schemas.microsoft.com/cdo/configuration/flushbuffersonwrite"
> Const cdoSMTPServerPickupDirectory =
> "http://schemas.microsoft.com/cdo/configuration/smtpserverpickupdirectory"
> Const cdoSendUsingPickup = 1
>
> Dim oMsg : Set oMsg = CreateObject("CDO.Message")
>
> Set oMsg.Configuration = CreateObject("CDO.Configuration")
>
> With oMsg.Configuration.Fields
> .Item(cdoSendUsingMethod) = cdoSendUsingPickup
> .Item(cdoFlushBuffersOnWrite) = True
> .Item(cdoSMTPServerPickupDirectory) = "G:\temp\pickup" '*** change this
> .Update
> End With
>
> oMsg.BodyPart.charset = "UTF-8"
>
> oMsg.From = "Dude(a)somewhere.com"
> oMsg.To = "Bloke(a)elsewhere.com"
> oMsg.Subject = "Testing"
> oMsg.HTMLBody = "<html><body>£</body></html>"
>
> oMsg.Send
>
> MsgBox "Done"
>
>
> Change the pick folder to a temp folder on your macine.
>
> When executed open the resulting eml file in Outlook Express (double click
> it). Does the £ appear correctly without other strange characters?
>
> Open the eml file in notepad you should see something like:-
>
> X-Receiver: Bloke(a)elsewhere.com
> X-Sender: Dude(a)somewhere.com
> From: <Dude(a)somewhere.com>
> To: <Bloke(a)elsewhere.com>
> Subject: Testing
> Date: Sun, 12 Nov 2006 19:46:27 -0000
> MIME-Version: 1.0
> Content-Type: multipart/alternative;
> boundary="----=_NextPart_000_0001_01C70693.3DE9F350"
> Content-Class: urn:content-classes:message
>
> This is a multi-part message in MIME format.
>
> ------=_NextPart_000_0001_01C70693.3DE9F350
> Content-Type: text/plain;
> charset="UTF-8"
> Content-Transfer-Encoding: base64
>
> wqPigqzFkg0K
>
> ------=_NextPart_000_0001_01C70693.3DE9F350
> Content-Type: text/html;
> charset="UTF-8"
> Content-Transfer-Encoding: 8bit
>
> <html><body>£</body></html>
> ------=_NextPart_000_0001_01C70693.3DE9F350--
>
> I deleted some headers for clarity. However you can see that by specifying
> UTF-8 on the main message body part before writing anything to the message
> has caused it to cascade the UTF-8 encoding to the alternative parts.
>
> What happens you change the code so that the configuration sends using port
> 25 to your SMTP server and you specify your real email address as the
> receiver. Does the email look ok when it arrives in outlook/thunderbird?
>
>
>
>
>
From: Anthony Jones on

"Jed" <jedatu(a)newsgroups.nospam> wrote in message
news:CE49C076-8A72-4182-9754-EC2894E57823(a)microsoft.com...
> Thanks for the input Anthony,
>
> I wrote out the email as you indicated and indeed the headers are UTF-8
but
> the text is wrong:
>

Before we go any further did you paste my code verbatim into a VBS? (Cos
what you posted below isn't what I posted)
Did you then open it in outlook express and did it look right?


> This:
> msg.BodyPart.Charset = "UTF-8"

Don't do this:-

> msg.TextBody = Message
> msg.TextBodyPart.Charset = "UTF-8"

> msg.HTMLBody = Message

Don't do this either:-

> msg.HTMLBodyPart.Charset = "UTF-8"
>
> Yields this:
>
> ------=_NextPart_000_0001_01C70806.B7C0CB80
> Content-Type: text/plain;
> charset="UTF-8"
> Content-Transfer-Encoding: 8bit
>
> ------=_NextPart_000_0001_01C70806.B7C0CB80
> Content-Type: text/html;
> charset="UTF-8"
> Content-Transfer-Encoding: quoted-printable
>
> When I just set the BodyPart.Charset = "UTF-8" and set the HTMLBody =
> Message then I get the following text in the text version of the email.
>
> =C3=83=C2=BA
>
> Using Notepad++
> Start in ANSI
> Convert the chars above from Hex to Text
> (Plugins > TextFX Convert > Hex to Text)
> Then switch to UTF-8
> You get the chars in the email
>
> Then cut the characters
> Switch to ANSI mode
> Paste the characters
> Switch to UTF-8
> And you get the character that is supposed to be there.
>
> I don't get it. Any ideas what I am doing wrong?
>

It would help if I knew what character this is supposed to be? ? ?
What ANSI codepage are you using and what are the char codes for these
characters in that code page?
Are you certain the chararacter isn't already corrupted?
The fact that 4 octets have appeared in the output suggests to me that the
character is going through the UTF-8 encoding twice?
Is this in ASP?
Are you posting from a UTF-8 encoded HTML form?

>
> "Anthony Jones" wrote:
>
> >
> > "Jed" <jedatu(a)newsgroups.nospam> wrote in message
> > news:C6F5325A-BC98-4467-B611-12B96BC745D1(a)microsoft.com...
> > > Actually, this is the CDOSYS code I tried.
> > >
> > > msg.BodyPart.Charset = "utf-8"
> > > msg.HTMLBody = Message
> > > msg.HTMLBodyPart.Charset = "utf-8"
> > > msg.Send
> > >
> > > I accidentally copied the CDONTS code in the last post.
> > >
> >
> > Try this in a VBScript file:-
> >
> > Option Explicit
> >
> > Const cdoSendUsingMethod =
> > "http://schemas.microsoft.com/cdo/configuration/sendusing"
> > Const cdoFlushBuffersOnWrite =
> > "http://schemas.microsoft.com/cdo/configuration/flushbuffersonwrite"
> > Const cdoSMTPServerPickupDirectory =
> >
"http://schemas.microsoft.com/cdo/configuration/smtpserverpickupdirectory"
> > Const cdoSendUsingPickup = 1
> >
> > Dim oMsg : Set oMsg = CreateObject("CDO.Message")
> >
> > Set oMsg.Configuration = CreateObject("CDO.Configuration")
> >
> > With oMsg.Configuration.Fields
> > .Item(cdoSendUsingMethod) = cdoSendUsingPickup
> > .Item(cdoFlushBuffersOnWrite) = True
> > .Item(cdoSMTPServerPickupDirectory) = "G:\temp\pickup" '*** change
this
> > .Update
> > End With
> >
> > oMsg.BodyPart.charset = "UTF-8"
> >
> > oMsg.From = "Dude(a)somewhere.com"
> > oMsg.To = "Bloke(a)elsewhere.com"
> > oMsg.Subject = "Testing"
> > oMsg.HTMLBody = "<html><body>?</body></html>"
> >
> > oMsg.Send
> >
> > MsgBox "Done"
> >
> >
> > Change the pick folder to a temp folder on your macine.
> >
> > When executed open the resulting eml file in Outlook Express (double
click
> > it). Does the ? appear correctly without other strange characters?
> >
> > Open the eml file in notepad you should see something like:-
> >
> > X-Receiver: Bloke(a)elsewhere.com
> > X-Sender: Dude(a)somewhere.com
> > From: <Dude(a)somewhere.com>
> > To: <Bloke(a)elsewhere.com>
> > Subject: Testing
> > Date: Sun, 12 Nov 2006 19:46:27 -0000
> > MIME-Version: 1.0
> > Content-Type: multipart/alternative;
> > boundary="----=_NextPart_000_0001_01C70693.3DE9F350"
> > Content-Class: urn:content-classes:message
> >
> > This is a multi-part message in MIME format.
> >
> > ------=_NextPart_000_0001_01C70693.3DE9F350
> > Content-Type: text/plain;
> > charset="UTF-8"
> > Content-Transfer-Encoding: base64
> >
> > wqPigqzFkg0K
> >
> > ------=_NextPart_000_0001_01C70693.3DE9F350
> > Content-Type: text/html;
> > charset="UTF-8"
> > Content-Transfer-Encoding: 8bit
> >
> > <html><body>?</body></html>
> > ------=_NextPart_000_0001_01C70693.3DE9F350--
> >
> > I deleted some headers for clarity. However you can see that by
specifying
> > UTF-8 on the main message body part before writing anything to the
message
> > has caused it to cascade the UTF-8 encoding to the alternative parts.
> >
> > What happens you change the code so that the configuration sends using
port
> > 25 to your SMTP server and you specify your real email address as the
> > receiver. Does the email look ok when it arrives in
outlook/thunderbird?
> >
> >
> >
> >
> >


From: Jed on
Hi Anthony,

I have a good feeling that you will be able to help me get to the bottom of
this.

Let me answer your questions.

"Anthony Jones" wrote:

> Before we go any further did you paste my code verbatim into a VBS? (Cos
> what you posted below isn't what I posted)

Yes. I tried it exactly as you recommended then I tried some other things.

> Did you then open it in outlook express and did it look right?

Yes. I opened the eml in outlook and it did not look right.

> It would help if I knew what character this is supposed to be? ú ?

Yes. You are correct about the character code. I would have pasted it in
my message but I was not confident that it would come out right in the post.

> What ANSI codepage are you using and what are the char codes for these
> characters in that code page?

I don't know what ANSI code page Notepad++ uses. I am guessing the default
for my localization settings in windows.

> Are you certain the chararacter isn't already corrupted?

I don't know, but when I write the results out to the web page using
Response.Write(Message) I get the correct characters.

Response.Clear
'I have heard you need the following, but it seems to
' render fine in the browser without it
'Response.CodePage = 65001
Response.CharSet = "utf-8"
Response.Write Message
Response.End

> The fact that 4 octets have appeared in the output suggests to me that the
> character is going through the UTF-8 encoding twice?

This is possible, I guess. I don't know.

> Is this in ASP?

Yes. This is a classic asp page handling the request using the standard asp
ISAPI dll in IIS 6.

> Are you posting from a UTF-8 encoded HTML form?

I believe so. I put the following in the HTML of the form page:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

Does that make any sense?

From: Anthony Jones on

"Jed" <jedatu(a)newsgroups.nospam> wrote in message
news:5BD4B6B1-005F-4457-8931-5618D49AF743(a)microsoft.com...
> Hi Anthony,
>
> I have a good feeling that you will be able to help me get to the bottom
of
> this.
>
> Let me answer your questions.
>
> "Anthony Jones" wrote:
>
> > Before we go any further did you paste my code verbatim into a VBS? (Cos
> > what you posted below isn't what I posted)
>
> Yes. I tried it exactly as you recommended then I tried some other
things.
>
> > Did you then open it in outlook express and did it look right?
>
> Yes. I opened the eml in outlook and it did not look right.

Was that after you 'tried some things' or before?

It didn't contain simply a British pound sign (?)?
How did the contents of the eml file create by my original code differ from
the contents I posted along with the code?


>
> > It would help if I knew what character this is supposed to be? ? ?
>
> Yes. You are correct about the character code. I would have pasted it in
> my message but I was not confident that it would come out right in the
post.
>
> > What ANSI codepage are you using and what are the char codes for these
> > characters in that code page?
>
> I don't know what ANSI code page Notepad++ uses. I am guessing the
default
> for my localization settings in windows.
>

Yes it uses the localization settings.


> > Are you certain the chararacter isn't already corrupted?
>
> I don't know, but when I write the results out to the web page using
> Response.Write(Message) I get the correct characters.
>
> Response.Clear
> 'I have heard you need the following, but it seems to
> ' render fine in the browser without it
> 'Response.CodePage = 65001
> Response.CharSet = "utf-8"
> Response.Write Message
> Response.End

Your problem I believe hinges around a couple of little understood facts.
The response.codepage affects the way posted characters received in the
Request are converted to unicode. IOW, if the response code page is set to
a standard ANSI character set then any characters received in a form post
will be assumed to also be in the same ANSI character set.

Here's another fact. A browser will encode characters into a Form post
according to the charset for the page. Hence a content-type specifying a
charset of UTF-8 will cause characters in the form fields to be encoded to
UTF-8 when posted.

Combining these facts we can see that if a UTF-8 page posts characters to an
ASP target which reads the form fields whilst the Response.CodePage is set
to an ANSI codepage this would result in each byte in a multibyte UTF-8
character to be treated as individual characters.

The code above hides this problem because Response.Write is assuming it is
sending ANSI but tells the page it is getting UTF-8 reversing the problem.


>
> > The fact that 4 octets have appeared in the output suggests to me that
the
> > character is going through the UTF-8 encoding twice?
>
> This is possible, I guess. I don't know.
>
> > Is this in ASP?
>
> Yes. This is a classic asp page handling the request using the standard
asp
> ISAPI dll in IIS 6.
>
> > Are you posting from a UTF-8 encoded HTML form?
>
> I believe so. I put the following in the HTML of the form page:
>
> <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
>
Yeah don't do that. Use the Charset and ContentType properties of the
response object.

> Does that make any sense?
>

Yes. When receiving a Form post from a UTF-8 page make sure your
Response.Codepage is set to 65001 before you attempt to read any form
fields.

Anthony.