From: Pegasus [MVP] on

"Odd" <audrey.nsh(a)gmail.com> wrote in message
news:385cd965-be36-415e-9d0e-124d1b260c92(a)r9g2000yqa.googlegroups.com...

Ok, Ijust sent you to your e-mail address......let me know.....

===========

Your file is a comma-delimited text file. Some of your fields contain an
excessive number of trailing spaces that you should get rid of. Some contain
tabs ($09) and extended characters $95 that you may want to suppress. Some
contain quoted strings with embedded CRLFs. Here is a way to replace them
with spaces. It's not a particularly elegant way - perhaps someone can think
of a better method.
Const ForReading = 1
Const ForWriting = 2
Const sQuote = """"

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objInput = objFSO.OpenTextFile("d:\csv.csv", ForReading)

objFSO.CreateTextFile "d:\Project_" & DatePart("yyyy", Date) _
& Right("0" & DatePart("m",Date),2) & Right("0" & DatePart("d", Date),2) &
".csv"
Set objOutput = objFSO.OpenTextFile("d:\Project_" _
& DatePart("yyyy", Date) & Right("0" & DatePart("m", Date), 2) _
& Right("0" & DatePart("d", Date),2) & ".csv", ForWriting)

sData = Replace(objInput.ReadAll, VbCrLf, Chr(0))
sOut = ""
bQuoted = False

For i = 1 To Len(sData)
If Mid(sData, i, 1) = sQuote Then bQuoted = Not bQuoted
If Mid(sData, i, 1) = Chr(0) And bQuoted Then
sOut = sOut & " "
Else
sOut = sOut & Mid(sData, i, 1)
End If
Next

objOutput.Write Replace(sOut, Chr(0), VbCrLf)
objOutput.Close


From: Pegasus [MVP] on

"ekkehard.horner" <ekkehard.horner(a)arcor.de> wrote in message
news:4aa6b9de$0$32674$9b4e6d93(a)newsspool2.arcor-online.net...
> Pegasus [MVP] schrieb:
> [...]
>> The dump you posted does not really seem to be a "CSV" - it uses
>> semicolons as delimiters, not commas. In an attempt to put an end to the
>> guessing game,
>
> That is the German locale form of a .CSV; cross language processing may
> be cumbersome some times, but as long as you stay with your locale,
>
> PROCESSING .CSVs WITH EMBEDDED LINE-ENDINGS CAUSE NO PROBLEMS
> AT ALL AS LONG AS YOU USE THE PROPER (MICROSOFT) TOOLS

Why all this shouting? Achtung, listen now!


From: Pegasus [MVP] on

"Todd Vargo" <tlvargo(a)sbcglobal.netz> wrote in message
news:%232o6zoMMKHA.4064(a)TK2MSFTNGP06.phx.gbl...
>
> "Odd" <audrey.nsh(a)gmail.com> wrote in message
> news:8a2b8b75-480a-4315-88d4-849b5ceb3c96(a)r39g2000yqm.googlegroups.com...
>> Hi everyone,
>>
>> I have a .csv file where certain fields have carriage returns/new line
>> feeds. Because of such, I cannot properly import my file. Hence, I
>> wrote the following script and while it removes the carriage returns/
>> new line feeds through the file, it ALSO removes the END OF LINE
>> carriage return/line feed. So now, I have one file with one record.
>>
>> Basically,I need to know how to remove the carriage returns/line feeds
>> in the file EXCEPT for the ones that is END OF LINE. Can somebody help
>> me please?
>>
>> Dim strSearchString, objFSO, objFile
>> Const ForReading = 1
>> Const ForWriting = 2
>>
>>
>> Set objFSO = CreateObject("Scripting.FileSystemObject")
>> Set objFile = objFSO.OpenTextFile("d:\Test\Project.csv", ForReading)
>> strSearchString = objFile.ReadAll
>> objFile.Close
>>
>>
>> objFSO.CreateTextFile "d:\Test\Project_" & DatePart("yyyy",Date) &
>> Right("0" & DatePart("m",Date),2) & Right("0" & DatePart("d", Date),2)
>> & ".csv"
>> Set objFile = objFSO.OpenTextFile("d:\Test\Project_" & DatePart
>> ("yyyy",Date) & Right("0" & DatePart("m",Date),2) & Right("0" &
>> DatePart("d", Date),2) & ".csv", ForWriting)
>> strSearchString = Replace(strSearchString, VbCrLf, "")
>> strSearchString = Replace(strSearchString, Vbtab, "")
>>
>> objFile.Write strSearchString
>> objFile.Close
>
> From your other posts it sounds like the .csv file was created by Excel
> and
> you see blocks (in Notepad or other editor) for the embedded line feeds.
> Removing the embedded line feeds is simple but then you have the lines
> running into each other. I suggest replacing the the line feeds with
> spaces
> so you wont have lines crashing together. To do this, 1) Replace all
> Chr(10)
> with a space. 2) Replace all Chr(13) + Chr(32) with Chr(13) + Chr(10).
>
> Dim strSearchString, objFSO, objFile
> Const ForReading = 1
> Const ForWriting = 2
>
> Set objFSO = CreateObject("Scripting.FileSystemObject")
> Set objFile = objFSO.OpenTextFile("d:\Test\Project.csv", ForReading)
> strSearchString = objFile.ReadAll
> objFile.Close
>
> Set objFile = objFSO.OpenTextFile("d:\Test\Project_" & DatePart
> ("yyyy",Date) & Right("0" & DatePart("m",Date),2) & Right("0" &
> DatePart("d", Date),2) & ".csv", ForWriting)
> strSearchString = Replace(strSearchString, VbLf, " ")
> strSearchString = Replace(strSearchString, VbCr & " ", VbCrLf)
> objFile.Write strSearchString
> objFile.Close
>
> MsgBox "done."
>
> --
> Todd Vargo
> (Post questions to group only. Remove "z" to email personal messages)

This is what I thought too but after examining the OP's sample file I
realised that it would not work. The multi-line fields contain embedded
CRLFs, not just LFs. In my other reply I used the double-quote initiator to
recognise a quoted field so that I could replace the subsequent CRLFs with a
space, right up to the closing double quote.


From: Richard Mueller [MVP] on

"ekkehard.horner" <ekkehard.horner(a)arcor.de> wrote in message
news:4aa6bdaf$0$32670$9b4e6d93(a)newsspool2.arcor-online.net...
> Odd schrieb:
> [...]
>> On Sep 8, 12:38 pm, "Richard Mueller [MVP]" <rlmueller-
>> nos...(a)ameritech.nospam.net> wrote:
> [...]
>>> Do Until objInput.AtEndofStream
>>> strLine = Trim(objInput.ReadLine)
>>> If (strLine <> "") Then
>>> strLine = Replace(strLine, vbCrLf, "")
>>> strLine = Replaced(strLine, vbCr, "")
>>> strLine = Replaced(strLine, vbLf, "")
>>> strLine = Replaced(strLine, vbFormFeed, "")
>>> strLine = Replace(strLine, vbtab,"")
>>> objOutput.WriteLine strLine
>>> End If
>>> Loop
> [...]
>>
>> What Richard is saying is correct.....the carriage returns and line
>> feeds is embedded in the field values. Richard: I tried your code..it
>> removed the tab,but did not remove the carriage return nor the line
>> feed.
>
> That is to be expected, because ReadLine will read data from the file
> upto the bytes it considers line endings (vbLf, vbCrLf; I'm fairly
> sure that vbCr isn't a line ending, but don't believe, do your own tests),
> discard the line ending, and return the 'clean' string/line.
>
> [Richard, you use ADO frequently - please help me to convince Odd
> and Pegasus that giving the text driver a try is NOT stupid]

The one hex dump I've seen in this thread had a normal 0D 0A sequence,
equivalent to vbCrLf, but I didn't see enough to know what is going on. I
have used code similar to below to read csv files:
=============
Option Explicit

Dim adoCSVConnection, adoCSVRecordSet, strPathToTextfile
Dim strCSVFile, k

' Specify path to CSV file.
strPathToTextFile = "c:\Scripts\"

' Specify CSV file name.
strCSVFile = "Example.csv"

' Open connection to the CSV file.
Set adoCSVConnection = CreateObject("ADODB.Connection")
Set adoCSVRecordSet = CreateObject("ADODB.Recordset")

' Open CSV file with no header line.
adoCSVConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & strPathtoTextFile & ";" & _
"Extended Properties=""text;HDR=NO;FMT=Delimited"""

adoCSVRecordset.Open "SELECT * FROM " & strCSVFile, adoCSVConnection

' Read the CSV file.
Do Until adoCSVRecordset.EOF
' Display Values in all fields.
For k = 0 To adoCSVRecordset.Fields.Count -1
Wscript.Echo adoCSVRecordset.Fields.Item(k).Value
Next
adoCSVRecordset.MoveNext
Loop

' Clean up.
adoCSVRecordset.Close
adoCSVConnection.Close
=========
Of course, if the file has a header line, you can reference the fields by
name. For example (in part):
===========
' Open CSV file with header line.
adoCSVConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & strPathtoTextFile & ";" & _
"Extended Properties=""text;HDR=YES;FMT=Delimited"""

adoCSVRecordset.Open "SELECT * FROM " & strCSVFile, adoCSVConnection

' Read the CSV file.
Do Until adoCSVRecordset.EOF
' Display Values in fields by field name (from header line).
Wscript.Echo "First Name: " & adoCSVRecordset.Fields("First").Value
Wscript.Echo "Middle Name: " & adoCSVRecordset.Fields("Middle").Value
Wscript.Echo "Last Name: " & adoCSVRecordset.Fields("Last").Value
adoCSVRecordset.MoveNext
Loop
========
It would be intersting to see how code similar to above handles this file.
Still, if any of the fields has embedded characters that the OP wants to
remove, you will need to add steps to replace the characters with blanks (or
something else).

--
Richard Mueller
MVP Directory Services
Hilltop Lab - http://www.rlmueller.net
--


From: Todd Vargo on
Pegasus [MVP] wrote:
> Todd Vargo wrote:
....
> > strSearchString = Replace(strSearchString, VbLf, " ")
> > strSearchString = Replace(strSearchString, VbCr & " ", VbCrLf)
> > objFile.Write strSearchString
> > objFile.Close
> >
> > MsgBox "done."
>
> This is what I thought too but after examining the OP's sample file I
> realised that it would not work. The multi-line fields contain embedded
> CRLFs, not just LFs. In my other reply I used the double-quote initiator
to
> recognise a quoted field so that I could replace the subsequent CRLFs with
a
> space, right up to the closing double quote.

Unfortunately, I did not have a sample file from OP to work with. I created
my own .csv using Excel which embeds the LF without the CR. One of OP's
posts described the "end of line" character as a block, which is what we see
when LF is missing a preceding CR. I guess OP will just have to try it and
report back.

--
Todd Vargo
(Post questions to group only. Remove "z" to email personal messages)