From: Pegasus [MVP] on

"Odd" <audrey.nsh(a)gmail.com> wrote in message
news:588cae2e-5711-4423-872a-41f691f1f846(a)g23g2000yqh.googlegroups.com...
On Sep 8, 12:21 pm, Odd <audrey....(a)gmail.com> wrote:
> On Sep 8, 12:09 pm, "Pegasus [MVP]" <n...(a)microsoft.com> wrote:
>
>
>
>
>
> > "Odd" <audrey....(a)gmail.com> wrote in message
>
> >news:7c4672bc-7245-4f05-883c-dabe25f0227e(a)y21g2000yqn.googlegroups.com...
> > On Sep 8, 11:41 am, "Pegasus [MVP]" <n...(a)microsoft.com> wrote:
>
> > > "Odd" <audrey....(a)gmail.com> wrote in message
>
> > >news:8a2b8b75-480a-4315-88d4-849b5ceb3c96(a)r39g2000yqm.googlegroups.com...
>
> > > > Hi everyone,
>
> > > > I have a .csv file where certain fields have carriage returns/new
> > > > line
> > > > feeds. Because of such, I cannot properly import my file. Hence, I
> > > > wrote the following script and while it removes the carriage
> > > > returns/
> > > > new line feeds through the file, it ALSO removes the END OF LINE
> > > > carriage return/line feed. So now, I have one file with one record.
>
> > > > Basically,I need to know how to remove the carriage returns/line
> > > > feeds
> > > > in the file EXCEPT for the ones that is END OF LINE. Can somebody
> > > > help
> > > > me please?
>
> > > > Dim strSearchString, objFSO, objFile
> > > > Const ForReading = 1
> > > > Const ForWriting = 2
>
> > > > Set objFSO = CreateObject("Scripting.FileSystemObject")
> > > > Set objFile = objFSO.OpenTextFile("d:\Test\Project.csv", ForReading)
> > > > strSearchString = objFile.ReadAll
> > > > objFile.Close
>
> > > > objFSO.CreateTextFile "d:\Test\Project_" & DatePart("yyyy",Date) &
> > > > Right("0" & DatePart("m",Date),2) & Right("0" & DatePart("d",
> > > > Date),2)
> > > > & ".csv"
> > > > Set objFile = objFSO.OpenTextFile("d:\Test\Project_" & DatePart
> > > > ("yyyy",Date) & Right("0" & DatePart("m",Date),2) & Right("0" &
> > > > DatePart("d", Date),2) & ".csv", ForWriting)
> > > > strSearchString = Replace(strSearchString, VbCrLf, "")
> > > > strSearchString = Replace(strSearchString, Vbtab, "")
>
> > > > objFile.Write strSearchString
> > > > objFile.Close
>
> > > How do you distinguish a CRLF at the end of a field from a CRLF at the
> > > end
> > > of a line?- Hide quoted text -
>
> > > - Show quoted text -
>
> > Hmm...well, through trial and error. Basically, when I open the .csv
> > file in Excel, the field (on one record) that has a carriage return is
> > represented as a square and the line feed is represented as a new line
> > (in Excel, it is similar to ALT-ENTER). Then, again, when I ran my
> > script, I know that there is a CRLF at the end of the line because I
> > can see that the second line moved up to the first........sorry for
> > the explanation, but truly, that's how I figured out.
>
> > ==========
>
> > If you want your script to be robust then you must stop guessing and use
> > a
> > proper tool to examine your file structure. XVI32.exe is such a tool.
> > You
> > can download it fromhttp://www.chmaas.handshake.de. It lets you examine
> > any
> > file, binary or ASCII so that you can tell exactly where one record ends
> > and
> > the next one starts.- Hide quoted text -
>
> > - Show quoted text -
>
> Pegasus,
>
> The file is in tabular format but comma-delimited. Excel opens
> the .csv file properly. I have a starting column (Project_ID) and an
> ending column (Project_Currency_ID). The problem that I have is that I
> have a column in between the first column and the last column that has
> entries that have carriage returns and line feeds. I need to get rid
> of those...- Hide quoted text -
>
> - Show quoted text -

An example would be:

Project_ID, Project_Name, Major Assumptions, Project_Currency_ID
3, ProjectABC, SomeAssumptionsaaa, 2
SomeAssumptionsbbb
4, ProjectDEF, SomeAssumptionsccc, 2
5, ProjectGHI, SomeAssumptionsddd,2


So, if you see Project_ID = 3, it is the carriagereturn/line feed in
the Major Assumptions columns that I want to get rid of.

==============

This is getting a little confusing. A CSV is by definition a text file. It
has a CRLF at the end of each line, that's all. I don't know what tool you
used to display the above example and I do not understand it. If you're
reluctant to look at the file with a binary editor, how about posting the
first few records in your reply? You can do it like so:
1. Copy the "CSV" file to c:\CSV.CSV. Note: you *must* select an 8.3 file
name!
2. Type these commands at the Command Prompt:
debug c:\csv.csv > c:\csv.deb{Enter)
d 100 500 {Enter}
q {Enter}
(You will have to type the second and the third command blind. There is
no screen feedback!)
notepad c:\csv.deb{Enter}
3. Paste what you see into your reply.


From: ekkehard.horner on
Odd schrieb:
> Hi everyone,
>
> I have a .csv file where certain fields have carriage returns/new line
> feeds. Because of such, I cannot properly import my file.

That you can "open the .csv file in Excel" indicates that your file
is a 'good' .csv file. Then it can be processed by the proper tools/drivers
without risky/complicated custom manipulation of special characters.


Proof of concept script:

Dim oFS : Set oFS = CreateObject( "Scripting.FileSystemObject" )
Dim sDDir : sDDir = oFS.GetAbsolutePathName( ".\adotext" )
Dim sTFName : sTFName = "EmbeddedLineEndings.txt"
Dim sFSpec : sFSpec = oFS.BuildPath( sDDir, sTFName )
Dim oADOCN : Set oADOCN = CreateObject( "ADODB.Connection" )
Dim sCS
If False Then
sCS = Join( Array( _
"Provider=MSDASQL" _
, "Driver={Microsoft Text Driver (*.txt; *.csv)}" _
, "DBQ=" & sDDir ), ";" )
Else
sCS = Join( Array( _
"Provider=Microsoft.Jet.OLEDB.4.0" _
, "Data Source=" & sDDir _
, "Extended Properties=""" & Join( Array( _
"text" _
, "HDR=Yes" _
, "FMT=Delimited;""" ), ";" ) ), ";" )
End If
Dim sSQL

WScript.Echo "Working with", sFSpec

WScript.Echo "Using CS", sCS
oADOCN.Open sCS

If Not oFS.FileExists( sFSpec ) Then
WScript.Echo "Creating", sFSpec
sSQL = Join( Array( _
"CREATE TABLE [" & sTFName & "]" _
, "( Id INTEGER" _
, " , Name VARCHAR( 20 )" _
, " , Remark VARCHAR( 200 )" _
, " , Tail INTEGER" _
, ")" ), " " )
oADOCN.Execute sSQL
sSQL = "INSERT INTO [" & sTFName & "] VALUES ( 1, 'Alpha', '" & Join( Array( "rem1",
"rem2" ), vbCrLf ) & "', 4711 )"
oADOCN.Execute sSQL
WScript.Echo "Adding", sSQL
sSQL = "INSERT INTO [" & sTFName & "] VALUES ( 2, 'Beta' , '" & Join( Array( "rem3",
"rem4" ), vbCrLf ) & "', 4712 )"
oADOCN.Execute sSQL
WScript.Echo "Adding", sSQL
End If

WScript.Echo "--------------" & vbCrLf & oFS.OpenTextFile( sFSpec ).ReadAll() & vbCrLf
& "--------------"

sSQL = "SELECT * FROM [" & sTFName & "]"
WScript.Echo sSQL
Dim oRS : Set oRS = oADOCN.Execute( sSQL )
Do Until oRS.EOF
WScript.Echo "--------------"
Dim oFld
For Each oFld In oRS.Fields
WScript.Echo "---", oFld.Name & ":"
WScript.Echo oFld.Value
Next
oRS.MoveNext
Loop
oRS.Close
oADOCN.Close

output:

...
--------------
"Id";"Name";"Remark";"Tail"
1;"Alpha";"rem1
rem2";4711
2;"Beta";"rem3
rem4";4712

--------------
SELECT * FROM [EmbeddedLineEndings.txt]
--------------
--- Id:
1
--- Name:
Alpha
--- Remark:
rem1
rem2
--- Tail:
4711
--------------
--- Id:
2
--- Name:
Beta
--- Remark:
rem3
rem4
--- Tail:
4712

To work with your input file, you may have to provide a schema.ini
file in the directory the file resides in. My schema.ini contains

[EmbeddedLineEndings.txt]
ColNameHeader=True
CharacterSet=1252
Format=Delimited(;)
Col1=Id Integer
Col2=Name Char Width 20
Col3=Remark Char Width 200
Col4=Tail Integer

(this definition was generated automagically by the create table
statement; you may have to think about headers, encodings and
the field delimiter)
From: ekkehard.horner on
Pegasus [MVP] schrieb:
[...]
> This is getting a little confusing. A CSV is by definition a text file. It
> has a CRLF at the end of each line, that's all.
[...]
A .CSV file may have embedded in line endings in (properly delimited)
text fields. All Microsoft applications and tools can process such a
file without any problem.
From: Pegasus [MVP] on

"ekkehard.horner" <ekkehard.horner(a)arcor.de> wrote in message
news:4aa69868$0$31873$9b4e6d93(a)newsspool3.arcor-online.net...
> Pegasus [MVP] schrieb:
> [...]
>> This is getting a little confusing. A CSV is by definition a text file.
>> It has a CRLF at the end of each line, that's all.
> [...]
> A .CSV file may have embedded in line endings in (properly delimited)
> text fields. All Microsoft applications and tools can process such a
> file without any problem.

And what might this "in line ending" be? The OP seems a little reluctant
post it.


From: ekkehard.horner on
Pegasus [MVP] schrieb:
> "ekkehard.horner" <ekkehard.horner(a)arcor.de> wrote in message
> news:4aa69868$0$31873$9b4e6d93(a)newsspool3.arcor-online.net...
>> Pegasus [MVP] schrieb:
>> [...]
>>> This is getting a little confusing. A CSV is by definition a text file.
>>> It has a CRLF at the end of each line, that's all.
>> [...]
>> A .CSV file may have embedded in line endings in (properly delimited)
>> text fields. All Microsoft applications and tools can process such a
>> file without any problem.
>
> And what might this "in line ending" be? The OP seems a little reluctant
> post it.
>
>
It may well be vbCrLf as a dump of my demo file shows:

00000000h: 22 49 64 22 3B 22 4E 61 6D 65 22 3B 22 52 65 6D ; "Id";"Name";"Rem
00000010h: 61 72 6B 22 3B 22 54 61 69 6C 22 0D 0A 31 3B 22 ; ark";"Tail"..1;"
00000020h: 41 6C 70 68 61 22 3B 22 72 65 6D 31 0D 0A 72 65 ; Alpha";"rem1..re
00000030h: 6D 32 22 3B 34 37 31 31 0D 0A 32 3B 22 42 65 74 ; m2";4711..2;"Bet
00000040h: 61 22 3B 22 72 65 6D 33 0D 0A 72 65 6D 34 22 3B ; a";"rem3..rem4";
00000050h: 34 37 31 32 0D 0A ; 4712..

Just to be sure, I changed

sSQL = "INSERT INTO [" & sTFName & "] VALUES ( 1, 'Alpha', '" & Join( Array( "rem1",
"rem2" ), vbCrLf ) & "', 4711 )"

to

sSQL = "INSERT INTO [" & sTFName & "] VALUES ( 1, 'Alpha', '" & Join( Array( "rem1",
"rem2" ), vbLf ) & "', 4711 )"

in my demo script - no problems at all.