From: Pegasus [MVP] on 8 Sep 2009 13:35 "Odd" <audrey.nsh(a)gmail.com> wrote in message news:588cae2e-5711-4423-872a-41f691f1f846(a)g23g2000yqh.googlegroups.com... On Sep 8, 12:21 pm, Odd <audrey....(a)gmail.com> wrote: > On Sep 8, 12:09 pm, "Pegasus [MVP]" <n...(a)microsoft.com> wrote: > > > > > > > "Odd" <audrey....(a)gmail.com> wrote in message > > >news:7c4672bc-7245-4f05-883c-dabe25f0227e(a)y21g2000yqn.googlegroups.com... > > On Sep 8, 11:41 am, "Pegasus [MVP]" <n...(a)microsoft.com> wrote: > > > > "Odd" <audrey....(a)gmail.com> wrote in message > > > >news:8a2b8b75-480a-4315-88d4-849b5ceb3c96(a)r39g2000yqm.googlegroups.com... > > > > > Hi everyone, > > > > > I have a .csv file where certain fields have carriage returns/new > > > > line > > > > feeds. Because of such, I cannot properly import my file. Hence, I > > > > wrote the following script and while it removes the carriage > > > > returns/ > > > > new line feeds through the file, it ALSO removes the END OF LINE > > > > carriage return/line feed. So now, I have one file with one record. > > > > > Basically,I need to know how to remove the carriage returns/line > > > > feeds > > > > in the file EXCEPT for the ones that is END OF LINE. Can somebody > > > > help > > > > me please? > > > > > Dim strSearchString, objFSO, objFile > > > > Const ForReading = 1 > > > > Const ForWriting = 2 > > > > > Set objFSO = CreateObject("Scripting.FileSystemObject") > > > > Set objFile = objFSO.OpenTextFile("d:\Test\Project.csv", ForReading) > > > > strSearchString = objFile.ReadAll > > > > objFile.Close > > > > > objFSO.CreateTextFile "d:\Test\Project_" & DatePart("yyyy",Date) & > > > > Right("0" & DatePart("m",Date),2) & Right("0" & DatePart("d", > > > > Date),2) > > > > & ".csv" > > > > Set objFile = objFSO.OpenTextFile("d:\Test\Project_" & DatePart > > > > ("yyyy",Date) & Right("0" & DatePart("m",Date),2) & Right("0" & > > > > DatePart("d", Date),2) & ".csv", ForWriting) > > > > strSearchString = Replace(strSearchString, VbCrLf, "") > > > > strSearchString = Replace(strSearchString, Vbtab, "") > > > > > objFile.Write strSearchString > > > > objFile.Close > > > > How do you distinguish a CRLF at the end of a field from a CRLF at the > > > end > > > of a line?- Hide quoted text - > > > > - Show quoted text - > > > Hmm...well, through trial and error. Basically, when I open the .csv > > file in Excel, the field (on one record) that has a carriage return is > > represented as a square and the line feed is represented as a new line > > (in Excel, it is similar to ALT-ENTER). Then, again, when I ran my > > script, I know that there is a CRLF at the end of the line because I > > can see that the second line moved up to the first........sorry for > > the explanation, but truly, that's how I figured out. > > > ========== > > > If you want your script to be robust then you must stop guessing and use > > a > > proper tool to examine your file structure. XVI32.exe is such a tool. > > You > > can download it fromhttp://www.chmaas.handshake.de. It lets you examine > > any > > file, binary or ASCII so that you can tell exactly where one record ends > > and > > the next one starts.- Hide quoted text - > > > - Show quoted text - > > Pegasus, > > The file is in tabular format but comma-delimited. Excel opens > the .csv file properly. I have a starting column (Project_ID) and an > ending column (Project_Currency_ID). The problem that I have is that I > have a column in between the first column and the last column that has > entries that have carriage returns and line feeds. I need to get rid > of those...- Hide quoted text - > > - Show quoted text - An example would be: Project_ID, Project_Name, Major Assumptions, Project_Currency_ID 3, ProjectABC, SomeAssumptionsaaa, 2 SomeAssumptionsbbb 4, ProjectDEF, SomeAssumptionsccc, 2 5, ProjectGHI, SomeAssumptionsddd,2 So, if you see Project_ID = 3, it is the carriagereturn/line feed in the Major Assumptions columns that I want to get rid of. ============== This is getting a little confusing. A CSV is by definition a text file. It has a CRLF at the end of each line, that's all. I don't know what tool you used to display the above example and I do not understand it. If you're reluctant to look at the file with a binary editor, how about posting the first few records in your reply? You can do it like so: 1. Copy the "CSV" file to c:\CSV.CSV. Note: you *must* select an 8.3 file name! 2. Type these commands at the Command Prompt: debug c:\csv.csv > c:\csv.deb{Enter) d 100 500 {Enter} q {Enter} (You will have to type the second and the third command blind. There is no screen feedback!) notepad c:\csv.deb{Enter} 3. Paste what you see into your reply.
From: ekkehard.horner on 8 Sep 2009 13:45 Odd schrieb: > Hi everyone, > > I have a .csv file where certain fields have carriage returns/new line > feeds. Because of such, I cannot properly import my file. That you can "open the .csv file in Excel" indicates that your file is a 'good' .csv file. Then it can be processed by the proper tools/drivers without risky/complicated custom manipulation of special characters. Proof of concept script: Dim oFS : Set oFS = CreateObject( "Scripting.FileSystemObject" ) Dim sDDir : sDDir = oFS.GetAbsolutePathName( ".\adotext" ) Dim sTFName : sTFName = "EmbeddedLineEndings.txt" Dim sFSpec : sFSpec = oFS.BuildPath( sDDir, sTFName ) Dim oADOCN : Set oADOCN = CreateObject( "ADODB.Connection" ) Dim sCS If False Then sCS = Join( Array( _ "Provider=MSDASQL" _ , "Driver={Microsoft Text Driver (*.txt; *.csv)}" _ , "DBQ=" & sDDir ), ";" ) Else sCS = Join( Array( _ "Provider=Microsoft.Jet.OLEDB.4.0" _ , "Data Source=" & sDDir _ , "Extended Properties=""" & Join( Array( _ "text" _ , "HDR=Yes" _ , "FMT=Delimited;""" ), ";" ) ), ";" ) End If Dim sSQL WScript.Echo "Working with", sFSpec WScript.Echo "Using CS", sCS oADOCN.Open sCS If Not oFS.FileExists( sFSpec ) Then WScript.Echo "Creating", sFSpec sSQL = Join( Array( _ "CREATE TABLE [" & sTFName & "]" _ , "( Id INTEGER" _ , " , Name VARCHAR( 20 )" _ , " , Remark VARCHAR( 200 )" _ , " , Tail INTEGER" _ , ")" ), " " ) oADOCN.Execute sSQL sSQL = "INSERT INTO [" & sTFName & "] VALUES ( 1, 'Alpha', '" & Join( Array( "rem1", "rem2" ), vbCrLf ) & "', 4711 )" oADOCN.Execute sSQL WScript.Echo "Adding", sSQL sSQL = "INSERT INTO [" & sTFName & "] VALUES ( 2, 'Beta' , '" & Join( Array( "rem3", "rem4" ), vbCrLf ) & "', 4712 )" oADOCN.Execute sSQL WScript.Echo "Adding", sSQL End If WScript.Echo "--------------" & vbCrLf & oFS.OpenTextFile( sFSpec ).ReadAll() & vbCrLf & "--------------" sSQL = "SELECT * FROM [" & sTFName & "]" WScript.Echo sSQL Dim oRS : Set oRS = oADOCN.Execute( sSQL ) Do Until oRS.EOF WScript.Echo "--------------" Dim oFld For Each oFld In oRS.Fields WScript.Echo "---", oFld.Name & ":" WScript.Echo oFld.Value Next oRS.MoveNext Loop oRS.Close oADOCN.Close output: ... -------------- "Id";"Name";"Remark";"Tail" 1;"Alpha";"rem1 rem2";4711 2;"Beta";"rem3 rem4";4712 -------------- SELECT * FROM [EmbeddedLineEndings.txt] -------------- --- Id: 1 --- Name: Alpha --- Remark: rem1 rem2 --- Tail: 4711 -------------- --- Id: 2 --- Name: Beta --- Remark: rem3 rem4 --- Tail: 4712 To work with your input file, you may have to provide a schema.ini file in the directory the file resides in. My schema.ini contains [EmbeddedLineEndings.txt] ColNameHeader=True CharacterSet=1252 Format=Delimited(;) Col1=Id Integer Col2=Name Char Width 20 Col3=Remark Char Width 200 Col4=Tail Integer (this definition was generated automagically by the create table statement; you may have to think about headers, encodings and the field delimiter)
From: ekkehard.horner on 8 Sep 2009 13:49 Pegasus [MVP] schrieb: [...] > This is getting a little confusing. A CSV is by definition a text file. It > has a CRLF at the end of each line, that's all. [...] A .CSV file may have embedded in line endings in (properly delimited) text fields. All Microsoft applications and tools can process such a file without any problem.
From: Pegasus [MVP] on 8 Sep 2009 13:52 "ekkehard.horner" <ekkehard.horner(a)arcor.de> wrote in message news:4aa69868$0$31873$9b4e6d93(a)newsspool3.arcor-online.net... > Pegasus [MVP] schrieb: > [...] >> This is getting a little confusing. A CSV is by definition a text file. >> It has a CRLF at the end of each line, that's all. > [...] > A .CSV file may have embedded in line endings in (properly delimited) > text fields. All Microsoft applications and tools can process such a > file without any problem. And what might this "in line ending" be? The OP seems a little reluctant post it.
From: ekkehard.horner on 8 Sep 2009 14:11
Pegasus [MVP] schrieb: > "ekkehard.horner" <ekkehard.horner(a)arcor.de> wrote in message > news:4aa69868$0$31873$9b4e6d93(a)newsspool3.arcor-online.net... >> Pegasus [MVP] schrieb: >> [...] >>> This is getting a little confusing. A CSV is by definition a text file. >>> It has a CRLF at the end of each line, that's all. >> [...] >> A .CSV file may have embedded in line endings in (properly delimited) >> text fields. All Microsoft applications and tools can process such a >> file without any problem. > > And what might this "in line ending" be? The OP seems a little reluctant > post it. > > It may well be vbCrLf as a dump of my demo file shows: 00000000h: 22 49 64 22 3B 22 4E 61 6D 65 22 3B 22 52 65 6D ; "Id";"Name";"Rem 00000010h: 61 72 6B 22 3B 22 54 61 69 6C 22 0D 0A 31 3B 22 ; ark";"Tail"..1;" 00000020h: 41 6C 70 68 61 22 3B 22 72 65 6D 31 0D 0A 72 65 ; Alpha";"rem1..re 00000030h: 6D 32 22 3B 34 37 31 31 0D 0A 32 3B 22 42 65 74 ; m2";4711..2;"Bet 00000040h: 61 22 3B 22 72 65 6D 33 0D 0A 72 65 6D 34 22 3B ; a";"rem3..rem4"; 00000050h: 34 37 31 32 0D 0A ; 4712.. Just to be sure, I changed sSQL = "INSERT INTO [" & sTFName & "] VALUES ( 1, 'Alpha', '" & Join( Array( "rem1", "rem2" ), vbCrLf ) & "', 4711 )" to sSQL = "INSERT INTO [" & sTFName & "] VALUES ( 1, 'Alpha', '" & Join( Array( "rem1", "rem2" ), vbLf ) & "', 4711 )" in my demo script - no problems at all. |