From: ekkehard.horner on 8 Sep 2009 18:00 Todd Vargo schrieb: > "Odd" <audrey.nsh(a)gmail.com> wrote in message > news:8a2b8b75-480a-4315-88d4-849b5ceb3c96(a)r39g2000yqm.googlegroups.com... >> Hi everyone, >> >> I have a .csv file where certain fields have carriage returns/new line >> feeds. Because of such, I cannot properly import my file. Hence, I >> wrote the following script and while it removes the carriage returns/ >> new line feeds through the file, it ALSO removes the END OF LINE >> carriage return/line feed. So now, I have one file with one record. >> >> Basically,I need to know how to remove the carriage returns/line feeds >> in the file EXCEPT for the ones that is END OF LINE. Can somebody help >> me please? >> >> Dim strSearchString, objFSO, objFile >> Const ForReading = 1 >> Const ForWriting = 2 >> >> >> Set objFSO = CreateObject("Scripting.FileSystemObject") >> Set objFile = objFSO.OpenTextFile("d:\Test\Project.csv", ForReading) >> strSearchString = objFile.ReadAll >> objFile.Close >> >> >> objFSO.CreateTextFile "d:\Test\Project_" & DatePart("yyyy",Date) & >> Right("0" & DatePart("m",Date),2) & Right("0" & DatePart("d", Date),2) >> & ".csv" >> Set objFile = objFSO.OpenTextFile("d:\Test\Project_" & DatePart >> ("yyyy",Date) & Right("0" & DatePart("m",Date),2) & Right("0" & >> DatePart("d", Date),2) & ".csv", ForWriting) >> strSearchString = Replace(strSearchString, VbCrLf, "") >> strSearchString = Replace(strSearchString, Vbtab, "") >> >> objFile.Write strSearchString >> objFile.Close > > From your other posts it sounds like the .csv file was created by Excel and > you see blocks (in Notepad or other editor) for the embedded line feeds. > Removing the embedded line feeds is simple but then you have the lines > running into each other. I suggest replacing the the line feeds with spaces > so you wont have lines crashing together. To do this, 1) Replace all Chr(10) > with a space. 2) Replace all Chr(13) + Chr(32) with Chr(13) + Chr(10). > > Dim strSearchString, objFSO, objFile > Const ForReading = 1 > Const ForWriting = 2 > > Set objFSO = CreateObject("Scripting.FileSystemObject") > Set objFile = objFSO.OpenTextFile("d:\Test\Project.csv", ForReading) > strSearchString = objFile.ReadAll > objFile.Close > > Set objFile = objFSO.OpenTextFile("d:\Test\Project_" & DatePart > ("yyyy",Date) & Right("0" & DatePart("m",Date),2) & Right("0" & > DatePart("d", Date),2) & ".csv", ForWriting) > strSearchString = Replace(strSearchString, VbLf, " ") > strSearchString = Replace(strSearchString, VbCr & " ", VbCrLf) > objFile.Write strSearchString > objFile.Close > > MsgBox "done." > This won't work, if Pegasus' > I created my own Excel file and made one of the cells a two-liner, by > pressing Alt+Enter after the first line. When I saved the file as a CSV, > this generated a $0a in this particular field, which is not really > surprising. Each record was terminated by a $0d$0a, which is also as > expected. is correct, because the embedded line endings are *not* vbLf (Chr(10)), but vbCr (Chr(13)) To prove: Dim oFS : Set oFS = CreateObject( "Scripting.FileSystemObject" ) Dim sFSpec : sFSpec = ".\adotext\EmbeddedLineEndings.txt" Dim sFAll : sFAll = oFS.OpenTextFile( sFSpec ).ReadAll() Dim sT : sT = sFAll Dim sE : sE = sFAll WScript.Echo dumpHex( sFAll ) WScript.Echo "------------" sT = Replace( sT, VbLf, " ") sT = Replace( sT, VbCr & " ", VbCrLf) WScript.Echo dumpHex( sT ) WScript.Echo "------------" WScript.Echo "No change whatsoever:", CStr( sFAll = sT ) WScript.Echo "------------" sE = Replace( sE, vbCrLf, vbLf ) sE = Replace( sE, vbCr, "<InnerLineBreak>" ) sE = Replace( sE, vbLf, vbCrLf ) WScript.Echo dumpHex( sE ) WScript.Echo "------------" WScript.Echo sE Function dumpHex( sT ) Dim nLen : nLen = Len( sT ) ReDim aTmp( nLen - 1 ) Dim nPos For nPos = 1 To nLen aTmp( nPos - 1 ) = Right( "0" & Hex( AscB( Mid( sT, nPos, 1 ) ) ), 2 ) Next dumpHex = Join( aTmp, " " ) End Function output: === replaceLE: replace line endings =========================================== 22 49 64 22 3B 22 4E 61 6D 65 22 3B 22 52 65 6D 61 72 6B 22 3B 22 54 61 69 6C 22 0D 0A 31 3B 22 41 6C 70 68 61 22 3B 22 72 65 6D 31 0D 72 65 6D 32 22 3B 34 37 31 31 0D 0A 32 3B 22 42 65 74 61 22 3B 22 72 65 6D 33 0D 72 65 6D 34 22 3B 34 37 31 32 0D 0A ------------ 22 49 64 22 3B 22 4E 61 6D 65 22 3B 22 52 65 6D 61 72 6B 22 3B 22 54 61 69 6C 22 0D 0A 31 3B 22 41 6C 70 68 61 22 3B 22 72 65 6D 31 0D 72 65 6D 32 22 3B 34 37 31 31 0D 0A 32 3B 22 42 65 74 61 22 3B 22 72 65 6D 33 0D 72 65 6D 34 22 3B 34 37 31 32 0D 0A ------------ No change whatsoever: Wahr ------------ 22 49 64 22 3B 22 4E 61 6D 65 22 3B 22 52 65 6D 61 72 6B 22 3B 22 54 61 69 6C 22 0D 0A 31 3B 22 41 6C 70 68 61 22 3B 22 72 65 6D 31 3C 49 6E 6E 65 72 4C 69 6E 65 42 72 65 61 6B 3E 72 65 6D 32 22 3B 34 37 31 31 0D 0A 32 3B 22 42 65 74 61 22 3B 22 72 65 6D 33 3C 49 6E 6E 65 72 4C 69 6E 65 42 72 65 61 6B 3E 72 65 6D 34 22 3B 34 37 31 32 0D 0A ------------ "Id";"Name";"Remark";"Tail" 1;"Alpha";"rem1<InnerLineBreak>rem2";4711 2;"Beta";"rem3<InnerLineBreak>rem4";4712 === replaceLE: 0 done (00:00:00) ==============================================
From: Pegasus [MVP] on 8 Sep 2009 18:03 "Todd Vargo" <tlvargo(a)sbcglobal.netz> wrote in message news:eKfQJ6MMKHA.4028(a)TK2MSFTNGP05.phx.gbl... > Pegasus [MVP] wrote: >> Todd Vargo wrote: > ... >> > strSearchString = Replace(strSearchString, VbLf, " ") >> > strSearchString = Replace(strSearchString, VbCr & " ", VbCrLf) >> > objFile.Write strSearchString >> > objFile.Close >> > >> > MsgBox "done." >> >> This is what I thought too but after examining the OP's sample file I >> realised that it would not work. The multi-line fields contain embedded >> CRLFs, not just LFs. In my other reply I used the double-quote initiator > to >> recognise a quoted field so that I could replace the subsequent CRLFs >> with > a >> space, right up to the closing double quote. > > Unfortunately, I did not have a sample file from OP to work with. I > created > my own .csv using Excel which embeds the LF without the CR. One of OP's > posts described the "end of line" character as a block, which is what we > see > when LF is missing a preceding CR. I guess OP will just have to try it and > report back. > > -- > Todd Vargo > (Post questions to group only. Remove "z" to email personal messages) > The "blocks" mentioned by the OP and seen in Excel are either $95 or CRLFs. If you're interested then I'll send you his sample file.
From: Pegasus [MVP] on 8 Sep 2009 18:05 "ekkehard.horner" <ekkehard.horner(a)arcor.de> wrote in message news:4aa6d318$0$32670$9b4e6d93(a)newsspool2.arcor-online.net... > Todd Vargo schrieb: >> "Odd" <audrey.nsh(a)gmail.com> wrote in message >> news:8a2b8b75-480a-4315-88d4-849b5ceb3c96(a)r39g2000yqm.googlegroups.com... >>> Hi everyone, >>> >>> I have a .csv file where certain fields have carriage returns/new line >>> feeds. Because of such, I cannot properly import my file. Hence, I >>> wrote the following script and while it removes the carriage returns/ >>> new line feeds through the file, it ALSO removes the END OF LINE >>> carriage return/line feed. So now, I have one file with one record. >>> >>> Basically,I need to know how to remove the carriage returns/line feeds >>> in the file EXCEPT for the ones that is END OF LINE. Can somebody help >>> me please? >>> >>> Dim strSearchString, objFSO, objFile >>> Const ForReading = 1 >>> Const ForWriting = 2 >>> >>> >>> Set objFSO = CreateObject("Scripting.FileSystemObject") >>> Set objFile = objFSO.OpenTextFile("d:\Test\Project.csv", ForReading) >>> strSearchString = objFile.ReadAll >>> objFile.Close >>> >>> >>> objFSO.CreateTextFile "d:\Test\Project_" & DatePart("yyyy",Date) & >>> Right("0" & DatePart("m",Date),2) & Right("0" & DatePart("d", Date),2) >>> & ".csv" >>> Set objFile = objFSO.OpenTextFile("d:\Test\Project_" & DatePart >>> ("yyyy",Date) & Right("0" & DatePart("m",Date),2) & Right("0" & >>> DatePart("d", Date),2) & ".csv", ForWriting) >>> strSearchString = Replace(strSearchString, VbCrLf, "") >>> strSearchString = Replace(strSearchString, Vbtab, "") >>> >>> objFile.Write strSearchString >>> objFile.Close >> >> From your other posts it sounds like the .csv file was created by Excel >> and >> you see blocks (in Notepad or other editor) for the embedded line feeds. >> Removing the embedded line feeds is simple but then you have the lines >> running into each other. I suggest replacing the the line feeds with >> spaces >> so you wont have lines crashing together. To do this, 1) Replace all >> Chr(10) >> with a space. 2) Replace all Chr(13) + Chr(32) with Chr(13) + Chr(10). >> >> Dim strSearchString, objFSO, objFile >> Const ForReading = 1 >> Const ForWriting = 2 >> >> Set objFSO = CreateObject("Scripting.FileSystemObject") >> Set objFile = objFSO.OpenTextFile("d:\Test\Project.csv", ForReading) >> strSearchString = objFile.ReadAll >> objFile.Close >> >> Set objFile = objFSO.OpenTextFile("d:\Test\Project_" & DatePart >> ("yyyy",Date) & Right("0" & DatePart("m",Date),2) & Right("0" & >> DatePart("d", Date),2) & ".csv", ForWriting) >> strSearchString = Replace(strSearchString, VbLf, " ") >> strSearchString = Replace(strSearchString, VbCr & " ", VbCrLf) >> objFile.Write strSearchString >> objFile.Close >> >> MsgBox "done." >> > This won't work, if Pegasus' > > > I created my own Excel file and made one of the cells a two-liner, by > > pressing Alt+Enter after the first line. When I saved the file as a CSV, > > this generated a $0a in this particular field, which is not really > > surprising. Each record was terminated by a $0d$0a, which is also as > > expected. > > is correct, because the embedded line endings are *not* vbLf (Chr(10)), > but vbCr (Chr(13)) Unfortunately they are CRLFs. If you're interested then I'll send you the OP's sample file.
From: ekkehard.horner on 8 Sep 2009 18:10 Pegasus [MVP] schrieb: > "ekkehard.horner" <ekkehard.horner(a)arcor.de> wrote in message > news:4aa6b9de$0$32674$9b4e6d93(a)newsspool2.arcor-online.net... >> Pegasus [MVP] schrieb: >> [...] >>> The dump you posted does not really seem to be a "CSV" - it uses >>> semicolons as delimiters, not commas. In an attempt to put an end to the >>> guessing game, >> That is the German locale form of a .CSV; cross language processing may >> be cumbersome some times, but as long as you stay with your locale, >> >> PROCESSING .CSVs WITH EMBEDDED LINE-ENDINGS CAUSE NO PROBLEMS >> AT ALL AS LONG AS YOU USE THE PROPER (MICROSOFT) TOOLS > > Why all this shouting? Achtung, listen now! > > Because all these "use FSO & string replacements" trials & errors did cost the OP half a working day
From: ekkehard.horner on 8 Sep 2009 18:16
Richard Mueller [MVP] schrieb: > "ekkehard.horner" <ekkehard.horner(a)arcor.de> wrote in message [...] >> [Richard, you use ADO frequently - please help me to convince Odd >> and Pegasus that giving the text driver a try is NOT stupid] > > The one hex dump I've seen in this thread had a normal 0D 0A sequence, > equivalent to vbCrLf, but I didn't see enough to know what is going on. I > have used code similar to below to read csv files: [...] Thank you - so it seems there are at least two of us. > It would be intersting to see how code similar to above handles this file. > Still, if any of the fields has embedded characters that the OP wants to > remove, you will need to add steps to replace the characters with blanks (or > something else). [...] If Excel wrote the file, then the text driver(s) will parse it. I agree with you, that extra specs regarding 'unwanted characters' will cause extra work. |