From: Ole-Hjalmar Kristensen on 1 Sep 2009 08:11 >>>>> "GB" == Georg Bauhaus <rm.dash-bauhaus(a)futureapps.de> writes: I think the problem is not the explicit copy which happens when we hit the end of the buffer (easily verified, just increase the buffer size 10 times, the copying will happen 1/10 of the time, but execution time stays the same), but the implicit copy which seems to happen in the statement return In_Buf(Start..I-1); I do not know if this copy is unavoidable, or if the compiler could have optimized it away. The idea of using a storage pool is an interesting one, since it could allow you to avoid copying when you want to return a string from a function. But how would you overlay the string on top of the read buffer? An Ada string is not a C string, you would need to put the descriptor somewhere as well. GB> Ole-Hjalmar Kristensen schrieb: >> astra06:~/ada/div-ada> uname -a >> SunOS astra06 5.10 Generic_137137-09 sun4u sparc SUNW,Sun-Fire-V440 >> astra06:~/ada/div-ada> ls -l x.txt >> -rw------- 1 ok145024 clustra 60000000 Sep 1 12:03 x.txt >> astra06:~/ada/div-ada> time ./get_line_test < x.txt >> 0.49u 0.09s 0:00.59 98.3% >> GB> Comparing the two programs, they are on a par, almost GB> exactly. GB> However, yours is IMHO less messy. It also has GB> return Copy & Get_Line; GB> which, I guess, solves the problem of long lines, even GB> at the expense of copying. GB> Can we reduce the copying overhead? GB> The copying would be have O(f(Item'Length)) with GB> f(n) ~= s*(s+1)/2 where s = n/BUFSIZ, so it is in O(n**2). GB> We could, I think, use a storage pool: GB> - whenever copying is needed, allocate another buffer GB> right next to the current buffer. Use the newly allocated GB> buffer for the recursive call. GB> - on reaching of line/file, reinterpret the sequence of GB> buffers in the storage pool as a String. Finally, Free. GB> Can this be done? GB> (This won't make a Shootout program any smaller, but GB> seems good for the general case...) -- C++: The power, elegance and simplicity of a hand grenade.
From: Georg Bauhaus on 1 Sep 2009 10:36 Ole-Hjalmar Kristensen schrieb: >>>>>> "GB" == Georg Bauhaus <rm.dash-bauhaus(a)futureapps.de> writes: > > I think the problem is not the explicit copy which happens when we hit > the end of the buffer (easily verified, just increase the buffer size > 10 times, the copying will happen 1/10 of the time, but execution time > stays the same), but the implicit copy which seems to happen in the statement > return In_Buf(Start..I-1); I do not know if this copy is unavoidable, > or if the compiler could have optimized it away. WRT return In_Buf(Start..I-1), there seems to be a way to have the bounds shift towards 1 .. Result'Length without copying: with Ada.Text_IO; procedure Bnds is function S return String is begin return String'(10 .. 20 => '*'); end S; X : constant String := S; begin Ada.Text_IO.Put_Line(Positive'Image(X'First)); declare subtype XString is String(1 .. X'Length); Z : XString; pragma Import (Ada, Z); for Z'Address use X'Address; begin Ada.Text_IO.Put_Line(Positive'Image(Z'First)); end; end Bnds; Is this "phantom sliding" accidental or can it be relied upon? > The idea of using a storage pool is an interesting one, since it could > allow you to avoid copying when you want to return a string from a > function. But how would you overlay the string on top of the read > buffer? An Ada string is not a C string, you would need to put the > descriptor somewhere as well. I'll reread "Accessing Memory as a String", http://adapower.com/adapower1/lang/accessmem.html Maybe this leads somewhere regarding this question.
From: Georg Bauhaus on 2 Sep 2009 04:44 Ole-Hjalmar Kristensen wrote: >>>>>> "GB" == Georg Bauhaus <rm.dash-bauhaus(a)futureapps.de> writes: > > I think the problem is not the explicit copy which happens when we hit > the end of the buffer (easily verified, just increase the buffer size > 10 times, the copying will happen 1/10 of the time, but execution time > stays the same), but the implicit copy which seems to happen in the statement > return In_Buf(Start..I-1); I do not know if this copy is unavoidable, > or if the compiler could have optimized it away. It looks like copying In_Buf(Start..I-1) be circumvented. I have added a solution to Line_IO. read() performance vs. Stream_IO.Read performance seems to depend on the OS. On a Mac here, read() is faster. On Ubuntu, Stream_IO.Read is faster. There is an issue with GNATs and End_File (GNATs other than I guess the one you have). When comparing N <= 0, ``operator for type "System.Crtl.size_t" is not directly visible''. To work around this, I wrote function End_File return Boolean is type Size_T is mod 2**Standard'Address_Size; N0 : constant Size_T := Size_T(N); Zero : constant Size_T := 0; begin return N0 <= Zero; end End_File; Finally, I think Get_Line_Test can be made more portable among Unix compilers by changing GNAT specific Interfaces.C_Streams into Interfaces.C; Could be slightly less convenient to write, though?
From: Ole-Hjalmar Kristensen on 2 Sep 2009 07:07 >>>>> "GB" == Georg Bauhaus <rm.tsoh.plus-bug.bauhaus(a)maps.futureapps.de> writes: GB> Ole-Hjalmar Kristensen wrote: >>>>>>> "GB" == Georg Bauhaus <rm.dash-bauhaus(a)futureapps.de> writes: >> >> I think the problem is not the explicit copy which happens when we hit >> the end of the buffer (easily verified, just increase the buffer size >> 10 times, the copying will happen 1/10 of the time, but execution time >> stays the same), but the implicit copy which seems to happen in the statement >> return In_Buf(Start..I-1); I do not know if this copy is unavoidable, >> or if the compiler could have optimized it away. GB> It looks like copying In_Buf(Start..I-1) be circumvented. GB> I have added a solution to Line_IO. GB> read() performance vs. Stream_IO.Read performance seems GB> to depend on the OS. On a Mac here, read() is faster. GB> On Ubuntu, Stream_IO.Read is faster. Strange. I cannot really imagine how Stream_IO.Read is faster since it has to do the system call to do the actual read anyway. One possibility is that In_Buf is not properly aligned, so that read() into In_Buf is slower than the equivalent read() into the C stdio buffers. The system call should be the same in any case. GB> There is an issue with GNATs and End_File (GNATs other GB> than I guess the one you have). When comparing N <= 0, GB> ``operator for type "System.Crtl.size_t" is not GB> directly visible''. To work around this, I wrote GB> function End_File return Boolean is GB> type Size_T is mod 2**Standard'Address_Size; GB> N0 : constant Size_T := Size_T(N); GB> Zero : constant Size_T := 0; GB> begin GB> return N0 <= Zero; GB> end End_File; Yes. I was using an old version of GNAT. GB> Finally, I think Get_Line_Test can be made more portable among Unix GB> compilers by changing GNAT specific Interfaces.C_Streams GB> into Interfaces.C; Could be slightly less convenient to GB> write, though? I only included Interfaces.C_Streams because it was a convenient place to get hold of the types needed to import the read() function, so if you find them in Interfaces.C, that's just fine. -- C++: The power, elegance and simplicity of a hand grenade.
From: Ole-Hjalmar Kristensen on 3 Sep 2009 04:13
>>>>> "GB" == Georg Bauhaus <rm.tsoh.plus-bug.bauhaus(a)maps.futureapps.de> writes: GB> Ole-Hjalmar Kristensen wrote: >>>>>>> "GB" == Georg Bauhaus <rm.dash-bauhaus(a)futureapps.de> writes: >> >> I think the problem is not the explicit copy which happens when we hit >> the end of the buffer (easily verified, just increase the buffer size >> 10 times, the copying will happen 1/10 of the time, but execution time >> stays the same), but the implicit copy which seems to happen in the statement >> return In_Buf(Start..I-1); I do not know if this copy is unavoidable, >> or if the compiler could have optimized it away. GB> It looks like copying In_Buf(Start..I-1) be circumvented. GB> I have added a solution to Line_IO. How did you avoid the copying? -- C++: The power, elegance and simplicity of a hand grenade. |