From: Nathan Rixham on
Anshul Agrawal wrote:
> On Tue, Mar 30, 2010 at 8:41 PM, Jan G.B. <ro0ot.w00t(a)googlemail.com> wrote:
>
>> 2010/3/30 Nathan Rixham <nrixham(a)gmail.com>:
>>> Jan G.B. wrote:
>>>> 2010/3/29 Nathan Rixham <nrixham(a)gmail.com>
>>>>
>>>>> Jan G.B. wrote:
>>>>>> 2010/3/29 Nathan Rixham <nrixham(a)gmail.com>
>>>>>>
>>>>>>> Jan G.B. wrote:
>>>>>>>> Top posting sucks, so I'll answer the post somewhere down there.
>>>>>>>> <SCNR>
>>>>>>>>
>>>>>>>> 2010/3/29 Devendra Jadhav <devendra.in(a)gmail.com>
>>>>>>>>
>>>>>>>>> Then you can do file_get_contents within PHP. or any file handling
>>>>>>>>> mechanism.
>>>>>>>>>>> On Mon, Mar 29, 2010 at 1:00 AM, ebhakt <im(a)ebhakt.com> wrote:
>>>>>>>>>>>> Hi
>>>>>>>>>>>> i am writing a web application in php
>>>>>>>>>>>> this webapp primarily focuses on file uploads and downloads
>>>>>>>>>>>> the uploaded files will be saved in a folder which is not in
>>>>> document
>>>>>>>>>>>> root
>>>>>>>>>>>> and my query is how will i be able to provide download to such
>>>>> files
>>>>>>>>> not
>>>>>>>>>>>> located in document root via php
>>>>>>>>>>>>
>>>>>>>> Try something like that
>>>>>>>> <?php
>>>>>>>> $content = file_get_contents($filename);
>>>>>>>> $etag = md5($content);
>>>>>>>> header('Last-Modified: '.gmdate('D, d M Y H:i:s',
>>>>>>>> filemtime($filename)).' GMT');
>>>>>>>> header('ETag: '.$etag);
>>>>>>>> header('Accept-Ranges: bytes');
>>>>>>>> header('Content-Length: '.strlen($content));
>>>>>>>> header('Cache-Control: '.$cache_value); // you decide
>>>>>>>> header('Content-type: '.$should_be_set);
>>>>>>>> echo $content;
>>>>>>>> exit;
>>>>>>>> ?>
>>>>>>>>
>>>>>>>> Depending on the $filesize, you should use something else than
>>>>>>>> file_get_contents() (for example fopen/fread). file_get_contents on
>> a
>>>>>>> huge
>>>>>>>> file will exhaust your webservers RAM.
>>>>>>> Yup, so you can map the <Directory /path/to> in web server config;
>> then
>>>>>>> "allow from" only from localhost + yourdomain. This means you can
>> then
>>>>>>> request it like an url and do a head request to get the etag etc then
>>>>>>> return a 304 not modified if you received a matching etag
>> Last-Modified
>>>>>>> etc; (thus meaning you only file_get_contents when really really
>>>>> needed).
>>>>>>> I'd advise against saying you Accept-Ranges bytes if you don't accept
>>>>>>> byte ranges (ie you aren't going to send little bits of the file).
>>>>>>>
>>>>>>> If you need the downloads to be secure only; then you could easily
>>>>>>> negate php all together and simply expose the directory via a
>> location
>>>>>>> so that it is web accessible and set it up to ask for "auth" using
>>>>>>> htpasswd; a custom script, ldap or whatever.
>>>>>>>
>>>>>>> And if you don't need security then why have php involved at all?
>> simply
>>>>>>> symlink to the directory or expose it via http and be done with the
>>>>>>> problem in a minute or two.
>>>>>>>
>>>>>>> Regards!
>>>>>>>
>>>>>> In my opinion, serving user-content on a productive server is wicked
>>>>> sick.
>>>>>> You don't want your visitors to upload malicous files that may trigger
>>>>> some
>>>>>> modules as mod_php in apache. So it makes sense to store user-uploads
>>>>>> outside of a docroot and with no symlink or whatsover.
>>>>> even the simplest of server configurations will ensure safety. just use
>>>>> .htaccess to SetHandler default-handler which treats everything as
>>>>> static content and serves it right up.
>>>>>
>>>> Yes. But the average persons posting here aren't server config gods, I
>>>> believe.
>>>> Also, you can not implement permissions on these files.
>>>> The discussion was about serving files from a place outside any docroot!
>>>> Guess there is a reason for that.
>>>>
>>>>
>>>>>> One more thing added: your RAM will be exhausted even if you open that
>>>>> 600mb
>>>>>> file just once.
>>>>>> Apaches memory handling is a bit weird: if *one* apache process is
>> using
>>>>>> 200mb RAM on *one* impression because your application uses that much,
>>>>> then
>>>>>> that process will not release the memory while it's serving another
>> 1000
>>>>>> requests for `clear.gif` which is maybe 850b in size.
>>>>> again everything depends on how you have your server configured; you
>> can
>>>>> easily tell apache to kill each child after one run or a whole host of
>>>>> other configs; but ultimately if you can avoid opening up that file in
>>>>> php then do; serving statically as above is the cleanest quickest way
>> to
>>>>> do it (other than using s3 or similar).
>>>>>
>>>>> regards!
>>>>>
>>>> Sure, you could configure your apache like that. Unless you have some
>>>> traffic on your site, because the time intensive thing for apache is to
>>>> spawn new processes. So it's just not a good idea to do that, Nor to
>> serve
>>>> big files via file_get_contents.
>>> was only addressing and issue you pointed out.. anyways.. so you propose
>>> what exactly? don't server via apache, don't use file_get_contents
>>> instead do..?
>>>
>>> ps you do realise that virtually every "huge" file on the net is served
>>> via a web server w/o problems yeah?
>>>
>>>
>> I was recommending other file methods like fopen() combinations,
>> fpassthru() and at best readfile(). All of them do not buffer the
>> whole file in memory.
>>
>> http://php.net/readfile
>> http://php.net/fpassthru
>>
>> Regards
>>
>>
>>
> --
>> PHP General Mailing List (http://www.php.net/)
>> To unsubscribe, visit: http://www.php.net/unsub.php
>>
>>
> I wanted to see the diff between the memory usage of following three methods
> in PHP.
> 1. readfile
> 2. fopen followed by fpassthru, and
> 3. file_get_contents
>
> Using xdebug trace, all three of them gave same number. With
> memory_get_peak_usage(true) file_get_contents took double the space. (file
> being tested was mere 4mb in size)
>
> Unable to decide what is the best way to profile such methods. Can anybody
> suggest?


do it with a huge file and watch top or suchlike; you'll note that
readfile doesn't affect memory whereas file_get_contents does;
fpassthrough has an extra couple of commands (fopen close) but that's a
marginal hit.

regards!
From: Anshul Agrawal on
On Wed, Mar 31, 2010 at 1:12 AM, Nathan Rixham <nrixham(a)gmail.com> wrote:

> Anshul Agrawal wrote:
> > On Tue, Mar 30, 2010 at 8:41 PM, Jan G.B. <ro0ot.w00t(a)googlemail.com>
> wrote:
> >
> >> 2010/3/30 Nathan Rixham <nrixham(a)gmail.com>:
> >>> Jan G.B. wrote:
> >>>> 2010/3/29 Nathan Rixham <nrixham(a)gmail.com>
> >>>>
> >>>>> Jan G.B. wrote:
> >>>>>> 2010/3/29 Nathan Rixham <nrixham(a)gmail.com>
> >>>>>>
> >>>>>>> Jan G.B. wrote:
> >>>>>>>> Top posting sucks, so I'll answer the post somewhere down there.
> >>>>>>>> <SCNR>
> >>>>>>>>
> >>>>>>>> 2010/3/29 Devendra Jadhav <devendra.in(a)gmail.com>
> >>>>>>>>
> >>>>>>>>> Then you can do file_get_contents within PHP. or any file
> handling
> >>>>>>>>> mechanism.
> >>>>>>>>>>> On Mon, Mar 29, 2010 at 1:00 AM, ebhakt <im(a)ebhakt.com> wrote:
> >>>>>>>>>>>> Hi
> >>>>>>>>>>>> i am writing a web application in php
> >>>>>>>>>>>> this webapp primarily focuses on file uploads and downloads
> >>>>>>>>>>>> the uploaded files will be saved in a folder which is not in
> >>>>> document
> >>>>>>>>>>>> root
> >>>>>>>>>>>> and my query is how will i be able to provide download to such
> >>>>> files
> >>>>>>>>> not
> >>>>>>>>>>>> located in document root via php
> >>>>>>>>>>>>
> >>>>>>>> Try something like that
> >>>>>>>> <?php
> >>>>>>>> $content = file_get_contents($filename);
> >>>>>>>> $etag = md5($content);
> >>>>>>>> header('Last-Modified: '.gmdate('D, d M Y H:i:s',
> >>>>>>>> filemtime($filename)).' GMT');
> >>>>>>>> header('ETag: '.$etag);
> >>>>>>>> header('Accept-Ranges: bytes');
> >>>>>>>> header('Content-Length: '.strlen($content));
> >>>>>>>> header('Cache-Control: '.$cache_value); // you decide
> >>>>>>>> header('Content-type: '.$should_be_set);
> >>>>>>>> echo $content;
> >>>>>>>> exit;
> >>>>>>>> ?>
> >>>>>>>>
> >>>>>>>> Depending on the $filesize, you should use something else than
> >>>>>>>> file_get_contents() (for example fopen/fread). file_get_contents
> on
> >> a
> >>>>>>> huge
> >>>>>>>> file will exhaust your webservers RAM.
> >>>>>>> Yup, so you can map the <Directory /path/to> in web server config;
> >> then
> >>>>>>> "allow from" only from localhost + yourdomain. This means you can
> >> then
> >>>>>>> request it like an url and do a head request to get the etag etc
> then
> >>>>>>> return a 304 not modified if you received a matching etag
> >> Last-Modified
> >>>>>>> etc; (thus meaning you only file_get_contents when really really
> >>>>> needed).
> >>>>>>> I'd advise against saying you Accept-Ranges bytes if you don't
> accept
> >>>>>>> byte ranges (ie you aren't going to send little bits of the file).
> >>>>>>>
> >>>>>>> If you need the downloads to be secure only; then you could easily
> >>>>>>> negate php all together and simply expose the directory via a
> >> location
> >>>>>>> so that it is web accessible and set it up to ask for "auth" using
> >>>>>>> htpasswd; a custom script, ldap or whatever.
> >>>>>>>
> >>>>>>> And if you don't need security then why have php involved at all?
> >> simply
> >>>>>>> symlink to the directory or expose it via http and be done with the
> >>>>>>> problem in a minute or two.
> >>>>>>>
> >>>>>>> Regards!
> >>>>>>>
> >>>>>> In my opinion, serving user-content on a productive server is wicked
> >>>>> sick.
> >>>>>> You don't want your visitors to upload malicous files that may
> trigger
> >>>>> some
> >>>>>> modules as mod_php in apache. So it makes sense to store
> user-uploads
> >>>>>> outside of a docroot and with no symlink or whatsover.
> >>>>> even the simplest of server configurations will ensure safety. just
> use
> >>>>> .htaccess to SetHandler default-handler which treats everything as
> >>>>> static content and serves it right up.
> >>>>>
> >>>> Yes. But the average persons posting here aren't server config gods, I
> >>>> believe.
> >>>> Also, you can not implement permissions on these files.
> >>>> The discussion was about serving files from a place outside any
> docroot!
> >>>> Guess there is a reason for that.
> >>>>
> >>>>
> >>>>>> One more thing added: your RAM will be exhausted even if you open
> that
> >>>>> 600mb
> >>>>>> file just once.
> >>>>>> Apaches memory handling is a bit weird: if *one* apache process is
> >> using
> >>>>>> 200mb RAM on *one* impression because your application uses that
> much,
> >>>>> then
> >>>>>> that process will not release the memory while it's serving another
> >> 1000
> >>>>>> requests for `clear.gif` which is maybe 850b in size.
> >>>>> again everything depends on how you have your server configured; you
> >> can
> >>>>> easily tell apache to kill each child after one run or a whole host
> of
> >>>>> other configs; but ultimately if you can avoid opening up that file
> in
> >>>>> php then do; serving statically as above is the cleanest quickest way
> >> to
> >>>>> do it (other than using s3 or similar).
> >>>>>
> >>>>> regards!
> >>>>>
> >>>> Sure, you could configure your apache like that. Unless you have some
> >>>> traffic on your site, because the time intensive thing for apache is
> to
> >>>> spawn new processes. So it's just not a good idea to do that, Nor to
> >> serve
> >>>> big files via file_get_contents.
> >>> was only addressing and issue you pointed out.. anyways.. so you
> propose
> >>> what exactly? don't server via apache, don't use file_get_contents
> >>> instead do..?
> >>>
> >>> ps you do realise that virtually every "huge" file on the net is served
> >>> via a web server w/o problems yeah?
> >>>
> >>>
> >> I was recommending other file methods like fopen() combinations,
> >> fpassthru() and at best readfile(). All of them do not buffer the
> >> whole file in memory.
> >>
> >> http://php.net/readfile
> >> http://php.net/fpassthru
> >>
> >> Regards
> >>
> >>
> >>
> > --
> >> PHP General Mailing List (http://www.php.net/)
> >> To unsubscribe, visit: http://www.php.net/unsub.php
> >>
> >>
> > I wanted to see the diff between the memory usage of following three
> methods
> > in PHP.
> > 1. readfile
> > 2. fopen followed by fpassthru, and
> > 3. file_get_contents
> >
> > Using xdebug trace, all three of them gave same number. With
> > memory_get_peak_usage(true) file_get_contents took double the space.
> (file
> > being tested was mere 4mb in size)
> >
> > Unable to decide what is the best way to profile such methods. Can
> anybody
> > suggest?
>
>
> do it with a huge file and watch top or suchlike; you'll note that
> readfile doesn't affect memory whereas file_get_contents does;
> fpassthrough has an extra couple of commands (fopen close) but that's a
> marginal hit.
>
> regards!
>

Somehow the max memory usage reported by the system for readfile and
fpassthru is double the file size, which in turn is more than the memory
limit allowed in PHP.
For file_get_contents, php throws an memory out of limit exception.

It seems when the file data is handed over to Apache, apache buffer is what
is eating up the memory space and reflected in process list. (I am using
Windows by the way)

Thanks for help,
Anshul
From: Tommy Pham on
On Wed, Mar 31, 2010 at 12:43 AM, Anshul Agrawal <drinknderive(a)gmail.com> wrote:
> On Wed, Mar 31, 2010 at 1:12 AM, Nathan Rixham <nrixham(a)gmail.com> wrote:
>
>> Anshul Agrawal wrote:
>> > On Tue, Mar 30, 2010 at 8:41 PM, Jan G.B. <ro0ot.w00t(a)googlemail.com>
>> wrote:
>> >
>> >> 2010/3/30 Nathan Rixham <nrixham(a)gmail.com>:
>> >>> Jan G.B. wrote:
>> >>>> 2010/3/29 Nathan Rixham <nrixham(a)gmail.com>
>> >>>>
>> >>>>> Jan G.B. wrote:
>> >>>>>> 2010/3/29 Nathan Rixham <nrixham(a)gmail.com>
>> >>>>>>
>> >>>>>>> Jan G.B. wrote:
>> >>>>>>>> Top posting sucks, so I'll answer the post somewhere down there..
>> >>>>>>>> <SCNR>
>> >>>>>>>>
>> >>>>>>>> 2010/3/29 Devendra Jadhav <devendra.in(a)gmail.com>
>> >>>>>>>>
>> >>>>>>>>> Then you can do file_get_contents within PHP. or any file
>> handling
>> >>>>>>>>> mechanism.
>> >>>>>>>>>>> On Mon, Mar 29, 2010 at 1:00 AM, ebhakt <im(a)ebhakt.com> wrote:
>> >>>>>>>>>>>> Hi
>> >>>>>>>>>>>> i am writing a web application in php
>> >>>>>>>>>>>> this webapp primarily focuses on file uploads and downloads
>> >>>>>>>>>>>> the uploaded files will be saved in a folder which is not in
>> >>>>> document
>> >>>>>>>>>>>> root
>> >>>>>>>>>>>> and my query is how will i be able to provide download to such
>> >>>>> files
>> >>>>>>>>> not
>> >>>>>>>>>>>> located in document root via php
>> >>>>>>>>>>>>
>> >>>>>>>> Try something like that
>> >>>>>>>> <?php
>> >>>>>>>>         $content = file_get_contents($filename);
>> >>>>>>>>         $etag = md5($content);
>> >>>>>>>>         header('Last-Modified: '.gmdate('D, d M Y H:i:s',
>> >>>>>>>> filemtime($filename)).' GMT');
>> >>>>>>>>         header('ETag: '.$etag);
>> >>>>>>>>         header('Accept-Ranges: bytes');
>> >>>>>>>>         header('Content-Length: '.strlen($content));
>> >>>>>>>>         header('Cache-Control: '.$cache_value); // you decide
>> >>>>>>>>         header('Content-type: '.$should_be_set);
>> >>>>>>>>         echo $content;
>> >>>>>>>>         exit;
>> >>>>>>>> ?>
>> >>>>>>>>
>> >>>>>>>> Depending on the $filesize, you should use something else than
>> >>>>>>>> file_get_contents() (for example fopen/fread). file_get_contents
>> on
>> >> a
>> >>>>>>> huge
>> >>>>>>>> file will exhaust your webservers RAM.
>> >>>>>>> Yup, so you can map the <Directory /path/to> in web server config;
>> >> then
>> >>>>>>> "allow from" only from localhost + yourdomain. This means you can
>> >> then
>> >>>>>>> request it like an url and do a head request to get the etag etc
>> then
>> >>>>>>> return a 304 not modified if you received a matching etag
>> >> Last-Modified
>> >>>>>>> etc; (thus meaning you only file_get_contents when really really
>> >>>>> needed).
>> >>>>>>> I'd advise against saying you Accept-Ranges bytes if you don't
>> accept
>> >>>>>>> byte ranges (ie you aren't going to send little bits of the file).
>> >>>>>>>
>> >>>>>>> If you need the downloads to be secure only; then you could easily
>> >>>>>>> negate php all together and simply expose the directory via a
>> >> location
>> >>>>>>> so that it is web accessible and set it up to ask for "auth" using
>> >>>>>>> htpasswd; a custom script, ldap or whatever.
>> >>>>>>>
>> >>>>>>> And if you don't need security then why have php involved at all?
>> >> simply
>> >>>>>>> symlink to the directory or expose it via http and be done with the
>> >>>>>>> problem in a minute or two.
>> >>>>>>>
>> >>>>>>> Regards!
>> >>>>>>>
>> >>>>>> In my opinion, serving user-content on a productive server is wicked
>> >>>>> sick.
>> >>>>>> You don't want your visitors to upload malicous files that may
>> trigger
>> >>>>> some
>> >>>>>> modules as mod_php in apache. So it makes sense to store
>> user-uploads
>> >>>>>> outside of a docroot and with no symlink or whatsover.
>> >>>>> even the simplest of server configurations will ensure safety. just
>> use
>> >>>>> .htaccess to SetHandler default-handler which treats everything as
>> >>>>> static content and serves it right up.
>> >>>>>
>> >>>> Yes. But the average persons posting here aren't server config gods, I
>> >>>> believe.
>> >>>> Also, you can not implement permissions on these files.
>> >>>> The discussion was about serving files from a place outside any
>> docroot!
>> >>>> Guess there is a reason for that.
>> >>>>
>> >>>>
>> >>>>>> One more thing added: your RAM will be exhausted even if you open
>> that
>> >>>>> 600mb
>> >>>>>> file just once.
>> >>>>>> Apaches memory handling is a bit weird: if *one* apache process is
>> >> using
>> >>>>>> 200mb RAM on *one* impression because your application uses that
>> much,
>> >>>>> then
>> >>>>>> that process will not release the memory while it's serving another
>> >> 1000
>> >>>>>> requests for `clear.gif` which is maybe 850b in size.
>> >>>>> again everything depends on how you have your server configured; you
>> >> can
>> >>>>> easily tell apache to kill each child after one run or a whole host
>> of
>> >>>>> other configs; but ultimately if you can avoid opening up that file
>> in
>> >>>>> php then do; serving statically as above is the cleanest quickest way
>> >> to
>> >>>>> do it (other than using s3 or similar).
>> >>>>>
>> >>>>> regards!
>> >>>>>
>> >>>> Sure, you could configure your apache like that. Unless you have some
>> >>>> traffic on your site, because the time intensive thing for apache is
>> to
>> >>>> spawn new processes. So it's just not a good idea to do that, Nor to
>> >> serve
>> >>>> big files via file_get_contents.
>> >>> was only addressing and issue you pointed out.. anyways.. so you
>> propose
>> >>> what exactly? don't server via apache, don't use file_get_contents
>> >>> instead do..?
>> >>>
>> >>> ps you do realise that virtually every "huge" file on the net is served
>> >>> via a web server w/o problems yeah?
>> >>>
>> >>>
>> >> I was recommending other file methods like fopen() combinations,
>> >> fpassthru() and at best readfile(). All of them do not buffer the
>> >> whole file in memory.
>> >>
>> >> http://php.net/readfile
>> >> http://php.net/fpassthru
>> >>
>> >> Regards
>> >>
>> >>
>> >>
>> > --
>> >> PHP General Mailing List (http://www.php.net/)
>> >> To unsubscribe, visit: http://www.php.net/unsub.php
>> >>
>> >>
>> > I wanted to see the diff between the memory usage of following three
>> methods
>> > in PHP.
>> > 1. readfile
>> > 2. fopen followed by fpassthru, and
>> > 3. file_get_contents
>> >
>> > Using xdebug trace, all three of them gave same number. With
>> > memory_get_peak_usage(true) file_get_contents took double the space.
>> (file
>> > being tested was mere 4mb in size)
>> >
>> > Unable to decide what is the best way to profile such methods. Can
>> anybody
>> > suggest?
>>
>>
>> do it with a huge file and watch top or suchlike; you'll note that
>> readfile doesn't affect memory whereas file_get_contents does;
>> fpassthrough has an extra couple of commands (fopen close) but that's a
>> marginal hit.
>>
>> regards!
>>
>
> Somehow the max memory usage reported by the system for readfile and
> fpassthru is double the file size, which in turn is more than the memory
> limit allowed in PHP.
> For file_get_contents, php throws an memory out of limit exception.
>
> It seems when the file data is handed over to Apache, apache buffer is what
> is eating up the memory space and reflected in process list. (I am using
> Windows by the way)
>
> Thanks for help,
> Anshul
>

Have you read this? http://httpd.apache.org/docs/2.1/caching.html
NOTE: Link implies version 2.1 but doc is for version 2.2.