From: Ashley Sheridan on 14 Mar 2010 07:45 On Sun, 2010-03-14 at 12:25 +0100, Rene Veerman wrote: > On Sun, Mar 14, 2010 at 12:24 PM, Rene Veerman <rene7705(a)gmail.com> wrote: > > > > I'd love to have a copy of whatever function you use to filter out bad > > HTML/js/flash for use cases where users are allowed to enter html. > > I'm aware of strip_tags() "allowed tags" param, but haven't got a good list > > for it. > > > > oh, and even <img> tags can be used for cookie-stuffing on many browsers.. > Yes, and you call strip_tags() before the data goes to the browser for display, not before it gets inserted into the database. Essentially, you need to keep as much original information as possible. Thanks, Ash http://www.ashleysheridan.co.uk
From: Jochem Maas on 14 Mar 2010 19:56 Op 3/14/10 11:45 AM, Ashley Sheridan schreef: > On Sun, 2010-03-14 at 12:25 +0100, Rene Veerman wrote: > >> On Sun, Mar 14, 2010 at 12:24 PM, Rene Veerman <rene7705(a)gmail.com> wrote: >>> >>> I'd love to have a copy of whatever function you use to filter out bad >>> HTML/js/flash for use cases where users are allowed to enter html. >>> I'm aware of strip_tags() "allowed tags" param, but haven't got a good list >>> for it. >>> >> >> oh, and even <img> tags can be used for cookie-stuffing on many browsers.. >> > > > Yes, and you call strip_tags() before the data goes to the browser for > display, not before it gets inserted into the database. Essentially, you > need to keep as much original information as possible. I disagree with both you. I'm like that :) let's assume we're not talking about data that is allowed to contain HTML, in such cases I would do a strip_tags() on the incoming data then compare the output ofstrip_tags() to the original input ... if they don't match then I would log the problem and refuse to input the data at all. using strip_tags() on a piece of data everytime you output it if you know that it shouldn't contain any in the first is a waste of resources ... this does assume that you can trust the data source ... which in the case of a database that you control should be the case. at any rate, strip_tags() doesn't belong in an 'anti-sql-injection' routine as it has nothing to do with sql injection at all. > > Thanks, > Ash > http://www.ashleysheridan.co.uk > > >
From: Colin Guthrie on 15 Mar 2010 08:48 'Twas brillig, and Jochem Maas at 14/03/10 23:56 did gyre and gimble: > Op 3/14/10 11:45 AM, Ashley Sheridan schreef: >> On Sun, 2010-03-14 at 12:25 +0100, Rene Veerman wrote: >> >>> On Sun, Mar 14, 2010 at 12:24 PM, Rene Veerman <rene7705(a)gmail.com> wrote: >>>> >>>> I'd love to have a copy of whatever function you use to filter out bad >>>> HTML/js/flash for use cases where users are allowed to enter html. >>>> I'm aware of strip_tags() "allowed tags" param, but haven't got a good list >>>> for it. >>>> >>> >>> oh, and even <img> tags can be used for cookie-stuffing on many browsers.. >>> >> >> >> Yes, and you call strip_tags() before the data goes to the browser for >> display, not before it gets inserted into the database. Essentially, you >> need to keep as much original information as possible. > > I disagree with both you. I'm like that :) > > let's assume we're not talking about data that is allowed to contain HTML, > in such cases I would do a strip_tags() on the incoming data then compare > the output ofstrip_tags() to the original input ... if they don't match then > I would log the problem and refuse to input the data at all. > > using strip_tags() on a piece of data everytime you output it if you know > that it shouldn't contain any in the first is a waste of resources ... this > does assume that you can trust the data source ... which in the case of a database > that you control should be the case. I used to think like that too, but I've relatively recently changed my position. While it's not as extreme an example, I used to keep data in the database *after* I had processed it with htmlspecialchars() (not quite the same as strip_tags, but the principle is the same). The issue I had was that over time, I've found the need to output to other formats - e.g. spread sheets, plain text emails, PDFs etc. in which case this pre-encoded format is a pain and I have to call html_entity_decode() to reverse the htmlspecialchars() I did in the first place. This is a royal pain in the bum and it's really ugly in the code, remembering what format the data is in in order to process it appropriately at the right points. Nowadays I work rather differently and always escape at the point of output (this does not exclude filtering at the point of input too, but I do not keep things encoded any longer - I keep it raw). Any half way decently designed caching layer will prevent any major impact from escaping at the point of output anyway. Now you could argue that encoding at the save point and reversing the encoding when needed is still a better approach and I wont argue too heavily, but for the sake of my sanity I'm much happier working the way I do now. The view layers are very clearly escaping everything that needs escaping and no logic for the "is it or is it not already escaped" leaks into this layer. (I appreciate strip tags and htmlspecialchars are not the same and my general usage may not apply to a pure striptags usage). > at any rate, strip_tags() doesn't belong in an 'anti-sql-injection' routine as > it has nothing to do with sql injection at all. Indeed, it's more about XSS and CSRF rather than SQL injection. Col -- Colin Guthrie gmane(at)colin.guthr.ie http://colin.guthr.ie/ Day Job: Tribalogic Limited [http://www.tribalogic.net/] Open Source: Mandriva Linux Contributor [http://www.mandriva.com/] PulseAudio Hacker [http://www.pulseaudio.org/] Trac Hacker [http://trac.edgewall.org/]
From: Ashley Sheridan on 15 Mar 2010 08:49 On Mon, 2010-03-15 at 12:48 +0000, Colin Guthrie wrote: > 'Twas brillig, and Jochem Maas at 14/03/10 23:56 did gyre and gimble: > > Op 3/14/10 11:45 AM, Ashley Sheridan schreef: > >> On Sun, 2010-03-14 at 12:25 +0100, Rene Veerman wrote: > >> > >>> On Sun, Mar 14, 2010 at 12:24 PM, Rene Veerman <rene7705(a)gmail.com> wrote: > >>>> > >>>> I'd love to have a copy of whatever function you use to filter out bad > >>>> HTML/js/flash for use cases where users are allowed to enter html. > >>>> I'm aware of strip_tags() "allowed tags" param, but haven't got a good list > >>>> for it. > >>>> > >>> > >>> oh, and even <img> tags can be used for cookie-stuffing on many browsers.. > >>> > >> > >> > >> Yes, and you call strip_tags() before the data goes to the browser for > >> display, not before it gets inserted into the database. Essentially, you > >> need to keep as much original information as possible. > > > > I disagree with both you. I'm like that :) > > > > let's assume we're not talking about data that is allowed to contain HTML, > > in such cases I would do a strip_tags() on the incoming data then compare > > the output ofstrip_tags() to the original input ... if they don't match then > > I would log the problem and refuse to input the data at all. > > > > using strip_tags() on a piece of data everytime you output it if you know > > that it shouldn't contain any in the first is a waste of resources ... this > > does assume that you can trust the data source ... which in the case of a database > > that you control should be the case. > > I used to think like that too, but I've relatively recently changed my > position. > > While it's not as extreme an example, I used to keep data in the > database *after* I had processed it with htmlspecialchars() (not quite > the same as strip_tags, but the principle is the same). > > The issue I had was that over time, I've found the need to output to > other formats - e.g. spread sheets, plain text emails, PDFs etc. in > which case this pre-encoded format is a pain and I have to call > html_entity_decode() to reverse the htmlspecialchars() I did in the > first place. This is a royal pain in the bum and it's really ugly in the > code, remembering what format the data is in in order to process it > appropriately at the right points. > > Nowadays I work rather differently and always escape at the point of > output (this does not exclude filtering at the point of input too, but I > do not keep things encoded any longer - I keep it raw). > > Any half way decently designed caching layer will prevent any major > impact from escaping at the point of output anyway. > > Now you could argue that encoding at the save point and reversing the > encoding when needed is still a better approach and I wont argue too > heavily, but for the sake of my sanity I'm much happier working the way > I do now. The view layers are very clearly escaping everything that > needs escaping and no logic for the "is it or is it not already escaped" > leaks into this layer. > > (I appreciate strip tags and htmlspecialchars are not the same and my > general usage may not apply to a pure striptags usage). > > > at any rate, strip_tags() doesn't belong in an 'anti-sql-injection' routine as > > it has nothing to do with sql injection at all. > > Indeed, it's more about XSS and CSRF rather than SQL injection. > > Col > > -- > > Colin Guthrie > gmane(at)colin.guthr.ie > http://colin.guthr.ie/ > > Day Job: > Tribalogic Limited [http://www.tribalogic.net/] > Open Source: > Mandriva Linux Contributor [http://www.mandriva.com/] > PulseAudio Hacker [http://www.pulseaudio.org/] > Trac Hacker [http://trac.edgewall.org/] > > You could escape the content with strip_tags() and insert both copies into the database if you're really worried about wasted resources. That way, you keep a copy of the original data, and the one you're most likely going to display in a web page. It's like the whole argument about modifying textarea content to replace newlines with <br/> tags. At some point, you might need that content for another use, and when you do, you'll wish you had the original. Just because you don't see that use in your immediate future, it doesn't mean it won't occur. Thanks, Ash http://www.ashleysheridan.co.uk
From: Tommy Pham on 17 Mar 2010 20:18 On Sat, Mar 13, 2010 at 11:10 AM, tedd <tedd.sperling(a)gmail.com> wrote: > Hi gang: > > I just completed writing a survey that has approximately 180 questions in it > and I need a fresh look at how to store the results so I can use them later. > > The survey requires the responder to identify themselves via an > authorization script. After which, the responder is permitted to take the > survey. Everything works as the client wants so there are no problems there. > > My question is how to store the results? > > I have the answers stored in a session variable, like: > > $_SESSION['answer']['e1'] > $_SESSION['answer']['e2'] > $_SESSION['answer']['e2a'] > $_SESSION['answer']['e2ai'] > $_SESSION['answer']['p1'] > $_SESSION['answer']['p1a'] > $_SESSION['answer']['p1ai'] > > and so on. As I said, there are around 180 questions/answers. > > Most of the answers are integers (less than 100), some are text, and some > will be null. > > Each "vote" will have a unique number (i.e., time) assigned to it as well as > a common survey id. > > My first thought was to simply record the "vote" as a single record with the > answers as a long string (maybe MEDIUMTEXT), such as: > > 1, 1268501271, e1, 1, e2, 16, e2a, Four score and ..., e2a1, , > > Then I thought I might make the data xml, such as: > > <survey_id>1</survey_id><vote_id>1268501271</vote_id><e1>1</e1><e2>16</e2><e2a>Four > score and ...</e2a><e2ai></e2ai> > > That way I can strip text entries for <> and have absolute control over > question separation. > > Then I thought I could make each question/answer combination have it's own > record while using the vote_id to tie the "vote" together. That way I can > use MySQL to do the heavy lifting during the analysis. While each "vote" > creates 180 records, I like this way best. > > Then I thought, what would you guys do? So, what would you guys do? > > Keep in mind that this survey must evaluated in terms of answers, such as > "Of the ones who answered e1 as 1 how did they answer e2?" > > If there is something wrong with my preference, please let me know. > > Thanks, > > tedd > > -- > ------- Tedd, Sorry to be jumping in late, trying to migrate from yahoo mail to gmail since I'm experiencing more problems with yahoo mail then I'd like. Any way, I'd go with db storage for storing of the results since it will give better and more flexible analysis and reporting later. Below is how I'd do the db structure: tbl_survey_questions: questionId = int / uid << your call languageId = int / uid / char << your call if you intend to I18n it ;) question = varchar << length is your requirement PK > questionId + languageId tbl_participants: userId = int / uid userName = varchar PK > userId tbl_answers: userId = int / uid questionId = int / uid languageId = int / uid answer = varchar / mediumtext / or another type of text field PK > userId + questionId + languageId The reason why I'd structure it like this is: Let's say you have question 1 with 5 (a-e) multiple choices, you aggregrate your query (GROUP BY) to db for question 1 and see how many responses are for a to e (each). If your survey is I18n and your DB reflects it, you can even analyze how/why certain cultural background would choose each of those answer. (don't flame me... I know the environment comes in to growing up too :p and that's way beyond the scope of this list ) For question 2 with could be user entry (non multiple choice selection), again, you see what their opinions are for that question. You get the idea as how the rest may go. I used to do lots of reporting with the real tool, Crystal Report ;) Regards, Tommy
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 5 Prev: Re[2]: [PHP] Re: PHP Sessions Next: natural text / human text analysis |