Prev: Anybody mind to translate this Felleisen quote from German to English
Next: Macros and anonymous functions
From: Zach Beane on 1 Mar 2010 16:21 ccc31807 <cartercc(a)gmail.com> writes: > On Mar 1, 1:46 pm, Zach Beane <x...(a)xach.com> wrote: >> This is a terrible pattern for email addresses. >> >> What would you do if you found a false negative or false positive? Would >> you patch it up to be good enough again (until the next problem), or >> would you try to get it right? > > Depends on what your need is. If you are validating an HTML form > submission, all you may care about is three groups of letters divided > by an '@' and a '.'. I'm wondering about the specific regular expression you offered. It rejects a large class of valid email addresses. In what situation would you use a regular expression like that? If you used it in a situation where rejecting valid email addresses caused a problem, how would you fix it? Zach
From: ccc31807 on 1 Mar 2010 16:53 On Mar 1, 4:21 pm, Zach Beane <x...(a)xach.com> wrote: > I'm wondering about the specific regular expression you offered. It > rejects a large class of valid email addresses. In what situation would > you use a regular expression like that? If you used it in a situation > where rejecting valid email addresses caused a problem, how would you > fix it? I apologize for the confusion. This is something I made up on the fly, untested, and not used in code, at least by me. I just wanted to show something 'like' I would use without finding an actual example in code. Probably the most common task in my job is to take apart a datafile wherein some of the values are email addresses. I know the values contained in the database from past experience, so I can get away with testing for the @ symbol for an email address, and this is sufficient. Actually, this file contains a lot of information, in a particular order, with multiple telephone numbers, email addresses, terms, and restrictions. I break up the values with REs like these: =~ /@/ -- it's an email address =~ /\// -- it's a term (terms contain the solidus) =~ /\d{3}.?\d{3}.?\d{4}/ -- it's a telephone number =~ /[A-Z]{3,4}/ -- it's a restriction (contains 3 or 4 UC chars) If the token doesn't match any of these, it's a error, and this is ALWAYS true. This works because I know the data values, it would not work in another application. There are multiple ways to do a job like this, for example, using index() would work for some values. I use REs because they are simple, uncomplicated, and easy. I am also well aware that people who have not seen REs before are mystified, and to be quite honest with you, this is part of their charm. Some of the comments in this thread about the insanity of Perl have a reverse psychological effect, in that the more horrible they say Perl is, the more it makes me want to rub it in their faces. And to be totally honest, I find this to be part of the charm of Lisp as well. I am a graduate student in Software Engineering at a large public university, and sometimes use Lisp for projects and assignments. I enjoy some of the comments from some of the professors (and I mean well published and respected names) that "Lisp programmers should be shot," or "Lisp should be made illegal." The fact that my first exposure to a Lisp program produced in me a feeling that Lisp was absolute opaque also made me want to learn Lisp, just as REs did. It also contributed to my interest in Perl (as "unintelligible line noise"). Maybe those things that people complain about the loudest are the most valuable. Anyway, I have found both Perl and REs understandable and practical, and I trust that I'll have the same experience with Lisp. CC.
From: Tim Bradshaw on 1 Mar 2010 18:06 On 2010-03-01 19:17:35 +0000, Ron Garret said: > The worst part is that the design of Perl makes it nearly impossible to > figure out what a piece of code does unless you're already intimately > familiar with the language. Consider the above three lines of code, and > suppose you didn't know what they did. How would you find out? What > would you look up? Like English
From: Tim Bradshaw on 1 Mar 2010 18:15 On 2010-03-01 18:24:28 +0000, ccc31807 said: > Personally, I use something like /[\w.-]+@[\w]+\.[\w]{2,4}/ > which matches as follows: > - at least one alphanumeric character, dot, or dash > - exactly one "@" > - at least one alphanumeric character > - exactly one "." > - from two to four alphanumeric characters > and fits into the category of "good enough" People who do this sort of thing should just be prevented from pforramming. There's a standard (which is still essentially RFC822) for what is a valid mail address. I repeatedly run into systems which have implemented their own, deficient, parser (such as yours) and reject perfectly valid addresses. It is emphatically not "good enough".
From: Kaz Kylheku on 1 Mar 2010 20:09
On 2010-03-01, ccc31807 <cartercc(a)gmail.com> wrote: > On Feb 27, 3:08 pm, Ron Garret <rNOSPA...(a)flownet.com> wrote: >> It boggles my mind that the same people who >> complain about the aesthetics (or lack thereof) of parens in >> S-expressions will accept something like this: >> >> (?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\ >> x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e >> -\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]* >> [a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]| >> 2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e >> -\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\]) > > I'd like to offer some perspective to this, without insulting or > casting aspersions. FIRST, [ ... ] SECOND, [ ... ] SIXTH, [ ... ] Ronnie pushed some buttons here, tee hee. |