Prev: Extending a class that uses 'fields'
Next: FAQ 9.10 How do I decode or create those %-encodings on the web?
From: avilella on 22 Apr 2010 11:00 Hi, I am looking for a neat way of trying a match of a series of tokens to another string. E.g.: $tg1 = "abdcadbcdadcbacbacbadbdcadbcbdcdcbcadabadbcadbc"; $qy1 = "abdca dadcbacb dbdcadbc cbcad dbcadbc" Because $qy1 contains the characters in $tg1, I want the match to be true. Whereas: $tg1 = "abdcadbcdadcbacbacbadbdcadbcbdcdcbcadabadbcadbc"; $qy2 = "abdca dadcbacb aaaaaaaa cbcad dbcadbc" Now $qy2 has a middle token that is not compatible with $tg, so the match should be false. Any suggestions? Cheers, Albert.
From: J. Gleixner on 22 Apr 2010 11:41 avilella wrote: > Hi, > > I am looking for a neat way of trying a match of a series of tokens to > another string. E.g.: > > $tg1 = "abdcadbcdadcbacbacbadbdcadbcbdcdcbcadabadbcadbc"; > $qy1 = "abdca dadcbacb dbdcadbc cbcad dbcadbc" > > Because $qy1 contains the characters in $tg1, I want the match to be > true. Whereas: > > > $tg1 = "abdcadbcdadcbacbacbadbdcadbcbdcdcbcadabadbcadbc"; > $qy2 = "abdca dadcbacb aaaaaaaa cbcad dbcadbc" > > Now $qy2 has a middle token that is not compatible with $tg, so the > match should be false. > > Any suggestions? Use a regular expression, instead of spaces, in $qy1. You could use ".*" or '.'. perldoc perlre perldoc perlop .... m/PATTERN/msixogc /PATTERN/msixogc Searches a string for a pattern match, and in scalar context ....
From: Dilbert on 22 Apr 2010 11:51 On 22 avr, 17:00, avilella <avile...(a)gmail.com> wrote: > Hi, > > I am looking for a neat way of trying a match of a series of tokens to > another string. E.g.: > > $tg1 = "abdcadbcdadcbacbacbadbdcadbcbdcdcbcadabadbcadbc"; > $qy1 = "abdca dadcbacb dbdcadbc cbcad dbcadbc" > > Because $qy1 contains the characters in $tg1, I want the match to be > true. Whereas: > > $tg1 = "abdcadbcdadcbacbacbadbdcadbcbdcdcbcadabadbcadbc"; > $qy2 = "abdca dadcbacb aaaaaaaa cbcad dbcadbc" > > Now $qy2 has a middle token that is not compatible with $tg, so the > match should be false. > > Any suggestions? One way to look at this problem is through "Algorithm::Diff" glasses: use strict; use warnings; use Algorithm::Diff qw(sdiff); my $tg1 = "abdcadbcdadcbacbacbadbdcadbcbdcdcbcadabadbcadbc"; my $qy1 = "abdca dadcbacb dbdcadbc cbcad dbcadbc"; my $qy2 = "abdca dadcbacb aaaaaaaa cbcad dbcadbc"; print "case-a: first string : '$tg1'\n"; print "case-a: second string : '$qy1'\n"; print "case-a: degree of diff : ", degree_of_difference($tg1, $qy1), "\n"; print "\n"; print "case-b: first string : '$tg1'\n"; print "case-b: second string : '$qy2'\n"; print "case-b: degree of diff : ", degree_of_difference($tg1, $qy2), "\n"; print "\n"; sub degree_of_difference { my ($string_x, $string_y) = @_; s{\s}''xmsg for $string_x, $string_y; # the longest string always comes first: if (length($string_x) < length($string_y)) { my $temp = $string_x; $string_x = $string_y; $string_y = $temp; } my @chain_x = split m{}xms, $string_x; my @chain_y = split m{}xms, $string_y; my @sd = sdiff(\@chain_x, \@chain_y); my $inserts = () = grep {$_->[0] eq '+'} @sd; my $deletes = () = grep {$_->[0] eq '-'} @sd; my $changes = () = grep {$_->[0] eq 'c'} @sd; my $unchanged = () = grep {$_->[0] eq 'u'} @sd; $inserts + $changes; } The output is: case-a: first string : 'abdcadbcdadcbacbacbadbdcadbcbdcdcbcadabadbcadbc' case-a: second string : 'abdca dadcbacb dbdcadbc cbcad dbcadbc' case-a: degree of diff : 0 case-b: first string : 'abdcadbcdadcbacbacbadbdcadbcbdcdcbcadabadbcadbc' case-b: second string : 'abdca dadcbacb aaaaaaaa cbcad dbcadbc' case-b: degree of diff : 5 One could argue that the "degree-of-diff" = 0 in case-a implies that the match is true. With the same argument we find that "degree-of-diff" = 5 in case-b implies that the match is false. This is only one way to look at the problem, I am sure that there are many more different ways to look at the problem.
From: sln on 22 Apr 2010 11:52
On Thu, 22 Apr 2010 08:00:46 -0700 (PDT), avilella <avilella(a)gmail.com> wrote: >Hi, > >I am looking for a neat way of trying a match of a series of tokens to >another string. E.g.: > >$tg1 = "abdcadbcdadcbacbacbadbdcadbcbdcdcbcadabadbcadbc"; >$qy1 = "abdca dadcbacb dbdcadbc cbcad dbcadbc" > >Because $qy1 contains the characters in $tg1, I want the match to be >true. Whereas: > > >$tg1 = "abdcadbcdadcbacbacbadbdcadbcbdcdcbcadabadbcadbc"; >$qy2 = "abdca dadcbacb aaaaaaaa cbcad dbcadbc" > >Now $qy2 has a middle token that is not compatible with $tg, so the >match should be false. > >Any suggestions? > You could use index if the tokens are constant. use strict; use warnings; my $String = "abdcadbcdadcbacbacbadbdcadbcbdcdcbcadabadbcadbc"; my @Toks = qw(abdca dadcbacb dbdcadbc aaaaaaaa cbcad dbcadbc); print "\n$String'\n\n"; for my $tok (@Toks) { my $pos = index $String, $tok; if ($pos >= 0) { printf "found (%2d): %s\n", $pos, $tok; } else { printf "not found : %s\n", $tok; } } -sln |