Prev: FAQ 9.19 How do I return the user's mail address?
Next: FAQ 9.21 How do I use MIME to make an attachment to a mail message?
From: Ilya Zakharevich on 13 Apr 2010 07:01 On 2010-04-13, Kyle T. Jones <KBfoMe(a)realdomain.net> wrote: >> Solution 1: >> ---ab >> a1bab >> > > > [ > ['+', 0, 'a'], > ['+', 1, '1'], > ['+', 2, 'b'], > ] > >> Solution 2: >> a-b-- >> a1bab >> > > > [ > ['+', 1, '1'], > ['+', 3, 'a'], > ['+', 4, 'b'], > ] > >> Solution 3: >> a---b >> a1bab >> > > > [ > ['+', 1, '1'], > ['+', 2, 'b'], > ['+', 3, 'a'], > ] > Why are any of the three better? Obviously, the metric the OP wants is: assign the "identity edit" weight eps, and any other edit weight 1+eps, with the exception that N consecutive edits of the same type get weight N+eps, not N + N*eps. Looks reasonable (if one convert it to a cheap algorithm to find the best match...). Hope this helps, Ilya
From: Ed on 13 Apr 2010 10:48
On Apr 12, 4:24 pm, Dilbert <dilbert1...(a)gmail.com> wrote: > Theoretically there are 3 solutions with LCS = 2: > > Solution 1: > ---ab > a1bab > > Solution 2: > a-b-- > a1bab > > Solution 3: > a---b > a1bab > > I understand that any of those 3 solutions could be returned by > Algorithm::Diff, but I would argue that solution 1 is "better" than > solution 2 or 3, because solution 1 changes only once between '-' and > [ab], whereas solution 2 and 3 change more than once between '-' and > [ab]. > my $d = Algorithm::Diff::sdiff(\@old, \@new); > How can I teach Algorithm::Diff to choose Solution 1 (the best of the > 3 possibilities) ? Look at traverse_balanced as a starting point. Basically you'd need to write your own diff calculator based off the LCS, using whatever method you feel is appropriate. Once you get to the point where you're looking for the "best" of the possible solutions, you are in new territory since you'll have to consider the solution set. I don't think there's anything in Algorithm::Diff that does that sort of thing - I believe the code simply finds the first solution that uses the given LCS. |