similar_text
CALCULATE the similarity between two strings.
Swapping the $first and $second may yield a different result.
This function calculates the similarity between two strings as described in Programming Classics: Implementing the World's Best Algorithms by Oliver (ISBN 0-131-00413-1).
Note that this implementation does not use a stack as in Oliver's pseudo code, but recursive calls which may or may not speed up the whole process.
Note also that the complexity of this algorithm is O(N**3) where N is the length of the longest string.
<?php
int similar_text ( str $first , str $second [, float &$percent ] )
where,
$first = The first STRING
$second = The second STRING
&$percent = The percentage of similarity
( PASSED be reference )
?>
$first
The first STRING.
$second
The second STRING.
&$percent
The similarity percentage between $first and $second.
EXERCISE
<?php
$first01 = 'Window';
$second01 = 'Widow';
// setlocale(LC_ALL, 'pt-BR', 'pt_BR');
$sim01a = similar_text($first01, $second01, $percent01);
echo '(' . $sim01a . ') The similarity between<br>' .
$first01 . '<br>and<br>'. $second01 . '<br>is<br>' .
$percent01 . '%<br><br>';
$sim01b = similar_text($second01, $first01, $percent01);
echo '(' . $sim01a . ') The similarity between<br>' .
$second01 . '<br>and<br>'. $first01 . '<br>is<br>' .
$percent01 . '%<br><br>';
?>
RESULT
(5) The similarity between
Window
and
Widow
is
90.909090909091%
(5) The similarity between
Widow
and
Window
is
90.909090909091%
EXERCISE
<?php
$first02 = 'Levenshtein';
$second02 = 'Soundex';
$sim02a = similar_text($first02, $second02, $percent02);
echo '(' . $sim02a . ') The similarity between<br>' .
$first02 . '<br>and<br>'.
$second02 . '<br>is<br>' .
$percent02 . '%<br><br>';
$sim02b = similar_text($second02, $first02, $percent02);
echo '(' . $sim02b . ') The similarity between<br>' .
$second02 . '<br>and<br>'.
$first02 . '<br>is<br>' .
$percent02 . '%<br><br>';
?>
RESULT
(1) The similarity between
Levenshtein
and
Soundex
is
11.111111111111%
(2) The similarity between
Soundex
and
Levenshtein
is
22.222222222222%
EXERCISE
<?php
function smlartxt($parA, $parB)
{
$smlr = similar_text($parA, $parB, $perc);
echo "[ $smlr ] ( $parA, $parB ) = $perc% <br><br>";
}
smlartxt("abcdefgh", "efg");
smlartxt("abcdefgh", "mno");
smlartxt("abcdefghcc", "c");
smlartxt("abcdefghabcdef", "zzzzabcdefggg");
smlartxt("abcdef", "abcdef");
?>