5 Replies - 310 Views - Last Post: 17 August 2012 - 10:43 AM Rate Topic: -----

#1 LiDoNg_9_0  Icon User is offline

  • D.I.C Head

Reputation: 9
  • View blog
  • Posts: 159
  • Joined: 03-September 09

Sentence difference algorithm

Posted 17 August 2012 - 06:41 AM

Hi felas,


I want to create a function that returns the difference of a two string.
e.g
grey tiles 100 X 200 kitchen flooring
grey tiles 200 X 600 kitchen flooring

it will return "100 x 200" and "200 x 600".

i have been reading lots of blogs related to this concerns but none of them fit what I want.
most blogs point me to use the pear package the one with Text_Diff but I don't what to install pear package just for that. If you guys knows other approach, i will be great and much more appreciated.
I've done also with an approach using array_diff, which is totally wrong.

$s1="grey tiles 100 X 200 kitchen flooring";
$s2="grey tiles 200 X 600 kitchen flooring";
$sa1=explode(" ",$s1);
$sa2=explode(" ",$s2);
$diff1=array_diff($sa1,$sa2);
$diff2=array_diff($sa2,$sa1);
print_r($diff1);
print_r($diff2);



i dont like this actually, its shame. the point is I should have to whole different words but array_diff actually compares all array index values regardless of the position. You guys have other approach?

Is This A Good Question/Topic? 0
  • +

Replies To: Sentence difference algorithm

#2 CTphpnwb  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 2505
  • View blog
  • Posts: 8,565
  • Joined: 08-August 08

Re: Sentence difference algorithm

Posted 17 August 2012 - 07:44 AM

You need to iterate through the arrays manually. Something like this:
$s1="grey tiles 100 X 200 kitchen flooring";
$s2="grey tiles 200 X 600 kitchen flooring";
$sa1=explode(" ",$s1);
$sa2=explode(" ",$s2);
if(count($sa1) < count($sa2)) {
	$difference = compare_arrays($sa2,$sa1);
} else {
	$difference = compare_arrays($sa1,$sa2);
}
echo "The Difference =<br>".$difference;

function compare_arrays(&$larger, &$smaller) {
	$n = count($smaller);
	$diff = "";
	for($i = 0; $i < $n; $i++) {
		if($larger[$i] != $smaller[$i]) {
			$diff .= $larger[$i]." vs ".$smaller[$i]."<br>";
		}
	}
	$n = count($larger);
	while($i < $n) {
		if($larger[$i] != $smaller[$i]) {
			$diff .= $larger[$i]." vs ".$smaller[$i]."<br>";
		}
		$i++;
	}
	return $diff;
}

Was This Post Helpful? 3
  • +
  • -

#3 LiDoNg_9_0  Icon User is offline

  • D.I.C Head

Reputation: 9
  • View blog
  • Posts: 159
  • Joined: 03-September 09

Re: Sentence difference algorithm

Posted 17 August 2012 - 07:56 AM

View PostCTphpnwb, on 17 August 2012 - 07:44 AM, said:

You need to iterate through the arrays manually. Something like this:
$s1="grey tiles 100 X 200 kitchen flooring";
$s2="grey tiles 200 X 600 kitchen flooring";
$sa1=explode(" ",$s1);
$sa2=explode(" ",$s2);
if(count($sa1) < count($sa2)) {
	$difference = compare_arrays($sa2,$sa1);
} else {
	$difference = compare_arrays($sa1,$sa2);
}
echo "The Difference =<br>".$difference;

function compare_arrays(&$larger, &$smaller) {
	$n = count($smaller);
	$diff = "";
	for($i = 0; $i < $n; $i++) {
		if($larger[$i] != $smaller[$i]) {
			$diff .= $larger[$i]." vs ".$smaller[$i]."<br>";
		}
	}
	$n = count($larger);
	while($i < $n) {
		if($larger[$i] != $smaller[$i]) {
			$diff .= $larger[$i]." vs ".$smaller[$i]."<br>";
		}
		$i++;
	}
	return $diff;
}


wtf, i've been working to this for the whole day and when I post here it was just like 30 mins or something, somebody just got it right. Your amazing dude. Thanks I appreciate your help. :)
Was This Post Helpful? 0
  • +
  • -

#4 StefanOnRails  Icon User is offline

  • D.I.C Head

Reputation: 35
  • View blog
  • Posts: 105
  • Joined: 31-July 12

Re: Sentence difference algorithm

Posted 17 August 2012 - 08:27 AM

You may also use this simpler method but with 2 constrains:
  • your arrays must have the same number of elements
  • there's only one difference between the 2 sentences

function diff_in($arr1, $arr2){
	
	$size 	= count($arr1);
	$output = '';
	$start	= -1;
	$end	= -1;
	
	for($i = 0; $i < $size; $i++)
		if($arr1[$i] != $arr2[$i]) 
			if($start == -1) $start = $i;
			else $end = $i;
	
	if($start == -1) return 'no difference';
	else if($end == -1) return $arr1[$start];
	else{
		for($i = $start; $i <= $end; $i++)
			$output .= $arr1[$i].' ';
		return rtrim($output);
	}
	
}

$s1 = "grey tiles 100 X 200 kitchen flooring";
$s2 = "grey tiles 200 X 600 kitchen flooring";
$sa1=explode(" ",$s1);
$sa2=explode(" ",$s2);

echo diff_in($sa1,$sa2);
echo '<br />';
echo diff_in($sa2,$sa1);


Output:
100 X 200
200 X 600

Otherwise, use CTphpnwb's algorithm - he did a great job :^:
Was This Post Helpful? 0
  • +
  • -

#5 BetaWar  Icon User is offline

  • #include "soul.h"
  • member icon

Reputation: 924
  • View blog
  • Posts: 6,462
  • Joined: 07-September 06

Re: Sentence difference algorithm

Posted 17 August 2012 - 09:24 AM

There is also a method known as shingling which allows you to determine the difference between two documents. With slight modification you could get the differences out instead of just a similarity percentage. I have programmed it before in Javascript and it only took about 10-20 lines to complete (don't remember at the moment).

And there are other methods too. You could also do a character-based diff which will give you the (possibly) shortest (or cheapest, depending on how you set up the algorithm) way to change 1 string/document into another string/document.
Was This Post Helpful? 0
  • +
  • -

#6 CTphpnwb  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 2505
  • View blog
  • Posts: 8,565
  • Joined: 08-August 08

Re: Sentence difference algorithm

Posted 17 August 2012 - 10:43 AM

Just noticed a bug in my code. If the arrays are different sizes the second loop shouldn't include the smaller array!
function compare_arrays(&$larger, &$smaller) {
	$n = count($smaller);
	$diff = "";
	for($i = 0; $i < $n; $i++) {
		if($larger[$i] != $smaller[$i]) {
			$diff .= $larger[$i]." vs ".$smaller[$i]."<br>";
		}
	}
	$n = count($larger);
	for($j = $i; $j < $n; $j++) {
		$diff .= $larger[$j]." vs null<br>";
	}
	return $diff;
}

This post has been edited by CTphpnwb: 17 August 2012 - 10:51 AM

Was This Post Helpful? 0
  • +
  • -

Page 1 of 1