|Date:||Mon, 26 Jan 2009 11:02:18|
! 2 minutes implementation of ! http://www.devarticles.com/c/a/Development-Cycles/How-to-Strike-a-Match/1/ ! algorithm by Simon White USING: grouping kernel math sequences sets unicode.case ; : similarity ( string string -- n ) [ >upper 2 clump ] bi@ [ intersect length 2 * ] 2keep [ length ] bi@ + / ; ! umm, btw, if I haven't mentioned it before, you can use any code that I paste here.
|Date:||Mon, 26 Jan 2009 12:04:16|
! different algorithm but here's an attempt ! for sentence similarity (simple item match against each item), nothing for the seq order though which is a shame. USING: kernel math sequences splitting ; : (one-result) ( item seq -- n ) swap [ = ] curry map [ [ ] filter length ] [ length ] bi / ; : sentence-similarity ( sentence sentence -- n ) [ " " split ] bi@ '[ _ (one-result) ] map sum ; ! instead of exact matches, you can change this to use the above algorithm to find partial matches too, and then average later. but this is only on a word-per-word basis and there's nothing to score the order of words.