Paste: regexp-minimization profiling
Author: | frea |
Mode: | factor |
Date: | Tue, 5 May 2009 20:44:41 |
Plain Text |
USING: kernel sequences regexp.transition-tables fry assocs accessors arrays hashtables
regexp.dfa regexp.classes lists prettyprint vectors locals sets combinators
deques math classes.tuple sorting io ;
IN: my-minimalize
TUPLE: subset working elems ;
C: <subset> subset
:: (reverse-transition-table) ( dict letter key' key -- )
key' condition?
[
key' condition-states
[
H{ { key key } } swap associate
f dict at
dict set-at
f dict [ 1 - ] change-at
] each
]
[ H{ { key key } } key key' letter dict [ drop H{ } clone ] cache [ drop H{ } clone ] cache set-at ] if ;
: reverse-transition-table ( transition-table -- reversed-transition-table )
transitions>> unzip [ unzip rot [ [ (reverse-transition-table) ] 3curry ] curry 2map ] { } 2map-as concat
H{ { f -1 } } clone dup '[ _ swap call( dict -- ) ] swapd each
f over delete-at ;
: prepare-state-partitioning ( table -- rev-trans-table partitions )
{
[ reverse-transition-table values sequence>cons ]
[ [ transitions>> keys ] [ final-states>> keys ] bi diff t swap <subset> ]
[ final-states>> keys t swap <subset> ]
} cleave
2array [ elems>> length 0 = not ] filter sequence>cons
;
: create-sets ( first-working? set1 set2 -- set1' set2' )
[ f swap <subset> ] bi@ rot
[ [ t >>working ] bi@ ]
[ 2dup [ elems>> length ] bi@ < [ swap ] when t >>working ]
if ;
: (sub-partition) ( partition set -- new-partitions )
[ [ working>> ] [ elems>> ] bi ] dip '[ _ key? ] partition
2dup [ empty? ] either?
[ dup empty? [ swap ] unless drop <subset> 1list ] [ create-sets 2list ] if ;
: sub-partition ( partitions set -- new-partitions )
+nil+ swap '[ _ (sub-partition) lappend ] foldr ;
: lfind ( list q -- elem )
over +nil+? [ 2drop f ]
[
2dup [ car>> ] [ call( obj -- t/f ) ] bi*
[ drop car>> ] [ [ cdr>> ] dip lfind ] if
] if ; recursive
: find-next-set ( partitions -- set )
[ working>> ] lfind ;
: ingoing-states ( states rev-trans-table-elt -- states )
'[ _ at assoc-union ] H{ } swap reduce ;
: (my-minimize) ( rev-trans-table partitions -- new-partitions )
[ dup ] bi@ find-next-set dup f = [ drop 2nip ]
[
f >>working elems>>
'[ _ swap ingoing-states sub-partition ] foldr
(my-minimize)
] if ; recursive
: number-partitions ( partitions -- partitions' )
[ elems>> natural-sort ] lmap [ dup first '[ _ 2array ] map ] lmap list>array concat ;
: has-conditions? ( assoc -- ? )
values [ condition? ] any? ;
: canonical-state? ( state transitions state-classes -- ? )
'[ dup _ at = ] swap '[ _ at has-conditions? ] bi or ;
: delete-duplicates ( transitions state-classes -- new-transitions )
dupd '[ drop _ _ canonical-state? ] assoc-filter ;
: combine-states ( table classes -- smaller-table )
[ transitions-at ] keep
'[ _ delete-duplicates ] change-transitions ;
: combine-state-transitions ( hash -- hash )
H{ } clone tuck '[
_ [ 2array <or-class> ] change-at
] assoc-each [ swap ] assoc-map ;
: combine-transitions ( table -- table )
[ [ combine-state-transitions ] assoc-map ] change-transitions ;
: table>state-numbers ( table -- assoc )
transitions>> keys <enum> [ swap ] H{ } assoc-map-as ;
: number-states ( table -- newtable )
dup table>state-numbers transitions-at ;
: my-minimize ( table -- new-table )
number-states
dup prepare-state-partitioning
(my-minimize) number-partitions >hashtable
combine-states
combine-transitions
expand-ors ;
Author: | slava |
Mode: | text |
Date: | Tue, 5 May 2009 21:05:59 |
Plain Text |
reverse-transition-table invokes the compiler at runtime by constructing a quotation using 'concat'
try using arrays instead of linked lists
[ foo ] lmap [ bar ] lmap == [ foo bar ] lmap
Try running your version and Dan's in the profiler, and compare call counts for words in the 'math' and 'sequences' vocabularies.
Use Control-O in the UI to use the UI profiler, it's friendlier than the 'profile' word.
New Annotation