USING: accessors arrays assocs bit-arrays combinators.short-circuit lists fry kernel make math math.order sequences sorting summary ; IN: huffman-coding TUPLE: huffman-code tree-root dictionary ; huffman-leaf C: huffman-node : combine-nodes ( a b -- c ) [ [ probability>> ] bi@ + ] 2keep ; : (huffman-sorted-insert) ( list node -- ) 2dup [ cadr probability>> ] [ probability>> ] bi* > [ over cdr cons >>cdr drop ] [ [ cdr ] dip (huffman-sorted-insert) ] if ; : huffman-sorted-insert ( list node -- list ) 2dup { [ drop nil? ] [ [ car probability>> ] [ probability>> ] bi* >= ] } 2|| [ swap cons ] [ dupd (huffman-sorted-insert) ] if ; inline : build-huffman-tree ( list -- tree ) dup cdr nil? not [ [ cdr cdr ] [ 2car ] bi combine-nodes huffman-sorted-insert build-huffman-tree ] [ car ] if ; GENERIC# (build-dictionary) 1 ( internal-tree accum -- ) M: huffman-leaf (build-dictionary) [ symbol>> ] dip >bit-array 2array , ; M: huffman-node (build-dictionary) [ [ left>> ] [ right>> ] bi ] dip dup clone [ dup t swap push ] [ dup f swap push ] bi* swapd [ (build-dictionary) ] 2bi@ ; : build-dictionary ( internal-tree -- dictionary ) [ V{ } (build-dictionary) ] { } make H{ } assoc-like ; PRIVATE> : ( assoc -- huffman-code ) [ ] { } assoc>map [ [ probability>> ] bi@ <=> ] sort seq>list build-huffman-tree dup build-dictionary huffman-code boa ; : huffman-encode ( huffman-code seq -- code ) swap dictionary>> '[ dup _ at* [ nip ] [ drop illegal-huffman-symbol ] if ] { } map-as concat ; ( huffman-code -- huffman-decoder ) tree-root>> dup huffman-decoder boa ; GENERIC: update-decoder ( decoder tree -- decoder ) M: huffman-node update-decoder >>current ; M: huffman-leaf update-decoder symbol>> , dup root>> >>current ; : decode-step ( huffman-decoder bit -- huffman-decoder ) [ dup current>> ] dip [ left>> ] [ right>> ] if update-decoder ; PRIVATE> : huffman-decode-as ( huffman-code code exemplar -- seq ) [ [ ] dip [ decode-step ] [ each ] ] dip make nip ; : huffman-decode ( huffman-code code -- array ) { } huffman-decode-as ; inline : huffman-count-and-encode ( seq -- huffman-code code ) [ get-statistics dup ] keep huffman-encode ;