The basic intuition behind Huffman’s algorithm, that frequent blocks should have short encodings and infrequent blocks should have long encodings, is also at work in English, where typical words like I, you, is, and, to, from, and so on are short, and rarely used words like velociraptor are longer.
However, words like fire!, help!, and run! are short not because they are frequent, but perhaps because time is precious in situations where they are used.
To make things theoretical, suppose we have a file composed of m different words, with frequencies . Suppose also that for the word, the cost per bit of encoding is . Thus, if we find a prefix-free code where the word has a codeword of length , then the total cost of the encoding will be localid="1659078764835" .
Show how to modify Huffman’s algorithm to find the prefix-free encoding of minimum total cost.