Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Question: Suppose the symbols a,b,c,d,e occur with frequencies 12,14,18,116,116,respectively.

(a) What is the Huffman encoding of the alphabet?

(b) If this encoding is applied to a file consisting of1,000,1000 characters with the given frequencies, what is the length of the encoded file in bits?

Short Answer

Expert verified

Answers

  1. Huffman encoding of the symbols a,b,c,d,e, is0,10,110,1110,1111


    respectively.

2. The length of the encoded file in bits is 1875000 .

Step by step solution

01

Define Huffman Encoding 

Huffman Encoding is represented by a full binary tree. A binary tree in which every node has zero or two children is called a full binary tree. Alphabets are at the leaves, and each codeword is generated by a path from the root to leaf where left is represented as and right is represented as .

02

Determine Huffman encoding of given alphabets.

(a)

Given frequencies are sorted in increasing order.

After sorting, a full binary tree representation of the given frequencies is made as follows:

Huffman Encoding is done from the root node to each leaf node. So, codeword of is , codeword of is . Similarly, codewords of other alphabets are also found as shown in the table.

Alphabet

Codeword

a

0

b

10

c

110

d

1110

e

1111

Thus, Huffman encoding of the given alphabets are 0,10,110,1110,1111.

03

Calculate of length of encoded file and the number of bits 

(b)

Number of bits needed to encode is calculated by multiplying the number of characters in the file and the entropy of the distribution. A measure of how much randomness a distribution contains is called its Entropy.

Entropy=i=1npilog1pi

Here, is the number of possible outcomes, is the probability of each outcome.

Consider the formula for the number of bits.

Numberofbits,N=i=1nmpilog1pi

Here, is the number of characters in the file

N=i=1npilog1pi=m×i=15pilog1pi=m×12log2+14log4+18log8+116log16+116log16=1000000×12×1+14×2+18×3+116×4+12×4

N is further solved as:

N=1000000×12+12+38+14+14=1000000×32+38=1000000×158=1875000

Therefore, the length of encoded file in bits is 1875000.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Give a linear-time algorithm that takes as input a tree and determines whether it has a perfect matching: a set of edges that touches each node exactly once.

A feedback edge set of an undirected graph G(V,E) is a subset of edgesE'Ethat intersects every cycle of the graph. Thus, removing the edges will render the graph acyclic.

Give an efficient algorithm for the following problem:

Input: Undirected graph G(V,E) with positive edge weights we.

Output: A feedback edge set E'Eminimum total weight eE'we.

Entropy: Consider a distribution overnpossible outcomes, with probabilities p1,p2,K,pn.

a. Just for this part of the problem, assume that each piis a power of 2 (that is, of the form 1/2k). Suppose a long sequence of msamples is drawn from the distribution and that for all 1in, the ithoutcome occurs exactly times in the sequence. Show that if Huffman encoding is applied to this sequence, the resulting encoding will have length

i-1nmpilog1pi

b. Now consider arbitrary distributions-that is, the probabilities pi are noy restricted to powers of 2. The most commonly used measure of the amount of randomness in the distribution is the entropy.

i-1nmpilog1pi

For what distribution (over outcomes) is the entropy the largest possible? The smallest possible?

Prove the following two properties of the Huffman encoding scheme.

(a) If some character occurs with frequency more than 25, then there is guaranteed to be a codeword of length 1 .

(b) If all characters occur with frequency less than13 , then there is guaranteed to be no codeword of length 1 .

Suppose we want to find the minimum spanning tree of the following graph.

(a) Run Prim’s algorithm; whenever there is a choice of nodes, always use alphabetic ordering (e.g., start from node A). Draw a table showing the intermediate values of the cost array.

(b) Run Kruskal’s algorithm on the same graph. Show how the disjoint-sets data structure looks at every intermediate stage (including the structure of the directed trees), assuming path compression is used.

Let G=(V,E) be an undirected graph. Prove that if all its edge weights are distinct, then it has a unique minimum spanning tree

See all solutions

Recommended explanations on Computer Science Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free