Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Write down the binary bit pattern to represent -1.5625×10-1assuming a format similar to that employed by the DEC PDP-8 (the left most 12 bits are the exponent stored as a two’s complement number, and the rightmost 24 bits are the fraction stored as a two’s complement number). No hidden 1 is used. Comment on how the range and accuracy of this 36-bit pattern compares to the single and double precision IEEE 754 standards.

Short Answer

Expert verified

Representation in binary bit patternis

Exponent

Mantissa

111111111110

10110000 0000 0000 0000

Step by step solution

01

Convert -1.5625×10-1 into binary

We can write-1.5625×10-1 as -0.15625

Now, we convert it into binary representation

0.15625×2=0.31250.3125×2=0.6250.625×2=1.250.25×2=1.250.25×2=0.50.5×2=1

So, we write it

(-0.15625)10=(-0.00101)2

02

Normalize the binary representation

For normalization, we shift the decimal

-0.00101=-0.101×2-2

03

Step 3

As per the given instruction in the question, We don't use hidden 1, so in this format, the exponent is negative

-2=-000000000010

04

Find two's complement of exponent (-2)

To find two's compliments first we have to find one's complement

So, one's complement of (-2) is 111111111101

Now to find two's complements we have to add 1 in one's complement of the number

111111111101+000000000001=111111111110

05

Find two's complement of the fraction part

We have to find two’s complement because the number given in the question is negative

So first we find one’s complement of 0101 0000 0000 0000 0000 0000

One’s complement of fraction part is 101011111111111111111111

Now two’s compliment

101011111111111111111111+000000000000000000000001=101100000000000000000000

06

Binary representation of the number -1.5625×10-1

Exponent

Mantissa

111111111110

10110000 0000 0000 0000

36-bit pattern has more accuracy than the single-precision IEEE 754 standard, but In IEEE 754 standard accuracy of the double-precision s greater.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Question: Calculate the time necessary to perform a multiply using the approach described in the text (31 adders vertically) if an integer is 8 bits wide and an adder takes 4 time units.

IEEE 754-2008 contains a half precision that is only 16 bits wide. The left most bit is still the sign bit, the exponent is 5 bits wide and has a bias of 15, and the mantissa is 10 bits long. A hidden 1 is assumed. Write down the bit pattern to represent -1·5625×10-1assuming a version of this format, which uses an excess-16 format to store the exponent. Comment on how the range and accuracy of this 16-bit floating point format compares to the single precision IEEE 754 standard.

Question: [45] The following C code implements a four-tap FIR filter on input array sig_in. Assume that all arrays are 16-bit fixed-point values. for (i 3;i< 128;i ) sig_out[i] sig_in[i-3] * f[0] sig_in[i-2] * f[1] sig_in[i-1] * f[2] sig_in[i] * f[3]; Assume you are to write an optimized implementation this code in assembly language on a processor that has SIMD instructions and 128-bit registers. Without knowing the details of the instruction set, briefly describe how you would implement this code, maximizing the use of sub-word operations and minimizing the amount of data that is transferred between registers and memory. State all your assumptions about the instructions you us

Calculate the sum of 2.6125×101and4.150390625×10-1 by hand, assuming A and B are stored in the 16-bit half precision described in Exercise 3.27. Assume 1 guard, 1 round bit, and 1 sticky bit, and round to the nearest even. Show all the steps

What is 4365 - 3412 when these values represent signed 12-bit octal numbers stored in sign-magnitude format? The result should be written in octal. Show your work.

See all solutions

Recommended explanations on Computer Science Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free