Jump to content

How to express -608 in normalised mantissa-exponent form?

Yoo Song Won
Go to solution Solved by colonel_mortis,

Assuming you're talking about IEEE 754 floating point numbers (the floating point representation in the majority of programming languages), and you want the 32 bit version (float in C), the representation is

  • 1 bit for the sign
  • 8 bits for the exponent
  • 23 bits for the mantissa

They are all represented in binary, which means that the normalised mantissa has the form 1.xxxxx..., so that 1 is implicit, and not represented in the number directly.

  • The sign bit will be 1, because it's negative.
  • The exponent will be 9 = 000010012 because the largest power of 2 ≤608 is 512 = 29
  • For the mantissa, 608 = 512 + 64 + 32 = 29 + 26 + 25 = 10011000002. We drop the leading 1, because that is implied, so the mantissa becomes 001100000, and is then right padded with 0 (to fill the decimal places)

The resulting number is 1 00001001 001100000000000000000002, representing -1.0011000002 × 29 = -1.187510 × 29 = -1.1875 × 512 = -608.

(Notation: 1012 means the binary number 101, which is 5 in decimal (510)),

Does anyone know how to express -608 in normalised mantissa-exponent form? 

 

I also would like to see how it is done as I don't know how to get the answers.

 

Thank you.

Link to comment
Share on other sites

Link to post
Share on other sites

Assuming you're talking about IEEE 754 floating point numbers (the floating point representation in the majority of programming languages), and you want the 32 bit version (float in C), the representation is

  • 1 bit for the sign
  • 8 bits for the exponent
  • 23 bits for the mantissa

They are all represented in binary, which means that the normalised mantissa has the form 1.xxxxx..., so that 1 is implicit, and not represented in the number directly.

  • The sign bit will be 1, because it's negative.
  • The exponent will be 9 = 000010012 because the largest power of 2 ≤608 is 512 = 29
  • For the mantissa, 608 = 512 + 64 + 32 = 29 + 26 + 25 = 10011000002. We drop the leading 1, because that is implied, so the mantissa becomes 001100000, and is then right padded with 0 (to fill the decimal places)

The resulting number is 1 00001001 001100000000000000000002, representing -1.0011000002 × 29 = -1.187510 × 29 = -1.1875 × 512 = -608.

(Notation: 1012 means the binary number 101, which is 5 in decimal (510)),

HTTP/2 203

Link to comment
Share on other sites

Link to post
Share on other sites

41 minutes ago, colonel_mortis said:

Assuming you're talking about IEEE 754 floating point numbers (the floating point representation in the majority of programming languages), and you want the 32 bit version (float in C), the representation is

  • 1 bit for the sign
  • 8 bits for the exponent
  • 23 bits for the mantissa

They are all represented in binary, which means that the normalised mantissa has the form 1.xxxxx..., so that 1 is implicit, and not represented in the number directly.

  • The sign bit will be 1, because it's negative.
  • The exponent will be 9 = 000010012 because the largest power of 2 ≤608 is 512 = 29
  • For the mantissa, 608 = 512 + 64 + 32 = 29 + 26 + 25 = 10011000002. We drop the leading 1, because that is implied, so the mantissa becomes 001100000, and is then right padded with 0 (to fill the decimal places)

The resulting number is 1 00001001 001100000000000000000002, representing -1.0011000002 × 29 = -1.187510 × 29 = -1.1875 × 512 = -608.

(Notation: 1012 means the binary number 101, which is 5 in decimal (510)),

?

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×