Know your floats better!
In C, the primary floating point types are float
, double
, and long double
. The range and number of bits for these types can vary depending on the system and compiler, but they generally adhere to the IEEE 754 standard for floating-point arithmetic. Here's a breakdown:
float
Typically 32 bits (4 bytes).
IEEE 754 standard for single-precision floating-point numbers.
Range: Approximately ±1.5 x 10^-45 to ±3.4 x 10^38.
Example representation:
1 bit
(sign) +8 bits
(exponent) +23 bits
(fraction).
double
Typically 64 bits (8 bytes).
IEEE 754 standard for double-precision floating-point numbers.
Range: Approximately ±5.0 x 10^-324 to ±1.7 x 10^308.
Example representation:
1 bit
(sign) +11 bits
(exponent) +52 bits
(fraction).
long double
Size varies: often 80, 96, or 128 bits.
The standard and range can vary more than
float
anddouble
.May follow IEEE 754 or have a completely different format.
IEEE 754 Floating Point Representation
In IEEE 754 standard, a floating point number is represented by three parts:
Sign bit: Determines if the number is positive (0) or negative (1).
Exponent: Encodes the range of the number. It's stored in a biased format.
Fraction (or mantissa): Represents the precision of the number.
Example in C
Let's consider a float
example in C:
#include <stdio.h>
#include <stdint.h>
typedef union {
float f;
uint32_t bits;
} FloatUnion;
void printBinary(uint32_t number, int bits) {
for (int i = bits - 1; i >= 0; i--) {
printf("%d", (number >> i) & 1);
if (i == 31 || i == 23) printf(" ");
}
printf("\n");
}
int main() {
FloatUnion fu;
fu.f = -13.625; // Example float number
printf("Floating point value: %f\n", fu.f);
printf("Binary representation: ");
printBinary(fu.bits, 32);
return 0;
}
In this code:
We use a union to interpret the bits of a
float
as anuint32_t
.The
printBinary
function prints the binary representation of the number, separating the sign, exponent, and fraction parts for clarity.-13.625
in binary form is broken down into the sign bit, exponent, and fraction according to IEEE 754.
When you run this code, it'll display the binary representation of -13.625
as a 32-bit floating-point number.
As an actual encoding example, for the decimal number123.456
, when encoded into afloat
and represented according to the IEEE 754 standard, the binary representation is broken down into three parts:
Sign bit:
0
(indicating a positive number)Exponent:
10000101
Mantissa (Fraction):
11101101110100101111001
The binary representation is thus displayed as:
┌0┐┌10000101┐┌11101101110100101111001┐
Here, each box represents a different part of the IEEE 754 floating-point format:
The first box contains the sign bit.
The second box shows the exponent.
The third box is the mantissa (or fraction part) of the number.
The exponent 10000101
in the IEEE 754 representation of a floating-point number is not the actual exponent but a "biased" exponent. This biasing is a technique used in the IEEE 754 standard to handle both positive and negative exponents in a way that simplifies the design of floating-point hardware.
In the case of 32-bit float
(single precision), the bias used is 127. Here's how the biased exponent is calculated:
Find the Actual Exponent: First, determine the actual exponent of the number in binary form. For the decimal number
123.456
, it's necessary to convert it to binary and normalize it to a form where there's only one non-zero digit before the decimal point.Normalize the Number: To convert
123.456
to binary, we can focus on the integer part and fractional part separately:The integer part
123
in binary is1111011
.The fractional part
.456
is converted to binary by repeatedly multiplying by 2 and taking the integer part of the result, but we can stop after a few digits for this example.Combining these, the binary representation would start as
1111011.011...
(and so on).
Adjust to Scientific Notation: Adjust this binary number to scientific notation (binary form), which would be approximately
1.111011011... x 2^6
. Here,6
is the actual exponent since we moved the decimal point 6 places to the left.Apply the Bias: Add the bias (127 for 32-bit floats) to the actual exponent. So, the biased exponent is
6 + 127 = 133
.Convert to Binary: Finally, convert
133
to binary, which is10000101
.
Therefore, in the IEEE 754 representation of the number 123.456
, the exponent part 10000101
represents the biased exponent, and it is this biased value that is stored in the binary representation of the floating point number.
Understanding the exact binary representation requires familiarity with the IEEE 754 format, including how to calculate the biased exponent and the fraction part from the actual number. This example shows the underlying binary form but understanding the conversion from a floating-point number to this binary form and vice versa is a bit more complex and involves understanding of binary and floating-point arithmetic.
Subscribe to my newsletter
Read articles from Jyotiprakash Mishra directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Jyotiprakash Mishra
Jyotiprakash Mishra
I am Jyotiprakash, a deeply driven computer systems engineer, software developer, teacher, and philosopher. With a decade of professional experience, I have contributed to various cutting-edge software products in network security, mobile apps, and healthcare software at renowned companies like Oracle, Yahoo, and Epic. My academic journey has taken me to prestigious institutions such as the University of Wisconsin-Madison and BITS Pilani in India, where I consistently ranked among the top of my class. At my core, I am a computer enthusiast with a profound interest in understanding the intricacies of computer programming. My skills are not limited to application programming in Java; I have also delved deeply into computer hardware, learning about various architectures, low-level assembly programming, Linux kernel implementation, and writing device drivers. The contributions of Linus Torvalds, Ken Thompson, and Dennis Ritchie—who revolutionized the computer industry—inspire me. I believe that real contributions to computer science are made by mastering all levels of abstraction and understanding systems inside out. In addition to my professional pursuits, I am passionate about teaching and sharing knowledge. I have spent two years as a teaching assistant at UW Madison, where I taught complex concepts in operating systems, computer graphics, and data structures to both graduate and undergraduate students. Currently, I am an assistant professor at KIIT, Bhubaneswar, where I continue to teach computer science to undergraduate and graduate students. I am also working on writing a few free books on systems programming, as I believe in freely sharing knowledge to empower others.