IEEE-754-Standard floating point confusion

Hi there,
I am really confused. The datatype double should be in C++ and Java the same standard acc. to IEEE-754.
But when I try to investigate the several bytearrays created from a double value e.g. 1.1d, it is different in C and Java.
below are the results:
Value 1.1 in C++
intCsigned
     bit0     bit1     bit2     bit3     bit4     bit5     bit6     bit7
byte0     1     1     0     0     1     1     0     1     -51
byte1     1     1     0     0     1     1     1     0     -52
byte2     1     0     0     0     1     1     0     0     -116
byte3     0     0     1     1     1     1     1     1     63
byte4     1     1     0     0     1     1     0     0     -52
byte5     1     1     0     0     1     1     0     0     -52
byte6     1     1     0     0     1     1     0     0     -52
byte7     1     1     0     0     1     1     0     0     -52
Value 1.1 in Java
intJava(signed)
byte0     0     0     1     1     1     1     1     1     63
byte1     1     1     1     1     0     0     0     1     -15
byte2     1     0     0     1     1     0     0     1     -103
byte3     1     0     0     1     1     0     0     1     -103
byte4     1     0     0     1     1     0     0     1     -103
byte5     1     0     0     1     1     0     0     1     -103
byte6     1     0     0     1     1     0     0     1     -103
byte7     1     0     0     1     1     0     1     0     -102
Can please somebody bring light into that?????
Does somebody know the exact specification of a double datatype in c++ and java?
with the best regards,
stonee

OK,
It seems my C-program created a bad array. I finally
found out, that the Java and C Array of each double is
exactly turned.
C[0] == J[7]
C[1] == J[6]
C[2] == J[5]
its probably big endian vs little endian issues plus on top of that nibble swapping.
I happen to be working on this very problem at this instant. I'll see what I can dig up.

Similar Messages

IEEE 754 standards for representing floating point numbers

HI All..
Most of us are not awared how the actually Floating point numbers are represented . IEEE have set standards as how should we represent floating point numbers.
I am giving u the link with which u can know how actually these are represented.
http://en.wikipedia.org/wiki/IEEE_754
If u have any doubts u can always reach me.
Bye
Happy learning
[email protected]

A noble but misguided attempt at dispelling the recurring problems the programmers have over and over again. There have been repeated posts to links about the IEEE standard, to little or no avail. The newbies who run into the problems will continue to do so, without regard to yet another post about it here.

Convert Floating Point Decimal to Hex

In my application I make some calculations using floating point format DBL,and need to write these values to a file in IEEE 754 Floating Point Hex format. Is there any way to do this using LabVIEW?

Mike,
Good news. LabVIEW has a function that does exactly what you want. It is well hidden though...
In the Advanced/Data manipulation palette there is a function called Flatten to String. If you feed this funtion with your DBL precision digital value you get the IEEE-754 hexadecimal floating point representation (64 bit) at the data string terminal (as a text string).
I attached a simple example that shows how it works.
Hope this helps. /Mikael Garcia
Attachments:
ieee754converter.vi ‏10 KB

Conversion of a float to IEEE 754 hexa (and vice versa)

Hello everyone,
I need to convert a float into an hexadecimal value to transmit it on a communication bus (I also have to decode the hexa into a float). I need this hexadecimal to respect the IEEE 754 technical standart. I'm trying to do it with the basic functions of Labview but I'm facing some problems. You'll find attached my VI.
If someone has already done such a function or know an easier way to do it, I'll be very grateful.
Attachments:
IEEE 754 conv.png ‏38 KB

If your communication bus in only using singles, then avoid doubles in your code.
How are you converting to a Single from the Doubles? In my quick experiment, I got the right answer.
There are only two ways to tell somebody thanks: Kudos and Marked Solutions
Unofficial Forum Rules and Guidelines
Attachments:
double to single hex.png ‏8 KB

On-line IEEE floating-point addition

Hi,
Could somebody please recommend the login for the on-line IEEE floating-point number addition.
On-line: start adding twp floating-point number from the most significant posion.
Thank you!

Thank you for your replay!
I had a misspelled word in my question. it should have been "logic" instead of "login"
I am looking for some java code that I can use as a starting point to start developing on-line floating poing addition unit. On-line means that the addition is done from the most significand bit (left-most) and goes towards least significand bits in binary floating point.
for example:
there are 64 bit in a FP number
|1 bit | 11 bits | 52 bits | + |1 bit | 11 bits | 52 bits |
I hope, I make sence
Thank you!

IEEE Floating point format converstion to ForteDouble

Question:
Given that I have 4 bytes of binary data which represents a number in
IEEE floating point format,
and I wish to convert it to a Forte DoubleData, will the following code
give me the correct answer
in Value?
(Assume that file is correctly set up, etc...)
Value : DoubleData = new;
FPoint : point to float;
F : float;
LineText : BinaryData = new;
File.ReadBinary(LineText,4);
Fpoint = (pointer to Float)(LineText.Value);
F = *Fpoint;
Value.SetValue(F);
Thanks
To unsubscribe, email '[email protected]' with
'unsubscribe forte-users' as the body of the message.
Searchable thread archive <URL:http://pinehurst.sageit.com/listarchive/>

Mark,
you might try testing whether forte floats are IEEE in the following
way using the following:
pflt : pointer to float = (pointer to float) (res.value);
flt = *pFlt;
however, I believe you will have to wrapper a C function to do this.
The C function takes a void * first argument and has a float
void ConvIEEE(void * buffer, float * return)
return = (float) (buffer);
or
void ConvIEEE(void buffer, float return)
ieeefloat ie;
ie = (ieeefloat) (*buffer);
*return = IEEELibraryConvertToFloat(ie);
depending upon whether C floats are IEEE or not on your
platform/compiler. I think you'll have to investigate this yourself,
or try the first approach and see if it works.
Good luck!
assuming, of course, that your C compiler's float is also IEEE format.
Your forte wrapper would look like
class floatWrapper inherits from framework.object
has public method ConvIEEE(input buffer : pointer,
output return : float)
end class;
with your binarydata you would
res : binarydata = (get from somewhere)
flt : float;
fw : FloatWrapper = new;
fw.ConvIEEE(res.value,flt);
Mark Sundsten wrote:
>
Question:
Given that I have 4 bytes of binary data which represents a number in
IEEE floating point format,
and I wish to convert it to a Forte DoubleData, will the following code
give me the correct answer
in Value?
(Assume that file is correctly set up, etc...)
Value : DoubleData = new;
FPoint : point to float;
F : float;
LineText : BinaryData = new;
File.ReadBinary(LineText,4);
Fpoint = (pointer to Float)(LineText.Value);
F = *Fpoint;
Value.SetValue(F);
Thanks
To unsubscribe, email '[email protected]' with
'unsubscribe forte-users' as the body of the message.
Searchable thread archive <URL:http://pinehurst.sageit.com/listarchive/>--
John Jamison [email protected]
Vice President and Chief Technology Officer
Sage IT Partners, Inc.
Voice: 415 392-7243 x 306
Fax: 415 391-3899
Internet Enabled Business Change
http://www.sageit.com
-----------------------------------------------------

Floating Point Representations on SPARC (64-bit architecture)

Hi Reader,
I got hold of "Numerical Computation Guide -2005" by Sun while looking for Floating Point representations on 64 bit Architectures. It gives me nice illustrations of Single and Double formats and the solution for endianness with
two 32-bit words. But it doesn't tell me how it is for 64-bit SPARC or 64-bit x86.
I might be wrong here, but having all integers and pointers of 64-bit length, do we still need to break the floating point numbers and store them in lower / higher order addresses ??
or is it as simple as having a Double Format consistent in the bit-pattern across all the architectures (Intel, SPARC, IBMpowerPC, AMD) with 1 + 11 + 52 bit pattern.
I have tried hard to get hold of a documentation that explains a 64-bit architecture representation of a Floating Point Number. Any suggestion should be very helpful.
Thanks for reading. Hope you have something useful to write back.
Regards,
Regmee

The representation of floating-point numbers is specified by IEEE standard 754. This standard contains the specifications for single-precision (32-bit), and double-precision (64-bit) floating-point numbers (There is also a quad-precision (128-bit) format as well). OpenSPARC T1 supports both single and double precision numbers, and can support quad-precision numbers through emulation (not in hardware). The fact that this is a 64-bit machine does not affect how the numbers are stored in memory.
The only thing that affects how the numbers are stored in memory is endianness. SPARC architecture is big-endian, while x86 is little-endian. But a double-precision floating-point numer in a SPARC register looks the same as a double-precision floating-point number in an x86 register.
formalGuy

Serial Communication(CAN) of Floating Point Numbers

Hi,
I have ran into a situation were I need to use floating point numbers(IEEE Floating Standard) when communicating. How can I send and recieve floating point numbers? What converstions need to be made? Is there a good resource on this?
Thanks,
Ken

Hi K.,
in automotive a lot of fractional values are exchanged via CAN communication, but still the CAN protocol is based on using integer numbers (of variable bit size)…
We are thinking we need to use single SGL floats ... that require a higher resoultion and precision
What is the needed resolution and "precision"? SGL transports 23 bits of mantissa: you can easily pack them into an I32 value!
Lets make an example:
I have a signal with a value range of 100…1000 and a resolution of 0.125. To send this over CAN I need to use 13 bits and scale the value with a gain of 0.125 and an offset of 100. Values will be send in an integer representation:
msgdata value
0 100
500 162.5
501 162.625
1000 225
5000 725
7200 1000
The formula to translate msgdata to value is easy: value := msgdata*gain+offset.
Another example: the car I test at the moment sends it's current speed as 16 bit integer value with a range of 0…65532. The gain is 0.01 so speed translates to 0.00…655.32 km/h. The values 65533-65535 have special meanings to indicate errors…
So again: What is your needed resolution and data range?
Another option: send the SGL as it is: just 4 bytes. Receive those 4 bytes on your PC as I32/U32 and typecast them to SGL…
Best regards,
GerdW
CLAD, using 2009SP1 + LV2011SP1 + LV2014SP1 on WinXP+Win7+cRIO
Kudos are welcome

How does Java store floating point numbers?

Hello
I'm writing a paper about floating point numbers in which I compare an IEEE-754 compatible language [c] with Java. I read that Java can do a conversion decimal->binary->decimal and retain the same value whereas c can't. I found several documents discussing the pros and cons of that but I can't find any information about how it is implemented.
I hope someone can explain it to me, or post a link to a site explaining it.
Cheers
Huttu

So it is a myth.
I still ask because I observed a oddity: When I store 1.4 in c and printf( %2.20f\n",a); it I get 1.39999999999999991118. If I do the same in Java with System.out.printf( %2.20f\n",a); I get 1.4. If I multiply the variable with itself I get 1.95999999999999970000:
double a=1.4;
a=a*a;
System.out.printf( %2.20f\n",a);
{code}
Does this happen because of the rounding in Java?

BUG: Large floating point numbers convert to the wrong integer

Hi,
When using the conversion "bullets" to convert SGL, DBL and EXT to integers there are some values which convert wrong. One example is the integer 9223370937343148030, which can be represented exactly as a SGL (and thus exactly as DBL and EXT as well). If you convert this to I64 you get 9223370937343148032 instead, even though the correct integer is within the range of an I64. There are many similar cases, all (I've noticed) within the large end of the ranges.
This has nothing to do with which integers can be represented exactly as a floating point value or not. This is a genuine conversion bug mind you.
Cheers,
Steen
CLA, CTA, CLED & LabVIEW Champion
Solved!
Go to Solution.

Yes, I understand the implications involved, and there definetely is a limit to how many significant digits that can be displayed in the numeric controls and constants today. I think that either this limit should be lifted or a cap should be put onto the configuration page when setting the display format.
I ran into this problem as I'm developing a new toolset that lets you convert all the numeric formats into any other numeric format, just like the current "conversion bullets". My conversion bullets have outputs for overflow and exact conversion as well, since I need that functionality myself for a Math toolset (GPMath) I'm also developing. Eventually I'll maybe include underflow as well, but for now just those two outputs are available. Example:
I do of course pay close attention to the binary representation of the numbers to calculate the Exact conversion? output correctly for each conversion variation (there are hundreds of VIs in polymorphic wrappers), but I relied in some cases on the ability of the numeric indicator to show a true number when configured appropriately - that was when I discovered this bug, which I at first mistook for a conversion error in LabVIEW.
Is there a compliancy issue with EXT?
While doing this work I've discovered that the EXT format is somewhat misleadingly labelled as "80-bit IEEE compliant" (it says so here), but that statement should be read with some suspicion IMO. The LabVIEW EXT is not simply IEEE 754-1985 compliant anyways, as that format would imply the x87 80-bit extended format. An x87 IEEE 754 extended precision float only has 63-bit fraction and a 1-bit integer part. That 1-bit integer part is implicit in single and double precision IEEE 754 numbers, but it is explicit in x87 extended precision numbers. LabVIEW EXT seems to have an implicit integer part and 64-bit fraction, thus not straight IEEE 754 compliant. Instead I'd say that the LabVIEW EXT is an IEEE 754r extended format, but still a proprietary one that should deserve a bit more detail in the available documentation. Since it's mentioned several places in the LabVIEW documentation that the EXT is platform independent, your suspicion should already be high though. It didn't take me many minutes to verify the apparent format of the EXT in any case, so no real problem here.
Is there a genuine conversion error from EXT to U64?
The integer 18446744073709549568 can be represented exactly as EXT using this binary representation (mind you that the numeric indicators won't display the value correctly, but instead show 18446744073709549600):
EXT-exponent: 0x100000000111110b
EXT-fraction: 0x1111111111111111111111111111111111111111111111111111000000000000b
--> Decimal: 18446744073709549568
The above EXT value converts exactly to U64 using the To Unsigned Quad Integer "bullet". But then let's try to flip the blue bit from 0 to 1 in the fraction part of the EXT, making this value:
EXT-exponent: 0x100000000111110b
EXT-fraction: 0x1111111111111111111111111111111111111111111111111111100000000000b
--> Decimal: 18446744073709550592
The above EXT value is still within U64 range, but the To Unsigned Quad Integer "bullet" converts it to U64_max which is 18446744073709551615. Unless I've missed something this must be a genuine conversion error from EXT to U64?
/Steen
CLA, CTA, CLED & LabVIEW Champion

F suffix for floating point.

Okay, I'm a proficient c++ programmer and have been learning Java for only a few weeks now.
I have a question about the f suffix for floating point varibles such as float f = 3.14f;
The f suffix casts this as float right? which is the same as float f = (float) 3.14; Correct?
Why do we have to add the f suffix in the first place? Doesn't the compiler know that we want a float and not a double? (single-precision 32-bit instead of double precision 64 bit) I really do not understand the concept here or why they need the f suffix.
Can someone explain?

ThePHPGuy wrote:
The f suffix denotes that the literal is of a floating-point type.Yes. The d suffix does the same.
Java has two different types of floating-point numbers.Right.
The type double is the default type.Right.
The float type can have a double and a float literal. Is this true or false?No. At least not in any way I understand it.
I think you're confusing two things:
"floating point number" is any number in the IEEE floating point format.
"float" is a datatype holding a 32bit floating point number.
"double" is a datatype holding a 64bit floating point number.
floating point number literals can be either double literals (without suffix or if the "d" suffix is used) or float literals (when the "f" suffix is used).

Floating point to binary conversion

Hi
I need to convert a floating point decimal number to bits.
Eg. 0.000532 to be converted to binary(bits).
How do I do this?

Now if I convert that decimal number to bits(in
the usual method of dividing by 2),will that be the
exact binary representation of the floating point
decimal number?You have the same bit pattern in both cases. In one it's held in a double and will be interpreted as a floating point number according to the IEEE 754 representation. In the other it's held in a long and will be interpreted according to the two's complement representation. But it's the same bitpattern.
Note that Long has a toString method which allows you to convert the long to a String. The radix in your case is 2 for binary.

Check Floating Point Number

Hello All,
I am having some trouble checking the value of a field with Key Figure type Number with 8 byte floating point. I want to read that field and populate another field with an X if true. For example if that field is equal to 5,0000000000000000E+07 then i want to mark the other field with an 'X'.
The problem is in my code, how do i read that number in the fltp field, such as the number above. my code reads as follows for the 'X' field.
    if SOURCE_FIELDS-abc123 eq 5000000.
      RESULT = 'X'.
    endif.
Thanks everyone in advance

You don't need to worry about converting the code into standard format or floating, just implement your code as you want and it will automatically take care of the conversion. Basically 5,0000000000000000E+07 = 50,000,000.
thanks.
Wond

Floating point formats: Java/C/C++, PPC and Intel platforms

Hi everyone
Where can I find out about the various bit formats used for 32 bit floating numbers in Java and C/C++ for both Mac hardware platforms?
I'm developing a Java audio application which needs to convert vast quantities of variable width integer audio samples to canonical float audio format. I've discovered that a floating point divide by the maximum integer value gives the correct answer but takes too much processor time, so I'm trying out bit-twiddling in C via JNI to carve out my own floating point bit patterns. This is very fast, however, I need to take into account the various float formats used on the different platforms so my app can be universal. Can anyone point me to the information?
Thanks in advance.
Bob

I am not sure that Rosetta floating point works the same as PPC floating point. I was using RealBasic (a PPC basic compiler) and moved one of the my compiled applications to a MacBook Pro and floating point comparisons that had been exact on the PPC stopped working under Rosetta. I changed the code to do an approximate comparison (i.e. abs(a -b) < tolerance) and this fixed things.
I reported the problem to the RealBasic people and thought nothing more of it until I fired up Adobe's InDesign and not being used to working with picas, changed the units of measurement to inches. The default letter paper size was suddenly 8.5000500050005 inches instead of the more usual 8.5! This was not a big problem, but it appears that all of InDesign's page math is running into some kind of rounding errors.
The floating point format is almost certainly IEEE, and I cannot imagine Rosetta doing anything other than using native hardware Intel floating point. On the other hand, there is a subtle difference in behavior.
I am posting this here as a follow up, but I am also going to post this as a proper question in the forum. If you have to delete one or the other of these duplicate posts, please zap the reply, not the question.

Precision loss - conversions between exact values and floating point values

Hi!
I read this in your SQL Reference manual, but I don't quite get it.
Conversions between exact numeric values (TT_TINYINT, TT_SMALLINT, TT_INTEGER, TT_BIGINT, NUMBER) and floating-point values (BINARY_FLOAT, BINARY_DOUBLE) can be inexact because the exact numeric values use decimal precision whereas the floating-point numbers use binary precision.
Could you please give two examples: one where a TT_TINYINT is converted to a BINARY_DOUBLE and one when a TT_BIGINT is converted into a DOUBLE, both cases give examples on lost precision? This would be very helpful.
Thanks!
Sune

chokpa wrote:
Public Example (float... values){}
new Example (1, 1e2, 3.0, 4.754);It accepts it if I just use 1,2,3,4 as the values being passed in, but doesn't like it if I use actual float values.Those are double literals, try
new Example (1f, 1e2f, 3.0f, 4.754f);

IEEE-754-Standard floating point confusion

Similar Messages

Maybe you are looking for