Can a conversion from double to int be written in portable C
Can a conversion from double to int be written in portable C
I need to write function like double_to_int(double val, int *err)
which
would covert double val to integer when it's possible; otherwise report an error (NAN/INFs/OUT_OF_RANGE).
double_to_int(double val, int *err)
so pseudo code implementation would look like:
if isnan(val):
err = ERR_NAN
return 0
if val < MAX_INT:
err = ERR_MINUS_INF
return MIN_INT
if ...
return (int)val
There are at least two similar questions on SO:
in this answer it's solved in enough clean way, though it's C++ solution - in C we do not have portable digits for signed int.
In this answer, it's explained why we cannot just check (val > INT_MAX || val < INT_MIN)
.
(val > INT_MAX || val < INT_MIN)
So the only possible clean way i see is to use floating point environment, but it's stated as implementation-defined feature.
So my question: is there any way to implement double_to_int
function in cross-platform way (basing only on C standard, even not considering
target platforms to support IEEE-754).?
double_to_int
Looking for an answer drawing from credible and/or official sources.
There's already a good answer from Eric. But I feel it doesn't quite yet disprove my assertion that this is not possible. Hence the bounty.
I wonder if
frexp
is any help.– Steve Summit
Jun 29 at 15:30
frexp
I really feel like your question is answer by the answer you linked in your question, thus make your question a duplicate.
– Stargateur
Jun 29 at 15:33
You really should explain why the answer shown by Stargateur does not answer your question.
– Serge Ballesta
Jun 29 at 15:35
I feel that all the "close duplicates" fail over to a particular implementation at some point in their answers. I strongly believe that it's not possible to do this; my answer is little more than an invitation to peer review.
– Bathsheba
Jun 29 at 15:35
5 Answers
5
Since conversion of double
to int
truncates toward zero, all the double
values that properly convert to int
are in the open interval (−INT_MAX
−1, INT_MAX
+1), and every value not inside this interval overflows when converted to int
or is a NaN. We will find the double
value UpperBound
that is the greatest representable value less than INT_MAX
+1 and the value LowerBound
that is the least representable value greater than −INT_MAX
−1. Then set of double
values in the open interval (−INT_MAX
−1, INT_MAX
+1) equals set of double
values in the closed interval [LowerBound
, UpperBound
], and we can test whether a value x
is in the set by evaluating LowerBound <= x && x <= UpperBound
.
double
int
double
int
INT_MAX
INT_MAX
int
double
UpperBound
INT_MAX
LowerBound
INT_MAX
double
INT_MAX
INT_MAX
double
LowerBound
UpperBound
x
LowerBound <= x && x <= UpperBound
The following determines UpperBound
:
UpperBound
static double UpperBound;
double b1 = INT_MAX, b0 = nexttoward(b1, 0);
if (INT_MAX - (int) ceil(b0) < (int) (b1-b0))
UpperBound = b0;
else if (INT_MAX - (int) ceil(b0) == (int) (b1-b0))
UpperBound = nexttoward(ceil(nexttoward(INT_MAX, HUGE_VALL)), 0);
else
UpperBound = b1;
Reasoning:
INT_MAX
double
b1
INT_MAX
double
INT_MAX
b0
INT_MAX
INT_MAX
b1
INT_MAX
b1
INT_MAX
INT_MAX
double
b0
ceil(b0)
INT_MAX
b1
b0
b1-b0
b0
INT_MAX
int
b1-b0
INT_MAX
int
INT_MAX - (int) b0
b1-b0
b1
INT_MAX
b0
double
INT_MAX
b0
INT_MAX - (int) b0
INT_MAX - (int) ceil(b0)
INT_MAX - (int) b0
b1-b0
b1
INT_MAX
double
INT_MAX
Finally, we consider the case where INT_MAX - (int) ceil(b0)
equals b1-b0
. In this case, b1
must equal INT_MAX
, but there may be additional significand bits below the position value 1. For example, INT_MAX
+1 may be a representable value. The reasoning here is:
INT_MAX - (int) ceil(b0)
b1-b0
b1
INT_MAX
INT_MAX
b1
INT_MAX
b0
INT_MAX
if
INT_MAX
b0
b1
b0
b0
ceil(b0)
INT_MAX
INT_MAX
double
INT_MAX
b1
INT_MAX
INT_MAX
ceil
INT_MAX
INT_MAX
INT_MAX
INT_MAX
LowerBound
can be found from INT_MIN
similarly.
LowerBound
INT_MIN
The above does require that INT_MAX
and INT_MIN
be within the range of double
. Thus, this could fail in implementation with a large int
type and a very constrained non-IEEE-754 double
type with no infinities. Of course, in such a system, all conversions from double
to int
are in range.
INT_MAX
INT_MIN
double
int
double
double
int
I think this does go quite some way; will study it later. Hopefully the bounty will help attract more attention.
– Bathsheba
yesterday
@NominalAnimal: An interesting idea, I will think about whether converting to
unsigned int
gives us some leeway. However, the conversion caused by a cast from double
to unsigned int
is not necessarily modulo. Per C 2011 (N1570) 6.3.1.4 note 61, “The remaindering operation performed when a value of integer type is converted to unsigned type need not be performed when a value of real floating type is converted to unsigned type. Thus, the range of portable real floating values is (−1, Utype_MAX+1).”– Eric Postpischil
10 hours ago
unsigned int
double
unsigned int
limits.h
defines INT_MAX
and INT_MIN
.– Bob Jarvis
10 hours ago
limits.h
INT_MAX
INT_MIN
@BobJarvis: What is your point? We do not have any issue with obtaining
INT_MAX
or INT_MIN
in int
. The problem is we do not know they can be converted to double
without error, but we need to find the greatest double
that is less than INT_MAX
+1. So we need to find some way to evade or correct for the rounding errors that may occur during conversion.– Eric Postpischil
10 hours ago
INT_MAX
INT_MIN
int
double
double
INT_MAX
@EricPostpischil: I rewrote my answer to use
floor(max_double_to_int) == (double)INT_MAX
and ceil(min_double_to_int) == (double)INT_MIN
in the nextafter()
loops, plus handling for the odd case when DBL_MAX <= INT_MAX
or -DBL_MAX >= INT_MIN
via strtod()
.– Nominal Animal
5 hours ago
floor(max_double_to_int) == (double)INT_MAX
ceil(min_double_to_int) == (double)INT_MIN
nextafter()
DBL_MAX <= INT_MAX
-DBL_MAX >= INT_MIN
strtod()
(This answer is in dispute, although I still think I'm correct, please therefore don't upvote unwisely.)
You cannot implement such a function in portable C.
In this respect, it's rather like malloc
&c.
malloc
The moral of the story really is that mixing types is never a good idea in C; i.e. write code in such a way that type conversions are not necessary.
What is your basis for asserting such a function cannot be implemented? At worst, one could convert the exact floating-point number to a character string containing a decimal or hexadecimal numeral and then test whether the numeral in the string could be interpreted as an in-range integer.
– Eric Postpischil
Jun 29 at 15:44
strtol()
should be able to detect error in range ?– Stargateur
Jun 29 at 15:47
strtol()
@EricPostpischil: But do bear in mind that
int
and long
might be the same size.– Bathsheba
Jun 29 at 15:49
int
long
@Bathsheba: So? If
int
and long
are the same size, and x
is a long
, then x <= INT_MAX
always returns true. It is still a valid comparison, and similarly for INT_MIN
. I do not see what problem you think there is.– Eric Postpischil
Jun 29 at 15:50
int
long
x
long
x <= INT_MAX
INT_MIN
Convert
INT_MAX
to a double
and take the next lower and next higher representable values (via nexttoward
). If the subject value is less than the former, it is safe on the positive side (repeat with INT_MIN
for the negative side). If it is not less than or equal to the latter, it is out of bounds. (That also reports infinity and NaN as out of bounds.) Then we just have finicky values around the bounds to test.– Eric Postpischil
Jun 29 at 16:09
INT_MAX
double
nexttoward
INT_MIN
The underlying problem is to find min_double_to_int
and max_double_to_int
, the smallest and largest double
, respectively, that can be converted to an int
.
min_double_to_int
max_double_to_int
double
int
The portable conversion function itself can be written in C11 as
int double_to_int(const double value, int *err)
{
if (!isfinite(value)) {
if (isnan(value)) {
if (err) *err = ERR_NAN;
return 0;
} else
if (signbit(value)) {
if (err) *err = ERR_NEG_INF;
return INT_MIN;
} else {
if (err) *err = ERR_POS_INF;
return INT_MAX;
}
}
if (value < min_double_to_int) {
if (err) *err = ERR_TOOSMALL;
return INT_MIN;
} else
if (value > max_double_to_int) {
if (err) *err = ERR_TOOLARGE;
return INT_MAX;
}
if (err) *err = 0;
return (int)value;
}
Before the above function is first used, we need to assign min_double_to_int
and max_double_to_int
.
min_double_to_int
max_double_to_int
EDITED on 2018-07-03: Rewritten approach.
We can use a simple function to find the smallest power of ten that is at least as large as INT_MAX
/INT_MIN
in magnitude. If those are smaller than DBL_MAX_10_EXP
, the range of double
is greater than the range of int
, and we can cast INT_MAX
and INT_MIN
to double
.
INT_MAX
INT_MIN
DBL_MAX_10_EXP
double
int
INT_MAX
INT_MIN
double
Otherwise, we construct a string containing the decimal representation of INT_MAX
/INT_MIN
, and use strtod()
to convert them to double
. If this operation overflows, it means the range of double
is smaller than the range of int
, and we can use DBL_MAX
/-DBL_MAX
as max_double_to_int
and min_double_to_int
, respectively.
INT_MAX
INT_MIN
strtod()
double
double
int
DBL_MAX
-DBL_MAX
max_double_to_int
min_double_to_int
When we have INT_MAX
as a double
, we can use a loop to increment that value using nextafter(value, HUGE_VAL)
. The largest value that is finite, and rounded down using floor()
still yields the same double
value, is max_double_to_int
.
INT_MAX
double
nextafter(value, HUGE_VAL)
floor()
double
max_double_to_int
Similarly, when we have INT_MIN
as a double, we can use a loop to decrement that value using nextafter(value, -HUGE_VAL)
. The largest value in magnitude that is still finite, and rounds up (ceil()
) to the same double
, is min_double_to_int
.
INT_MIN
nextafter(value, -HUGE_VAL)
ceil()
double
min_double_to_int
Here is an example program to illustrate this:
#include <stdlib.h>
#include <limits.h>
#include <string.h>
#include <float.h>
#include <stdio.h>
#include <errno.h>
#include <math.h>
static double max_double_to_int = -1.0;
static double min_double_to_int = +1.0;
#define ERR_OK 0
#define ERR_NEG_INF -1
#define ERR_POS_INF -2
#define ERR_NAN -3
#define ERR_NEG_OVER 1
#define ERR_POS_OVER 2
int double_to_int(const double value, int *err)
{
if (!isfinite(value)) {
if (isnan(value)) {
if (err) *err = ERR_NAN;
return 0;
} else
if (signbit(value)) {
if (err) *err = ERR_NEG_INF;
return INT_MIN;
} else {
if (err) *err = ERR_POS_INF;
return INT_MAX;
}
}
if (value < min_double_to_int) {
if (err) *err = ERR_NEG_OVER;
return INT_MIN;
} else
if (value > max_double_to_int) {
if (err) *err = ERR_POS_OVER;
return INT_MAX;
}
if (err) *err = ERR_OK;
return (int)value;
}
static inline double find_double_max(const double target)
{
double next = target;
double curr;
do {
curr = next;
next = nextafter(next, HUGE_VAL);
} while (isfinite(next) && floor(next) == target);
return curr;
}
static inline double find_double_min(const double target)
{
double next = target;
double curr;
do {
curr = next;
next = nextafter(next, -HUGE_VAL);
} while (isfinite(next) && ceil(next) == target);
return curr;
}
static inline int ceil_log10_abs(int value)
{
int result = 1;
while (value < -9 || value > 9) {
result++;
value /= 10;
}
return result;
}
static char *int_string(const int value)
{
char *buf;
size_t max = ceil_log10_abs(value) + 4;
int len;
while (1) {
buf = malloc(max);
if (!buf)
return NULL;
len = snprintf(buf, max, "%d", value);
if (len < 1) {
free(buf);
return NULL;
}
if ((size_t)len < max)
return buf;
free(buf);
max = (size_t)len + 2;
}
}
static int int_to_double(double *to, const int ivalue)
{
char *ival, *iend;
double dval;
ival = int_string(ivalue);
if (!ival)
return -1;
iend = ival;
errno = 0;
dval = strtod(ival, &iend);
if (errno == ERANGE) {
if (*iend != '' || dval != 0.0) {
/* Overflow */
free(ival);
return +1;
}
} else
if (errno != 0) {
/* Unknown error, not overflow */
free(ival);
return -1;
} else
if (*iend != '') {
/* Overflow */
free(ival);
return +1;
}
free(ival);
/* Paranoid overflow check. */
if (!isfinite(dval))
return +1;
if (to)
*to = dval;
return 0;
}
int init_double_to_int(void)
{
double target;
if (DBL_MAX_10_EXP > ceil_log10_abs(INT_MAX))
target = INT_MAX;
else {
switch (int_to_double(&target, INT_MAX)) {
case 0: break;
case 1: target = DBL_MAX; break;
default: return -1;
}
}
max_double_to_int = find_double_max(target);
if (DBL_MAX_10_EXP > ceil_log10_abs(INT_MIN))
target = INT_MIN;
else {
switch (int_to_double(&target, INT_MIN)) {
case 0: break;
case 1: target = -DBL_MAX; break;
default: return -1;
}
}
min_double_to_int = find_double_min(target);
return 0;
}
int main(void)
{
int i, val, err;
double temp;
if (init_double_to_int()) {
fprintf(stderr, "init_double_to_int() failed.n");
return EXIT_FAILURE;
}
printf("(int)max_double_to_int = %dn", (int)max_double_to_int);
printf("(int)min_double_to_int = %dn", (int)min_double_to_int);
printf("max_double_to_int = %.16f = %an", max_double_to_int, max_double_to_int);
printf("min_double_to_int = %.16f = %an", min_double_to_int, min_double_to_int);
temp = nextafter(max_double_to_int, 0.0);
for (i = -1; i <= 1; i++) {
val = double_to_int(temp, &err);
printf("(int)(max_double_to_int %+d ULP)", i);
switch (err) {
case ERR_OK: printf(" -> %dn", val); break;
case ERR_POS_OVER: printf(" -> overflown"); break;
case ERR_POS_INF: printf(" -> infinityn"); break;
default: printf(" -> BUGn");
}
temp = nextafter(temp, HUGE_VAL);
}
temp = nextafter(min_double_to_int, 0.0);
for (i = 1; i >= -1; i--) {
val = double_to_int(temp, &err);
printf("(int)(min_double_to_int %+d ULP)", i);
switch (err) {
case ERR_OK: printf(" -> %dn", val); break;
case ERR_NEG_OVER: printf(" -> overflown"); break;
case ERR_NEG_INF: printf(" -> infinityn"); break;
default: printf(" -> BUGn");
}
temp = nextafter(temp, -HUGE_VAL);
}
return EXIT_SUCCESS;
}
Perhaps this might work:
#define BYTES_TO_BITS(x) (x*8)
void numToIntnt(double num, int *output) {
const int upperLimit = ldexp(1.0, (BYTES_TO_BITS(sizeof(int))-1))-1;
const int lowerLimit = (-1)*ldexp(1.0, (BYTES_TO_BITS(sizeof(int))-1));
/*
* or a faster approach if the rounding is acceptable:
* const int upperLimit = ~(1<<(BYTES_TO_BITS(sizeof(int))-1));
* const int lowerLimit = (1<<(BYTES_TO_BITS(sizeof(int))-1));
*/
if(num > upperLimit) {
/* report invalid conversion */
} else if (num < lowerLimit) {
/* report invalid conversion */
} else {
*output = (int)num;
}
}
How would that return a value of zero?
– Andrew Henle
yesterday
@AndrewHenle IMO the goal is to achieve a proper conversion, instead of an error log, which I have tried to simplify it (actually both errors are reporting the same value, in a variable in which any value is expected). Anyway, I have edited the code in order to make it clearer.
– Jose Felipe
yesterday
The calculation of
upperLimit
attempts to calculate 2^width-1, where width is the number of bits in an int
. Even if some of those bits are padding bits, so they do not contribute to the available values, subtracting 1 is a problem. C does not specify what happens if the result is not exactly representable in floating-point. It might round up or down. Then you do not know whether you should use val < upperLimit
or val <= upperLimit
.– Eric Postpischil
10 hours ago
upperLimit
int
val < upperLimit
val <= upperLimit
Testing
val > upperLimit
will report false for a NaN, as will the other comparison, so this code will fall through to the *err = (int) val
case, which we do not want. (Why is it called “err”? That suggests an error, but this is for returning the correct value, is it not?) These tests should be structured so that if the value is in range, then it is converted, else an error is reported. Then NaNs flow to the error path. Or NaNs could be tested for separately.– Eric Postpischil
10 hours ago
val > upperLimit
*err = (int) val
This code assumes the minimum integer value is the negative of a power of two, but the C standard does not require that.
– Eric Postpischil
10 hours ago
It depends a lot on what you mean by “convert”, and how efficient you want it to be.
For example, you could sprintf the floating value to a string, do string-based inspection (i.e. by string-based comparison to max and min values you also sprintf’d), validation, rounding, etc and then sscanf the known-valid string for the final value.
In effect, you’d be moving toward an intermediate representation that’s (a) portable and (b) convenient. C strings are fine at portability, but not so convenient. If you can use external libraries, there are several that are convenient, but whose portability should be confirmed.
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
"Please read before marking as duplicate." should go on comment section
– Stargateur
Jun 29 at 15:26