Can a conversion from double to int be written in portable C


Can a conversion from double to int be written in portable C



I need to write function like double_to_int(double val, int *err) which
would covert double val to integer when it's possible; otherwise report an error (NAN/INFs/OUT_OF_RANGE).


double_to_int(double val, int *err)



so pseudo code implementation would look like:


if isnan(val):
err = ERR_NAN
return 0
if val < MAX_INT:
err = ERR_MINUS_INF
return MIN_INT
if ...
return (int)val



There are at least two similar questions on SO:
in this answer it's solved in enough clean way, though it's C++ solution - in C we do not have portable digits for signed int.
In this answer, it's explained why we cannot just check (val > INT_MAX || val < INT_MIN).


(val > INT_MAX || val < INT_MIN)



So the only possible clean way i see is to use floating point environment, but it's stated as implementation-defined feature.



So my question: is there any way to implement double_to_int function in cross-platform way (basing only on C standard, even not considering
target platforms to support IEEE-754).?


double_to_int



Looking for an answer drawing from credible and/or official sources.



There's already a good answer from Eric. But I feel it doesn't quite yet disprove my assertion that this is not possible. Hence the bounty.





"Please read before marking as duplicate." should go on comment section
– Stargateur
Jun 29 at 15:26





I wonder if frexp is any help.
– Steve Summit
Jun 29 at 15:30


frexp





I really feel like your question is answer by the answer you linked in your question, thus make your question a duplicate.
– Stargateur
Jun 29 at 15:33





You really should explain why the answer shown by Stargateur does not answer your question.
– Serge Ballesta
Jun 29 at 15:35





I feel that all the "close duplicates" fail over to a particular implementation at some point in their answers. I strongly believe that it's not possible to do this; my answer is little more than an invitation to peer review.
– Bathsheba
Jun 29 at 15:35





5 Answers
5



Since conversion of double to int truncates toward zero, all the double values that properly convert to int are in the open interval (−INT_MAX−1, INT_MAX+1), and every value not inside this interval overflows when converted to int or is a NaN. We will find the double value UpperBound that is the greatest representable value less than INT_MAX+1 and the value LowerBound that is the least representable value greater than −INT_MAX−1. Then set of double values in the open interval (−INT_MAX−1, INT_MAX+1) equals set of double values in the closed interval [LowerBound, UpperBound], and we can test whether a value x is in the set by evaluating LowerBound <= x && x <= UpperBound.


double


int


double


int


INT_MAX


INT_MAX


int


double


UpperBound


INT_MAX


LowerBound


INT_MAX


double


INT_MAX


INT_MAX


double


LowerBound


UpperBound


x


LowerBound <= x && x <= UpperBound



The following determines UpperBound:


UpperBound


static double UpperBound;

double b1 = INT_MAX, b0 = nexttoward(b1, 0);
if (INT_MAX - (int) ceil(b0) < (int) (b1-b0))
UpperBound = b0;
else if (INT_MAX - (int) ceil(b0) == (int) (b1-b0))
UpperBound = nexttoward(ceil(nexttoward(INT_MAX, HUGE_VALL)), 0);
else
UpperBound = b1;



Reasoning:


INT_MAX


double


b1


INT_MAX


double


INT_MAX


b0


INT_MAX


INT_MAX


b1


INT_MAX


b1


INT_MAX


INT_MAX


double


b0


ceil(b0)


INT_MAX


b1


b0


b1-b0


b0


INT_MAX


int


b1-b0


INT_MAX


int


INT_MAX - (int) b0


b1-b0


b1


INT_MAX


b0


double


INT_MAX


b0


INT_MAX - (int) b0


INT_MAX - (int) ceil(b0)


INT_MAX - (int) b0


b1-b0


b1


INT_MAX


double


INT_MAX



Finally, we consider the case where INT_MAX - (int) ceil(b0) equals b1-b0. In this case, b1 must equal INT_MAX, but there may be additional significand bits below the position value 1. For example, INT_MAX+1 may be a representable value. The reasoning here is:


INT_MAX - (int) ceil(b0)


b1-b0


b1


INT_MAX


INT_MAX


b1


INT_MAX


b0


INT_MAX


if


INT_MAX


b0


b1


b0


b0


ceil(b0)


INT_MAX


INT_MAX


double


INT_MAX


b1


INT_MAX


INT_MAX


ceil


INT_MAX


INT_MAX


INT_MAX


INT_MAX



LowerBound can be found from INT_MIN similarly.


LowerBound


INT_MIN



The above does require that INT_MAX and INT_MIN be within the range of double. Thus, this could fail in implementation with a large int type and a very constrained non-IEEE-754 double type with no infinities. Of course, in such a system, all conversions from double to int are in range.


INT_MAX


INT_MIN


double


int


double


double


int





I think this does go quite some way; will study it later. Hopefully the bounty will help attract more attention.
– Bathsheba
yesterday





@NominalAnimal: An interesting idea, I will think about whether converting to unsigned int gives us some leeway. However, the conversion caused by a cast from double to unsigned int is not necessarily modulo. Per C 2011 (N1570) 6.3.1.4 note 61, “The remaindering operation performed when a value of integer type is converted to unsigned type need not be performed when a value of real floating type is converted to unsigned type. Thus, the range of portable real floating values is (−1, Utype_MAX+1).”
– Eric Postpischil
10 hours ago



unsigned int


double


unsigned int





limits.h defines INT_MAX and INT_MIN.
– Bob Jarvis
10 hours ago


limits.h


INT_MAX


INT_MIN





@BobJarvis: What is your point? We do not have any issue with obtaining INT_MAX or INT_MIN in int. The problem is we do not know they can be converted to double without error, but we need to find the greatest double that is less than INT_MAX+1. So we need to find some way to evade or correct for the rounding errors that may occur during conversion.
– Eric Postpischil
10 hours ago


INT_MAX


INT_MIN


int


double


double


INT_MAX





@EricPostpischil: I rewrote my answer to use floor(max_double_to_int) == (double)INT_MAX and ceil(min_double_to_int) == (double)INT_MIN in the nextafter() loops, plus handling for the odd case when DBL_MAX <= INT_MAX or -DBL_MAX >= INT_MIN via strtod().
– Nominal Animal
5 hours ago



floor(max_double_to_int) == (double)INT_MAX


ceil(min_double_to_int) == (double)INT_MIN


nextafter()


DBL_MAX <= INT_MAX


-DBL_MAX >= INT_MIN


strtod()



(This answer is in dispute, although I still think I'm correct, please therefore don't upvote unwisely.)



You cannot implement such a function in portable C.



In this respect, it's rather like malloc &c.


malloc



The moral of the story really is that mixing types is never a good idea in C; i.e. write code in such a way that type conversions are not necessary.





What is your basis for asserting such a function cannot be implemented? At worst, one could convert the exact floating-point number to a character string containing a decimal or hexadecimal numeral and then test whether the numeral in the string could be interpreted as an in-range integer.
– Eric Postpischil
Jun 29 at 15:44





strtol() should be able to detect error in range ?
– Stargateur
Jun 29 at 15:47



strtol()





@EricPostpischil: But do bear in mind that int and long might be the same size.
– Bathsheba
Jun 29 at 15:49


int


long





@Bathsheba: So? If int and long are the same size, and x is a long, then x <= INT_MAX always returns true. It is still a valid comparison, and similarly for INT_MIN. I do not see what problem you think there is.
– Eric Postpischil
Jun 29 at 15:50



int


long


x


long


x <= INT_MAX


INT_MIN





Convert INT_MAX to a double and take the next lower and next higher representable values (via nexttoward). If the subject value is less than the former, it is safe on the positive side (repeat with INT_MIN for the negative side). If it is not less than or equal to the latter, it is out of bounds. (That also reports infinity and NaN as out of bounds.) Then we just have finicky values around the bounds to test.
– Eric Postpischil
Jun 29 at 16:09



INT_MAX


double


nexttoward


INT_MIN



The underlying problem is to find min_double_to_int and max_double_to_int, the smallest and largest double, respectively, that can be converted to an int.


min_double_to_int


max_double_to_int


double


int



The portable conversion function itself can be written in C11 as


int double_to_int(const double value, int *err)
{
if (!isfinite(value)) {
if (isnan(value)) {
if (err) *err = ERR_NAN;
return 0;
} else
if (signbit(value)) {
if (err) *err = ERR_NEG_INF;
return INT_MIN;
} else {
if (err) *err = ERR_POS_INF;
return INT_MAX;
}
}

if (value < min_double_to_int) {
if (err) *err = ERR_TOOSMALL;
return INT_MIN;
} else
if (value > max_double_to_int) {
if (err) *err = ERR_TOOLARGE;
return INT_MAX;
}

if (err) *err = 0;
return (int)value;
}



Before the above function is first used, we need to assign min_double_to_int and max_double_to_int.


min_double_to_int


max_double_to_int



EDITED on 2018-07-03: Rewritten approach.



We can use a simple function to find the smallest power of ten that is at least as large as INT_MAX/INT_MIN in magnitude. If those are smaller than DBL_MAX_10_EXP, the range of double is greater than the range of int, and we can cast INT_MAX and INT_MIN to double.


INT_MAX


INT_MIN


DBL_MAX_10_EXP


double


int


INT_MAX


INT_MIN


double



Otherwise, we construct a string containing the decimal representation of INT_MAX/INT_MIN, and use strtod() to convert them to double. If this operation overflows, it means the range of double is smaller than the range of int, and we can use DBL_MAX/-DBL_MAX as max_double_to_int and min_double_to_int, respectively.


INT_MAX


INT_MIN


strtod()


double


double


int


DBL_MAX


-DBL_MAX


max_double_to_int


min_double_to_int



When we have INT_MAX as a double, we can use a loop to increment that value using nextafter(value, HUGE_VAL). The largest value that is finite, and rounded down using floor() still yields the same double value, is max_double_to_int.


INT_MAX


double


nextafter(value, HUGE_VAL)


floor()


double


max_double_to_int



Similarly, when we have INT_MIN as a double, we can use a loop to decrement that value using nextafter(value, -HUGE_VAL). The largest value in magnitude that is still finite, and rounds up (ceil()) to the same double, is min_double_to_int.


INT_MIN


nextafter(value, -HUGE_VAL)


ceil()


double


min_double_to_int



Here is an example program to illustrate this:


#include <stdlib.h>
#include <limits.h>
#include <string.h>
#include <float.h>
#include <stdio.h>
#include <errno.h>
#include <math.h>

static double max_double_to_int = -1.0;
static double min_double_to_int = +1.0;

#define ERR_OK 0
#define ERR_NEG_INF -1
#define ERR_POS_INF -2
#define ERR_NAN -3
#define ERR_NEG_OVER 1
#define ERR_POS_OVER 2

int double_to_int(const double value, int *err)
{
if (!isfinite(value)) {
if (isnan(value)) {
if (err) *err = ERR_NAN;
return 0;
} else
if (signbit(value)) {
if (err) *err = ERR_NEG_INF;
return INT_MIN;
} else {
if (err) *err = ERR_POS_INF;
return INT_MAX;
}
}

if (value < min_double_to_int) {
if (err) *err = ERR_NEG_OVER;
return INT_MIN;
} else
if (value > max_double_to_int) {
if (err) *err = ERR_POS_OVER;
return INT_MAX;
}

if (err) *err = ERR_OK;
return (int)value;
}


static inline double find_double_max(const double target)
{
double next = target;
double curr;

do {
curr = next;
next = nextafter(next, HUGE_VAL);
} while (isfinite(next) && floor(next) == target);

return curr;
}


static inline double find_double_min(const double target)
{
double next = target;
double curr;

do {
curr = next;
next = nextafter(next, -HUGE_VAL);
} while (isfinite(next) && ceil(next) == target);

return curr;
}


static inline int ceil_log10_abs(int value)
{
int result = 1;

while (value < -9 || value > 9) {
result++;
value /= 10;
}

return result;
}


static char *int_string(const int value)
{
char *buf;
size_t max = ceil_log10_abs(value) + 4;
int len;

while (1) {
buf = malloc(max);
if (!buf)
return NULL;

len = snprintf(buf, max, "%d", value);
if (len < 1) {
free(buf);
return NULL;
}

if ((size_t)len < max)
return buf;

free(buf);
max = (size_t)len + 2;
}
}

static int int_to_double(double *to, const int ivalue)
{
char *ival, *iend;
double dval;

ival = int_string(ivalue);
if (!ival)
return -1;

iend = ival;
errno = 0;
dval = strtod(ival, &iend);
if (errno == ERANGE) {
if (*iend != '' || dval != 0.0) {
/* Overflow */
free(ival);
return +1;
}
} else
if (errno != 0) {
/* Unknown error, not overflow */
free(ival);
return -1;
} else
if (*iend != '') {
/* Overflow */
free(ival);
return +1;
}
free(ival);

/* Paranoid overflow check. */
if (!isfinite(dval))
return +1;

if (to)
*to = dval;

return 0;
}

int init_double_to_int(void)
{
double target;

if (DBL_MAX_10_EXP > ceil_log10_abs(INT_MAX))
target = INT_MAX;
else {
switch (int_to_double(&target, INT_MAX)) {
case 0: break;
case 1: target = DBL_MAX; break;
default: return -1;
}
}

max_double_to_int = find_double_max(target);

if (DBL_MAX_10_EXP > ceil_log10_abs(INT_MIN))
target = INT_MIN;
else {
switch (int_to_double(&target, INT_MIN)) {
case 0: break;
case 1: target = -DBL_MAX; break;
default: return -1;
}
}

min_double_to_int = find_double_min(target);

return 0;
}

int main(void)
{
int i, val, err;
double temp;

if (init_double_to_int()) {
fprintf(stderr, "init_double_to_int() failed.n");
return EXIT_FAILURE;
}

printf("(int)max_double_to_int = %dn", (int)max_double_to_int);
printf("(int)min_double_to_int = %dn", (int)min_double_to_int);
printf("max_double_to_int = %.16f = %an", max_double_to_int, max_double_to_int);
printf("min_double_to_int = %.16f = %an", min_double_to_int, min_double_to_int);

temp = nextafter(max_double_to_int, 0.0);
for (i = -1; i <= 1; i++) {
val = double_to_int(temp, &err);
printf("(int)(max_double_to_int %+d ULP)", i);
switch (err) {
case ERR_OK: printf(" -> %dn", val); break;
case ERR_POS_OVER: printf(" -> overflown"); break;
case ERR_POS_INF: printf(" -> infinityn"); break;
default: printf(" -> BUGn");
}
temp = nextafter(temp, HUGE_VAL);
}

temp = nextafter(min_double_to_int, 0.0);
for (i = 1; i >= -1; i--) {
val = double_to_int(temp, &err);
printf("(int)(min_double_to_int %+d ULP)", i);
switch (err) {
case ERR_OK: printf(" -> %dn", val); break;
case ERR_NEG_OVER: printf(" -> overflown"); break;
case ERR_NEG_INF: printf(" -> infinityn"); break;
default: printf(" -> BUGn");
}
temp = nextafter(temp, -HUGE_VAL);
}

return EXIT_SUCCESS;
}



Perhaps this might work:


#define BYTES_TO_BITS(x) (x*8)

void numToIntnt(double num, int *output) {
const int upperLimit = ldexp(1.0, (BYTES_TO_BITS(sizeof(int))-1))-1;
const int lowerLimit = (-1)*ldexp(1.0, (BYTES_TO_BITS(sizeof(int))-1));

/*
* or a faster approach if the rounding is acceptable:
* const int upperLimit = ~(1<<(BYTES_TO_BITS(sizeof(int))-1));
* const int lowerLimit = (1<<(BYTES_TO_BITS(sizeof(int))-1));
*/

if(num > upperLimit) {
/* report invalid conversion */
} else if (num < lowerLimit) {
/* report invalid conversion */
} else {
*output = (int)num;
}
}





How would that return a value of zero?
– Andrew Henle
yesterday





@AndrewHenle IMO the goal is to achieve a proper conversion, instead of an error log, which I have tried to simplify it (actually both errors are reporting the same value, in a variable in which any value is expected). Anyway, I have edited the code in order to make it clearer.
– Jose Felipe
yesterday






The calculation of upperLimit attempts to calculate 2^width-1, where width is the number of bits in an int. Even if some of those bits are padding bits, so they do not contribute to the available values, subtracting 1 is a problem. C does not specify what happens if the result is not exactly representable in floating-point. It might round up or down. Then you do not know whether you should use val < upperLimit or val <= upperLimit.
– Eric Postpischil
10 hours ago



upperLimit


int


val < upperLimit


val <= upperLimit





Testing val > upperLimit will report false for a NaN, as will the other comparison, so this code will fall through to the *err = (int) val case, which we do not want. (Why is it called “err”? That suggests an error, but this is for returning the correct value, is it not?) These tests should be structured so that if the value is in range, then it is converted, else an error is reported. Then NaNs flow to the error path. Or NaNs could be tested for separately.
– Eric Postpischil
10 hours ago


val > upperLimit


*err = (int) val





This code assumes the minimum integer value is the negative of a power of two, but the C standard does not require that.
– Eric Postpischil
10 hours ago



It depends a lot on what you mean by “convert”, and how efficient you want it to be.



For example, you could sprintf the floating value to a string, do string-based inspection (i.e. by string-based comparison to max and min values you also sprintf’d), validation, rounding, etc and then sscanf the known-valid string for the final value.



In effect, you’d be moving toward an intermediate representation that’s (a) portable and (b) convenient. C strings are fine at portability, but not so convenient. If you can use external libraries, there are several that are convenient, but whose portability should be confirmed.






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

paramiko-expect timeout is happening after executing the command

Export result set on Dbeaver to CSV

Opening a url is failing in Swift