Can a conversion from double to int be written in portable C

I need to write function like double_to_int(double val, int *err) which
would covert double val to integer when it's possible; otherwise report an error (NAN/INFs/OUT_OF_RANGE).

double_to_int(double val, int *err)

so pseudo code implementation would look like:

if isnan(val): err = ERR_NAN return 0 if val < MAX_INT: err = ERR_MINUS_INF return MIN_INT if ... return (int)val

There are at least two similar questions on SO:
in this answer it's solved in enough clean way, though it's C++ solution - in C we do not have portable digits for signed int.
In this answer, it's explained why we cannot just check (val > INT_MAX || val < INT_MIN).

(val > INT_MAX || val < INT_MIN)

So the only possible clean way i see is to use floating point environment, but it's stated as implementation-defined feature.

So my question: is there any way to implement double_to_int function in cross-platform way (basing only on C standard, even not considering
target platforms to support IEEE-754).?

double_to_int

Looking for an answer drawing from credible and/or official sources.

There's already a good answer from Eric. But I feel it doesn't quite yet disprove my assertion that this is not possible. Hence the bounty.

"Please read before marking as duplicate." should go on comment section
– Stargateur
Jun 29 at 15:26

I wonder if frexp is any help.
– Steve Summit
Jun 29 at 15:30

frexp

I really feel like your question is answer by the answer you linked in your question, thus make your question a duplicate.
– Stargateur
Jun 29 at 15:33

You really should explain why the answer shown by Stargateur does not answer your question.
– Serge Ballesta
Jun 29 at 15:35

I feel that all the "close duplicates" fail over to a particular implementation at some point in their answers. I strongly believe that it's not possible to do this; my answer is little more than an invitation to peer review.
– Bathsheba
Jun 29 at 15:35

5 Answers
5

Since conversion of double to int truncates toward zero, all the double values that properly convert to int are in the open interval (−INT_MAX−1, INT_MAX+1), and every value not inside this interval overflows when converted to int or is a NaN. We will find the double value UpperBound that is the greatest representable value less than INT_MAX+1 and the value LowerBound that is the least representable value greater than −INT_MAX−1. Then set of double values in the open interval (−INT_MAX−1, INT_MAX+1) equals set of double values in the closed interval [LowerBound, UpperBound], and we can test whether a value x is in the set by evaluating LowerBound <= x && x <= UpperBound.

double

int

double

int

INT_MAX

int

double

UpperBound

INT_MAX

LowerBound

INT_MAX

double

INT_MAX

double

LowerBound

UpperBound

x

LowerBound <= x && x <= UpperBound

The following determines UpperBound:

UpperBound

static double UpperBound; double b1 = INT_MAX, b0 = nexttoward(b1, 0); if (INT_MAX - (int) ceil(b0) < (int) (b1-b0)) UpperBound = b0; else if (INT_MAX - (int) ceil(b0) == (int) (b1-b0)) UpperBound = nexttoward(ceil(nexttoward(INT_MAX, HUGE_VALL)), 0); else UpperBound = b1;

Reasoning:

INT_MAX

double

b1

INT_MAX

double

INT_MAX

b0

INT_MAX

b1

INT_MAX

b1

INT_MAX

double

b0

ceil(b0)

INT_MAX

b1

b0

b1-b0

b0

INT_MAX

int

b1-b0

INT_MAX

int

INT_MAX - (int) b0

b1-b0

b1

INT_MAX

b0

double

INT_MAX

b0

INT_MAX - (int) b0

INT_MAX - (int) ceil(b0)

INT_MAX - (int) b0

b1-b0

b1

INT_MAX

double

INT_MAX

Finally, we consider the case where INT_MAX - (int) ceil(b0) equals b1-b0. In this case, b1 must equal INT_MAX, but there may be additional significand bits below the position value 1. For example, INT_MAX+1 may be a representable value. The reasoning here is:

INT_MAX - (int) ceil(b0)

b1-b0

b1

INT_MAX

b1

INT_MAX

b0

INT_MAX

if

INT_MAX

b0

b1

b0

ceil(b0)

INT_MAX

double

INT_MAX

b1

INT_MAX

ceil

INT_MAX

LowerBound can be found from INT_MIN similarly.

LowerBound

INT_MIN

The above does require that INT_MAX and INT_MIN be within the range of double. Thus, this could fail in implementation with a large int type and a very constrained non-IEEE-754 double type with no infinities. Of course, in such a system, all conversions from double to int are in range.

INT_MAX

INT_MIN

double

int

double

int

I think this does go quite some way; will study it later. Hopefully the bounty will help attract more attention.
– Bathsheba
yesterday

@NominalAnimal: An interesting idea, I will think about whether converting to unsigned int gives us some leeway. However, the conversion caused by a cast from double to unsigned int is not necessarily modulo. Per C 2011 (N1570) 6.3.1.4 note 61, “The remaindering operation performed when a value of integer type is converted to unsigned type need not be performed when a value of real floating type is converted to unsigned type. Thus, the range of portable real floating values is (−1, Utype_MAX+1).”
– Eric Postpischil
10 hours ago

unsigned int

double

unsigned int

limits.h defines INT_MAX and INT_MIN.
– Bob Jarvis
10 hours ago

limits.h

INT_MAX

INT_MIN

@BobJarvis: What is your point? We do not have any issue with obtaining INT_MAX or INT_MIN in int. The problem is we do not know they can be converted to double without error, but we need to find the greatest double that is less than INT_MAX+1. So we need to find some way to evade or correct for the rounding errors that may occur during conversion.
– Eric Postpischil
10 hours ago

INT_MAX

INT_MIN

int

double

INT_MAX

@EricPostpischil: I rewrote my answer to use floor(max_double_to_int) == (double)INT_MAX and ceil(min_double_to_int) == (double)INT_MIN in the nextafter() loops, plus handling for the odd case when DBL_MAX <= INT_MAX or -DBL_MAX >= INT_MIN via strtod().
– Nominal Animal
5 hours ago

floor(max_double_to_int) == (double)INT_MAX

ceil(min_double_to_int) == (double)INT_MIN

nextafter()

DBL_MAX <= INT_MAX

-DBL_MAX >= INT_MIN

strtod()

(This answer is in dispute, although I still think I'm correct, please therefore don't upvote unwisely.)

You cannot implement such a function in portable C.

In this respect, it's rather like malloc &c.

malloc

The moral of the story really is that mixing types is never a good idea in C; i.e. write code in such a way that type conversions are not necessary.

What is your basis for asserting such a function cannot be implemented? At worst, one could convert the exact floating-point number to a character string containing a decimal or hexadecimal numeral and then test whether the numeral in the string could be interpreted as an in-range integer.
– Eric Postpischil
Jun 29 at 15:44

strtol() should be able to detect error in range ?
– Stargateur
Jun 29 at 15:47

strtol()

@EricPostpischil: But do bear in mind that int and long might be the same size.
– Bathsheba
Jun 29 at 15:49

int

long

@Bathsheba: So? If int and long are the same size, and x is a long, then x <= INT_MAX always returns true. It is still a valid comparison, and similarly for INT_MIN. I do not see what problem you think there is.
– Eric Postpischil
Jun 29 at 15:50

int

long

x

long

x <= INT_MAX

INT_MIN

Convert INT_MAX to a double and take the next lower and next higher representable values (via nexttoward). If the subject value is less than the former, it is safe on the positive side (repeat with INT_MIN for the negative side). If it is not less than or equal to the latter, it is out of bounds. (That also reports infinity and NaN as out of bounds.) Then we just have finicky values around the bounds to test.
– Eric Postpischil
Jun 29 at 16:09

INT_MAX

double

nexttoward

INT_MIN

The underlying problem is to find min_double_to_int and max_double_to_int, the smallest and largest double, respectively, that can be converted to an int.

min_double_to_int

max_double_to_int

double

int

The portable conversion function itself can be written in C11 as

int double_to_int(const double value, int *err) { if (!isfinite(value)) { if (isnan(value)) { if (err) *err = ERR_NAN; return 0; } else if (signbit(value)) { if (err) *err = ERR_NEG_INF; return INT_MIN; } else { if (err) *err = ERR_POS_INF; return INT_MAX; } } if (value < min_double_to_int) { if (err) *err = ERR_TOOSMALL; return INT_MIN; } else if (value > max_double_to_int) { if (err) *err = ERR_TOOLARGE; return INT_MAX; } if (err) *err = 0; return (int)value; }

Before the above function is first used, we need to assign min_double_to_int and max_double_to_int.

min_double_to_int

max_double_to_int

EDITED on 2018-07-03: Rewritten approach.

We can use a simple function to find the smallest power of ten that is at least as large as INT_MAX/INT_MIN in magnitude. If those are smaller than DBL_MAX_10_EXP, the range of double is greater than the range of int, and we can cast INT_MAX and INT_MIN to double.

INT_MAX

INT_MIN

DBL_MAX_10_EXP

double

int

INT_MAX

INT_MIN

double

Otherwise, we construct a string containing the decimal representation of INT_MAX/INT_MIN, and use strtod() to convert them to double. If this operation overflows, it means the range of double is smaller than the range of int, and we can use DBL_MAX/-DBL_MAX as max_double_to_int and min_double_to_int, respectively.

INT_MAX

INT_MIN

strtod()

double

int

DBL_MAX

-DBL_MAX

max_double_to_int

min_double_to_int

When we have INT_MAX as a double, we can use a loop to increment that value using nextafter(value, HUGE_VAL). The largest value that is finite, and rounded down using floor() still yields the same double value, is max_double_to_int.

INT_MAX

double

nextafter(value, HUGE_VAL)

floor()

double

max_double_to_int

Similarly, when we have INT_MIN as a double, we can use a loop to decrement that value using nextafter(value, -HUGE_VAL). The largest value in magnitude that is still finite, and rounds up (ceil()) to the same double, is min_double_to_int.

INT_MIN

nextafter(value, -HUGE_VAL)

ceil()

double

min_double_to_int

Here is an example program to illustrate this:

#include <stdlib.h> #include <limits.h> #include <string.h> #include <float.h> #include <stdio.h> #include <errno.h> #include <math.h> static double max_double_to_int = -1.0; static double min_double_to_int = +1.0; #define ERR_OK 0 #define ERR_NEG_INF -1 #define ERR_POS_INF -2 #define ERR_NAN -3 #define ERR_NEG_OVER 1 #define ERR_POS_OVER 2 int double_to_int(const double value, int *err) { if (!isfinite(value)) { if (isnan(value)) { if (err) *err = ERR_NAN; return 0; } else if (signbit(value)) { if (err) *err = ERR_NEG_INF; return INT_MIN; } else { if (err) *err = ERR_POS_INF; return INT_MAX; } } if (value < min_double_to_int) { if (err) *err = ERR_NEG_OVER; return INT_MIN; } else if (value > max_double_to_int) { if (err) *err = ERR_POS_OVER; return INT_MAX; } if (err) *err = ERR_OK; return (int)value; } static inline double find_double_max(const double target) { double next = target; double curr; do { curr = next; next = nextafter(next, HUGE_VAL); } while (isfinite(next) && floor(next) == target); return curr; } static inline double find_double_min(const double target) { double next = target; double curr; do { curr = next; next = nextafter(next, -HUGE_VAL); } while (isfinite(next) && ceil(next) == target); return curr; } static inline int ceil_log10_abs(int value) { int result = 1; while (value < -9 || value > 9) { result++; value /= 10; } return result; } static char *int_string(const int value) { char *buf; size_t max = ceil_log10_abs(value) + 4; int len; while (1) { buf = malloc(max); if (!buf) return NULL; len = snprintf(buf, max, "%d", value); if (len < 1) { free(buf); return NULL; } if ((size_t)len < max) return buf; free(buf); max = (size_t)len + 2; } } static int int_to_double(double *to, const int ivalue) { char *ival, *iend; double dval; ival = int_string(ivalue); if (!ival) return -1; iend = ival; errno = 0; dval = strtod(ival, &iend); if (errno == ERANGE) { if (*iend != '' || dval != 0.0) { /* Overflow */ free(ival); return +1; } } else if (errno != 0) { /* Unknown error, not overflow */ free(ival); return -1; } else if (*iend != '') { /* Overflow */ free(ival); return +1; } free(ival); /* Paranoid overflow check. */ if (!isfinite(dval)) return +1; if (to) *to = dval; return 0; } int init_double_to_int(void) { double target; if (DBL_MAX_10_EXP > ceil_log10_abs(INT_MAX)) target = INT_MAX; else { switch (int_to_double(&target, INT_MAX)) { case 0: break; case 1: target = DBL_MAX; break; default: return -1; } } max_double_to_int = find_double_max(target); if (DBL_MAX_10_EXP > ceil_log10_abs(INT_MIN)) target = INT_MIN; else { switch (int_to_double(&target, INT_MIN)) { case 0: break; case 1: target = -DBL_MAX; break; default: return -1; } } min_double_to_int = find_double_min(target); return 0; } int main(void) { int i, val, err; double temp; if (init_double_to_int()) { fprintf(stderr, "init_double_to_int() failed.n"); return EXIT_FAILURE; } printf("(int)max_double_to_int = %dn", (int)max_double_to_int); printf("(int)min_double_to_int = %dn", (int)min_double_to_int); printf("max_double_to_int = %.16f = %an", max_double_to_int, max_double_to_int); printf("min_double_to_int = %.16f = %an", min_double_to_int, min_double_to_int); temp = nextafter(max_double_to_int, 0.0); for (i = -1; i <= 1; i++) { val = double_to_int(temp, &err); printf("(int)(max_double_to_int %+d ULP)", i); switch (err) { case ERR_OK: printf(" -> %dn", val); break; case ERR_POS_OVER: printf(" -> overflown"); break; case ERR_POS_INF: printf(" -> infinityn"); break; default: printf(" -> BUGn"); } temp = nextafter(temp, HUGE_VAL); } temp = nextafter(min_double_to_int, 0.0); for (i = 1; i >= -1; i--) { val = double_to_int(temp, &err); printf("(int)(min_double_to_int %+d ULP)", i); switch (err) { case ERR_OK: printf(" -> %dn", val); break; case ERR_NEG_OVER: printf(" -> overflown"); break; case ERR_NEG_INF: printf(" -> infinityn"); break; default: printf(" -> BUGn"); } temp = nextafter(temp, -HUGE_VAL); } return EXIT_SUCCESS; }

Perhaps this might work:

#define BYTES_TO_BITS(x) (x*8) void numToIntnt(double num, int *output) { const int upperLimit = ldexp(1.0, (BYTES_TO_BITS(sizeof(int))-1))-1; const int lowerLimit = (-1)*ldexp(1.0, (BYTES_TO_BITS(sizeof(int))-1)); /* * or a faster approach if the rounding is acceptable: * const int upperLimit = ~(1<<(BYTES_TO_BITS(sizeof(int))-1)); * const int lowerLimit = (1<<(BYTES_TO_BITS(sizeof(int))-1)); */ if(num > upperLimit) { /* report invalid conversion */ } else if (num < lowerLimit) { /* report invalid conversion */ } else { *output = (int)num; } }

How would that return a value of zero?
– Andrew Henle
yesterday

@AndrewHenle IMO the goal is to achieve a proper conversion, instead of an error log, which I have tried to simplify it (actually both errors are reporting the same value, in a variable in which any value is expected). Anyway, I have edited the code in order to make it clearer.
– Jose Felipe
yesterday

The calculation of upperLimit attempts to calculate 2^width-1, where width is the number of bits in an int. Even if some of those bits are padding bits, so they do not contribute to the available values, subtracting 1 is a problem. C does not specify what happens if the result is not exactly representable in floating-point. It might round up or down. Then you do not know whether you should use val < upperLimit or val <= upperLimit.
– Eric Postpischil
10 hours ago

upperLimit

int

val < upperLimit

val <= upperLimit

Testing val > upperLimit will report false for a NaN, as will the other comparison, so this code will fall through to the *err = (int) val case, which we do not want. (Why is it called “err”? That suggests an error, but this is for returning the correct value, is it not?) These tests should be structured so that if the value is in range, then it is converted, else an error is reported. Then NaNs flow to the error path. Or NaNs could be tested for separately.
– Eric Postpischil
10 hours ago

val > upperLimit

*err = (int) val

This code assumes the minimum integer value is the negative of a power of two, but the C standard does not require that.
– Eric Postpischil
10 hours ago

It depends a lot on what you mean by “convert”, and how efficient you want it to be.

For example, you could sprintf the floating value to a string, do string-based inspection (i.e. by string-based comparison to max and min values you also sprintf’d), validation, rounding, etc and then sscanf the known-valid string for the final value.

In effect, you’d be moving toward an intermediate representation that’s (a) portable and (b) convenient. C strings are fine at portability, but not so convenient. If you can use external libraries, there are several that are convenient, but whose portability should be confirmed.

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Search This Blog

Mgiyuk