c - Calculate the range of double -




as part of exercise "the c programming language" trying find way calculate maximum possible float , maximum possible double on computer. technique shown below works floats (to calculate max float) not double:

// max float: float f = 1.0; float last_f; float step = 9.0; while(1) {     last_f = f;     f *= (1.0 + step);     while (f == infinity) {         step /= 2.0;         f  = last_f * (1.0 + step);     }     if (! (f > last_f) )         break; } printf("calculated float max : %e\n", last_f); printf("limits.h float max   : %e\n", flt_max); printf("diff                 : %e\n", flt_max - last_f); printf("the expected value?  : %s\n\n", (flt_max == last_f)? "yes":"no");  // max double: double d = 1.0; double last_d; double step_d = 9.0; while(1) {     last_d = d;     d *= (1.0 + step_d);     while (d == infinity) {         step_d /= 2.0;         d  = last_d * (1.0 + step_d);     }     if (! (d > last_d) )         break; } printf("calculated double max: %e\n", last_d); printf("limits.h double max  : %e\n", dbl_max); printf("diff                 : %e\n", dbl_max - last_d); printf("the expected value?  : %s\n\n", (dbl_max == last_d)? "yes":"no"); 

and results to:

calculated float max : 3.402823e+38 limits.h float max   : 3.402823e+38 diff                 : 0.000000e+00 expected value?  : yes  calculated double max: 1.797693e+308 limits.h double max  : 1.797693e+308 diff                 : 1.995840e+292 expected value?  : no 

it looks me still calculates using single precision in second case.

what missing?

op's approach works when calculations done wider precision float in first case , wider double in 2nd case.

in first case, op reports flt_eval_method == 0 float calculations done float , double done double. note float step ... 1.0 + step double calculation.


the below code forces calculation double , can replicate op's problem flt_evel_method==2 (use long double internal calculations.)

  volatile double d = 1.0;   volatile double last_d;   volatile double step_d = 9.0;   while(1) {       last_d = d;       d *= (1.0 + step_d);       while (d == infinity) {           step_d /= 2.0;           volatile double sum = 1.0 + step_d;           d  = last_d * sum;           //d  = last_d  + step_d*last_d;       }       if (! (d > last_d) ) {         break;       }   }  diff                 : 1.995840e+292 expected value?  : no 

instead op should use following not form inexact sum of 1.0 + step_d when step_d small, rather forms exact product of step_d*last_d. 2nd form results in more accurate calculation new d, providing additional bit of calculation precision in d. higher precision fp not needed employ op's approach.

          d  = last_d  + step_d*last_d;  diff                 : 0x0p+0 0.000000e+00 expected value?  : yes 




wiki

Comments

Popular posts from this blog

python - Read npy file directly from S3 StreamingBody -

kotlin - Out-projected type in generic interface prohibits the use of metod with generic parameter -

Asterisk AGI Python Script to Dialplan does not work -