CCS C Software and Maintenance Offers
FAQFAQ   FAQForum Help   FAQOfficial CCS Support   SearchSearch  RegisterRegister 

ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

CCS does not monitor this forum on a regular basis.

Please do not post bug reports on this forum. Send them to CCS Technical Support

Operations efficiency – Memory vs. CPU speed
Goto page 1, 2, 3, 4, 5  Next
 
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion
View previous topic :: View next topic  
Author Message
viki2000



Joined: 08 May 2013
Posts: 233

View user's profile Send private message

Operations efficiency – Memory vs. CPU speed
PostPosted: Sun Jun 04, 2017 5:04 pm     Reply with quote

I use PIC24HJ64GP202 or generally a PIC24 on 16 bit MCU.
This is my working code:

Code:
void main()
{
float x;
signed int16 y;

   while(TRUE)
   {
          for(x=0; x<2*PI; x+=PI/32768){
          y = 32767*sin(x);
          putc(make8(y,1)); //MSB
          putc(make8(y,0)); //LSB
       }
   }
}


In the loop for I have some arithmetical operations.
“PI” is a constant (3.14) known by CCS C compiler, but I do not know with how many decimals is specified by CCS.
1) Can someone tell?

Here are my questions:
2) Every time when the “for loop” is passed, does the CPU makes the multiplication “2*PI” and division “PI/32768”?
3) Is it any other better way to do it?
4) If I allocate a space in memory for a constant with fixed values as “CONST1=2*PI” as “CONST2=”PI/32768”, then is the execution of the program faster?
Code:

void main()
{
CONST1=2*PI
CONST2=”PI/32768

float x;
signed int16 y;

   while(TRUE)
   {
          for(x=0; x<CONST1; x+=CONST2){
          y = 32767*sin(x);
          putc(make8(y,1)); //MSB
          putc(make8(y,0)); //LSB
       }
   }
}


The differences between allocation a constant or making the operations in the “for loop”, I guess is next:
- Using constants needs more fixed space in ROM memory, but the CPU executes faster the operations, the “for loop”.
- Not using constants saves ROM memory, but CPU is slower, needs to execute the arithmetical operations each time when the loop is passed.

5) Is the above a good view/understanding of what happens?
6) Does these arithmetical operations from “for loop” use temporal variables? In RAM?
7) What is the best way to do it, to write the code? What can be considered more efficient?
PCM programmer



Joined: 06 Sep 2003
Posts: 21708

View user's profile Send private message

PostPosted: Sun Jun 04, 2017 5:43 pm     Reply with quote

1. It's in math.h. It's right there:
Code:
#define PI     3.1415926535897932


Last edited by PCM programmer on Sun Jun 04, 2017 5:49 pm; edited 1 time in total
viki2000



Joined: 08 May 2013
Posts: 233

View user's profile Send private message

PostPosted: Sun Jun 04, 2017 5:48 pm     Reply with quote

Thank you. I did not checked that.
What about the other questions? Any suggestions?
PCM programmer



Joined: 06 Sep 2003
Posts: 21708

View user's profile Send private message

PostPosted: Sun Jun 04, 2017 7:01 pm     Reply with quote

First make a short section of code so you can see what the pre-calculated
values are:
Code:
.................... x = PI;
00032:  MOVLW  DB
00034:  MOVWF  x+3
00036:  MOVLW  0F
00038:  MOVWF  x+2
0003A:  MOVLW  49
0003C:  MOVWF  x+1
0003E:  MOVLW  80
00040:  MOVWF  x
.................... x = 2*PI;
00042:  MOVLW  DB
00044:  MOVWF  x+3
00046:  MOVLW  0F
00048:  MOVWF  x+2
0004A:  MOVLW  49
0004C:  MOVWF  x+1
0004E:  MOVLW  81
00050:  MOVWF  x
.................... x = PI/32768;
00052:  MOVLW  DB
00054:  MOVWF  x+3
00056:  MOVLW  0F
00058:  MOVWF  x+2
0005A:  MOVLW  49
0005C:  MOVWF  x+1
0005E:  MOVLW  71
00060:  MOVWF  x


Then look in the for() loop code.
The pre-calculated 2*PI value is used in the for() loop below:
Quote:

.................... for(x=0; x<2*PI; x+=PI/32768){
00826: CLRF x+3
00828: CLRF x+2
0082A: CLRF x+1
0082C: CLRF x
0082E: MOVFF x+3,??65535+3
00832: MOVFF x+2,??65535+2
00836: MOVFF x+1,??65535+1
0083A: MOVFF x,??65535
0083E: MOVLW DB
00840: MOVWF @FLT.P1+3
00842: MOVLW 0F
00844: MOVWF @FLT.P1+2
00846: MOVLW 49
00848: MOVWF @FLT.P1+1
0084A: MOVLW 81
0084C: MOVWF @FLT.P1
0084E: MOVLB 0
00850: CALL @FLT
00854: BNC 08EC


The pre-calculated value of PI/32768 is added to x, as shown
in the code at the end of the for() loop:
Quote:


0083E: MOVLW DB
008C6: MOVWF @ADDFF.P1+3
008C8: MOVLW 0F
008CA: MOVWF @ADDFF.P1+2
008CC: MOVLW 49
008CE: MOVWF @ADDFF.P1+1
008D0: MOVLW 71
008D2: MOVWF @ADDFF.P1
008D4: CALL @ADDFF
008D8: MOVFF 03,x+3
008DC: MOVFF 02,x+2
008E0: MOVFF 01,x+1
008E4: MOVFF 00,x
008E8: MOVLB F
008EA: BRA 082E


So, yes it is using numbers that are calculated at compile-time, and
not at run-time.
Ttelmah



Joined: 11 Mar 2010
Posts: 19513

View user's profile Send private message

PostPosted: Mon Jun 05, 2017 12:46 am     Reply with quote

Several 'missing points' here:

First you worry about the accuracy of Pi, while you are doing the maths in float (not double).
Then your output values are only 15bit (ignoring the sign), so the accuracy of Pi doesn't matter at all...
Then why waste time calculating for all four quadrants?. Remember that you only need the sin values for one quadrant.

Just use a simple integer Cordic implementation:

<http://www.dcs.gla.ac.uk/~jhw/cordic/cordic-16bit.h>

It is worth understanding 'why' they use 16384, rather than 32767 as the 'unit' value for the integer maths. It makes a lot of other things much simpler, so should be considered. However the basic algorithm, can be easily ported to your 32767 value if required.
viki2000



Joined: 08 May 2013
Posts: 233

View user's profile Send private message

PostPosted: Mon Jun 05, 2017 1:58 am     Reply with quote

I was wondering about accuracy of PI only because I did not know to search its definition as constant in “math.h”, but I see now that has 17 decimals. The question came into my mind because in PCD manual page 191, there is an example of PI mentioned directly, not as constant, as 3.141596 only:
https://www.ccsinfo.com/downloads/PCDReferenceManual.pdf
Now, understanding that the code is using numbers that are calculated at compile-time and not at run-time, it means that does not matter for the MCU operations if I define these numbers as constants in the beginning. Probably is only easier to follow the code, but will be no other improvements.

I used the sine calculations for all quadrants, because it seems lots easier to sweep the range 0 to 2*PI in the for() loop, rather than use only one quadrant from 0 to PI/2 and then add additional code with other arithmetical calculations and decisions.
In what way do you think will bring any improvements to the code if I use only 1 quadrant? I do not think will be shorter, but would be then faster?

Thank you for 16 bit CORDIC algorithm. I will try to study it and then implement in a fixed point math.h library that will be included in the main project.
Ttelmah



Joined: 11 Mar 2010
Posts: 19513

View user's profile Send private message

PostPosted: Mon Jun 05, 2017 3:43 am     Reply with quote

It already is a fixed point library....

It's just they use 16384, as 1.00

There are significant reasons for this, and you should consider these for your application.
viki2000



Joined: 08 May 2013
Posts: 233

View user's profile Send private message

PostPosted: Tue Jun 06, 2017 1:04 am     Reply with quote

What is your experience with "Fixed point decimal" mentioned here?
https://www.ccsinfo.com/content.php?page=compiler-features#fixed

It says
Quote:
"Fixed point decimal gives you decimal representation, but at integer speed. This gives you a phenomenal speed boost over using float"


Does it make sense to try to include it in the trials above?
Ttelmah



Joined: 11 Mar 2010
Posts: 19513

View user's profile Send private message

PostPosted: Tue Jun 06, 2017 1:47 am     Reply with quote

Fixed point decimal, is what the %w printf format displays in CCS.
You treat an integer as if (for example), it is integer 'thousandths'. So 1000, becomes '1.000' etc..

Big advantages are for things like simple arithmetic for currency, where it is unacceptable that (for instance), 200000.04 + 200000.03 = 400000.06, which is what floating point will give. If instead you use int32, and work directly in 'integer cents', then the standard + - etc., will all work as normal, with the speed advantages of integer arithmetic, and the %w specifier will allow you to print as if they are 'floating' values.

However downsides arrive for this type of format, is when you are not working for 'human' consumption. So (for instance), if you multiply two values together, since each is 100* the actual value, the result will have to be divided by 100 after the multiplication. Since 100 is not an easy binary division, this brings a time cost (not as much as float, but significant). This is why _binary_ point decimal is far preferred for anything involving multiplication or division. Here you code the numbers so that there are a number of binary digits in front of and behind the decimal. This is why the int16 sine example I pointed you to, uses 16384 as '1'. They are using 14 binary digits after the decimal point, and one in front, plus the sign. This then means the divisions and multiplications needed to keep the number correctly scaled can be done by simple rotations - far easier for the processor to do.
viki2000



Joined: 08 May 2013
Posts: 233

View user's profile Send private message

PostPosted: Thu Jun 08, 2017 1:32 pm     Reply with quote

I find next explanations very useful to understand CORDIC:
http://bsvi.ru/uploads/CORDIC--_10EBA/cordic.pdf
And here another example of implemenation:
https://forum.sparkfun.com/viewtopic.php?t=15382
And here is the implementation of the code that you ponted to:
http://www.stm32duino.com/viewtopic.php?t=1510
viki2000



Joined: 08 May 2013
Posts: 233

View user's profile Send private message

PostPosted: Fri Jun 09, 2017 5:46 am     Reply with quote

In the given test example from here:
http://www.dcs.gla.ac.uk/~jhw/cordic/cordic-test.c

Code:
#include "cordic-32bit.h"
#include <math.h> // for testing only!

//Print out sin(x) vs fp CORDIC sin(x)
int main(int argc, char **argv)
{
    double p;
    int s,c;
    int i;   
    for(i=0;i<50;i++)
    {
        p = (i/50.0)*M_PI/2;       
        //use 32 iterations
        cordic((p*MUL), &s, &c, 32);
        //these values should be nearly equal
        printf("%f : %f\n", s/MUL, sin(p));
    }       
}


What is it M_PI ?

In CORDIC the value of the angle is specified in degrees, not radians.
Then the example above has 50 steps in the for() loop for "i" and refers to PI/2, probably to test 1 quadrant.
Then is it M_PI/2 = 90° , or just 90 as number, or what exactly?
From what comes that M in front of PI ?
temtronic



Joined: 01 Jul 2010
Posts: 9226
Location: Greensville,Ontario

View user's profile Send private message

PostPosted: Fri Jun 09, 2017 6:34 am     Reply with quote

My 'guess' is that M_PI is a variable that is 'created' in the
#include "cordic-32bit.h"
file that is at the top of the program.

OR

it could be in the math.h file....


Either way, if the program comiples it HAS to be in one or the other !

Jay

edit> had a quick F3 of math.h, didn't find M_PI has to be in cordic.....
Ttelmah



Joined: 11 Mar 2010
Posts: 19513

View user's profile Send private message

PostPosted: Fri Jun 09, 2017 7:17 am     Reply with quote

M_PI is their define for PI.

#define M_PI 3.1415926535897932384626
#define K1 0.6072529350088812561694

Honestly, unless you have some need for great accuracy, keep to the 16bit implementation. 32bit is not twice as slow as 16bit. An int32 takes more than 10* the time needed for an int16 multiply, while a division takes more than 20* the time.
For any simple thing like sinusoidal synthesis, 16bit is more than accurate enough. The odds are that any measurement based upon for instance an ADC, will only have perhaps 3 digits of actual resolution, and wasting time doing maths much beyond this accuracy is fundamentally pointless.
viki2000



Joined: 08 May 2013
Posts: 233

View user's profile Send private message

PostPosted: Fri Jun 09, 2017 9:19 am     Reply with quote

Now I see, is in the top of the code:
http://www.dcs.gla.ac.uk/~jhw/cordic/

I will keep it to 16bit.
Ttelmah



Joined: 11 Mar 2010
Posts: 19513

View user's profile Send private message

PostPosted: Fri Jun 09, 2017 2:32 pm     Reply with quote

As a further comment to this, you might be interested in the following:
Code:

//A fast float approximation to sin over the range +/-PI
#define PI2 (PI*PI)
#define INV4  (4/PI)
#define INV4SQ  (-4/(PI2))
#define P  0.225

float fast_sin(float x)
{
    float y;
    y = (INV4 * x) + (INV4SQ * x * fabs(x));
    return P * (y * fabs(y) - y) + y;   
}

void main(void)
{
  //Test the fast sin algorithm for angles from 0 to PI in steps of PI/128
  float an, res, sres,;
  for (an=0.0;an<(PI);an+=(PI/128))
  {
     //res1=fast_sin1(an);
     res=fast_sin(an);
     sres=sin(an);
     printf ("AN=%5.3f sinfast=%5.3f sin=%5.3f\n",an,res,sres);
  }
  while(TRUE)
     delay_cycles(1);
}


I have in the past published a fast approximation to atan2 in the code library.

This one is only in error on the third decimal:
Code:

AN=0.000 sinfast=0.000 sin=0.000
AN=0.024 sinfast=0.024 sin=0.024
AN=0.049 sinfast=0.048 sin=0.049
AN=0.073 sinfast=0.072 sin=0.073
AN=0.098 sinfast=0.097 sin=0.098
AN=0.122 sinfast=0.121 sin=0.122
AN=0.147 sinfast=0.145 sin=0.146
AN=0.171 sinfast=0.169 sin=0.170
AN=0.196 sinfast=0.194 sin=0.195
AN=0.220 sinfast=0.218 sin=0.219
AN=0.245 sinfast=0.241 sin=0.242
AN=0.269 sinfast=0.265 sin=0.266
AN=0.294 sinfast=0.289 sin=0.290
AN=0.319 sinfast=0.312 sin=0.313
AN=0.343 sinfast=0.336 sin=0.336
AN=0.368 sinfast=0.359 sin=0.359
AN=0.392 sinfast=0.382 sin=0.382
AN=0.417 sinfast=0.404 sin=0.405
AN=0.441 sinfast=0.427 sin=0.427
AN=0.466 sinfast=0.449 sin=0.449
AN=0.490 sinfast=0.471 sin=0.471
AN=0.515 sinfast=0.492 sin=0.492
AN=0.539 sinfast=0.514 sin=0.514
AN=0.564 sinfast=0.535 sin=0.534
AN=0.589 sinfast=0.555 sin=0.555
AN=0.613 sinfast=0.576 sin=0.575
AN=0.638 sinfast=0.596 sin=0.595
AN=0.662 sinfast=0.615 sin=0.615
AN=0.687 sinfast=0.634 sin=0.634
AN=0.711 sinfast=0.653 sin=0.653
AN=0.736 sinfast=0.672 sin=0.671
AN=0.760 sinfast=0.690 sin=0.689
AN=0.785 sinfast=0.707 sin=0.707
AN=0.809 sinfast=0.724 sin=0.724
AN=0.834 sinfast=0.741 sin=0.740
AN=0.859 sinfast=0.757 sin=0.757
AN=0.883 sinfast=0.773 sin=0.773
AN=0.908 sinfast=0.789 sin=0.788
AN=0.932 sinfast=0.803 sin=0.803
AN=0.957 sinfast=0.818 sin=0.817
AN=0.981 sinfast=0.832 sin=0.831
AN=1.006 sinfast=0.845 sin=0.844
AN=1.030 sinfast=0.858 sin=0.857
AN=1.055 sinfast=0.870 sin=0.870
AN=1.079 sinfast=0.882 sin=0.881
AN=1.104 sinfast=0.893 sin=0.893
AN=1.129 sinfast=0.904 sin=0.903
AN=1.153 sinfast=0.914 sin=0.914
AN=1.178 sinfast=0.924 sin=0.923
AN=1.202 sinfast=0.933 sin=0.932
AN=1.227 sinfast=0.941 sin=0.941
AN=1.251 sinfast=0.949 sin=0.949
AN=1.276 sinfast=0.957 sin=0.956
AN=1.300 sinfast=0.964 sin=0.963
AN=1.325 sinfast=0.970 sin=0.970
AN=1.349 sinfast=0.975 sin=0.975
AN=1.374 sinfast=0.980 sin=0.980
AN=1.398 sinfast=0.985 sin=0.985
AN=1.423 sinfast=0.989 sin=0.989
AN=1.448 sinfast=0.992 sin=0.992
AN=1.472 sinfast=0.995 sin=0.995
AN=1.497 sinfast=0.997 sin=0.997
AN=1.521 sinfast=0.998 sin=0.998
AN=1.546 sinfast=0.999 sin=0.999
AN=1.570 sinfast=1.000 sin=1.000
AN=1.595 sinfast=0.999 sin=0.999
AN=1.619 sinfast=0.998 sin=0.998
AN=1.644 sinfast=0.997 sin=0.997
AN=1.668 sinfast=0.995 sin=0.995
AN=1.693 sinfast=0.992 sin=0.992
AN=1.718 sinfast=0.989 sin=0.989
AN=1.742 sinfast=0.985 sin=0.985
AN=1.767 sinfast=0.980 sin=0.980
AN=1.791 sinfast=0.975 sin=0.975
AN=1.816 sinfast=0.970 sin=0.970
AN=1.840 sinfast=0.964 sin=0.963
AN=1.865 sinfast=0.957 sin=0.956
AN=1.889 sinfast=0.949 sin=0.949
AN=1.914 sinfast=0.941 sin=0.941
AN=1.938 sinfast=0.933 sin=0.932
AN=1.963 sinfast=0.924 sin=0.923
AN=1.988 sinfast=0.914 sin=0.914
AN=2.012 sinfast=0.904 sin=0.903
AN=2.037 sinfast=0.893 sin=0.893
AN=2.061 sinfast=0.882 sin=0.881
AN=2.086 sinfast=0.870 sin=0.870
AN=2.110 sinfast=0.858 sin=0.857
AN=2.135 sinfast=0.845 sin=0.844
AN=2.159 sinfast=0.832 sin=0.831
AN=2.184 sinfast=0.818 sin=0.817
AN=2.208 sinfast=0.803 sin=0.803
AN=2.233 sinfast=0.789 sin=0.788
AN=2.258 sinfast=0.773 sin=0.773
AN=2.282 sinfast=0.757 sin=0.757
AN=2.307 sinfast=0.741 sin=0.740
AN=2.331 sinfast=0.724 sin=0.724
AN=2.356 sinfast=0.707 sin=0.707
AN=2.380 sinfast=0.690 sin=0.689
AN=2.405 sinfast=0.672 sin=0.671
AN=2.429 sinfast=0.653 sin=0.653
AN=2.454 sinfast=0.634 sin=0.634
AN=2.478 sinfast=0.615 sin=0.615
AN=2.503 sinfast=0.596 sin=0.595
AN=2.527 sinfast=0.576 sin=0.575
AN=2.552 sinfast=0.555 sin=0.555
AN=2.577 sinfast=0.535 sin=0.535
AN=2.601 sinfast=0.514 sin=0.514
AN=2.626 sinfast=0.492 sin=0.492
AN=2.650 sinfast=0.471 sin=0.471
AN=2.675 sinfast=0.449 sin=0.449
AN=2.699 sinfast=0.427 sin=0.427
AN=2.724 sinfast=0.404 sin=0.405
AN=2.748 sinfast=0.382 sin=0.382
AN=2.773 sinfast=0.359 sin=0.359
AN=2.797 sinfast=0.336 sin=0.336
AN=2.822 sinfast=0.312 sin=0.313
AN=2.847 sinfast=0.289 sin=0.290
AN=2.871 sinfast=0.265 sin=0.266
AN=2.896 sinfast=0.241 sin=0.242
AN=2.920 sinfast=0.218 sin=0.219
AN=2.945 sinfast=0.194 sin=0.195
AN=2.969 sinfast=0.169 sin=0.170
AN=2.994 sinfast=0.145 sin=0.146
AN=3.018 sinfast=0.121 sin=0.122
AN=3.043 sinfast=0.097 sin=0.098
AN=3.067 sinfast=0.072 sin=0.073
AN=3.092 sinfast=0.048 sin=0.049
AN=3.117 sinfast=0.024 sin=0.024
AN=3.141 sinfast=0.000 sin=0.000



and executes on a PIC24 in 421 instructions, versus 3097 for the standard sin code.

It's based on fitting a second order polynomial to the sin curve, then allowing this to work for both + and - numbers (the fabs handles this), and then adding a third order polynomial correction. Because all the main terms are pre-solved, it is quite efficient.
Display posts from previous:   
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion All times are GMT - 6 Hours
Goto page 1, 2, 3, 4, 5  Next
Page 1 of 5

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group