CCS C Software and Maintenance Offers
FAQFAQ   FAQForum Help   FAQOfficial CCS Support   SearchSearch  RegisterRegister 

ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

CCS does not monitor this forum on a regular basis.

Please do not post bug reports on this forum. Send them to CCS Technical Support

Optimising code
Goto page 1, 2  Next
 
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion
View previous topic :: View next topic  
Author Message
Martin Berriman



Joined: 08 Dec 2005
Posts: 66
Location: UK

View user's profile Send private message

Optimising code
PostPosted: Tue Mar 04, 2008 1:47 am     Reply with quote

I am using a PIC18F4685 but I have a very large program and am short on code space! I'm therefore trying to optimise things a bit.

I have quite a lot of int32 multiplies followed by divides to do scaling (eg x = y * z / q) therefore I thought I would save a lot of space if I called a routine to do this each time rather than effectively repeating the same code each time. Unfortunately it seems that I use very nearly the same amount of code space either way.

Is there something better I can do with this (or suggestions for other code optimisation tricks for that matter)?

inline multiply and divide
Code:
... calib_parameters->oil_flow_rate = (oil_measured_flow * 5000) / 300;
03F74:  MOVLW  10
03F76:  MOVLB  6
03F78:  ADDWF  x7B,W
03F7A:  MOVWF  FE9
03F7C:  MOVLW  00
03F7E:  ADDWFC x7C,W
03F80:  MOVWF  FEA
03F82:  MOVFF  FEA,680
03F86:  MOVFF  FE9,67F
03F8A:  MOVFF  E7,889
03F8E:  MOVFF  E6,888
03F92:  MOVFF  E5,887
03F96:  MOVFF  E4,886
03F9A:  MOVLB  8
03F9C:  CLRF   x8D
03F9E:  CLRF   x8C
03FA0:  MOVLW  13
03FA2:  MOVWF  x8B
03FA4:  MOVLW  88
03FA6:  MOVWF  x8A
03FA8:  MOVLB  0
03FAA:  CALL   0FD4
03FAE:  MOVFF  680,FEA
03FB2:  MOVFF  67F,FE9
03FB6:  MOVFF  03,684
03FBA:  MOVFF  02,683
03FBE:  MOVFF  01,682
03FC2:  MOVFF  00,681
03FC6:  MOVFF  FEA,686
03FCA:  MOVFF  FE9,685
03FCE:  MOVFF  03,88B
03FD2:  MOVFF  02,88A
03FD6:  MOVFF  01,889
03FDA:  MOVFF  00,888
03FDE:  MOVLB  8
03FE0:  CLRF   x8F
03FE2:  CLRF   x8E
03FE4:  MOVLW  01
03FE6:  MOVWF  x8D
03FE8:  MOVLW  2C
03FEA:  MOVWF  x8C
03FEC:  MOVLB  0
03FEE:  RCALL  3E5C
03FF0:  MOVFF  686,FEA
03FF4:  MOVFF  685,FE9
03FF8:  MOVFF  00,FEF
03FFC:  MOVFF  01,FEC
04000:  MOVFF  02,FEC
04004:  MOVFF  03,FEC


alternative - call a function to perform multiply and divide
Code:
... calib_parameters->oil_flow_rate = math_imd(oil_measured_flow, 5000, 300);
03F74:  MOVLW  10
03F76:  MOVLB  6
03F78:  ADDWF  x7B,W
03F7A:  MOVWF  01
03F7C:  MOVLW  00
03F7E:  ADDWFC x7C,W
03F80:  MOVWF  03
03F82:  MOVFF  01,67D
03F86:  MOVWF  x7E
03F88:  MOVFF  E7,689
03F8C:  MOVFF  E6,688
03F90:  MOVFF  E5,687
03F94:  MOVFF  E4,686
03F98:  CLRF   x8D
03F9A:  CLRF   x8C
03F9C:  MOVLW  13
03F9E:  MOVWF  x8B
03FA0:  MOVLW  88
03FA2:  MOVWF  x8A
03FA4:  CLRF   x91
03FA6:  CLRF   x90
03FA8:  MOVLW  01
03FAA:  MOVWF  x8F
03FAC:  MOVLW  2C
03FAE:  MOVWF  x8E
03FB0:  MOVLB  0
03FB2:  CALL   1106
03FB6:  MOVFF  67E,FEA
03FBA:  MOVFF  67D,FE9
03FBE:  MOVFF  00,FEF
03FC2:  MOVFF  01,FEC
03FC6:  MOVFF  02,FEC
03FCA:  MOVFF  03,FEC


math_imd codes as:
Code:

... //---------------------------------------------------------
... s32bit math_imd(s32bit val, s32bit multiplier, s32bit divisor)
... //---------------------------------------------------------
... {
... return (val * multiplier) / divisor;
*
01106:  MOVFF  689,889
0110A:  MOVFF  688,888
0110E:  MOVFF  687,887
01112:  MOVFF  686,886
01116:  MOVFF  68D,88D
0111A:  MOVFF  68C,88C
0111E:  MOVFF  68B,88B
01122:  MOVFF  68A,88A
01126:  RCALL  0FD4
01128:  MOVFF  03,695
0112C:  MOVFF  02,694
01130:  MOVFF  01,693
01134:  MOVFF  00,692
01138:  MOVFF  03,69B
0113C:  MOVFF  02,69A
01140:  MOVFF  01,699
01144:  MOVFF  00,698
01148:  MOVFF  691,69F
0114C:  MOVFF  690,69E
01150:  MOVFF  68F,69D
01154:  MOVFF  68E,69C
01158:  RCALL  1030
.................... }
0115A:  RETLW  00
Pret



Joined: 18 Jul 2006
Posts: 92
Location: Iasi, Romania

View user's profile Send private message

PostPosted: Tue Mar 04, 2008 2:12 am     Reply with quote

With some versions of CCS,
Code:
(*calib_parameters).oil_flow_rate
is better than
Code:
calib_parameters->oil_flow_rate


Another thing. How about:
Code:
(oil_measured_flow * 50) / 3
Or if your result requires speed more than precision, you can try
Code:
oil_measured_flow*16 + oil_measured_flow/2
which can be translated in
Code:
oil_measured_flow<<4 + oil_measured_flow>>1

Hope it helps...
Martin Berriman



Joined: 08 Dec 2005
Posts: 66
Location: UK

View user's profile Send private message

PostPosted: Tue Mar 04, 2008 2:50 am     Reply with quote

Thanks for your reply Pret

Pret wrote:
With some versions of CCS,
Code:
(*calib_parameters).oil_flow_rate
is better than
Code:
calib_parameters->oil_flow_rate


I was not aware there was any difference there but just tried it and it definitely does save code space (saves 6 bytes using 4.063).
Edit: Just tried this in other places and it takes more space - strange Confused

Pret wrote:
Another thing. How about:
Code:
(oil_measured_flow * 50) / 3


Good point. This does save another 4 bytes. Not all of my code will have such nice numbers but it is at least something I can check through and improve where possible.

Pret wrote:
Or if your result requires speed more than precision, you can try
Code:
oil_measured_flow*16 + oil_measured_flow/2
which can be translated in
Code:
oil_measured_flow<<4 + oil_measured_flow>>1

Hope it helps...


Nice idea but accuracy is important. Will bear it in mind though and use where possible.

Thanks for your help.
Ttelmah
Guest







PostPosted: Tue Mar 04, 2008 3:18 am     Reply with quote

The reason for the small improvement, is that the compiler is already using a generic 'divide' routine in the original code.

One thought, is to evaluate the sum as:

calib_parameters->oil_flow_rate = (oil_measured_flow * 4267) / 256;

This gives the same result to better than 4 decimals, yet will evaluate much faster (the compiler is smart enough to know that it can perform /256, by shifting one byte right).

Best Wishes
Martin Berriman



Joined: 08 Dec 2005
Posts: 66
Location: UK

View user's profile Send private message

PostPosted: Tue Mar 04, 2008 3:43 am     Reply with quote

Thanks Ttelmah,

Ttelmah wrote:
The reason for the small improvement, is that the compiler is already using a generic 'divide' routine in the original code.


Is it likely that I could improve over the generic divide by using my custom multiply and divide routine implemented in assembler since I know I always want to multiply and then divide?

Ttelmah wrote:
One thought, is to evaluate the sum as:

calib_parameters->oil_flow_rate = (oil_measured_flow * 4267) / 256;

This gives the same result to better than 4 decimals, yet will evaluate much faster (the compiler is smart enough to know that it can perform /256, by shifting one byte right).


Just tried that out - It does save space compared to the original code however it does not save as much as calling my math_imd routine when using the 50 / 3 numbers.

Thanks for your suggestions Cool

Edit:
I also have a lot of sprintf to format data which I send to an LCD - can I improve these:
Code:
sprintf(&buffer[0], "%cZL%c%c%lu", 0x1B, x, y, oil_measured_flow);
ckielstra



Joined: 18 Mar 2004
Posts: 3680
Location: The Netherlands

View user's profile Send private message

PostPosted: Tue Mar 04, 2008 6:30 am     Reply with quote

The 32 bit division + multiply requires a lot of code space but this is about as good as it gets. The assembly code you show us is mostly for storing and retrieving the 32-bit parameters before the general multiply and divide routines are called.

From the small code fragments you show us it is difficult to give other optimization tips. Maybe there are other parts in your code taking a lot of space? Check the list file for this, especially printf lines can be expensive (hidden by a new subroutine call for every line).

Also consider another approach for your arithmetic. Do you really need 32-bit precision? Can you do the scaling only once, for example at start or end?
Martin Berriman



Joined: 08 Dec 2005
Posts: 66
Location: UK

View user's profile Send private message

PostPosted: Tue Mar 04, 2008 7:05 am     Reply with quote

Thanks ckielstra,

ckielstra wrote:
The 32 bit division + multiply requires a lot of code space but this is about as good as it gets. The assembly code you show us is mostly for storing and retrieving the 32-bit parameters before the general multiply and divide routines are called.


Thought that would be the case. I was wondering whether I could improve on it by coding the multiply and divide myself since I can leave results in specific registers however if I have refactored the code to use my math_imd routine then I would not save anything significant anyway.

ckielstra wrote:
From the small code fragments you show us it is difficult to give other optimization tips. Maybe there are other parts in your code taking a lot of space? Check the list file for this, especially printf lines can be expensive (hidden by a new subroutine call for every line).


Yes, the sprintf line that I show above takes 76 bytes and I have lots of these - some with more parameters and some with less. I am using an LCD where I send it data in a certain protocol over I2C - I therefore have to format what I wish to send first. If I could improve on sprintf that would help a lot. I never need to format floats so I wondered whether if I coded my own sprintf it would be better.

ckielstra wrote:
Also consider another approach for your arithmetic. Do you really need 32-bit precision? Can you do the scaling only once, for example at start or end?


I'm using long scaled integer arithmatic to avoid using floating point variables. I may be able to improve things further though.

Thanks for your comments. Cool
ckielstra



Joined: 18 Mar 2004
Posts: 3680
Location: The Netherlands

View user's profile Send private message

PostPosted: Tue Mar 04, 2008 7:27 am     Reply with quote

Here an example on the difference in code size between sprintf and manual coding:

Code:
void main()
{
  int32 oil_measured_flow;
  int8 x,y;
  char buffer[20];
 
  oil_measured_flow = 0x12345678;
 
  // sprintf takes 58 bytes + 5 calls to other functions
  sprintf(&buffer[0], "%cZL%c%c%lu", 0x1B, x, y, oil_measured_flow);
 
  // Manual code example below takes only 30 bytes and no function calls.
  buffer[0] = 0x1B;
  buffer[1] = x;
  buffer[2] = y;
  buffer[3] = make8( oil_measured_flow, 0);  // Note: byte sequence here is not equal to the sprintf.
  buffer[4] = make8( oil_measured_flow, 1);
  buffer[5] = make8( oil_measured_flow, 2);
  buffer[6] = make8( oil_measured_flow, 3);
  buffer[7] = 0;
}
Martin Berriman



Joined: 08 Dec 2005
Posts: 66
Location: UK

View user's profile Send private message

PostPosted: Tue Mar 04, 2008 8:00 am     Reply with quote

ckielstra wrote:
Here an example on the difference in code size between sprintf and manual coding:

Code:
void main()
{
snip
}


Wow Shocked
Thank you very much for doing that - it is like a slap in the face to notice how much can be saved so easily. Embarassed

I have just replaced one instance of it (including the ZL) and it saved 46 bytes. A quick check shows that I have 138 calls to sprintf so based on that I should be able to save around 6.7% code space!!! Very Happy

Since I need similar code 138 times, do you think it is worth figuring out CCS variable parameter lists and implementing it as a function that I call 138 times or simply to code it inline as you have shown? Perhaps I would not know the answer until I tried it out. Quite a lot of times the number and type of parameters are the same so I could have one function to cover that option and code the rest inline.

Many thanks indeed Very Happy
ckielstra



Joined: 18 Mar 2004
Posts: 3680
Location: The Netherlands

View user's profile Send private message

PostPosted: Tue Mar 04, 2008 8:12 am     Reply with quote

138 variations on the same theme sounds like an opportunity for optimization. Take note that pointer arithmetic is using a lot of code space in the PIC processor, so try to avoid passing variables as pointers.
To find the best solution will require some testing. In tweaking code there is no sure way to predict the most optimal solution.
Martin Berriman



Joined: 08 Dec 2005
Posts: 66
Location: UK

View user's profile Send private message

PostPosted: Tue Mar 04, 2008 8:32 am     Reply with quote

ckielstra wrote:
138 variations on the same theme sounds like an opportunity for optimization. Take note that pointer arithmetic is using a lot of code space in the PIC processor, so try to avoid passing variables as pointers.
To find the best solution will require some testing. In tweaking code there is no sure way to predict the most optimal solution.


More useful tips Very Happy - I do tend to pass pointers generally so will have to be careful of that in future.
Thanks again Cool
Ken Johnson



Joined: 23 Mar 2006
Posts: 197
Location: Lewisburg, WV

View user's profile Send private message

PostPosted: Tue Mar 04, 2008 8:41 am     Reply with quote

"I'm using long scaled integer arithmatic to avoid using floating point variables."

Why?

A lot of folks here disagree with me on this, but I use floats a lot - makes code much simpler and more readable (maintainable). Yes, there are instances where the speed penalty comes into play, but . . .

Look at the project requirements, rather than just saying "Don't use floats"

Ok, there's 2 cents worth, which may not be worth that much Smile

Ken
Martin Berriman



Joined: 08 Dec 2005
Posts: 66
Location: UK

View user's profile Send private message

PostPosted: Wed Mar 05, 2008 1:59 am     Reply with quote

Ken Johnson wrote:
A lot of folks here disagree with me on this, but I use floats a lot - makes code much simpler and more readable (maintainable). Yes, there are instances where the speed penalty comes into play, but . . .


Hi Ken,
Adding two floats takes more code than adding two int32s.

I've realised that my earlier enthusiasm for the simplification that ckielstra suggested was a bit misguided. For example:

Code:
sprintf(&buffer[0], "%cZL%c%c%lu", 0x1B, x, y, oil_measured_flow);


oil_measured_flow needs to be sent to the LCD as a string rather than 4 bytes so I still have a problem.

Another example:
Code:
sprintf(&buffer[0], "%cZL%c%c%02u/%02u/%02u", 0x1B, ScreenCentreX + 5, TextLine2T, cal_date.day, cal_date.month, cal_date.year);


where cal_date.day is a byte that I want to display as 02 etc
I guess I could code it as:

Code:

//buffer[0] = 0x1B;
//buffer[1] = 'Z';
//buffer[2] = 'L';
//buffer[3] = ScreenCentreX + 5;
//buffer[4] = TextLine2T;
//buffer[5] = '0' + (cal_date.day / 10);
//buffer[6] = '0' + (cal_date.day % 10);
//buffer[7] = '/';
//buffer[8] = '0' + (cal_date.month / 10);
//buffer[9] = '0' + (cal_date.month % 10);
//buffer[10] = '/';
//buffer[11] = '0' + (cal_date.year / 10);
//buffer[12] = '0' + (cal_date.year % 10);
//buffer[13] = 0;


but that takes more code space (144 bytes) compared to the sprintf (126 bytes) Rolling Eyes

Am I missing something obvious?? Confused
Ttelmah
Guest







PostPosted: Wed Mar 05, 2008 3:42 am     Reply with quote

Ages ago, I posted here, a 'demo' routine showing a more efficient way of doing the arithmetic for this. Cannot remember what the thread was about!. Basically, when you perform an /10, the remainder, is available in one of the compiler's temporary variables, and with a bit of ingenuity, it is possible to retrieve this, saving having to perform a second operation to generate the '%10' value. I'd suspect that possibly 'sprintf', may actually be doing something like this.

Best Wishes
Martin Berriman



Joined: 08 Dec 2005
Posts: 66
Location: UK

View user's profile Send private message

PostPosted: Wed Mar 05, 2008 3:47 am     Reply with quote

Ttelmah wrote:
Ages ago, I posted here, a 'demo' routine showing a more efficient way of doing the arithmetic for this. Cannot remember what the thread was about!. Basically, when you perform an /10, the remainder, is available in one of the compiler's temporary variables, and with a bit of ingenuity, it is possible to retrieve this, saving having to perform a second operation to generate the '%10' value. I'd suspect that possibly 'sprintf', may actually be doing something like this.


Thanks Ttelmah, I will search for it.
Edit: Not found it yet Sad - I have found something that might be along similar lines though (http://www.ccsinfo.com/forum/viewtopic.php?p=53435#53435).
Edit2: Just came across another of your posts saying that switch statements with defaults take more code - removed a few of them (since I handle all options anyway) and it saves me another 262 bytes! Very Happy
Display posts from previous:   
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion All times are GMT - 6 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group