CCS C Software and Maintenance Offers
FAQFAQ   FAQForum Help   FAQOfficial CCS Support   SearchSearch  RegisterRegister 

ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

CCS does not monitor this forum on a regular basis.

Please do not post bug reports on this forum. Send them to CCS Technical Support

Optimised for loop

 
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion
View previous topic :: View next topic  
Author Message
ouille
Guest







Optimised for loop
PostPosted: Fri Sep 03, 2004 8:58 am     Reply with quote

Hello,
CCS C for loop is not very efficient

for (i=0;i<8;i++) // ...;
63 cycles on 16f device opt level 9.

for (i=7;i;i--) // ...;
49 cycles.

i=8;
#asm
loop:
#endasm
//...
#asm
decfsz i,f
goto loop
#endasm
use 25 cycle.

It seem's that C perfs are double of optimised asm.

Why does the optimiser don't optimise structure like this :
0726: MOVF 59,W
0727: DECF 59,F
0728: XORLW 00 // flag Z already positionned
0729: BTFSC 03.2 // replace decfsz
072A: GOTO 72C
072B: GOTO 726

This optimisation for small loop is quite important for slow devices.
Is there a 'C' syntax to write better code ?
ouille
Guest







The right question is how to optimize a for loop...
PostPosted: Fri Sep 03, 2004 11:10 am     Reply with quote

see my first message
bdavis



Joined: 31 May 2004
Posts: 86
Location: Colorado Springs, CO

View user's profile Send private message

PostPosted: Fri Sep 03, 2004 12:09 pm     Reply with quote

Have you tried a do {} while or a while {} do ?
Maybe that will work better?
ouille
Guest







for loop
PostPosted: Fri Sep 03, 2004 12:51 pm     Reply with quote

this is the same problem with do while. Quite same execution time
ckielstra



Joined: 18 Mar 2004
Posts: 3680
Location: The Netherlands

View user's profile Send private message

PostPosted: Fri Sep 03, 2004 3:42 pm     Reply with quote

You have a good point here.
Which compiler version are you using? The latest compiler versions improved on optimization (although mainly for the PIC18).
bdavis



Joined: 31 May 2004
Posts: 86
Location: Colorado Springs, CO

View user's profile Send private message

PostPosted: Fri Sep 03, 2004 7:51 pm     Reply with quote

It must be your version of the compiler or the chip type...
I got this in version 3.202 of the PCHW compiler...
.................... for (i=0; i<16; i++)
0040: CLRF 06
0042: MOVF 06,W
0044: SUBLW 0F
0046: BNC 004E
.................... {
.................... #asm
.................... nop
0048: NOP
.................... #endasm
.................... }
004A: INCF 06,F
004C: BRA 0042

if you exclude the nop that I put in, it's 6 instructions and 5 cycles per additional loop - that's good. I can do nothing really fast!! Laughing

I did find that doing an xor to flip a single bit sucks - 10 cycles
Then did a if, else to set or clear the bit - 5 instructions
Then I looked at assembler - BTG (bit toggle) - 1 instruction
Then I read the readme file - new function to toggle a bit - 1 instruction I think:)

It's all a learning process for me on what is optimized and what isn't. I have been fairly happy with the 18Fxxx so far though...

Good Luck! Very Happy
ouille
Guest







for loop
PostPosted: Sat Sep 04, 2004 2:13 am     Reply with quote

Hello,

my compiler version is 3.190, and y work on 16f pics

Overhead in your for loop is good. Compiler seem's to optimize a little bit.
I'had learn a long time ago that writing for loop with a i++ is not efficient on microcontroleur as there are instruction modifing directly Z flag. this can avoid the comparaison.
It's better to compare with 0.
for (i=15;i>=0;i--) ...
In this case the asm instruction is:
...
decfsz i,f
goto loop begin
wich is 3 cycles plus one decf i,w for reading loop index
I found strange that ccs don't optimise this kind of loop as there are relatively frequent in microC program.

Bye
bdavis



Joined: 31 May 2004
Posts: 86
Location: Colorado Springs, CO

View user's profile Send private message

PostPosted: Sat Sep 04, 2004 11:20 am     Reply with quote

Yup - the decrement for loop is also good for the ARM processors too. I tried it on the CCS compiler and it didn't fully optimize it. I did try the following and it works great! Very Happy

Repeated lop is 3 cycles if you exclude the nop...
It will loop 256 times since I was stupid enough to init i to zero instead of something smaller Laughing

.................... i = 0;
0040: CLRF 06
.................... do
.................... {
.................... #asm
.................... nop
0042: NOP
.................... #endasm
.................... i--;
0044: DECF 06,F
.................... }while (i>0);
0046: MOVF 06,F
0048: BNZ 0042
PCM programmer



Joined: 06 Sep 2003
Posts: 21708

View user's profile Send private message

PostPosted: Sat Sep 04, 2004 11:54 am     Reply with quote

Quote:
CCS C for loop is not very efficient
for (i=0;i<8;i++) // ...;
63 cycles on 16f device opt level 9.
my compiler version is 3.190, and I work on 16f pics


I installed PCM vs. 3.190 and compiled the test program shown below.
The loop code only takes 7 cycles. I tried it with and without #opt 9.
It compiles the same in each case.

How did you get 63 cycles ? Are you counting your code that's
in the body of the loop ? But that's not part of the loop control code.

Code:
#include <16F877.H>
#fuses XT, NOWDT, NOPROTECT, BROWNOUT, PUT, NOLVP
#use delay(clock = 4000000)

#define nop() #asm nop #endasm

//====================================
void main()
{
char i;

for(i=0;i<8;i++)
   {
    nop();
   }

while(1);
}

Code:
0000                00284 .................... for(i=0;i<8;i++)   
000A 1283       00285 BCF    03.5
000B 01A1       00286 CLRF   21
// The loop starts here:
000C 0821       00287 MOVF   21,W   // 1 cycle
000D 3C07       00288 SUBLW  07     // 1 cycle
000E 1C03       00289 BTFSS  03.0   // 2 cycles (jump normally taken)
000F 2813       00290 GOTO   013
0000                00291 ....................    { 
0000                00292 ....................     nop();   
0010 0000       00293 NOP
0000                00294 ....................    } 
0011 0AA1       00295 INCF   21,F   // 1 cycle
0012 280C       00296 GOTO   00C    // 2 cycles
ouille
Guest







for loop optimisation
PostPosted: Sun Sep 05, 2004 9:49 am     Reply with quote

Hello,

63 cycles is for the entire loop (8 iterations).
Each iteration is 7. Add some loop overhead.

My first question was perhaps confuse.
What is the C program that compile in an efficient for loop.
5 cycles form bdavis is better, but why is it impossible to achieve the 3cycles ?
Trampas



Joined: 04 Sep 2004
Posts: 89
Location: NC

View user's profile Send private message MSN Messenger

PostPosted: Sun Sep 05, 2004 5:49 pm     Reply with quote

Well you have to realize the processor only reconizes zero. That is all compares are done if the value is zero or not. Sort of...

Therefore to get the best performance out of loops have all loops end in zero. For example look at this:

Code:
209:                  for(i=7; i!=0; i--)
002FB4    0E07     MOVLW 0x7
002FB6    6F44     MOVWF 0x44, BANKED
002FB8    5344     MOVF 0x44, F, BANKED
002FBA    E003     BZ 0x2fc2
210:                  {
211:                     #asm nop #endasm
002FBC    0000     NOP
212:                  }
002FBE    0744     DECF 0x44, F, BANKED
002FC0    D7FB     BRA 0x2fb8


Looks a lot like bdavis' code...

The real reason is that most programmers use a for loop with the index such that the index is used to index into data. That is you use the variable i in your for loop to access arrays or do other calculations. Some compilers do a dependecy check on i with-in the loop and if it is not used it will use the more effecient looping to zero. However not all compilers are that smart.

Trampas

Trampas
ouille
Guest







for loop optimisation
PostPosted: Mon Sep 06, 2004 12:45 pm     Reply with quote

Hello,

Hy trampas, your anwer was exactly what i expected.
I don't know why a haven't test with i!=0 !!!

I've test your code on 16f device but but but, results are not quite as good:
Code:
062E:  MOVLW  08 ; initialisation ok
062F:  MOVWF  52 ; init
0630:  MOVF   52,F ;read i ...
0631:  BTFSC  03.2 ;zero testing
0632:  GOTO   635
...
0633:  DECF   52,F ;decrement
0634:  GOTO   630


but why does ccs don't use a decfsz ??? why ???
ouille
Guest







for loop optimisation
PostPosted: Mon Sep 06, 2004 12:49 pm     Reply with quote

Thank's for all i've got it:

Code:

062E:  MOVLW  07 ;init
062F:  MOVWF  52 ;init
0630:  NOP
0631:  DECFSZ 52,F ;loop ...
0632:  GOTO   630  ; 3 cycle ... ok


and the c code is:
Code:

   i=7;
   do
   #asm
   nop
   #endasm
   while (--i!=0);

my initialisation is probable false (i=niter+2)

Bye.
Display posts from previous:   
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group