CCS C Software and Maintenance Offers
FAQFAQ   FAQForum Help   FAQOfficial CCS Support   SearchSearch  RegisterRegister 

ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

CCS does not monitor this forum on a regular basis.

Please do not post bug reports on this forum. Send them to support@ccsinfo.com

Code Efficiency/Speed

 
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion
View previous topic :: View next topic  
Author Message
jecottrell



Joined: 16 Jan 2005
Posts: 559
Location: Tucson, AZ

View user's profile Send private message

Code Efficiency/Speed
PostPosted: Sat Nov 25, 2006 1:27 am     Reply with quote

3.236

I've got a multiconditional statement in my main loop:

Code:

if(x && y && z)
{
   do this
}


Is it more efficient to use the single, multi-conditional 'if' statement or use a nested if, that has the simplest or quickest condition in the outermost if?

Code:

if(x)  //this is the least likely to be true
{
   if(y && z)
   {
      do this
   }
}


I would think it has to do with the product of the compilation and whether there is a 'guarantee' that the conditions in the 'multiconditional' if statement are checked in the order they are written?

As an additional question that may help with future problems like this, can anyone recommend a good source (book?) for learning PIC assembly, so that I may be able to decipher LST files?

Thanks,

John
kender



Joined: 09 Aug 2004
Posts: 768
Location: Silicon Valley

View user's profile Send private message Send e-mail Visit poster's website Yahoo Messenger

PostPosted: Sat Nov 25, 2006 4:23 am     Reply with quote

If I'm not mistaken, the ANSI C spec requires that:
- These operators guarantee left-to-right evaluation.
- If the compiler can make an evaluation by examining only the left operand, the right operand is not evaluated.
Ttelmah
Guest







PostPosted: Sat Nov 25, 2006 9:00 am     Reply with quote

What does ANSI have to do with it?.
CCS, is not ANSI C....
The difference wll be almost immesurable for simple simple 8bit values and tests, since adding an extra test/jump, will take as long as just anding in the extra byte...

Best wishes.
jecottrell



Joined: 16 Jan 2005
Posts: 559
Location: Tucson, AZ

View user's profile Send private message

PostPosted: Sat Nov 25, 2006 10:08 am     Reply with quote

Kender,

Thanks for the info. I had wondered about the ANSI-ishness of CCS as noted in Ttelmah's response.

Ttelmah,

I think I understand your response, but I'm not positive. To further explain the situation, the 'if' statement is used in my main loop and obviously is tested every time through which adds overhead to the loop. Now, I have one particular case that will evaluate to FALSE 92% of the time. If I test that case first I will save any unnecessary branching... So, is there any way to maximize the efficiency of the multiple case 'if' statement?

I think in your previous reply you are saying that to test all the cases it is here nor there whether it is done either way. But my brain tells me that there must be some overhead in fetching and comparing extra conditions and if you can eliminate any unnecessary branches you'll save processing time? So, the sooner you can get out of the conditional branch the better?
Mark



Joined: 07 Sep 2003
Posts: 2838
Location: Atlanta, GA

View user's profile Send private message Send e-mail

PostPosted: Sat Nov 25, 2006 10:32 am     Reply with quote

The best way to know is to try them both and then look at the asm generated in the LST file.
jecottrell



Joined: 16 Jan 2005
Posts: 559
Location: Tucson, AZ

View user's profile Send private message

PostPosted: Sat Nov 25, 2006 11:35 am     Reply with quote

Mark,

I had anticipated that response in my original post, that's why I asked the question about a good reference for PIC ASM. I found an OK online tutorial, but I always like to have a hard copy to dogear pages and markup. Do you know of any such animal?

BTW, I will try it and see if I can decipher the LST file.

Thanks,

John
Ttelmah
Guest







PostPosted: Sat Nov 25, 2006 4:17 pm     Reply with quote

Ok.
The point about my answer, relates to how things work on the PIC. On many processors (most), there is a significant timing difference between the true, and false branches on a test. On the PIC, there isn't. The 'straight through' path, using a normal test, still fetches the extra instruction for the branch, and takes an extra cycle.
Now if you take three numbers, and need to logically combine them, and branch,the basic sequence would be:
load number 'A' into W.
combine with 'B'.
combine with 'C'
test and branch.

This can actually be done in five machine cycles (plus any overhead for bank switching).

If instead you 'double test', you get:
load number 'A' into W.
combine with 'B'.
test and branch.
combine with 'C'
test and branch.

The 'best' case, drops to 4 machine cycles, but for everyting that needs the second test, the time now rises to 7 cycles.
For your 90% probability, it would then be worthwhile to do the shorter test, but in many case it wouldn't (anything with less than 66% probability for the first test, it is not worthwile).
However this makes a 'key' assumption in both cases, that the registers being tested, are in the same bank as is being used by the operational registers in the rest of the particular routine. If they are not, another four machine cycles need to be added (potentially for each switch). The problem here, is that because the registers need to be restored before the branch, or seperately afterwards, this more than doubles the cycles involved. Also, the 'combine' operation, is assumed to be a single machine operation. This is not the case for more complex logic, which can result in the 'all in one' operation taking the lead.
Whether shortcutting gains anything, depends particularly then on the logic operation involved. In fact in your case, the logic evaluates as:

if ((a!=0) && (b!=0) && (c!=0))

On the PIC, loading a value from a register, sets the 'Z' flag if the value is zero, so the compiler takes advantage of this, and does not combine the values at all, but instead simply reads each one in turn, and jumps past the code to be called if this value is non zero. Also on the latter 18 chips, there is a single instruction 'short' jump for the zero condition, and this can be used to make this really efficient, provided the code inside the test is short. This is by default done, and identical code will be generated for the combined, or the separate operation.
However it also becomes vital to use the .sym file, and verify that all registers used in the core 'fast' loop, reside in the same bank (change the order they are declared if not), and even better, make sure they are in the first bank (where scratch variables etc., are stored), since otherwise it is the bank switching which will cost the most in time. Also keep the total length of the code inside the routine called if the test is true short (if there is a subroutine, make sure it is not coded as 'inline'), or the longer (slower) jump will need t be used.
PCM programmer has published notes in the pst about how to optimise register layouts, and while these my seem not to apply to the 18 chips (where there are single instructions to fetch values from any part of the memory), they are still just as important, since these larger instructions, take two machine cycles...

Best Wishes
jecottrell



Joined: 16 Jan 2005
Posts: 559
Location: Tucson, AZ

View user's profile Send private message

PostPosted: Sat Nov 25, 2006 5:09 pm     Reply with quote

Thank you very much for the in-depth explanation. I saw some of what you explained as a picked through the LST and SYM files. I even saw that one of my conditions was a function with a return value and the increased overhead that it cost.

I also saw the 'straight through' concept you explained. Now it seems rather obvious. On a processor of limited size, when testing a variety of conditions, A && !B && C, etc., that each will be done individually then branch. I DID NOT realize that similar conditions would be combined prior to testing. That is interesting, but makes complete sense.

What I've really learned is, is that I probably don't need to worry about this yet. But it is very interesting to start to understand the underlying ASM generated from the C code.

Thanks again,

John
Humberto



Joined: 08 Sep 2003
Posts: 1215
Location: Buenos Aires, La Reina del Plata

View user's profile Send private message

PostPosted: Sun Nov 26, 2006 8:31 am     Reply with quote

Nice and interesting discussion. Letīs say that the CCS compiler handle very efficiently
all regarding bit/test and the generated code is as short and efficient like in ASM.


It doesnīt matter the coding style you use, itīll give you the SAME generated code:
Code:

// 1) Coding in this way:
 
 if(x && y && z)
   { a = 1; }

// 2) Coding in this way:         
 
  if(x)
    if(y)
      if(z)
        { a = 1; }

// 3) Coding in this way:
   
   if ((x!=0) && (y!=0) && (z!=0))
      { a = 1; }



The CCS Compiler generated the SAME and SHORTEST ASM code.
Code:


  ....................         if(x && y && z)
0019:  MOVF   x,F
001A:  BTFSC  STATUS.2
001B:  GOTO   024
001C:  MOVF   y,F
001D:  BTFSC  STATUS.2
001E:  GOTO   024
001F:  MOVF   z,F
0020:  BTFSC  STATUS.2
0021:  GOTO   024
....................           {
....................            a = 1;
0022:  MOVLW  01
0023:  MOVWF  a
....................           }


...................         if(x)
0024:  MOVF   x,F
0025:  BTFSC  STATUS.2
0026:  GOTO   02F
....................           if(y)
0027:  MOVF   y,F
0028:  BTFSC  STATUS.2
0029:  GOTO   02F
....................             if(z)
002A:  MOVF   z,F
002B:  BTFSC  STATUS.2
002C:  GOTO   02F
....................               {
....................                a = 1;
002D:  MOVLW  01
002E:  MOVWF  a
....................               }
....................



....................
....................        if ((x!=0) && (y!=0) && (z!=0))
002F:  MOVF   x,F
0030:  BTFSC  STATUS.2
0031:  GOTO   03A
0032:  MOVF   y,F
0033:  BTFSC  STATUS.2
0034:  GOTO   03A
0035:  MOVF   z,F
0036:  BTFSC  STATUS.2
0037:  GOTO   03A
....................           {
....................            a = 1;
0038:  MOVLW  01
0039:  MOVWF  a
....................           }


There is no way to do it in a shortest way, even in ASM.


Humberto
jecottrell



Joined: 16 Jan 2005
Posts: 559
Location: Tucson, AZ

View user's profile Send private message

PostPosted: Sun Nov 26, 2006 8:55 am     Reply with quote

Humberto,

Thanks for the test. What part is that targeted for? I noticed in my experimentation for the 18F2525, that it uses a BRA (branch?) instruction where yours uses a GOTO. I'm wondering if the behavior that Ttelmah describes, combine and then branch is unique to 18F parts?

Even then, I still have decided that there is a slight gain to be realized depending on the test values. My example in my first post should have been more accurate and have looked more like this:

Code:

if((!kbhit) && (!function1()) && (x == 5) && (etc.....))
{
   do this
}



With a variety of conditions there is some minor optimization that can achieved manually in the code that is fed to the compiler.

Thanks again,

John
Humberto



Joined: 08 Sep 2003
Posts: 1215
Location: Buenos Aires, La Reina del Plata

View user's profile Send private message

PostPosted: Sun Nov 26, 2006 10:31 am     Reply with quote

Quote:

Thanks for the test. What part is that targeted for?


Hi John,

The posted code is valid for any mid-range PIC ucontroller using PCM Compiler V3.236
Regarding your preposition style it is a matter of looking in the resulting .lst file
generated and you will see the difference if so.


Humberto
Display posts from previous:   
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group