CCS C Software and Maintenance Offers
FAQFAQ   FAQForum Help   FAQOfficial CCS Support   SearchSearch  RegisterRegister 

ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

CCS does not monitor this forum on a regular basis.

Please do not post bug reports on this forum. Send them to support@ccsinfo.com

Hard to explain problem, CALL to random places

 
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion
View previous topic :: View next topic  
Author Message
Laurent Chouinard



Joined: 12 Sep 2003
Posts: 43

View user's profile Send private message

Hard to explain problem, CALL to random places
PostPosted: Sat Apr 28, 2007 8:20 pm     Reply with quote

Not only is my problem hard to diagnose, but my project is quite big so sharing code here is problematic. I am still trying my luck out, in case someone could have insight, tips or ideas as to explain the behavior I am having.

I use 4.033 compiler, pic18f6722.

So, let's assume I have a main() function, lots of includes, lots of functions, lots of ram consumed (80% worst case!), and about 45% of the rom used.

This main function does a lot of things, and then something like this:
Code:
superFunction();

if (event) {
  strcpy(someString, "Blah blah.");
}


someString is defined in the very beginning, in main.h, as: int8 someString[100];

"event" does not happen, yet, in the design i'm working on. Therefore, the above code, setting "someString" to "Blah blah." will never happen. Nevertheless, it is compiled.

Frequently, the "superFunction()" called by main() is handling buffers and comes to a switch case that has about 10 cases. This works fine at every pass.

I decided to change a bit of my phrasing to the user, so in the current code i'm working on (the "event" in main), i changed the phrase to that one:

Code:
if (event) {
  strcpy(someString, "Blah blahBAR.");
}


It is 3 bytes longer than previously. "someString" is declared as 100 bytes long so there's enough room. Also, numerous places in the entire program will quite often write 40 bytes to that array, never causing any problem. But today, adding these THREE characters to this "strcpy" function that is not even called in real-life because "event" doesn't happend, it triggers a very precise problem.

In "superFunction()", that is called at every pass of the main loop, around in the middle of it, there's a switch case with 10 cases. Instead of going through the cases, the compiler instead put a very nice CALL. This call end about the middle of a completely different function! Right in the MIDDLE! So of course that function craps everything up because it's handling unpredictable data. Since it's using a CALL, the function ends (abruptly), and execution returns to the switch case, at the end of it.

From there on, the program resumes it's merrily happy execution, but there's a good portion of the RAM that has been tampered with by the runaway CALL.

So now, the funny part. If i remove three characters from ANY (bold required) "strcpy" function, anywhere in the program, the problem goes away. Adding three new characters to any other strcpy? You guessed it, the problem comes back.

This has proven to be quite infuriating. I have spent the entire day on this, no amount of step-tracing in my emulator, variable watching or code commenting will help me find the cause.

If any of you here in these forums have seen anything like this, I'd be glad to hear the steps you took to fix it.

Thank you.
Ttelmah
Guest







PostPosted: Sun Apr 29, 2007 2:32 am     Reply with quote

First, look at the top of the list file generated. What does it show for the number of stack levels used?. If this is beyond the maximum supported by the chip, then this is the 'answer', and you need to reduce the 'depth' of the tree of subroutines. This is relatively unlikely on these larger chips (32 stack levels), but would be the most likely cause on a smaller chip.

Constant strings, are handled in a very 'non intuitive' way. They generate a program, which returns the required byte, according to a value passed representing the 'index'. So, when the string gets longer, so does the program. Is 'SuperFunction', called in multiple places?. If not, then the odds are, that it will be generated 'inline' (this is the default behaviour). If this is the case, then the extra bytes, may well be making 'main' itself grow. Now it is possible, that your code size, is reaching the point, where some routines are having to shift over above the FFFF - 10000 boundary. If so, then there can be a sudden 'jump' in code size involved, and new problems introduced, as previously working code fails to handle the extra address bit properly. If (for instance), you are handling interrupt code yourself, and not saving the extra address bit, then problems would appear at this point. You might want to try explicitly declaring Superfunction as 'separate', to see if this changes what happens.

Are you using external memory?. If so, there is an erratum, relating to the stack, which might well introduce problems. There is also one to do with certain instructions, resulting in registers geting the wrong value, which you might want to check with CCS, whether the fix is in place. Unfortunately, sometimes tiny changes in program size, can result in this sort of fault appearing, which makes diagnosis a pig...

Best Wishes
Guest








PostPosted: Sun Apr 29, 2007 12:04 pm     Reply with quote

Ttelmah,

Here is the top of he LST file.

Code:
This is with the bug in place.
CCS PCH C Compiler, Version 4.033, 32282               29-Apr-07 13:30

               Filename: main.lst

               ROM used: 53676 bytes (41%)
                         Largest free fragment is 65536
               RAM used: 3146 (82%) at main() level
                         3711 (97%) worst case
               Stack:    13 worst case (9 in main + 4 for interrupts)


Stack levels
I am still comfortable in the stack levels, so this probably is not causing any problem. Furthermore, I use the ICE 2000 in circuit emulator. If there were any stack overflow or underflow condition, execution would halt and I would receive an error message.

Boundaries
Your paragraph regarding the code boundaries is very interesting because I remember 3 years ago, the CCS compiler I was using then had a bug with code memory banking. As the program grew, it went outside of the bank boundaries and triggered the compiler bug, which did some unexpected things.

Now here's a LST freshly compiled without the bug present. I removed 3 characters from a strcpy. As a result, 4 bytes are removed from the ROM (instead of 3?)
Code:
This is with the bug absent (3 chracters removed from strcpy)
CCS PCH C Compiler, Version 4.033, 32282               29-Apr-07 13:33

               Filename: main.lst

               ROM used: 53672 bytes (41%)
                         Largest free fragment is 65536
               RAM used: 3146 (82%) at main() level
                         3711 (97%) worst case
               Stack:    13 worst case (9 in main + 4 for interrupts)


Does that stike you as crossing any kind of boundaries, banking or something?

The rest
This function is quite big and only called from one place (from main), so seperate or inline, for this particular function, wouldn't make a difference.

As for the external memory, I am not using it. On the other hand, I extensively use just about every pin and every function that this PIC offers. From what I gather, this is the biggest PIC there is so, as you can guess, the program does many, many things. I've had weird problems over the last few months with this PIC, my personal favorite being the use of both hardware I2C modules.

Other compilers
Yesterday, I decided to gather all the previous versions of the CCS compiler I had that supports this chip, and tried them. With the bug firmly in place:
4.014 No problem
4.025 No problem
4.031 No problem
4.033 Bug!!

So I immediately ZIPped everything I had, and wrote an email to tech support because this certainly sounded like a bug introduced with 4.033...

And then today I remembered something very interesting. Last week, I had a similar bug. Similar in the behavior, not in the cause. At some point in my program, I had this:
Code:
#define SAVE   1
#define NOSAVE 0

void SomeFunction(int1 save) {
  if (save) {
    something saving
  } else {
    not saving
  }
}


Normally, a well-behaved compiler would accept this because the constants are capitalized and the variables are not, therefore, they are entirely different. As such, the compiler did not produce any error or warning, so I never knew I had a problem...

Shortly after writing that offending code, a completely different part of my program had, at execution time, to call a function. Instead of using CALL with the address of the function expected, it used a GOTO to land right in the middle of SomeFunction(). At the end of SomeFunction, the return call was looking at the stack and... nothing! Of course, since it was a GOTO. So the emulator stops, stack underflow.

It took me a while to pinpoint the cause. I just changed my defines to:
Code:
#define CONSTANT_SAVE 1
#define COSNTANT_NO_SAVE 0

... and the problem went away.

But did it really go away? For sure? I am starting to have doubts since the problem I have today is quite similar to what I had last week.

I found out about #pragma case, so I turned that on. By doing that, I am at least preventing myself from using the same name for a constant and a variable, as now the compiler is obligated to treat them as distinct even thought they have the right spelling, as it should be. I personally do not think that, by default, #pragma case should be turned off...

In case this can help, here are the various LST results from the other compilers that do not exhibit the problem (even though I am increasingly convinced that, in effect, the problem is never really fixed, it just moved somewhere else.)

Code:
With the bug in place, this compiler will not exhibit the problem.
CCS PCH C Compiler, Version 4.031, 32282               29-Apr-07 13:56

               Filename: main.lst

               ROM used: 53636 bytes (41%)
                         Largest free fragment is 65536
               RAM used: 3146 (82%) at main() level
                         3711 (97%) worst case
               Stack:    13 worst case (9 in main + 4 for interrupts)


Code:
Just as well, bug in place, this compiler also has no problem.
CCS PCH C Compiler, Version 4.025, 32282               29-Apr-07 13:58

               Filename: main.lst

               ROM used: 53630 bytes (41%)
                         Largest free fragment is 65536
               RAM used: 3146 (82%) at main() level
                         3711 (97%) worst case
               Stack:    13 worst case (9 in main + 4 for interrupts)



And then I had a brilliant idea! I reinstalled 4.033, but turned optimizations off. I normally use +Y=9, so I turned it off with +Y=0.
Code:
CCS PCH C Compiler, Version 4.033, 32282               29-Apr-07 14:00

               Filename: main.lst

               ROM used: 67206 bytes (51%)
                         Largest free fragment is 55832
               RAM used: 3146 (82%) at main() level
                         3711 (97%) worst case
               Stack:    13 worst case (9 in main + 4 for interrupts)


Insane, 67KB of code instead of 53KB! Good optimizations indeed... and the problem disappears. So wow. What do I do? What can I do?




..

I think I'll buy some land and grow potatoes.
Laurent Chouinard



Joined: 12 Sep 2003
Posts: 43

View user's profile Send private message

PostPosted: Sun Apr 29, 2007 12:05 pm     Reply with quote

Weird forum, I was logged in 5 minutes ago and this posted as Guest. Oh well.
Laurent Chouinard



Joined: 12 Sep 2003
Posts: 43

View user's profile Send private message

PostPosted: Sun Apr 29, 2007 12:13 pm     Reply with quote

Shocked

I was just reading the WIEGAND thread a few pixels below me, and someone mentioned that, compiler 4 is mostly alpha with the versions above 4.030 being "mostly usable". I was shocked! Why is this even public if it's not production quality? Damn.

So I grabbed 3.249, I changed the bit array I had to a byte array, and, well, no problem yet. I think i'll stick to v3 for ... ever.
Ttelmah
Guest







PostPosted: Sun Apr 29, 2007 2:29 pm     Reply with quote

At the top of the group, there is a 'sticky thread' about V4. You will see a lot of posts...
The current V4 releases, have just started to become 'useable', and might be arguable as a reasonable 'beta' (except that several of the new features don't yet work...). However your problem is still 'typical', of something odd happening inside. If I was 'privately' paying for the support in the UK, I have felt on several occasions, that I could take CCS's suppliers to court over here, for not supplying a product 'of merchantable quality'. You will see several people who will always advise, that if you are having problems, and your chip is supported by 3.249, then try with this.
The sizes you have don't suggest anything directly, I'd guess at an internal table of some sort hitting a limit...
:(

Best Wishes
Laurent Chouinard



Joined: 12 Sep 2003
Posts: 43

View user's profile Send private message

PostPosted: Mon Apr 30, 2007 6:27 am     Reply with quote

Damn it. Had I known v4 was a product in development not meant for production, I would not have used it. But no where did they ever mention "oh by the way, don't use version 4."

I have wished I could replace this compiler for another one many times over the last 4 years that I have been using it, but everytime we just reasoned that it would be more complicated to port the code of all our products to another compiler than it is to endure the bugs of this one.

I think that this event just convinced me otherwise, I will look into alternatives.

I mean of all the software I buy and use, I would assume that a COMPILER has to be the one that isn't buggy, right?
Ttelmah
Guest







PostPosted: Mon Apr 30, 2007 7:25 am     Reply with quote

All compilers have bugs.
I have three currently 'on report', with Microsoft, for their current C# compiler....
However, most of these are 'exotic', with the core parts working well. The problem with CCS, is that they have a truly appalling approach to beta releases, and sometimes introduce problems that even the most basic QA, should find.
When they launch a major 'new' compiler, what ships, is often not even a reasonable 'alpha' release. Historically, you can reckon on perhaps six months after 'release' for the first reasonably stable version. 'Old hands', know this well. V4, has taken longer than this.
Then, they provide new 'sub releases', apparently to fix issues bought to their attention, but without checking that the 'fix' has not introduced new problems.
Historically, the download site, always carries two versions, with the new one being effectively the current 'beta', and the older one, the 'last known good'.
Several people here, have repeatedly cried for three things:
1) Some form of indication of the untested status of rapidly released versions.
2) Use of a reasonable range of test programs 'in house', to verify that basic operations do work, before any version appears on the site.
3) Better listing of what is changed/fixed in each version.
The 'silly' thing is that they seem to fail to understand, that once a customer is lost, you are very unlikely to get them back, and 'mud sticks'. The latest release represented an amazing triumph of marketting over the code being ready, with the premature release, undoubtedly 'tainting' quite a few people's perception of the compiler, which will not do their business any good in the long term...

Best Wishes
Jim Hearne



Joined: 22 Dec 2003
Posts: 109
Location: West Sussex, UK

View user's profile Send private message Send e-mail Visit poster's website

PostPosted: Wed May 02, 2007 10:07 am     Reply with quote

Maybe not a great deal of help, but my current project on a 18LF6722 has the following stats.

Code:
CCS PCH C Compiler, Version 4.033, 38073               01-May-07 15:24

               Filename: microvision v1.10.lst

               ROM used: 66242 bytes (51%)
                         Largest free fragment is 63874
               RAM used: 1721 (45%) at main() level
                         1843 (48%) worst case
               Stack:    13 worst case (8 in main + 5 for interrupts)


And since version 4.033 fixed the memset bug (i think i found that one !) i've had no problems.
At least 50% of the ROM is data for a graphics LCD fonts and there are quite a few uses of strcpy().

Maybe i've just been lucky with the length of the strings.


Jim
RossJ



Joined: 25 Aug 2004
Posts: 66

View user's profile Send private message

PostPosted: Mon May 07, 2007 1:51 am     Reply with quote

Hello Laurent,

I have just spent half the day debugging a problem which is likely the same as yours...

It appears to be a bug in the lookup table code generated for a switch statement. My guess is that when a switch statement contains more than a handful of cases (around 6-8?), instead of using the xor/btfs sequence, it uses a lookup table of addresses. This code is not visible in the .lst file, so you have to use mplab or similar to view the disassembly listing. Here is a faulty example:

Code:

  44D8    CFF2     MOVFF 0xff2, 0xe
  44DC    9EF2     BCF 0xff2, 0x7, ACCESS
  44DE    24E8     ADDWF 0xfe8, W, ACCESS
  44E0    6AF7     CLRF 0xff7, ACCESS
  44E2    36F7     RLCF 0xff7, F, ACCESS
  44E4    0FFD     ADDLW 0xfd                       ***1
  44E6    6EF6     MOVWF 0xff6, ACCESS
  44E8    0E44     MOVLW 0x44
  44EA    26F7     ADDWF 0xff7, F, ACCESS           ***2
  44EC    000A     TBLRD*-
  44EE    50F5     MOVF 0xff5, W, ACCESS
  44F0    6EFA     MOVWF 0xffa, ACCESS
  44F2    0008     TBLRD*
  44F4    50F5     MOVF 0xff5, W, ACCESS
  44F6    BE0E     BTFSC 0xe, 0x7, ACCESS
  44F8    8EF2     BSF 0xff2, 0x7, ACCESS
  44FA    6EF9     MOVWF 0xff9, ACCESS
  44FC    432E     RRNCF 0x2e, F, BANKED            case 0
  44FE    432E     RRNCF 0x2e, F, BANKED            case 1
  4500    4380     RRNCF 0x80, F, BANKED            case 2
  4502    443A     RLNCF 0x3a, W, ACCESS            case 3
  4504    4490     RLNCF 0xf90, W, ACCESS           case 4
  4506    4374     RRNCF 0x74, F, BANKED            case 5
  4508    4414     RLNCF 0x14, W, ACCESS            case 6
  450A    43EE     RRNCF 0xee, F, BANKED            case 7


Essentially the routine starts with W holding the case number (from 0). This value is doubled, and added to the address of the lookup table. Then the address is read using tblrd instructions, and written to the program counter (effecting the jump).

You may notice that in this example, the table starts near the top of a 256 byte page (0x44FC). So it doesn't take a high case number to roll over to the next page. The add ***1 produces a carry result which is not included in add ***2. This results in the tblrd fetching an address from whatever is 256 bytes lower than the actual table. Which of course creates an unpredictable (but consistent) jump. I think the add ***2 should simply be and ADDWFC instruction.

My experience was that adding an unused variable declaration to main(), changed the code enough that this routine was located differently in flash. The result was that my TCP/IP stack stopped working because this fault happened to occur in the main switch statement of StackTask(), one of the most fundamental functions of the stack!!! It should also be noted that the failure only occurs on high case values, making things even more interesting...

Unfortunately I can't see a workaround for this, other than keeping switch statements small, or using a previous compiler version. You mentioned that your program worked under 4.031. This problem may be older than that since it is possible that 4.031 simply placed this jump table differently. I'll take a look at the code generation to see if 4.031 is indeed a workaround...

Perhaps someone knows if there is an option or optimisation setting which prohibits the use of lookup tables like this (although I doubt it).

I have informed CCS. BTW, I am using 4.033/PIC18F2620.

/Cheers, Ross.
RossJ



Joined: 25 Aug 2004
Posts: 66

View user's profile Send private message

PostPosted: Mon May 07, 2007 3:52 am     Reply with quote

Hi again,

Further to my last post, it appears to be a new bug in 4.033. From the current version log...

Quote:

4.033 Switch is fixed to work with over 128 cases


I don't have 4.032, but I tested 4.031 and the switch code looks good (albeit it doesn't support case values higher than 127).

Does anyone know if previous compiler versions are available on CCS's website somewhere? I try to archive the downloads, but I don't always get around to downloading every version. It is useful to have the versions available when debugging (the compiler that is...), since it can be correlated with the current version log.

/Cheers, Ross.
RossJ



Joined: 25 Aug 2004
Posts: 66

View user's profile Send private message

PostPosted: Thu May 10, 2007 7:59 pm     Reply with quote

This problem is fixed in 4.034.
Display posts from previous:   
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group