|
|
View previous topic :: View next topic |
Author |
Message |
Ttelmah
Joined: 11 Mar 2010 Posts: 19513
|
|
Posted: Sat Nov 14, 2015 3:33 am |
|
|
OK.
On the reset_cause, _you_ have to test for stack over/underflow. The bits for this are not part of RCON register, which is what 'restart_cause' actually reflects. With STVREN enabled, if you add:
Code: |
#bit STKFUL=getenv("bit:STKFUL")
#bit STKUNF=getenv("bit:STKUNF")
if (STKFUL)
//display or indicate somehow that you have a stack overflow
if (STKUNF)
//display or indicate that you had a stack underflow
|
All the RCON bits are 'undefined' if a stack error occurs.
The 'underflow' errors can be tested for without STVREN, but the overflow error can't.
Now are you running this in debug?. There is a little problem here, that the debugger steals two stack levels. So code that could actually run OK for real, then gives stack overflows....
What does the listing show for stack used?.
The classic thing that can cause a stack error other than just 'running out', is a GOTO. This is one reason they are 'discouraged'. If (for instance), you jump from a piece of code inside a function, where a return address is on the stack (sometime inside a switch statement in some cases for example), then the stack can be left 'out of balance'.
Also remember that if your code (for instance) uses one more stack level, then the actual fault can appear somewhere else, when this just happens to step over the edge.... |
|
|
asmallri
Joined: 12 Aug 2004 Posts: 1634 Location: Perth, Australia
|
Re: Random Resets with reason of MCLR_FROM_RUN... |
Posted: Sat Nov 14, 2015 7:03 am |
|
|
terryopie wrote: |
After making the change I started experiencing random resets. When the reset happens for a given version of code, I can duplicate it nearly every time and on different boards. |
Are the boards powered with their own power supply or are all boards being tested with a common test bench power supply? If you are performing this testing with a common test setup then check for problems in the test setup. insufficient power supply filtering, faulty power supply, insufficient current etc. _________________ Regards, Andrew
http://www.brushelectronics.com/software
Home of Ethernet, SD card and Encrypted Serial Bootloaders for PICs!! |
|
|
terryopie
Joined: 13 Nov 2015 Posts: 13
|
|
Posted: Mon Nov 16, 2015 7:44 am |
|
|
Quote: | On the reset_cause, _you_ have to test for stack over/underflow. The bits for this are not part of RCON register, which is what 'restart_cause' actually reflects. With STVREN enabled, if you add: |
I did enable STVREN. First thing in main, I am saving away STKPTR register. It is giving me a value of 0x40 (Underflow ).
Quote: | Now are you running this in debug?. There is a little problem here, that the debugger steals two stack levels. So code that could actually run OK for real, then gives stack overflows.... |
No, I am not running in debug.
Quote: | What does the listing show for stack used?. |
Listing shows stack usage here:
Code: |
ROM used: 56620 bytes (86%)
Largest free fragment is 8912
RAM used: 1366 (37%) at main() level
1423 (39%) worst case
Stack: 7 worst case (6 in main + 1 for interrupts) |
|
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19513
|
|
Posted: Mon Nov 16, 2015 8:16 am |
|
|
_Underflow_. Very interesting.
Somehow you are executing a return from something that is not actually called, or popping a value from the stack.
Does the code use function pointers?. Classic is these being overwritten so the code jumps to an unexpected location in memory.
Goto as already mentioned.
Interrupt enabled without a handler present (effect depends on what other code is down there). |
|
|
terryopie
Joined: 13 Nov 2015 Posts: 13
|
|
Posted: Mon Nov 16, 2015 9:28 am |
|
|
Ttelmah wrote: | _Underflow_. Very interesting.
Somehow you are executing a return from something that is not actually called, or popping a value from the stack.
Does the code use function pointers?. Classic is these being overwritten so the code jumps to an unexpected location in memory.
Goto as already mentioned.
Interrupt enabled without a handler present (effect depends on what other code is down there). |
Not using function pointers anywhere. Only have one segment of inline assembly. No GOTO or CALL commands being used. I'll have to go back through and double check that all but the one interrupt that we are using are disabled. Unfortunately can't check that for a few days... I'll report back.
Thank you for the suggestions!! |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19513
|
|
Posted: Mon Nov 16, 2015 9:44 am |
|
|
One section of in-line assembly?.
Postable?.
Any write to STKPTR, could cause this.
Any instruction that accesses PCL, PCLATH, or PCLATU.
Any POP.
The first two could be the result of a memory pointer (or array access), that is accessing an address outside the array... |
|
|
terryopie
Joined: 13 Nov 2015 Posts: 13
|
|
Posted: Mon Nov 16, 2015 10:11 am |
|
|
Here is the Assembly:
Code: | #ASM
MOVF _a_lo,W ; Set-up address to write to
MOVWF EEADR
MOVF _a_hi,W
MOVWF EEADRH
MOVF _a_lo,W ; Set-up address to write to
MOVWF EEADR
MOVF _ee_data,W ; Set-up data to write
MOVWF EEDATA
BCF EECON1,7 ; Point to Data EEPROM Memory
BSF EECON1,2 ; Enable EEPROM Write
BCF INTCON,7 ; Disable interrupts globally
MOVLW 0x55 ; The next four lines are required to allow the write
MOVWF EECON2
MOVLW 0xAA
MOVWF EECON2
BSF EECON1,1 ; Set WR bit to begin write
BSF INTCON,7 ; Enable interrupts globally
#ENDASM
|
This snippet is how we write the internal EEPROM. Its somewhat faster than using the builtin interface. |
|
|
PCM programmer
Joined: 06 Sep 2003 Posts: 21708
|
|
Posted: Mon Nov 16, 2015 10:28 am |
|
|
I suspect that you are putting a RETURN instruction in the ASM code,
instead of letting the compiler handle the return by letting the function
proceed to the closing brace. Maybe you are not doing it in the posted
routine, but you may be doing it somewhere.
This would work, but sometimes the compiler won't do a CALL. It will do
a pseudo-call with a BRA to the routine, and the compiler inserts a BRA at
the end of the routine to jump back to the caller. There is no stack
involved. In this case, the insertion of RETURN is extremely ill advised.
If you thwart the compiler by inserting in your own RETURN in #asm,
you are sabotaging your own program. Absolutely marginal gains
are not worth going to assembly code. |
|
|
terryopie
Joined: 13 Nov 2015 Posts: 13
|
|
Posted: Mon Nov 16, 2015 10:36 am |
|
|
PCM programmer wrote: | I suspect that you are putting a RETURN instruction in the ASM code,
instead of letting the compiler handle the return by letting the function
proceed to the closing brace. Maybe you are not doing it in the posted
routine, but you may be doing it somewhere.
This would work, but sometimes the compiler won't do a CALL. It will do
a pseudo-call with a BRA to the routine, and the compiler inserts a BRA at
the end of the routine to jump back to the caller. There is no stack
involved. In this case, the insertion of RETURN is extremely ill advised.
If you thwart the compiler by inserting in your own RETURN in #asm,
you are sabotaging your own program. Absolutely marginal gains
are not worth going to assembly code. |
The only assembly is what is listed in my above reply... No return that I can see would be added from that. Correct me if I am wrong. |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19513
|
|
Posted: Mon Nov 16, 2015 11:16 am |
|
|
There are several instructions missing from the posted assembler. After the GIE, you should clear the WREN bit. If this is not done, later table accesses can result in writes to the memory....
Then before initiating the write, you must clear EEPGD, and CFGS bits, and set the WREN bit. As written it could fail to write completely (it the WREN bit is not set), and could write to the program memory, instead of the EEPROM.
Look at the listing in the data sheet. |
|
|
terryopie
Joined: 13 Nov 2015 Posts: 13
|
|
Posted: Mon Nov 23, 2015 10:09 am |
|
|
I have updated the inline assembly to follow the listing in the data sheet:
Code: | #ASM
MOVF _a_lo,W ; Set-up address to write to
MOVWF EEADR
MOVF _a_hi,W
MOVWF EEADRH
MOVF _a_lo,W ; Set-up address to write to
MOVWF EEADR
MOVF _ee_data,W ; Set-up data to write
MOVWF EEDATA
BCF EECON1,7 ; Point to Data EEPROM Memory
BCF EECON1,6 ; Access EEPROM
BSF EECON1,2 ; Enable EEPROM Write
BCF INTCON,7 ; Disable interrupts globally
MOVLW 0x55 ; The next four lines are required to allow the write
MOVWF EECON2
MOVLW 0xAA
MOVWF EECON2
BSF EECON1,1 ; Set WR bit to begin write
BTFSC EECON1,1 ; Wait for write to complete GOTO $-2
BSF INTCON,7 ; Enable interrupts globally
BCF EECON1,2 ; Disable EEPROM Write
#ENDASM |
But... I am still having the issue.
I did a little more investigation and based on something I noticed, I have a question. Is there a limit to the number of functions that can be defined? I have 72 functions, including main and the interrupt service routine.
When I remove the small function that I added for this original small change, the problem goes away. No underflow. But... When I add yet another function, completely empty, unused and not called anywhere, there are no issues. I remove the unused function and the problem returns. I also removed the small function I originally added, then added the empty function and the problem exists.
This just doesn't make any sense. How can adding a function or having a certain number of functions cause problems like this? Any ideas?
Thank you! |
|
|
temtronic
Joined: 01 Jul 2010 Posts: 9226 Location: Greensville,Ontario
|
|
Posted: Mon Nov 23, 2015 10:18 am |
|
|
hmm.. wild guess...
maybe it's not the qty but the order they are coded ?
it's the classic 'out of ROM' error so maybe, just maybe the compiler has a subtle quirk ??
heck, nothing to lose, just copy the program, cut and paste functions a bit differently and see what happens.
IF it still fails, you've eliminated one possibility...
Jay |
|
|
terryopie
Joined: 13 Nov 2015 Posts: 13
|
|
Posted: Mon Nov 23, 2015 10:36 am |
|
|
I had neglected to mention it, but I did change the order of a few of the functions with no change.
But again, pulling at straws and digging through assembly in the lst file, I found this:
Code: | 04D0C: BRA 4DBC
.................... case 1: // IN INSP? Y/N
.................... case 5: // AT DOWN LIMIT Y/N
.................... case 7: // AT FLR XX Y/N
.................... case 11: // AT UP LIMIT Y/N
.................... case 12: // VHC-102? Y/N
.................... if(current_lcd_line1[15]==0x20){ // Blank
04D0E: MOVLB 3
04D10: MOVF x78,W
04D12: SUBLW 20
04D14: BTFSS FD8.2
04D16: GOTO 2E6C
.................... if(setup_var)
04D1A: MOVLB 4
04D1C: MOVF x80,F
04D1E: BZ 4D28
.................... future_lcd_line1[15] = "Y"; |
The snippet above is roughly where I suspect things are going off into the weeds. I'm concerned about the GOTO 2E6C. When I look for this address, it is in a completely different function, 5000 lines away.
I'm wondering if there is a bug in the compiler that obviously doesn't show up all the time, but occurs when there are several case statements all for the same code?
In the normal "Failing" scenario, but duplicating the case code for all 5 cases, the problem seems to go away.
Is this a known problem? Too many cases? |
|
|
newguy
Joined: 24 Jun 2004 Posts: 1907
|
|
Posted: Mon Nov 23, 2015 10:40 am |
|
|
Try this:
Isolate other cases, then have a default: case that covers your existing case 1, 5, 7, 11, 12. |
|
|
terryopie
Joined: 13 Nov 2015 Posts: 13
|
|
Posted: Mon Nov 23, 2015 10:59 am |
|
|
newguy wrote: | Try this:
Isolate other cases, then have a default: case that covers your existing case 1, 5, 7, 11, 12. |
Just tried that and that also had issues... the GOTO (to the same address) moved to another group of cases that I used to try to isolate so the group of 5 could be the default... But the GOTO was NOT in the default.
So... instead of duplicating the code for each case statement, I created a small function to do what it should be doing and called that. That saw the same issue, but I noticed this:
Code: |
04D46: BRA 4DC8
.................... case 1: // IN INSP? Y/N
.................... case 5: // AT DOWN LIMIT Y/N
.................... case 7: // AT FLR XX Y/N
.................... case 11: // AT UP LIMIT Y/N
.................... case 12: // VHC-102? Y/N
.................... setYN();
04D48: GOTO 3D7A
.................... break;
04D4C: MOVLB 4
04D4E: BRA 4DC8
.................... case 8: // Trim Floor
.................... if(current_lcd_line1[2]==0x20){ // Blank |
Notice that instead a CALL, its getting a GOTO. Why? Is CALL only used when there are parameters to pass, otherwise GOTO is used? |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|