|
|
View previous topic :: View next topic |
Author |
Message |
benoitstjean
Joined: 30 Oct 2007 Posts: 566 Location: Ottawa, Ontario, Canada
|
|
Posted: Fri Dec 05, 2014 3:47 pm |
|
|
Actually, just in case you haven't noticed, the value retrieved by the address error trap code I posted earlier retrieves the value 0x00200000 (5 zero's) not 0x00020000 (4 zero's) as you stated... Not sure if it changes anything.
Also, if I look at the Microchip specs, that address (200K HEX) falls within the User Program Flash Memory as stated in the Microchip documentation page 25 (http://ww1.microchip.com/downloads/en/DeviceDoc/70175H.pdf) in the right-hand column. The first address of this block is 0x00000200 and the last address of this block is 0x007FFFFE so 0x00200000 is right in there....
Just wanted to clarify the values here in case it changes anything... So should I still go with the 38 instead of 36?
Thanks again,
Benoit |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19499
|
|
Posted: Sat Dec 06, 2014 2:17 am |
|
|
Yes.
You _must_ go to 38.
You need to understand what happens.
When an interrupt occurs, the PIC automatically saves the address where the code currently 'is', and the status register onto the stack.
Then the interrupt handler saves what registers it needs.
This leaves the stack at this point with loads of values stored.
We need to retrieve the address stored on the stack.
Where this is (relative to the current stack pointer), depends on how many registers the interrupt handler stores.
CCS changed how much data they save between when the code was posted, and the current compilers. I looked at the assembler generated by the current compilers, and 'counted back' to where the address was stored, and the result for the current compiler, is 38, not 36.
I've since tested this by forcing an error interrupt, by just loading a pointer to a 16bit variable, incrementing it by one, and then retrieving a 16bit variable from this. This forces a 16bit access to an 'odd' address, which will give an address error interrupt.
With 38, merrily works, and retrieves the correct address where it happens.
The 0x20000 value is basically nothing as far as your problem lies. |
|
|
benoitstjean
Joined: 30 Oct 2007 Posts: 566 Location: Ottawa, Ontario, Canada
|
|
Posted: Tue Dec 16, 2014 12:50 pm |
|
|
All right, so it **just** occured again (first time in two weeks) and I changed the value from 36 to 38 (see earlier posts) and the #ADD_INT error gave me address 0x07C62.
Here's the C listing followed by that same function with its assembly breakdown:
Code: |
// Add body to DMA packet
for( i = 0; i < uiBodyLength; i ++, packetByteIndex ++ )
{
DMAPacket0[packetByteIndex] = usBody[i];
}
.................... for( i = 0; i < uiBodyLength; i ++, packetByteIndex ++ )
07C48: MOV #0,W4
07C4A: MOV W4,362A
07C4C: MOV 362A,W0
07C4E: MOV 362E,W4
07C50: CP W4,W0
07C52: BRA LEU,7C76
.................... {
.................... DMAPacket0[packetByteIndex] = usBody[i];
07C54: MOV 3628,W0
07C56: MOV #4000,W4
07C58: ADD W0,W4,W5
07C5A: MOV 362A,W0
07C5C: MOV 3616,W4
07C5E: ADD W0,W4,W0
07C60: MOV.B [W0],[W5]
07C62: MOV 362A,W0
07C64: MOV W0,[W15++]
07C66: INC W0,W0
07C68: MOV W0,362A
07C6A: MOV [--W15],W0
07C6C: MOV 3628,W0
07C6E: INC W0,W0
07C70: MOV W0,3628
07C72: GOTO 7C4C
.................... }
|
I understand my C code... but what's happening in the assembly background at 0x7C62 to generate that address interrupt error?
Thanks a million!
Benoit |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19499
|
|
Posted: Tue Dec 16, 2014 2:41 pm |
|
|
It's the move in front of this. The return address is the instruction _after_ the problem.
One or the other of the addresses is causing a problem.
I'd suspect packetByteIndex is getting set somewhere it shouldn't. Add a test line in front of the function, to verify what this is set to. |
|
|
benoitstjean
Joined: 30 Oct 2007 Posts: 566 Location: Ottawa, Ontario, Canada
|
|
Posted: Tue Dec 16, 2014 8:20 pm |
|
|
Hi Ttelmah,
Thanks for your reply.
So when you say "One or the other of the addresses is causing a problem", which addresses are you referring to? The actual addresses 07C62 and 07C60?
I don't know assembly and I don't have the slightest idea what the code is doing (hence this post).
I'm not sure how I'll address this problem because it'll be _very_ difficult to print anything for troubleshooting because the DMA packet is filled 63 times per second with 127 bytes of data and it just goes non-stop. This problem can occur twice in two minutes just like it can happen twice at 20 minutes apart. Just this past week-end, I had this code running for 3 days straight. Today, it started acting-up. I haven't touched this part of the code so I'm thinking that it could be some race or timing condition that occurs very seldomly.... hard to tell...
I'm at home now and don't have my code in front of me. But if I remember correctly, I believe that just a few lines prior to this loop, the <i> variable is set to 0 and the <packetByteIndex> variable is set to 0 then increased based on other values.
Let me investigate this further tomorrow morning. In the meatime if you think of anything else or have other suggestions, you are more than welcome.
Thanks again,
Benoit |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19499
|
|
Posted: Wed Dec 17, 2014 1:45 am |
|
|
Neither.
The instruction causing the problem, is the byte move:
Code: |
07C60: MOV.B [W0],[W5]
|
It is what is in 'W0', and 'W5' that matter.
Now, one is 'i', and should therefore always be a reasonable value, unless 'uiBodyLength' has changed, and got set to a silly number. The other is 'packetByteIndex', which with the code shown, could be anything.
I'd suggest that the value of packetByteIndex, has got set to an impossible number, so it is attempting to move data to a location in memory that is not legal. Hence the crash. Equally it could be uiBodyLength that has the problem.
Either do a diagnostic output of the two numbers before the loop, or add a blocking test.
How high can uiBodyLength be?. if it's 256 bytes (say), then test that it is below this value before the loop. Similarly define a limit for packetByteIndex.
It looks as though one of these numbers is wrong at this point, so the code is accidentally then trying to 'walk' into an illegal memory location.
That it is intermittent, suggests possibly a quantisation issue with one of the numbers. For instance, if 'packetByteIndex' gets changed in an interrupt, what happens if the interrupt occurs during this copy?. |
|
|
Remco
Joined: 12 May 2015 Posts: 14
|
|
Posted: Fri Jul 24, 2015 12:50 am |
|
|
Hi, it's me again.
I'm again facing a problem. This time it's with an address error when I'm reading out a software UART with a hardware UART running on interrupt. Time to time the uc reset himself. I know which function causing the reset, but I want to know where exactly it reset, because maybe its still somewhere else causing the reset.
I know it's a address error and therefore I want to use the code provided in this thread. But when the error occur and put it out to an UART to a PC (it's working correctly) I get a strange number. A number which can never be find in the list file because its too big like 0x2AC600. Is there a way that they again changed how many register CCS saves for a routine enters the interrupt handler? Like Ttelmah posted in this thread on Fri Dec 05, 2014?
I'm using Version 5.042 of the compiler, that came recently out.
Thanks in advance,
Remco |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19499
|
|
Posted: Fri Jul 24, 2015 1:58 am |
|
|
OK. Done a quick test with 5.048.
The code works with 0x38 offset.
You need:
Code: |
unsigned long trapaddr;
#INT_ADDRERR
void ADDRERR_isr(void)
{
#asm
mov w15, w0
sub #38, w0
mov [w0++], w1
mov w1, trapaddr
mov [w0], w1
and #0x7f, w1
mov w1, trapaddr+2
#endasm
}
|
trapaddr then contains the address of the error.
It's very easy to test.
If you create the following lines in a program, and run them, an address error trap will trigger:
Code: |
int16 cause;
int16 * ptr;
ptr=(int8 *)&cause+1;
cause = *ptr; //deliberate code to crash chip.... Accessing a word on an odd location.
|
It'll crash on the cause=*ptr instruction.
What it does is generate a pointer to an int16, then cast this so the compiler thinks it's to an int8, and increments it, so it is talking to a byte half way up a int16, then use it to access an int16 at this address. This can't be done with an int16 access, so 'address trap'.
Now the nice thing is if you stick this into your code, compile, and look at the listing, you can see exactly what address this instruction is 'at'. So (for instance):
Code: |
.................... ptr=(int8 *)&cause+1;
13A2: MOV #951,W4
13A4: MOV W4,960
.................... cause = *ptr; //deliberate code to crash chip.... Accessing a word on an odd location.
13A6: MOV 960,W0
13A8: MOV [W0],[W15++]
13AA: POP 950
|
Run this, with the trap code present, and then look at 'trapaddr', and it contains 0x0013AA. The address to _return_ to after the instruction that caused the failure.
Are you sure you have the code right?.
What chip are you on?. This is vital. There might actually be something at 2AC600.
Are you sure you are getting the bytes in the right order?. Not something silly like 00C62A?. How are you outputting the value to the UART?. |
|
|
Remco
Joined: 12 May 2015 Posts: 14
|
|
Posted: Fri Jul 24, 2015 2:50 am |
|
|
Yup you are wright. Did everything as it was writing. Get exactly the address at which it must go bad. It was also exactly the same as I had before I made the post. So thank you for your quick test, now I must find out what the meaning is from the value I get back every time.
I gave it a new try and it gave me back 0x440AB9. It differs from time to time. Accordantly the data sheet this is in a reserved block of the program space memory of the dsPIC30F6014A. And if I remember it correctly all the other values would be in the same block as well. Also the number can't be found in the list file. |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19499
|
|
Posted: Fri Jul 24, 2015 3:20 am |
|
|
I still have a 'sneaky' that something is wrong in what you are doing.
Make an error the way I show, and see if you still get the variable reply. This would 'prove' whether the code is actually working.
It could be something silly, like having traperror declared as a local variable in the interrupt code, and also as a global, so the interrupt code writes to it's local copy, not the global, and then the print routine prints out the global version, so the value has nothing to do with the actual error!... |
|
|
Remco
Joined: 12 May 2015 Posts: 14
|
|
Posted: Fri Jul 24, 2015 3:39 am |
|
|
I wrote the code exactly as you suggested. And with a touch on a button your suggested trap code will be executed. When the button is pressed I get the value thru UART that is connected to a PC. The value I get from your code is indeed the location at witch it must go bad. But the code is still the same to test the other trap code which I want to find out at where it goes wrong (as written in my previous post).
With this I can prove the code works and that the value I get back is genuine. Otherwise your code would fail to. This is why I find it so a odd value. |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19499
|
|
Posted: Fri Jul 24, 2015 3:55 am |
|
|
I'd suspect the fundamental fault is not actually an address error.
Suspect something like an extra 'return' being reached, which pops a data value off the stack, so results in the code going off into a 'random' address. Then the bytes that just happen to be 'seen' by the processor at this address, are some artifact of the chip, that results in an instruction being executed that gives the trap....
You really are going to have to narrow things down. Have a marker byte that is changed as the code reaches particular parts of the code. When the trap triggers see what is in the marker. Then put more 'tighter packed' markers between this one and the next.
Alternatively, add a 'tick' interrupt (every mSec say), that again records the calling address the same way as the trap code. When the trap triggers look at the address this has recorded. You then know 'where' the code was, no more than 1mSec 'before' the error.
You do have the stack size expanded?. Though this should give a stack error, not an address error, it is worth ruling this out. |
|
|
younder
Joined: 24 Jan 2013 Posts: 53 Location: Brazil
|
|
Posted: Wed Jun 08, 2016 5:40 am |
|
|
Ttelmah wrote: | OK. Done a quick test with 5.048.
The code works with 0x38 offset.
You need:
Code: |
unsigned long trapaddr;
#INT_ADDRERR
void ADDRERR_isr(void)
{
#asm
mov w15, w0
sub #38, w0
mov [w0++], w1
mov w1, trapaddr
mov [w0], w1
and #0x7f, w1
mov w1, trapaddr+2
#endasm
}
|
trapaddr then contains the address of the error.
It's very easy to test.
If you create the following lines in a program, and run them, an address error trap will trigger:
Code: |
int16 cause;
int16 * ptr;
ptr=(int8 *)&cause+1;
cause = *ptr; //deliberate code to crash chip.... Accessing a word on an odd location.
|
It'll crash on the cause=*ptr instruction.
What it does is generate a pointer to an int16, then cast this so the compiler thinks it's to an int8, and increments it, so it is talking to a byte half way up a int16, then use it to access an int16 at this address. This can't be done with an int16 access, so 'address trap'.
Now the nice thing is if you stick this into your code, compile, and look at the listing, you can see exactly what address this instruction is 'at'. So (for instance):
Code: |
.................... ptr=(int8 *)&cause+1;
13A2: MOV #951,W4
13A4: MOV W4,960
.................... cause = *ptr; //deliberate code to crash chip.... Accessing a word on an odd location.
13A6: MOV 960,W0
13A8: MOV [W0],[W15++]
13AA: POP 950
|
Run this, with the trap code present, and then look at 'trapaddr', and it contains 0x0013AA. The address to _return_ to after the instruction that caused the failure.
Are you sure you have the code right?.
What chip are you on?. This is vital. There might actually be something at 2AC600.
Are you sure you are getting the bytes in the right order?. Not something silly like 00C62A?. How are you outputting the value to the UART?. |
Hi Ttelmah,
I've tested your trap ISR im my code (ROM usage=86%), but when trying to printf the error addr. inside the ISR handler I got the following message from compiler:
Quote: | *** Error 71 "CB_NIVA_V11.c" Line 1118(1,2): Out of ROM, A segment or the program is too large @ADDFF64
Seg 00100-07FFE, 008C left, need 001EE
Seg 00000-00002, 0000 left, need 001EE Reserved
Seg 00004-000FE, 0000 left, need 001EE Reserved
|
I know this problem is related to my code size, but I'm not sure how to make it works without reducing the code size, considering that I want to find the trap err addr in my code as it is.
I tried your code in a test code and the printf worked (ROM=6%).
Code: |
unsigned long trapaddr;
#INT_ADDRERR
void ADDRERR_isr(void)
{
#asm
mov w15, w0
sub #38, w0
mov [w0++], w1
mov w1, trapaddr
mov [w0], w1
and #0x7f, w1
mov w1, trapaddr+2
#endasm
//printf(lcd_putc,"ERR_ADDR: %LX",trapaddr); got compiler error if include this line
}
|
BTW, I'm using the statement due to my code size.
CCS V 5.059 & DSPic30F4012
Any help would be very appreciated,
Hugo _________________ Hugo Silva |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19499
|
|
Posted: Wed Jun 08, 2016 6:54 am |
|
|
Putting a printf, into the interrupt is going to change far too many things to actually be useful. Currently, the compiler can't fit it in the segment where the interrupt handlers normally sit.
Declare trapaddr without initialisation.
Have the address trap routine, trigger a processor reset, after loading trapdaddr.
Stick code at the start of your main, to test 'restart_cause', and print the contents of trapaddr, if the cause is 'RESTART_SOFTWARE'.
256, is not large. If I'm doing a lot of things I need 384 or 512. The stack is used for variables on the PIC24/30, so gets a lot more in it.... |
|
|
younder
Joined: 24 Jan 2013 Posts: 53 Location: Brazil
|
|
Posted: Wed Jun 08, 2016 5:23 pm |
|
|
Ttelmah wrote: | Putting a printf, into the interrupt is going to change far too many things to actually be useful. Currently, the compiler can't fit it in the segment where the interrupt handlers normally sit.
Declare trapaddr without initialisation.
Have the address trap routine, trigger a processor reset, after loading trapdaddr.
Stick code at the start of your main, to test 'restart_cause', and print the contents of trapaddr, if the cause is 'RESTART_SOFTWARE'.
256, is not large. If I'm doing a lot of things I need 384 or 512. The stack is used for variables on the PIC24/30, so gets a lot more in it.... |
Hi Ttelmah, Thanks for your reply
I could understand that the compiler can't fit the printf in the segment where the interrupt handlers normally sits... and I already have a switch statment to verify the restart cause in the start of my main code (which BTW, is calling up "RESTART_TRAP_CONFLICT", and not RESTART_SOFTWARE... So I moved the printf line to "RESTART_TRAP_CONFLICT". However, I got trapaddr = "0".
Below is my code:
Code: |
unsigned INT32 trapaddr;
#INT_ADDRERR
void ADDRERR_isr(void)
{
#asm
mov w15, w0
sub #38, w0
mov [w0++], w1
mov w1, trapaddr
mov [w0], w1
and #0x7f, w1
mov w1, trapaddr+2
#endasm
}
|
At start of main...
Code: |
//Verify cause of MCU restart
switch (restart_cause()) {
case RESTART_POWER_UP: printf(LCD_PUTC,"Inicializando..."); delay_ms(500); break;
case RESTART_BROWNOUT: printf(LCD_PUTC,"RESTART_BROWNOUT!"); delay_ms(1000); break;
case RESTART_WATCHDOG : printf(LCD_PUTC,"RESTART_WATCHDOG!"); delay_ms(1000); break;
case RESTART_SOFTWARE: printf(LCD_PUTC,"RESTART_SOFTWARE!"); delay_ms(1000); break;
case RESTART_MCLR : printf(LCD_PUTC,"RESTART_MCLR!"); delay_ms(1000); break;
case RESTART_ILLEGAL_OP: printf(LCD_PUTC,"RESTART_ILLEGAL_OP!"); delay_ms(1000); break;
case RESTART_TRAP_CONFLICT:
{
lcd_init();
printf(LCD_PUTC,"TRAP ERR_ADD: %LX",trapaddr);
delay_ms(8000);
}
break;
default: break;
}
|
Thanks a lot,
Hugo _________________ Hugo Silva |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|