View previous topic :: View next topic |
Author |
Message |
nazoa
Joined: 09 Feb 2007 Posts: 56
|
Speeding up loop |
Posted: Wed Apr 23, 2008 11:25 am |
|
|
I have a simple loop to transmit data to a device which I need to speed up. The code I have is as follows:
for (i=0; i<n; i++)
{
output_d(Data[i]);
while (input(TXE)) //Wait till ready
delay_cycles(1);
output_high(WR);
output_low(WR);
}
My problem is that the peak rate of the loop is about a byte very 2.1us. The command that is taking the longest is the 'output_d(Data[i])' Any suggestions of how I could speed this up? My target is about a byte every 1us..
I am using a PIC18F97J60 at 41.6MHz. The external device is very fast and it only takes 400ns to recover after being written to. |
|
|
PCM programmer
Joined: 06 Sep 2003 Posts: 21708
|
|
Posted: Wed Apr 23, 2008 11:33 am |
|
|
Look at your .LST file, and then make changes in your source, and
watch the effect on the code size.
You could speed it up a little bit by using Fast i/o mode, and set the TRIS
before you enter the loop.
The major thing is indexing into that array. You'll see that in the .LST file.
If you want to speed it up, you've got to get rid of that. |
|
|
RLScott
Joined: 10 Jul 2007 Posts: 465
|
Re: Speeding up loop |
Posted: Wed Apr 23, 2008 12:17 pm |
|
|
How about this:
Code: |
int *DataPtr;
DataPtr = &Data[0];
i = n;
do
{
output_d(*DataPtr);
DataPtr++;
while (input(TXE)) //Wait till ready
;
output_high(WR);
output_low(WR);
}while(--i);
|
There is no sense calling delaycycles() as long as input(TXE) is something that you can do over and over without harm. Also, doing the DataPtr++ before checking TXE increases the chances that you may not need to go around the while(input(TXE)) loop at all. And decrementing "i" at the end and checking for zero is the fastest iteration control because the compiler optimizes this construct with a DECFSZ.
Robert Scott
Real-Time Specialties |
|
|
n-squared
Joined: 03 Oct 2006 Posts: 99
|
|
Posted: Wed Apr 23, 2008 4:45 pm |
|
|
I would like to suggest a little twist on RLScott's code.
By adding a single int variable, you may squeeze a small gain in speed.
next_byte variable is updated with next byte to send while waiting for TXE to fall.
Code: |
int *DataPtr;
int next_byte;
next_byte = Data[0];
DataPtr = &Data[1];
i = n;
do
{
output_d(next_byte);
next_byte = *DataPtr++;
while (input(TXE)) //Wait till ready
;
output_high(WR);
output_low(WR);
}
while(--i);
|
Noam Naaman
Meteorox - Access Control Systems _________________ Every solution has a problem. |
|
|
Neutone
Joined: 08 Sep 2003 Posts: 839 Location: Houston
|
Re: Speeding up loop |
Posted: Wed Apr 23, 2008 8:02 pm |
|
|
nazoa wrote: | I have a simple loop to transmit data to a device which I need to speed up. The code I have is as follows:
for (i=0; i<n; i++)
{
output_d(Data[i]);
while (input(TXE)) //Wait till ready
delay_cycles(1);
output_high(WR);
output_low(WR);
}
My problem is that the peak rate of the loop is about a byte very 2.1us. The command that is taking the longest is the 'output_d(Data[i])' Any suggestions of how I could speed this up? My target is about a byte every 1us..
I am using a PIC18F97J60 at 41.6MHz. The external device is very fast and it only takes 400ns to recover after being written to. |
This should be a small incremental bump in speed. With the improved loop and fast IO it should be under 1uS loop time.
Code: | #Byte Target_Inc = 0x0FE6 // Increments FSR1 after use
Int16 Target_Address;
#locate Target_Address = 0x0FE1 // FSR1 address
Target_Address = &Data[0]; // Assign addres of array to copy
i=n;
do
{
output_d(Target_Inc);
while (input(TXE)); //Wait till ready
output_high(WR);
output_low(WR);
}while(--i);
|
|
|
|
ckielstra
Joined: 18 Mar 2004 Posts: 3680 Location: The Netherlands
|
|
Posted: Wed Apr 23, 2008 8:17 pm |
|
|
Hi Neutone, you were just ahead of me. No post on this thread for more than 3 hours and just when I want to post my solution I see you beat me by a few minutes.
I came up with the same solution as Neutone.
Address calculation of the array takes up a lot of time and is performed again and again inside the loop. On a PIC18 processor this can be improved a lot by using the Index registers. Set up the Index once before the loop and then reading the data from the POSTINC register will do an automatic post increment on the address pointer.
Code: | // Register defines for PIC18
// Place this at the global level, or in a separate include file with register definitions.
unsigned int16 FSR0;
#locate FSR0=0x0FE9
unsigned char POSTINC0;
#locate POSTINC0=0x0FEE |
Code: | FSR0 = &Data[0]; // Set start address
i = n;
do
{
output_d( POSTINC0 ); // output data from array and increment pointer
while (input(TXE)) //Wait till ready
;
output_high(WR);
output_low(WR);
}
while(--i); |
I don't think it can be made faster than this. Tested in MPLAB's simulator it proofed to be about twice as fast as N-squared's version and very close to 1us (at 40MHz). |
|
|
n-squared
Joined: 03 Oct 2006 Posts: 99
|
|
Posted: Wed Apr 23, 2008 10:16 pm |
|
|
Hi ckielstra,
You live and learn...
I really like your solution.
I guess it will safely work only in a tight loop where the compiler has no chance of "corrupting" the two registers.
Thanks.
Noam _________________ Every solution has a problem. |
|
|
nazoa
Joined: 09 Feb 2007 Posts: 56
|
|
Posted: Thu Apr 24, 2008 3:02 am |
|
|
Very good. I have tried it and it cuts down my loop timing to 1.2us per byte. This solves my problem.
Although the loop is tight there is a small chance that it can be interrupted by a simple ISR that collects data. I have run the loop many times and I saw it crash once but, of course it is difficult to be sure what caused the crash.
Thanks to all and particularly to Neutone and ckielstra for the suggestions. |
|
|
ckielstra
Joined: 18 Mar 2004 Posts: 3680 Location: The Netherlands
|
|
Posted: Thu Apr 24, 2008 7:19 am |
|
|
n-squared wrote: | I guess it will safely work only in a tight loop where the compiler has no chance of "corrupting" the two registers. | The compiler doesn't use the index registers very often but you do have a point. Thinking about it, most array accesses are performed using the first index register so care should be taken.
An easy way to make this method more reliable is to use another index register, there are a total of three (FSR0 to FSR2). I've never seen the compiler use more than 2 index registers at the same time (in memcpy), so using Index register FSR2 should be relative save.
nazoa wrote: | Although the loop is tight there is a small chance that it can be interrupted by a simple ISR that collects data." | Interrupts shouldn't be a problem as the 3 index registers are saved and restored by the CCS interrupt dispatcher.
Note: there is no risk of altering the POSTINCx registers as they are no real physical registers but only another representation of the FSRx registers. |
|
|
n-squared
Joined: 03 Oct 2006 Posts: 99
|
|
Posted: Thu Apr 24, 2008 9:28 am |
|
|
Hi ckielstra,
Good point about FSR2.
By the way, CCS XINST and NOXINST fuses but I never see it using the extended instructions. Do you know if the XINST does anything?
Best regards
Noam _________________ Every solution has a problem. |
|
|
ckielstra
Joined: 18 Mar 2004 Posts: 3680 Location: The Netherlands
|
|
Posted: Thu Apr 24, 2008 9:59 am |
|
|
As far as I know does v4.071 not support the Extended Instruction Set (XINST). Some compiler versions have/had this fuse enabled by default and this caused erratic behaviour in several programs. Best is to explicitly set NOXINST for now. |
|
|
nazoa
Joined: 09 Feb 2007 Posts: 56
|
|
Posted: Thu Apr 24, 2008 11:10 am |
|
|
I have tried changing to FSR2 and things are looking quite stable now. Not seen a single crash after a few hours of testing. Thanks. |
|
|
|