CCS C Software and Maintenance Offers
FAQFAQ   FAQForum Help   FAQOfficial CCS Support   SearchSearch  RegisterRegister 

ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

CCS does not monitor this forum on a regular basis.

Please do not post bug reports on this forum. Send them to CCS Technical Support

How to handle multiplication overflows?
Goto page 1, 2  Next
 
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion
View previous topic :: View next topic  
Author Message
future



Joined: 14 May 2004
Posts: 330

View user's profile Send private message

How to handle multiplication overflows?
PostPosted: Sun Jun 20, 2004 3:11 pm     Reply with quote

In my application I need to multiply 4 integers and divide for 1000000..

If in the middle of process there's an overflow I get wrong results.

What I need is if some goes wrong I can set a predefined value.

The multiplication is like:

var1 * var2 * var3 * var4 = ??
----- ------ ------ ------
100 100 100 100

Is there a better approach for this?
Neutone



Joined: 08 Sep 2003
Posts: 839
Location: Houston

View user's profile Send private message

PostPosted: Sun Jun 20, 2004 7:40 pm     Reply with quote

Use an Int32 to keep from overflowing. It will not overflow when multiplying 4 bytes. I assume you are talking about byte but the term integer is ambigous when refering to code for an 8 bit processor.
future



Joined: 14 May 2004
Posts: 330

View user's profile Send private message

PostPosted: Mon Jun 21, 2004 4:58 am     Reply with quote

Yes I did cast everything to int32.

The speed of main loop drops by a factor of 4 when using these 32bit math Shocked

But the problem is that if the final result is greater than 255, the result which I show in a gauge meter goes crazy.

So I need a way to check is the result is overflowed or not.

I though about testing each division to not be greater than some number, but it has to be a better way Smile
SherpaDoug



Joined: 07 Sep 2003
Posts: 1640
Location: Cape Cod Mass USA

View user's profile Send private message

PostPosted: Mon Jun 21, 2004 7:10 am     Reply with quote

An int32 can not possibly overflow from the multiplication of four int8s. So if you do your multiplications, then divide, you have to get the right result. Your result could be as large as 4295 so you have to compare to 255 before casting back to an int8.
_________________
The search for better is endless. Instead simply find very good and get the job done.
future



Joined: 14 May 2004
Posts: 330

View user's profile Send private message

PostPosted: Mon Jun 21, 2004 10:39 am     Reply with quote

Ok, so I will do the multiplication, then the division and test if the result is greater than 255, if it is, I will set the result equal 255.

The only thing I was wondering if it is possible to do something to not have to DO the whole multiplication/division to verify if it will overflow or not.

The operation takes a lot of processing time. Before, the main loop was at 17k loop/s, with 2 calculations it is at 2k and there is a lot of code to add.
Neutone



Joined: 08 Sep 2003
Posts: 839
Location: Houston

View user's profile Send private message

PostPosted: Mon Jun 21, 2004 11:51 am     Reply with quote

Try this, it should cut the time almost in half.

int8 x1,x2,x3,x4;
int16 y1,y2;
int32 z;

y1=x1;
y1*=x2;
y2=x3;
y2*=x4;
z=y1;
z*=y2;
z/=1000000;

The compiler should have a good method to perform 8x8=16 or 16x16=32 math but does not. The 8x8=16 can be done in assembly in just a few instructions.
future



Joined: 14 May 2004
Posts: 330

View user's profile Send private message

PostPosted: Mon Jun 21, 2004 4:56 pm     Reply with quote

Neutone: Thank you!

I put your code into a function...

Code:
int8 big(int8 x1,int8 x2,int8 x3,int8 x4) {
/*******************************
* result = x1 * x2 * x3 * x4             *
*             ---- ---- ---- ----               *
*             100  100  100  100           *
*******************************/
int32 z;
int16 y1,y2;
y1=x1;
y2=x3;
y1*=x2;
y2*=x4;
z=y1;
z*=y2;
z/=1000000;
if(z>255) { return(255); } else { return(z); }
}


Size shrank about 200 bytes and speed went up from 890 to 1070hz

The function is called 2 times.
SteveS



Joined: 27 Oct 2003
Posts: 126

View user's profile Send private message

PostPosted: Tue Jun 22, 2004 7:23 am     Reply with quote

If you really need to speed things up you could split the /1e6 into shift right 6 and /15625. This makes the divide routine a 32 by 16 divide, which you will need to write since I don't think it's available directly in CCS. I would think that be tighter code.

Along those lines, try to take advantage of anything that is known or fixed about your calculation. Say, if the input data is restricted in range, you might be able to streamline the calculations somehow. Or, if you can somehow scale your input data so you can divide by a power of 2 (like 2^20= 1048576).

Just a couple of ideas...

- SteveS
future



Joined: 14 May 2004
Posts: 330

View user's profile Send private message

PostPosted: Tue Jun 22, 2004 4:35 pm     Reply with quote

It is a good idea but...

//z/=1000000;
z>>=6;
z/=15625;

It adds 50bytes and it is a bit slower. Results are the same.
SteveS



Joined: 27 Oct 2003
Posts: 126

View user's profile Send private message

PostPosted: Wed Jun 23, 2004 6:40 am     Reply with quote

oops, sorry 'bout that. Did you use a 32 by 16 divide routine? If you use the 32 by 32 it would be slower and bigger.

- SteveS
Guest








PostPosted: Wed Jun 23, 2004 7:04 am     Reply with quote

I let the compiler use its default routines as I am not good with assembly.

It does not need optimizations for now, the speed is sufficient.

Just one thing that is cool, you write some routine and there is always a way to make it faster.

I will be adding more code and making things slower... Very Happy
SteveS



Joined: 27 Oct 2003
Posts: 126

View user's profile Send private message

PostPosted: Wed Jun 23, 2004 3:19 pm     Reply with quote

You can always find something to optimize the code - it just takes time. I remember spending a month or so on about 50 lines of code (a filter) on the first TI DSP (oops I'm dating myself). But each line I took out or optimized made my filter run faster and better. There it was worth it. If you decide you need to tweak your code, post again with as much info as you can - we can probably come up with some ideas.

- Steve
SteveS



Joined: 27 Oct 2003
Posts: 126

View user's profile Send private message

faster divide by 1e6
PostPosted: Thu Jun 24, 2004 1:57 pm     Reply with quote

Ok, here is a way to speed things up I think.

Basically division is slow (except for powers of 2). So, instead of dividing by 1e6, multiply by 4295/2^32 which is close to /1e6

But, since your four 8-bit numbers, when multiplied together, can fill up 32 bits, you don't have room (in a 32 bit value) to further multiply by 4295. So after getting the 4 bytes multiplied together, divide by 2^16 which means just take the upper two bytes of the result. Now you can multiply that by 4295. Again you need another divide by 2^16 (to get a total 2^32 divide), so take the upper two bytes again. I think this will preserve your accuracy and a quick test showed it to be twice as fast as /1e6.

- SteveS
future



Joined: 14 May 2004
Posts: 330

View user's profile Send private message

PostPosted: Fri Jun 25, 2004 4:48 am     Reply with quote

In your test, did you use an union to get the upper bytes?
SteveS



Joined: 27 Oct 2003
Posts: 126

View user's profile Send private message

PostPosted: Fri Jun 25, 2004 7:18 am     Reply with quote

Yes, I did it like this:

Code:

int8 m0;
int8 m1;
int8 m2;
int8 m3;

int32 quot;
int32 quot2;
int16 proda16;
int16 prodb16;

union
{
int16 w[2];
int32 d;
} val;

union
{
int16 w[2];
int32 d;
} val2;

   // set m0, m1, m2 , m3 to whatever
   proda16 = (int16)m0 * (int16)m1;
   prodb16 = (int16)m2 * (int16)m3;
   val.d = (int32)proda16 * (int32)prodb16;

#ifdef STRAIGHT_FORWARD_WAY
   quot = val.d/1000000;
   // 16 bit answer is val.w[0]
#else // faster way
   quot2 = val.w[1];
   val2.d = quot2 * 4295;
   // 16 bit answer is val.w[1]
#endif



BTW, I tested it on an 18F part which has hardware multiply, YMMV on a 16F part.

- SteveS
Display posts from previous:   
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion All times are GMT - 6 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group