How to handle multiplication overflows?

future · Joined: 14 May 2004 Posts: 330

In my application I need to multiply 4 integers and divide for 1000000..

If in the middle of process there's an overflow I get wrong results.

What I need is if some goes wrong I can set a predefined value.

The multiplication is like:

var1 * var2 * var3 * var4 = ??
----- ------ ------ ------
100 100 100 100

Is there a better approach for this?

Neutone · Joined: 08 Sep 2003 Posts: 839 Location: Houston

Use an Int32 to keep from overflowing. It will not overflow when multiplying 4 bytes. I assume you are talking about byte but the term integer is ambigous when refering to code for an 8 bit processor.

future · Joined: 14 May 2004 Posts: 330

Yes I did cast everything to int32.

The speed of main loop drops by a factor of 4 when using these 32bit math Shocked

But the problem is that if the final result is greater than 255, the result which I show in a gauge meter goes crazy.

So I need a way to check is the result is overflowed or not.

I though about testing each division to not be greater than some number, but it has to be a better way Smile

SherpaDoug · Joined: 07 Sep 2003 Posts: 1640 Location: Cape Cod Mass USA

An int32 can not possibly overflow from the multiplication of four int8s. So if you do your multiplications, then divide, you have to get the right result. Your result could be as large as 4295 so you have to compare to 255 before casting back to an int8.
_________________
The search for better is endless. Instead simply find very good and get the job done.

future · Joined: 14 May 2004 Posts: 330

Ok, so I will do the multiplication, then the division and test if the result is greater than 255, if it is, I will set the result equal 255.

The only thing I was wondering if it is possible to do something to not have to DO the whole multiplication/division to verify if it will overflow or not.

The operation takes a lot of processing time. Before, the main loop was at 17k loop/s, with 2 calculations it is at 2k and there is a lot of code to add.

Neutone · Joined: 08 Sep 2003 Posts: 839 Location: Houston

Try this, it should cut the time almost in half.

int8 x1,x2,x3,x4;
int16 y1,y2;
int32 z;

y1=x1;
y1*=x2;
y2=x3;
y2*=x4;
z=y1;
z*=y2;
z/=1000000;

The compiler should have a good method to perform 8x8=16 or 16x16=32 math but does not. The 8x8=16 can be done in assembly in just a few instructions.

future · Joined: 14 May 2004 Posts: 330

Neutone: Thank you!

I put your code into a function...

SteveS · Joined: 27 Oct 2003 Posts: 126

If you really need to speed things up you could split the /1e6 into shift right 6 and /15625. This makes the divide routine a 32 by 16 divide, which you will need to write since I don't think it's available directly in CCS. I would think that be tighter code.

Along those lines, try to take advantage of anything that is known or fixed about your calculation. Say, if the input data is restricted in range, you might be able to streamline the calculations somehow. Or, if you can somehow scale your input data so you can divide by a power of 2 (like 2^20= 1048576).

Just a couple of ideas...

- SteveS

future · Joined: 14 May 2004 Posts: 330

It is a good idea but...

//z/=1000000;
z>>=6;
z/=15625;

It adds 50bytes and it is a bit slower. Results are the same.

SteveS · Joined: 27 Oct 2003 Posts: 126

oops, sorry 'bout that. Did you use a 32 by 16 divide routine? If you use the 32 by 32 it would be slower and bigger.

- SteveS

Guest · Posted: Wed Jun 23, 2004 7:04 am

I let the compiler use its default routines as I am not good with assembly.

It does not need optimizations for now, the speed is sufficient.

Just one thing that is cool, you write some routine and there is always a way to make it faster.

I will be adding more code and making things slower... Very Happy

SteveS · Joined: 27 Oct 2003 Posts: 126

You can always find something to optimize the code - it just takes time. I remember spending a month or so on about 50 lines of code (a filter) on the first TI DSP (oops I'm dating myself). But each line I took out or optimized made my filter run faster and better. There it was worth it. If you decide you need to tweak your code, post again with as much info as you can - we can probably come up with some ideas.

- Steve

SteveS · Joined: 27 Oct 2003 Posts: 126

Ok, here is a way to speed things up I think.

Basically division is slow (except for powers of 2). So, instead of dividing by 1e6, multiply by 4295/2^32 which is close to /1e6

But, since your four 8-bit numbers, when multiplied together, can fill up 32 bits, you don't have room (in a 32 bit value) to further multiply by 4295. So after getting the 4 bytes multiplied together, divide by 2^16 which means just take the upper two bytes of the result. Now you can multiply that by 4295. Again you need another divide by 2^16 (to get a total 2^32 divide), so take the upper two bytes again. I think this will preserve your accuracy and a quick test showed it to be twice as fast as /1e6.

- SteveS

future · Joined: 14 May 2004 Posts: 330

In your test, did you use an union to get the upper bytes?

SteveS · Joined: 27 Oct 2003 Posts: 126

Yes, I did it like this: