Counting number of samples

delene · Joined: 13 Oct 2003 Posts: 32

Hi..

I am reading in 200 samples of data, and want to count which value appears the most number of times in the list.
I haven't got a clue on how to start or do this, and am hoping that someone can take pity on me and tell me how.

Thanks
Delene

delene · Joined: 13 Oct 2003 Posts: 32

Just wanted to add, I can either count them as I read them or store them in an array and count them, which ever is easiest

ckielstra · Joined: 18 Mar 2004 Posts: 3680 Location: The Netherlands

What is the range of values for the incoming data, i.e. what is the lowest and highest possible value?

delene · Joined: 13 Oct 2003 Posts: 32

anywhere between -19999 and + 19999

Ttelmah · Joined: 11 Mar 2010 Posts: 19496

Counting while reading, would depend on the size of the samples. If these are int8 values, then it is easy. Just have a 256 element int8 array, set all entries to zero before you start, and when each value arrives, increment the corresponding entry (so if a value of 23 arrives increment array[23]). When finished, look for the largest value in the array.
This gets harder for an int16 array, and basically impossible for float values. You would need to specify a 'margin' between which values were deemed to be identical, or it'll probably never have two identical values.

Have a look at:

<http://discuss.joelonsoftware.com/default.asp?interview.11.352366.6>

Which gives a reasonably efficient algorithm, which needs a little tweaking....

Try searching for 'finding the mode of an array'. This is the mathematical 'name' for what you want.

Best Wishes

delene · Joined: 13 Oct 2003 Posts: 32

Thank you Ttelmah for your help and for getting back to me so quickly.

The name of the function I am looking for is particularly helpful!

Will do as you have suggested THANK YOU
Smile

ckielstra · Joined: 18 Mar 2004 Posts: 3680 Location: The Netherlands

SherpaDoug · Joined: 07 Sep 2003 Posts: 1640 Location: Cape Cod Mass USA

The probabilities depend on what the numbers represent. If they are part numbers you could easily have 1 pipe, 2 flanges, 12 bolts, and 24 washers with the washers winning out.

I think this is a problem for a "sparse array". You have a real array of 200 that indexes into a virtual array of -19999 to + 19999. Every time you get a new sample you see if it is in the array. If it is there you increment the quantity. If it is not there then you put it in and give it a quantity of 1. It is a little slow, but I don't see any other way. It may on may not be worth periodically sorting the array.
_________________
The search for better is endless. Instead simply find very good and get the job done.