647 views
1 1 vote
In case of 3 sensors reporting loads of values individually.. one sensor might be off. The average of the 2 trustworthy sensors is to be reported.. the third in need for recalibration is to be neglected. I'm in need of an (excel) formula looking at three columns which row-by-row detects a significant deviation compared to the others and calculate the average of the most trustworthy.
Example:
48.1 ; 45.2 ; 45.4 => 45.3, as sensor 1 is way off....
36.0 ; 37;0 ; 45.0 => 36.5, as sensor 3 is way off....
36.0 ; 36;5 ; 37.0 => 36.5 as the deviation is too small to be considered an anomaly, so all values are valid to create the average.

Working with long periods of time.. the readings might be trustworthy for a few weeks, but in defect from moment X up until now... so simply ruling out one sensor is not really an option either.. What is the best way forward?
Please help. Highly appreciated.
0% Accept Rate Accepted 0 answers out of 1 questions

1 Answer

0 0 votes
What seems to work is simple: create 4 new columns:
x=average(3 values) ; y=stdev.p(3 values) ; low threshold = x-y ; high threshold = x+y
, then repeat the numbers if these are within the boundaries and make a 'n/a' if outside of stdev.
With the values repeated (as within thresholds), the average can be calculated neglecting the extreme values.
Example:
10 ; 11 ; 20 : avg=13.67: stdev.p = 4.49; Low=9.17; high=18.16, so
10 ; 11; n/a  (as 20  > 18.16)
this gives an average of 10.5   :-)

Agree?

Related questions

0 0 votes
0 0 answers
548
548 views
Anas asked Nov 28, 2021
548 views
So say I have a column with categorical data like different styles of temperature: 'Lukewarm', 'Hot', 'Scalding', 'Cold', 'Frostbite',... etc.I know that we can use pd.ge...
1 1 vote
1 1 answer
1.6k
1.6k views
Hagar asked Jun 24, 2023
1,599 views
Hello,I have a dataset with a categorical column that contains three categories. One of the categories represents 98% of the data, while the remaining 2% are distributed ...
1 1 vote
1 1 answer
1.4k
1.4k views
Anas asked Dec 18, 2021
1,358 views
It's a car prices dataset, and so I'm assuming that the more recent the more value a car should have. The values in the 'year' column simply consist of years from 1995 to...
0 0 votes
1 1 answer
850
850 views
Anna asked Jun 8, 2021
850 views
1 1 vote
1 1 answer
986
986 views
rozhan asked Oct 28, 2018
986 views