mean of duplicates with condition
I have a data set that seems like
CANONICAL_SMILES CMPD_CHEMBLID PCHEMBL_VALUE A CHEMBL85 7.8 A CHEMBL85 8.21 A CHEMBL85 8.16 A CHEMBL85 8.07 A CHEMBL85 8 A CHEMBL85 7.8 A CHEMBL85 7 A CHEMBL85 1 B CHEMBL61079 8.92 B CHEMBL61079 8.92 C CHEMBL91162 9.22 C CHEMBL316125 8.7 D CHEMBL293341 8.6 D CHEMBL293341 8.6 D CHEMBL293341 8.6
i want a output with following conditions
group values based on canonical smiles and CMPD_CHEMBLID (results shall be 6 groups let say A and B for easiness)
then check condition if in each group (A and B), the difference of max and min PCHEMBL_VALUE is <1 or >-1, then only calculate mean of PCHEMBL_VALUES in that particular group and report otherwise don't report that specific group. in this case A CHEMBL85 should not be reported as max -min = 8.21- 1 = 7.21 but B CHEMBL61079 should be reported as max -min = 8.92 - 8.92 = 0.
Kindly suggest me coding for this.