The H mag median for the Mars-crossing asteroids is about 18.
I tried to use a data mining
software (Weka) to find a classification model that builds a decision
tree based on orbital parameters (a,e,i) to estimate whether a
mars-crossing asteroid has H<=18.
After many trials, I found a model that is far from being perfect but that might have some interest.
After many trials, I found a model that is far from being perfect but that might have some interest.
The model seems to be able to correctly identify the H mag level of the mars-crossing asteroids in 67% of the cases (a performance much better than the 50% probability of success that it could have just by chance).
The data mining program has processed 12477 asteroids using the J48 algorithm (66% of the asteroids used for training, the remainder for testing it). When it finished, the following report was displayed:
In the above report, we see the overall performance of the model (67% of correctly identified instances) plus a detailed accuracy summary for each class showing the rate of True Positives, False Positives and Precision.
At the bottom, we can see the so called "Confusion Matrix" or contingency table showing the two classes of asteroids magnitude:
- class a: bright asteroid (H <= 18.0)
- class b: dim asteroid (H > 18.0)
In order to understand it better, let's explain it looking for example at class b, i.e., the class of dim asteroids:
- TP Rate: we see that the dim asteroids were correctly predicted with a rate of 72.8% (1554 / (1554+580))
- FP Rate: we see that 796 bright asteroid were mistakenly classified as dim asteroids, thus the proportion of bright asteroids not correctly classified is 37.8% (796/(1312+796))
- Precision: any asteroid classified as dim was truly dim in about 66% of the cases (1554 / (1554+796)).
In the following section, you can see the model itself (as a tecnique called bagging was used, the output contains 10 decision trees that taken together produced the overall result):