Blog Archive

Showing posts with label absolute magnitude. Show all posts
Showing posts with label absolute magnitude. Show all posts

Thursday, September 29, 2016

Apollo asteroids - H mag vs Tisserand parameter

The Apollo asteroids' H mag is very much related to their inclination i and perihelium ( a*(1-e) ).

As the Tisserand's parameter is related to the orbital parameters (a,e,i), one may wonder whether the H mag is also related to the Tisserand distribution.

At first, I had been trying to look at the relation between H mag and Tisserand's parameter with respect to Jupiter.
The relation seems to be very weak, so I have tried to see if a different choice for the semi-major axis of the perturbing body can provide a stronger relationship.

The conclusion is that the stronger relation is found with a perturbing body semi-major axis equal to 1.01 AU (more or less ...Earth).

The analysis has been done with the package R (see R Core Team (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ).

In the following sections, taking into account the Apollo asteroids, I will describe: 
  • data preparation 
  • H mag distribution
  • H mag versus inclination
  • H mag versus perihelium
  • H mag versus Tisserand's parameter with respect to Jupiter
  • investigation about a generic Tisserand's parameter 
  • the special case of Tisserand's parameter for a perturbing body with semi-major axis = 1.01 AU

Data preparation 
Apollo asteroids - I have downloaded the orbital parameters (a,e,i,w,om,t_jup) + H mag from JPL Small-Body Database Browser

After that, I read the file apollo.csv as a dataframe in R with the below instruction:

p<-read.csv("apollo.csv")

With the head and tail instructions you can look at the first rows and at the last rows of your data frame:

> head(p)
         a         e         i         om        w     H t_jup
1 1.078078 0.8268746 22.826863  88.012367  31.3817 16.90 5.298
2 1.245393 0.3353925 13.337454 337.216252 276.8695 15.60 5.075
3 1.367258 0.4359054  9.380703 274.339634 127.0826 14.23 4.716
4 1.470241 0.5599397  6.352955  35.739118 285.8483 16.25 4.415
5 2.259355 0.6061736 18.399653 346.487607 267.9952 15.54 3.298
6 1.460856 0.6146128 22.212986   6.640199 325.6343 14.85 4.336

> tail(p)
            a         e         i       om        w      H t_jup
7855 1.313828 0.3839697 10.811349 277.5693 123.8834 22.621 4.872
7856 1.355240 0.5707721  6.501750 294.3308 251.8823 24.329 4.672
7857 2.449819 0.6928541  1.086150 133.4694 231.7488 25.417 3.113
7858 1.306445 0.2387369 14.003415 288.4028 329.4405 23.046 4.927
7859 2.693641 0.6526003 10.463092 133.2149 213.9876 24.597 3.004
7860 2.804018 0.6672408  4.724818 183.5757 234.1325 20.400 2.946
So we read a total of 7860 rows.

With the summary instruction you can have a look at the data:

>summary(p)
       a                e                 i                   om        
 Min.   : 1.000   Min.   :0.02097   Min.   :  0.02262   Min.   :  0.019 
 1st Qu.: 1.294   1st Qu.:0.36393   1st Qu.:  4.24058   1st Qu.: 80.208 
 Median : 1.643   Median :0.51276   Median :  8.44792   Median :169.915 
 Mean   : 1.724   Mean   :0.49672   Mean   : 12.52270   Mean   :170.999 
 3rd Qu.: 2.090   3rd Qu.:0.62793   3rd Qu.: 17.46998   3rd Qu.:251.462 
 Max.   :17.827   Max.   :0.96910   Max.   :154.37505   Max.   :359.937 
       w                  H             t_jup     
 Min.   :  0.0348   Min.   :12.40   Min.   :1.316 
 1st Qu.: 92.4286   1st Qu.:20.00   1st Qu.:3.444 
 Median :189.4941   Median :22.60   Median :4.088 
 Mean   :182.7822   Mean   :22.57   Mean   :4.186 
 3rd Qu.:272.6443   3rd Qu.:25.17   3rd Qu.:4.878 
 Max.   :359.9882   Max.   :33.20   Max.   :6.064 
 


H mag distribution
With the following instructions you can display a rough density plot to have an idea about the data:
> library(ggplot2)
> ggplot(p,aes(H))+geom_density()

H mag versus inclination
The curious density shape shown above has an easy explanation.
Just take a look at the inclination quartiles (in short, I call them i_quartiles):

> i_quartiles<-cut(p$i,fivenum(p$i),include.lowest=T)
> ggplot(p,aes(H,col=i_quartiles))+geom_density()

> summary(i_quartiles)
[0.0226,4.24]   (4.24,8.45]   (8.45,17.5]    (17.5,154] 

1965                1965             1965              1965

So it seems that inclination alone is a good explanation for the H mag distribution shown above. Apollo asteroids belonging to the 3rd and 4th inclination quartiles have, by definition, a greater inclination: it turns out that they are brighter than those belonging to the 1st and 2nd inclination quartiles.

H mag versus perihelium
In this case, we calculate q = a*(1-e) and after that we find the perihelium quartiles:
> p$q<-p$a*(1-p$e)    
> q_quartiles<-cut(p$q,fivenum(p$q),include.lowest=T)


> summary(q_quartiles)
[0.0707,0.694]  (0.694,0.858]  (0.858,0.954]   (0.954,1.02]
          1965           1965           1965           1965


> ggplot(p,aes(H,col=q_quartiles))+geom_density()

Apollo asteroids belonging to the 3rd and 4th perihelium quartiles have, by definition, a greater perihelium: it turns out that they are darker  than those belonging to the 1st and 2nd perihelium quartiles.


H mag versus Tisserand's parameter with respect to Jupiter 

We can run the following instructions to get a rough scatter plot.
> ggplot(p,aes(H,t_jup))+geom_point()


This scatter plot is not particularly informative, it is hard to see any meaningful patterns!

So let's use the same approach seen before: let's make a plot based on T-Jupiter quartiles:
> tjup_quartiles<-cut(p$t_jup,fivenum(p$t_jup),include.lowest=T)

> summary(tjup_quartiles)
[1.32,3.44] (3.44,4.09] (4.09,4.88] (4.88,6.06] 

 1967        1965        1964        1964


 Similarly, we can draw a boxplot. Just run these commands:
ggplot(p,aes(tjup_quartiles,H,fill=tjup_quartiles))+\
geom_boxplot()+\ 
scale_x_discrete(labels=c("Q1","Q2","Q3","Q4"))

 

The Apollo asteroids belonging to the 1st T-jupiter quartile are brighter than the others.
A part from this, the distributions of the Apollo asteroids belonging to the 2nd, 3rd and a little less the 4th T-Jupiter quartiles are not so different as far as the median value is concerned.


We can find a rough numerical measure for the (not so great) difference between the various H mag distributions: we use R to calculate the spearman correlation.

> cor(p$H,p$t_jup,method="spearman")
[1] 0.2268667

I think the Spearman correlation in this case is better than the traditional Pearson correlation because we are not really hoping to find a linear relation, we would be happy to find a monotonic one.

Investigation about a generic Tisserand's parameter
Given the above result, one can wonder: what happens if we used a different Tisserand's parameter? Would we get a stronger correlation?

In order to answer, we can define two user defined functions:

Function1
func_t<-function(ap,a,e,i){ap/a+2*(a/ap*(1-e^2))^(1/2)*cos(i/180*pi)}

This first function just implements the Tisserand formula:

 T_{P}\ ={\frac  {a_{P}}{a}}+2\cdot {\sqrt  {{\frac  {a}{a_{P}}}(1-e^{2})}}\cos i

Function2
func_t_h_spea<-function(ap)\{p$ts<-func_t(ap,p$a,p$e,p$i);return(cor(p$H,p$ts,method="spearman"))}

This second function calls the first one and after that, once the Tisserand values are calculated for every asteroid, it returns the correlation coefficient between H and the generic Tisserand's parameter (given ap as input).

Now we can vary the semi-major axis ap of the perturbing body in a range of our choice ...let's say [0<ap<30] AU and plot the graph showing the variation of the correlation factor.

>therange<-seq(0.01,30,0.01)
>plot(therange,sapply(therange,function(x) {func_t_h_spea(x)}),type='l',xlab="ap used for tisserand param",ylab="Spearman correlation H vs Tisserand's param")



With the order instruction we can search for the exact ap value that maximizes the correlation;

> tail(order(sapply(therange,function(x) {func_t_h_spea(x)})),1)
[1] 101
> therange[101]
[1] 1.01


 We get ap = 1.01 AU

The special case of Tisserand's param with ap = 1.01 AU
Given the above result, we can calculate the quartiles of the Tisserand's parameter with respect to a perturbing body with semi-major axis 1.01 AU:

> ap=1.01
>p$ts<-func_t(ap,p$a,p$e,p$i);p$tsquartile<-factor(cut(p$ts,fivenum(p$ts), include.lowest = T))

>summary(p$tsquartile)
[-1.29,2.59]  (2.59,2.79]   (2.79,2.9]      (2.9,3] 

  1965             1965              1965         1965

Then, we can see the result in the usual graphical form:

> ggplot(p,aes(H,col=tsquartile))+geom_density()
It seems that our search for a better/stronger correlation has been successful: now we can observe a sharper separation between the various H distributions.

The same result is more clear with a boxplot:
ggplot(p,aes(tsquartile,H,fill=tsquartile))+\
geom_boxplot()+\ 
scale_x_discrete(labels=c("Q1","Q2","Q3","Q4"))+ggtitle("H versus Tisserand's param for ap=1.01")



In conclusion, there are a lot of outliers in the extreme quartiles but the middle half of the H mag distributions are now much better separated than when we used Tisserand's parameter with respect to Jupiter.

Cheers,
Alessandro Odasso

Monday, October 20, 2014

Mars-crossing Asteroids - Absolute Magnitude Model

The H mag median for the Mars-crossing asteroids is about 18.

I tried to use a data mining software (Weka) to find a classification model that builds a decision tree based on orbital parameters (a,e,i) to estimate whether a mars-crossing asteroid has H<=18.

After many trials, I found a model that is far from being perfect but that might have some interest.

The model seems to be able to correctly identify the H mag level of the mars-crossing asteroids in 67% of the cases (a performance much better than the 50% probability of success that it could have just by chance).

The data mining program has processed 12477 asteroids using the J48 algorithm (66% of the asteroids used for training, the remainder for testing it). When it finished, the following report was displayed:



In the above report, we see the overall performance of the model (67% of correctly identified instances) plus a detailed accuracy summary for each class showing the rate of True Positives, False Positives  and Precision.

At the bottom, we can see the so called "Confusion Matrix" or contingency table showing the two classes of asteroids magnitude:
  • class a: bright asteroid (H <= 18.0)
  • class b: dim asteroid (H > 18.0)

In order to understand it better, let's explain it looking for example at class b, i.e., the class of dim asteroids:

  • TP Rate: we see that the dim asteroids were correctly predicted with a rate of 72.8% (1554 / (1554+580))
  • FP Rate: we see that 796 bright asteroid were mistakenly classified as dim asteroids, thus the proportion of bright asteroids not correctly classified is 37.8% (796/(1312+796))
  • Precision: any asteroid classified as dim  was truly dim in about 66% of the cases (1554 / (1554+796)).

In the following section, you can see the model itself (as a tecnique called bagging was used, the output contains 10 decision trees that taken together produced the overall result):