The Minor Planet Center maintains an updated list of Jupiter Trojans (last updated on 2020 Dec. 8).
After some "cut&paste", I managed to read that list in an R dataframe.
Here you can see some lines:
> head(p)
Ln q Q H M Peri. Node Incl. e a
1906 TG L4 4.443 5.976 8.27 271.74 133.3 316.5 10.3 0.147 5.209
1906 VY L5 4.490 5.939 8.24 237.01 307.9 44.3 22.1 0.139 5.215
1907 XM L4 5.142 5.382 7.28 202.51 184.2 342.8 18.2 0.023 5.262
1908 CS L4 4.558 5.771 8.71 14.90 343.5 350.7 4.5 0.117 5.164
1917 CQ L5 4.544 5.822 8.71 324.73 335.5 301.6 8.9 0.123 5.183
1919 FD L4 4.927 5.629 7.93 301.43 80.6 338.0 21.8 0.067 5.278
In the above data frame, I managed to give the relevant asteroid provisional designation to each row name.
The second column Ln represents the lagrangian point, either L4 or L5.
In fact, the Trojan asteroids are divided into two groups:
- the so called "Greek group" is located around L4 (ahead of Jupyter)
- the so called "Trojan group" is located around L5 (trailing Jupyter)
I found a nice wikipedia picture showing their relative location with respect to Jupyter.
Out of curiosity - probably just to find well known asteroid properties - I tried a few exercizes in R.
First of all, I checked if the MPC list was loaded correctly in R, counting the number of asteroids in the above two groups:
> summary(p$Ln)
L4 L5
5651 3421
This is ok, in fact the same numbers are reported in the MPC list.
We can also give a look at the overall statistics.
Just to make it easier to read, I divide it in a few parts :
> summary(p[,c(2:4)])
q Q H
Min. :3.577 Min. :5.069 Min. : 7.28
1st Qu.:4.708 1st Qu.:5.426 1st Qu.:13.22
Median :4.850 Median :5.557 Median :13.90
Mean :4.825 Mean :5.580 Mean :13.65
3rd Qu.:4.980 3rd Qu.:5.698 3rd Qu.:14.30
Max. :5.328 Max. :6.753 Max. :18.50
NA's :1
Note that the H has one NA value: in fact, in the case of (2010 GE) the H value is not available .
> summary(p[,c(5:7)])
M Peri. Node
Min. : 0.0 Min. : 0.00 Min. : 0.00
1st Qu.:210.1 1st Qu.: 89.78 1st Qu.: 88.95
Median :272.3 Median :179.50 Median :169.65
Mean :240.8 Mean :179.89 Mean :173.14
3rd Qu.:310.4 3rd Qu.:271.60 3rd Qu.:256.10
Max. :360.0 Max. :360.00 Max. :360.00
> summary(p[,c(8:10)])
Incl. e a
Min. : 0.10 Min. :0.00100 Min. :4.952
1st Qu.: 7.10 1st Qu.:0.04400 1st Qu.:5.165
Median :11.30 Median :0.06700 Median :5.204
Mean :13.69 Mean :0.07255 Mean :5.203
3rd Qu.:19.20 3rd Qu.:0.09300 3rd Qu.:5.243
Max. :57.20 Max. :0.29800 Max. :5.419
Let's now compare the Greek and the Trojan groups to see if we can find a significant difference between their parameters.
I feel that for this analysis it is better to use a graphical approach.
This is done in R thanks to the wonderful "ggplot2" package by H. Wickham.
For example, we can look at the distribution using a "box-plot" (showing quartiles plus min and max values).
Let's start with H mag:
H Mag
There is no significant difference between the two groups even though there might be a couple of asteroid in the L4 group that are brighter than those in the L5 group and the opposite might also hold ... (but here I am probably reading too much).
Similarly, we can do the same analysis for the orbital parameters.
Perihelium
Aphelium
No significant difference till now.
Mean Anomaly
Let's see the distribution of mean anomaly M:
In this case one might doubt that a small difference exists (L5 median seems less than L4 median), however, a t test shows that this is not the case. In fact, the p-value is greater that 0.05 (95% confidence level) so we have no reason to reject the null hypothesis.
> t.test(M~Ln,p)
Welch Two Sample t-test
data: M by Ln
t = 1.5564, df = 7797.9, p-value = 0.1197
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.8426388 7.3362908
sample estimates:
mean in group L4 mean in group L5
242.0441 238.7973
One curious fact: the distribution for M, for both L4 and L5 groups, has two maxima - nor clear why and whether this is noteworthy or not:
We can continue our analysis with the other parameters ...
Argument of Perihelium
Ascending Node
Again, here we might have the impression that there is slight difference in the mean value of Node.
Let' see if this is really significant:
> t.test(Node~Ln,p)
Welch Two Sample t-test
data: Node by Ln
t = 3.8699, df = 7360.8, p-value = 0.0001098
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
4.14388 12.65160
sample estimates:
mean in group L4 mean in group L5
176.3043 167.9066
Apparently yes!, looking at the p-value we should reject the null hypothesis and accept the alternative one that states that true difference in means for the Ascending Node is not equal to 0 and in fact its true value lies between about 4 and 12 degrees.
However, one can say that the above t test is not valid because the distribution has not a nice gaussian shape:
Another curious fact (at least for me!) is found when you take the sum of Peri and Node.
This is the distribution of Node + Peri - there is a second relative maximum.
One can also note that
the curves are almost symmetrical when compared to the black vertical line
that shows the (Node+Peri) value for the planet Jupyter and this is just
the consequence of being around L4 and L5:
Lookinfg at the above graph, as Node+Peri increasis, we see first a prevalence of L4 then L5 and after L4 and L5. Another way to see this:
Let's now look at Inclination:
Inclination
Here the t test shows this:
> t.test(Incl.~Ln,p)
Welch Two Sample t-test
data: Incl. by Ln
t = -26.047, df = 6329.1, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-5.157120 -4.435177
sample estimates:
mean in group L4 mean in group L5
11.87947 16.67562
So the difference in Inclination seems to be really significant.
Looking at the p-value we should reject the null hypothesis and accept the alternative one that states that true difference in means for the Inclination is not equal to 0 and in fact its true value lies between about 4.4 and 5.1 degrees.
Again, this is not necessarily true because the distribution does not have a nice gaussian shape ... but it is true that the two distribution are different:
Let's now look at semi-major axis and eccentricity:
Semi-major axis
Eccentricity
In conclusion, for semi-major axis and eccentricity there seems to be no significant difference between the Greeks and the Trojans.
However, for both groups, the eccentricity distribution seems to be a little skewed:
OneR Rule
Another "game" that we can play: let's imagine that we are given a trojan but do not know if it belongs to the "Greek" camp or the "Trojan" camp.
We want to see if the algorithm OneR can find one decision rule based on the orbital parameters.
library(oneR)
data <- optbin(Ln~.,data=p)
model <- OneR(data, verbose = TRUE)
the result is:
> summary(model)
Call:
OneR.data.frame(x = data, verbose = TRUE)
Rules:
If Incl. = (0.0429,16.7] then Ln = L4
If Incl. = (16.7,57.3] then Ln = L5
Accuracy:
5921 of 9071 instances classified correctly (65.27%)
Contingency table:
Incl.
Ln (0.0429,16.7] (16.7,57.3] Sum
L4 * 4334 1317 5651
L5 1833 * 1587 3420
Sum 6167 2904 9071
---
Maximum in each column: '*'
Pearson's Chi-squared test:
X-squared = 521.19, df = 1, p-value < 2.2e-16
So the oneR chooses Inclination as the best parameter capable to make the decision but the accuracy is only 65%
We can give a little help to the system by adding a new column np=Node+Peri and see if the agorithm is capable to find a better rule (this time, we subtract 360 when the angle is greater than 360).
The answer is positive, the accuracy of the new rule has been increased to about 80%
> summary(model)
Call:
OneR.data.frame(x = data, verbose = TRUE)
Rules:
If np = (-0.36,225] then Ln = L4
If np = (225,360] then Ln = L5
Accuracy:
7218 of 9071 instances classified correctly (79.57%)
Contingency table:
np
Ln (-0.36,225] (225,360] Sum
L4 * 4812 839 5651
L5 1014 * 2406 3420
Sum 5826 3245 9071
---
Maximum in each column: '*'
Pearson's Chi-squared test:
X-squared = 2854.3, df = 1, p-value < 2.2e-16
Kind Regards,
Alessandro Odasso