Results
Pearson Correlation Calculation and Dendrogram
Because of the display limitation of website and the figure size, figure 7, as shown below, is only for general correlation exhibition purpose (resolution of the figure is too low to display anything) . The detail correlation information including pearson correlation results and dendrogram clustering results are shown in the attached png file below (hm_png and hm_cut.png) , which allowing zoom in and zoom out. Colorbar is at the right side of the figure, blue showing negative correlation between the two variables and red showing positive. The higher correlation or negative correlation is, the deeper the correspond block is in the heatmap. From figure 7, groups of highly correlated elements are listed below (correlation coefficient ≥ 0.7) :
1) Ca-Mg; 2) Ni-Co-Mn; 3) Cu-Al-Sc-Cr-Ga-(Fe) ; 4) Th-Cs-La-(Ti)-U-W; 5) As-Bi; 6) V-Ag-P; 7) Be-Y; 8) Hf-Zr; 9) Nb-Rb-Sn; 10) Fe-S.
For the dendrogram, if look at the cut line in the attached file hm_cut.png, there are around 9-11 groups (as the line is almost overlapping with the last 3 branches). From both correlation coefficient and dendrogram, the variables can be clustered into 10 groups.
Among the grouped elements, Ca-Mg group has high negative correlation most of other groups in the dataset. Group 2, 3 and 4 are relatively highly correlated with each other and group 5 and 6 show relatively high correlation between each other as well.
Because of the display limitation of website and the figure size, figure 7, as shown below, is only for general correlation exhibition purpose (resolution of the figure is too low to display anything) . The detail correlation information including pearson correlation results and dendrogram clustering results are shown in the attached png file below (hm_png and hm_cut.png) , which allowing zoom in and zoom out. Colorbar is at the right side of the figure, blue showing negative correlation between the two variables and red showing positive. The higher correlation or negative correlation is, the deeper the correspond block is in the heatmap. From figure 7, groups of highly correlated elements are listed below (correlation coefficient ≥ 0.7) :
1) Ca-Mg; 2) Ni-Co-Mn; 3) Cu-Al-Sc-Cr-Ga-(Fe) ; 4) Th-Cs-La-(Ti)-U-W; 5) As-Bi; 6) V-Ag-P; 7) Be-Y; 8) Hf-Zr; 9) Nb-Rb-Sn; 10) Fe-S.
For the dendrogram, if look at the cut line in the attached file hm_cut.png, there are around 9-11 groups (as the line is almost overlapping with the last 3 branches). From both correlation coefficient and dendrogram, the variables can be clustered into 10 groups.
Among the grouped elements, Ca-Mg group has high negative correlation most of other groups in the dataset. Group 2, 3 and 4 are relatively highly correlated with each other and group 5 and 6 show relatively high correlation between each other as well.
hm.png | |
File Size: | 3242 kb |
File Type: | png |
hm_cut.png | |
File Size: | 3207 kb |
File Type: | png |
Discriminant Analysis
Discriminant analysis was performed on the log transformed data set without variable Ta and Ge. The result is displayed in figure 8 (for higher resolution figure, please download discriminant_analysis.png).
Can1 has explained 67.9% of the variance within the Mackenzie Mountain data set. Can1 and Can2 together explained 79.7% of the variance. It is obvious that variable Ca, Mg are of the same group, Na is positively correlated with the Ca-Mg group. The Ca-Mg-Na group is negatively correlated with other groups in Can1 direction. Variable Mn, Co, Ni, Al can be regarded as same group, group 4, which is highly negatively correlated with group 1. The variance of Cd, S, Sc contribute to the grouping of group 5 and 6. Sr would be the dominant index for group 8. Variance in Zn and Ba contribute to the grouping of group 10 and Ag, V, Cr contribute to group 4. Group 4 and 10 are highly negatively correlated with Ca-Mg-Na group.
Discriminant analysis was performed on the log transformed data set without variable Ta and Ge. The result is displayed in figure 8 (for higher resolution figure, please download discriminant_analysis.png).
Can1 has explained 67.9% of the variance within the Mackenzie Mountain data set. Can1 and Can2 together explained 79.7% of the variance. It is obvious that variable Ca, Mg are of the same group, Na is positively correlated with the Ca-Mg group. The Ca-Mg-Na group is negatively correlated with other groups in Can1 direction. Variable Mn, Co, Ni, Al can be regarded as same group, group 4, which is highly negatively correlated with group 1. The variance of Cd, S, Sc contribute to the grouping of group 5 and 6. Sr would be the dominant index for group 8. Variance in Zn and Ba contribute to the grouping of group 10 and Ag, V, Cr contribute to group 4. Group 4 and 10 are highly negatively correlated with Ca-Mg-Na group.
discriminant_analysis.png | |
File Size: | 51 kb |
File Type: | png |
K-means Cluster Analysis
As a geological data set, the variables should be spatially sensitive. In this case, performing merely rotational tools and analyzing correlation without considering the distance between samples are not enough for understanding the data.
To corporate spatial information into the analysis, k-means cluster with Euclidean distance matrices is performed on original data. A couple of different numbers of clusters were tried out and 10 clusters turned out yielding best results. In table 2, the group number and count of samples in each group is recorded. The k-means cluster results are then plotted over regional geological data in ArcGIS and the shown in figure 9.
The groups forming major patterns on the map are group 1 (red), 7 (orange), 9 (yellow) and 2 (forest-green). The four groups covered the majority of our 1711 samples data. Group 4 (pink) is highly associated with group 9, but group 4 consists less samples comparing to group 9. As displayed in the map (figure 9), we can tell the patterns created by group 1, 7, 9 and 2 is aligned with the direction geological groups and sequences. When zoomed in, the map provides the evidences that group 9 and group 2 are highly controlled by local geological structures (thrusts and normal faults). Group 1 is mainly held by lower Paleozoic Mackenzie Platform. Group 7 point distribution is highly associated with Mackenzie Mountain Supergroup and Group 9 and 4 distribution is associated with Windermere Supergroup (including rift-relating successions). Group 1 distribution in general relates to frontal thrust of Cordilleran Orogen.
Classification And Regression Tree Analysis
CART analysis is based on the group results from k-mean clusters, as explained in methodology page, k-means grouping results is regarded as a responsive variables to conduct the analysis. In figure 10, the CART regression tree is displayed and it can be tell that the most important factor impacting on the grouping is Mn concentration in the sample and the second most important variable is Ba. The results from CART analysis suggest that the first split has explained 49% of the variance.
As a geological data set, the variables should be spatially sensitive. In this case, performing merely rotational tools and analyzing correlation without considering the distance between samples are not enough for understanding the data.
To corporate spatial information into the analysis, k-means cluster with Euclidean distance matrices is performed on original data. A couple of different numbers of clusters were tried out and 10 clusters turned out yielding best results. In table 2, the group number and count of samples in each group is recorded. The k-means cluster results are then plotted over regional geological data in ArcGIS and the shown in figure 9.
The groups forming major patterns on the map are group 1 (red), 7 (orange), 9 (yellow) and 2 (forest-green). The four groups covered the majority of our 1711 samples data. Group 4 (pink) is highly associated with group 9, but group 4 consists less samples comparing to group 9. As displayed in the map (figure 9), we can tell the patterns created by group 1, 7, 9 and 2 is aligned with the direction geological groups and sequences. When zoomed in, the map provides the evidences that group 9 and group 2 are highly controlled by local geological structures (thrusts and normal faults). Group 1 is mainly held by lower Paleozoic Mackenzie Platform. Group 7 point distribution is highly associated with Mackenzie Mountain Supergroup and Group 9 and 4 distribution is associated with Windermere Supergroup (including rift-relating successions). Group 1 distribution in general relates to frontal thrust of Cordilleran Orogen.
Classification And Regression Tree Analysis
CART analysis is based on the group results from k-mean clusters, as explained in methodology page, k-means grouping results is regarded as a responsive variables to conduct the analysis. In figure 10, the CART regression tree is displayed and it can be tell that the most important factor impacting on the grouping is Mn concentration in the sample and the second most important variable is Ba. The results from CART analysis suggest that the first split has explained 49% of the variance.
Discussions and Conclusion
From all the multivariate analysis results obtained above, it is obvious that variances within the data set can be used to indicate the relationships between different variables. In both dendrogram and discriminant analysis results, Ca and Mg appeared to be within the same group, suggesting strong correlation between the two elements. Meanwhile, the Ca-Mg group has displayed a strongly negative correlation with all other group (targeted mineralization elements like Fe, Zn, Ag etc,. ), under both distance and rotation-based techniques. In this case, we could infer that the host rocks of the mineralization may lack of Ca and Mg concentrations, in other word, carbonate rocks like dolomite or limestone may not be able to hold mineralizations in Mackenzie Mountain area. The metal elements of interesting, such as Pb-Zn, Ni-Co-Mn, Cu-Cr-Fe, Ag, W have shown strong grouping patterns under multivariate analysis, indicating that the silt geochemical results are still reliable as a potential exploration tool even if the sediments of the area had been undergone all types of geological transformations.
Apparently, the groups created by rotation Discriminant Analysis results is quite different with the grouping based on distance based methods like k-means. However, the grouping of key variables like Ni, Cu, Ag, Zn, Ba, Mn turned out to be similar under both techniques, suggesting the inherent patterns underlying within the dataset.
By plotting k-means cluster results patterns shown in figure 9, we can conclude that the behaviors of variables within samples are highly associated with the geological groups and structures in Mackenzie Mountain area. Based on previous literature and local geological surveys, mineral resources of different ore types in Mackenzie Mountain area is controlled by local geological settings. In other word, it means that the multivariate analysis results of geochemistry data from silt samples can provide reliable hints for further explorations.
In sum, from this study, we can conclude that:
1) geochemistry data from the silt is reliable for geological exploration purposes;
2) Correlations among variables in the silt samples have proven to be meaningful to geological exploration at Mackenzie Mountain area;
3) Ca-Mg, Mn, Ba, Fe-S, Ni-Co-Mn are potential indicators for mineralizations.
From all the multivariate analysis results obtained above, it is obvious that variances within the data set can be used to indicate the relationships between different variables. In both dendrogram and discriminant analysis results, Ca and Mg appeared to be within the same group, suggesting strong correlation between the two elements. Meanwhile, the Ca-Mg group has displayed a strongly negative correlation with all other group (targeted mineralization elements like Fe, Zn, Ag etc,. ), under both distance and rotation-based techniques. In this case, we could infer that the host rocks of the mineralization may lack of Ca and Mg concentrations, in other word, carbonate rocks like dolomite or limestone may not be able to hold mineralizations in Mackenzie Mountain area. The metal elements of interesting, such as Pb-Zn, Ni-Co-Mn, Cu-Cr-Fe, Ag, W have shown strong grouping patterns under multivariate analysis, indicating that the silt geochemical results are still reliable as a potential exploration tool even if the sediments of the area had been undergone all types of geological transformations.
Apparently, the groups created by rotation Discriminant Analysis results is quite different with the grouping based on distance based methods like k-means. However, the grouping of key variables like Ni, Cu, Ag, Zn, Ba, Mn turned out to be similar under both techniques, suggesting the inherent patterns underlying within the dataset.
By plotting k-means cluster results patterns shown in figure 9, we can conclude that the behaviors of variables within samples are highly associated with the geological groups and structures in Mackenzie Mountain area. Based on previous literature and local geological surveys, mineral resources of different ore types in Mackenzie Mountain area is controlled by local geological settings. In other word, it means that the multivariate analysis results of geochemistry data from silt samples can provide reliable hints for further explorations.
In sum, from this study, we can conclude that:
1) geochemistry data from the silt is reliable for geological exploration purposes;
2) Correlations among variables in the silt samples have proven to be meaningful to geological exploration at Mackenzie Mountain area;
3) Ca-Mg, Mn, Ba, Fe-S, Ni-Co-Mn are potential indicators for mineralizations.