WDC for Geophysics, Beijing(中国地球物理学科中心)
 
   

Author-submitted data information


ID 704
Title Contaminated zircon dataset and Random Forest-based Element Recovery Algorithm
Creator Pengfei Lv
Subject Machine Learning, Geochemistry, Zircon, Solid Earth
Publisher Xiukuan Zhao
Description Data set S1 Data original. (separate file)
Zircons from granite with full REE, complied from Georoc Database (DIGIS TEAM, 2024)
Data set S2 Data processing. (separate file)
Input data for machine learning training, obtained by preprocessing Data set S1.
Data set S3 updated_data_RF. (separate file)
Data set S1 data filled by the trained RF model.
Data set S4 updated_data_SVR. (separate file)
Data set S1 data filled by the trained SVR model
Data set S5 updated_data_XGB. (separate file)
Data set S1 data filled by the trained XGB model
Data set S6 Jack_Hills. (separate file)
Jack Hills zircons with full REE, compiled from Bell et al. (2016)
Data set S7 updated_JH. (separate file)
Data set S6 data filled by the trained random forest model
Data set S8 Global_detrital_zircon. (separate file)
Detrital zircons with full REE, compiled from Balica et al. (2020)
Data set S9 updated_data_GDZ. (separate file)
Data set S8 data filled by the trained random forest model

Code:
This project contains two main scripts:
Code_Train
Train and evaluate models (Random Forest, XGBoost, and SVR) to predict rare earth element concentrations (e.g., La, Pr, Nd, Sm) using KFold cross-validation.
Code_Predicted
Use the previously trained models to predict and correct missing or low-quality rare earth element data (e.g., in `Jack_Hills.xlsx`)
More details please see README.md

Contributor Xinyu Zou
Date 13 March, 2025
Type
Format .xlsx, .py, .pkl
URL http://www.geophys.ac.cn/ArticleData/20250313PollutedZircons.zip
DOI 10.12197/2025GA005
Source
Language eng
Relation
Coverage
Rights Institute of Geology and Geophysics, Chinese Academy of Sciences