project

Plan:We plan to use an indirect method to probe and detect young planets in planet forming systems. Near-infrared imaging with Atacama Large Millimeter Array (ALMA) has provided lots of images of protoplanetary disks showing rings and distinct gaps in star systems like HL Tau. So using this data as well as other simulations already performed, we can train our model to predict existence of exoplanets. The paper we referred to took only the case of ring-planet interaction whereas a planet might not exist actually in that ring. Also that model is limited only to low mass planets. We would like to improve the training data so as to account for larger planets also.
Dataset and algorithm: Dataset is provided by ALMA. We try to make a deep neural network model using a multilayer perceptron.

Examples of image data by ALMA

References:
- Auddy, L. and Lin, K. 2020. A Machine Learning Model to Infer Planet Masses from Gaps Observed in Protoplanetary Disks. The astrophysical journal. September 2020 https://iopscience.iop.org/article/10.3847/1538-4357/aba95d/pdf
- Auddy, S., Dey, R., Lin, M. and Hall, C. 2021. DPNNet-2.0 Part I: Finding hidden planets from simulated images of protoplanetary disk gaps. https://arxiv.org/pdf/2107.09086.pdf

PROJECT MIDWAY

The deep neural network implemented mainly using sequential model from tensorflow keras model and has an input layer of six variables such as gap width, aspect ratio, viscosity, dust to gas ratio, stokes number and density profile. This is followed by two hidden layers of nodes numbering 256 and 128. And gives a single node output which is Planet Mass. The base code is given as follows;

The primary part of the project involves the designing of a convalutional neural network(CNN) for image recognition from the data which has been obtained from ALMA. For that we are currently trying to use the Network type RESNET50 whose structure we have studied. Possible use of ALEXNET if we can obtain better results. For optimizer in keras we are using Adam which uses a stochastic gradient descent method. We might try Adadelta if time permits.

Isolation Forest

Isolation forest algorithm has been used to attempt to classify whether the disks are host to planets or not. This has been implemented using sklearn library. Functions used include make_pipeline , StandardScaler, GradientBoostingRegressor and IsolationForest. Anomalies popped up as -1 and were compared for efficiency by looking at Dust Gap as the parameter. Initial efficiencies are low and we plan to improve it by tweaking the parameters accordingly.

Contribution From Papers

References:

Yamaguchi, M., Tsukagoshi, T., Muto, T., Nomura, H., Nakazato, T., Ikeda, S., Tamura, M. and Kawabe,R. 2021.ALMA Super-resolution Imaging of T Tau: r = 12 au Gap in the Compact Dust Disk around T Tau N. https://arxiv.org/pdf/2110.00974.pdf
Yamaguchi, M., Akiyama, K., Tsukagoshi, T., Muto, T., Kataoka, A., Tazaki, F., Ikeda8, S., Fukagawa1, M., Honma, M., and Kawabe1, R. 2020. Super-resolution Imaging of the Protoplanetary Disk HD 142527 Using Sparse Modeling. https://iopscience.iop.org/article/10.3847/1538-4357/ab899f/pdf

For image processing most of the data was collected from ALMA (primarily DSHARP) and a few other sources. Some of these images were in the form of raw data and had to undergo processing to obtain the final image. Processing of the data was done by NASA’s HEASARC: Software for the raw images. Since the image data was unlabelled, we undertook literature survey to collect data regarding the disks. We had a total of close to 120 images of protostellar disks out of which close to 60 of them were labelled with the planet mass.

The images obtained from ALMA are observed to be quite blurred due to which the disk substructures are not clearly visible. This can hamper the results. Hence to solve this problem we undertake super resolution techniques to increase the resolution and clarity of the images. For this initially we had planned to implement sparse modelling but later used other two algorithms called BSRGAN and SwinIR. BSRGAN is a degradation model used to synthesize Low Resolution images and SwinIR does image restoration using Swin Transformer.

This is the schematic illustration of the BSRGAN degradation model. BSRGAN is a blind super-resolver which is trained with paired low resolution/high resolution images. In this the degradation sequence is randomly shuffled where B_iso , B_aniso refers to isotropic and anisotropic Gaussian kernels. Ds is the downsampling operation and N represents different types of noise.
These algorithms were applied for all images. These are the examples:

ResNet50 is a variant of ResNet model which has 48 Convolution layers along with 1 MaxPool and 1 Average Pool layer. Using ResNet50 we can train very deep models and still obtain good accuracy. As number of layers increase deep neural networks accuracy saturates and then starts degrading rapidly. This is not due to overfitting as training error also increases. This is rectified by creating a deeper layer which has layers from the shallow model and then identity layers are added to it. This is the architecture of ResNet50 algorithm:

Another part of the project involved the prediction of other disk parameters from the image of the disk. For this we took up equations from Lodato et al. (2019) and Kanagawa et al. (2016). We have mainly focused on the prediction of viscosity of the disk, the dust gap width and the gas gap.
The equation used in Lodato et al. is an empirical relation to infer planet masses from observed dust gap widths. The equation is:

where w_d is the gas width, h₀ is the disk’s local aspect ratio and ⍺ is the viscosity parameter.
To predict the relationship between the planet mass and disk parameters we used polynomial regression. The idea is that when we obtain the planet mass from the CNN-ResNet50 we can find the dust gap, gas width and viscosity based on the predictions offered by these polynomial regressions. For this we use the pre-processed dataset which we had worked on before. The dataset is created using the numeric data used earlier. The general code for these polynomial regressions are:

Now to present an example case we take the protostellar disk V883 Orionis. V883 Orionis is a protostar in the constellation of Orion. It is assumed to be a member of the Orion Nebula cluster at 414±7 pc. The true mass of primary planet in the disk is 0.897 Mꙩ . The predicted values are:
Planet Mass Prediction = 1.0698 Mꙩ
Gap Gap = 0.1408
Dust Gap = 0.5715
Viscosity of disk = 0.00466
The predictions for other disk parameters other than Planet mass cannot be confirmed yet as research is ongoing.

We plan to run the CNN for Planet mass estimation after converting images to grayscale. This can help in faster implementation and we assume that patterns will be better recognised in greyscale (currently being implemented). Another idea is to train labelled simulation data along with labelled image data to increase the number of training datasets to predict properties of the disk. (Probable implementation using image generation GAN) This model can be improved if mixed data (numerical/categorical and image data) is available for all protoplanetary disks or if it can be simulated. We believe an MLP-CNN algorithm can better predict other disk properties than what we have presented here.

References

https://iopscience.iop.org/article/10.3847/1538-4357/aba95d/pdf-A Machine Learning Model to Infer Planet Masses from Gaps Observed in Protoplanetary Disks
https://arxiv.org/pdf/2107.09086.pdf - Finding hidden planets from simulated images of protoplanetary disk gaps
https://arxiv.org/abs/2110.00974 - ALMA Super-resolution Imaging of T Tau: r = 12 au Gap in the Compact Dust Disk around T Tau N.
https://academic.oup.com/pasj/article/68/3/43/2223288 - Mass constraint for a planet in a protoplanetary disk from the gap width
https://academic.oup.com/mnras/article-abstract/486/1/453/5423333?redirectedFrom=fulltext - The newborn planet population emerging from ring-like structures in discs

Detection of Planets from Gaps in Protoplanetary Disks

Project as a part of CS460- Machine Learning(2021)

PROJECT PROPOSAL

PROJECT MIDWAY

Isolation Forest

Contribution From Papers

PROJECT FINAL

RESULTS AND INFERENCES