{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# RUBIX pipeline in steps\n",
    "\n",
    "`RUBIX` is designed as a linear pipeline, where the individual functions are called and constructed as a pipeline. This allows as to execude the whole data transformation from a cosmological hydrodynamical simulation of a galaxy to an IFU cube in two lines of code. To get a better sense, what is happening during the execution of the pipeline, this notebook splits the pipeline in small steps.\n",
    "\n",
    "This notebook contains the functions that are called inside the rubix pipeline. To see, how the pipeline is execuded, we refer to the notebook `rubix_pipeline_single_function.ipynb`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 1: Config\n",
    "\n",
    "The `config` contains all the information needed to run the pipeline. Those are run specfic configurations. Currently we just support Illustris as simulation, but extensions to other simulations (e.g. NIHAO) are planned.\n",
    "\n",
    "For the `config` you can choose the following options:\n",
    "- `pipeline`: you specify the name of the pipeline that is stored in the yaml file in rubix/config/pipeline_config.yml\n",
    "- `logger`: RUBIX has implemented a logger to report the user, what is happening during the pipeline execution and give warnings\n",
    "- `data - args - particle_type`: load only stars particle (\"particle_type\": [\"stars\"]) or only gas particle (\"particle_type\": [\"gas\"]) or both (\"particle_type\": [\"stars\",\"gas\"])\n",
    "- `data - args - simulation`: choose the Illustris simulation (e.g. \"simulation\": \"TNG50-1\")\n",
    "- `data - args - snapshot`: which time step of the simulation (99 for present day)\n",
    "- `data - args - save_data_path`: set the path to save the downloaded Illustris data\n",
    "- `data - load_galaxy_args - id`: define, which Illustris galaxy is downloaded\n",
    "- `data - load_galaxy_args - reuse`: if True, if in th esave_data_path directory a file for this galaxy id already exists, the downloading is skipped and the preexisting file is used\n",
    "- `data - subset`: only a defined number of stars/gas particles is used and stored for the pipeline. This may be helpful for quick testing\n",
    "- `simulation - name`: currently only IllustrisTNG is supported\n",
    "- `simulation - args - path`: where the data is stored and how the file will be named\n",
    "- `output_path`: where the hdf5 file is stored, which is then the input to the RUBIX pipeline\n",
    "- `telescope - name`: define the telescope instrument that is observing the simulation. Some telescopes are predefined, e.g. MUSE. If your instrument does not exist predefined, you can easily define your instrument in rubix/telescope/telescopes.yaml\n",
    "- `telescope - psf`: define the point spread function that is applied to the mock data\n",
    "- `telescope - lsf`: define the line spread function that is applied to the mock data\n",
    "- `telescope - noise`: define the noise that is applied to the mock data\n",
    "- `cosmology`: specify the cosmology you want to use, standard for RUBIX is \"PLANCK15\"\n",
    "- `galaxy - dist_z`: specify at which redshift the mock-galaxy is observed\n",
    "- `galaxy - rotation`: specify the orientation of the galaxy. You can set the types edge-on or face-on or specify the angles alpha, beta and gamma as rotations around x-, y- and z-axis\n",
    "- `ssp - template`: specify the simple stellar population lookup template to get the stellar spectrum for each stars particle. In RUBIX frequently \"BruzualCharlot2003\" is used."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "import os\n",
    "config = {\n",
    "    \"pipeline\":{\"name\": \"calc_ifu\"},\n",
    "    \n",
    "    \"logger\": {\n",
    "        \"log_level\": \"DEBUG\",\n",
    "        \"log_file_path\": None,\n",
    "        \"format\": \"%(asctime)s - %(name)s - %(levelname)s - %(message)s\",\n",
    "    },\n",
    "    \"data\": {\n",
    "        \"name\": \"IllustrisAPI\",\n",
    "        \"args\": {\n",
    "            \"api_key\": os.environ.get(\"ILLUSTRIS_API_KEY\"),\n",
    "            \"particle_type\": [\"stars\"],\n",
    "            \"simulation\": \"TNG50-1\",\n",
    "            \"snapshot\": 99,\n",
    "            \"save_data_path\": \"data\",\n",
    "        },\n",
    "        \n",
    "        \"load_galaxy_args\": {\n",
    "        \"id\": 12,\n",
    "        \"reuse\": True,\n",
    "        },\n",
    "\n",
    "        \"subset\": {\n",
    "            \"use_subset\": True,\n",
    "            \"subset_size\": 1000,\n",
    "        },\n",
    "    },\n",
    "    \"simulation\": {\n",
    "        \"name\": \"IllustrisTNG\",\n",
    "        \"args\": {\n",
    "            \"path\": \"data/galaxy-id-12.hdf5\",\n",
    "        },\n",
    "    \n",
    "    },\n",
    "    \"output_path\": \"output\",\n",
    "\n",
    "    \"telescope\":\n",
    "        {\"name\": \"MUSE\",\n",
    "         \"psf\": {\"name\": \"gaussian\", \"size\": 5, \"sigma\": 0.6},\n",
    "         \"lsf\": {\"sigma\": 0.5},\n",
    "         \"noise\": {\"signal_to_noise\": 1,\"noise_distribution\": \"normal\"},},\n",
    "        \n",
    "    \"cosmology\":\n",
    "        {\"name\": \"PLANCK15\"},\n",
    "        \n",
    "    \"galaxy\":\n",
    "        {\"dist_z\": 0.1,\n",
    "         \"rotation\": {\"type\": \"edge-on\"},\n",
    "        },\n",
    "    \"ssp\": {\n",
    "        \"template\": {\n",
    "            \"name\": \"BruzualCharlot2003\"\n",
    "        },\n",
    "    },    \n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 2: RUBIX data format\n",
    "\n",
    "First, we have to download the simulation data from the Illustris webpage and store it and transform it to the `rubixdata` format. The `rubixdata` format is a unige format for the `pipeline`. Any simulated galaxy can be transformed in the `rubixdata` format, which enables `RUBIX` to deal with all kind of cosmological hydrodynamical simulations of galaxies. For more deatails of this step, please have a look in the notebook `create_rubix_data.ipynb`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "from rubix.core.data import convert_to_rubix, prepare_input\n",
    "\n",
    "convert_to_rubix(config) # Convert the config to rubix format and store in output_path folder\n",
    "rubixdata = prepare_input(config) # Prepare the input for the pipeline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can simply access the data of the galaxy, e.g. the stellar coordinates by `rubixdata.stars.coords`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "import matplotlib.pyplot as plt\n",
    "# Make a scatter plot of the stars coordinates\n",
    "plt.scatter(rubixdata.stars.coords[:,0], rubixdata.stars.coords[:,1], s=1)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 3: Rotation\n",
    "\n",
    "In the `config` we specify, how the galaxy should be orientated. In this example we want to orientate the galaxy `edge-on`. We plot the coordinates again and see that they are now rotated."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "from rubix.core.rotation import get_galaxy_rotation\n",
    "rotate = get_galaxy_rotation(config)\n",
    "\n",
    "rubixdata = rotate(rubixdata)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Make a scatter plot of the stars coordinates after rotation\n",
    "plt.scatter(rubixdata.stars.coords[:,0], rubixdata.stars.coords[:,1], s=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 4: Filter particles\n",
    "\n",
    "All particles outside field of view of the telescope are filtered. This has to be done, because we later bin the particles to the IFU grid and particles outside the arperture would make strange artefacts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "from rubix.core.telescope import get_filter_particles\n",
    "filter_particles = get_filter_particles(config)\n",
    "\n",
    "rubixdata = filter_particles(rubixdata)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 5: Spaxel assignment\n",
    "\n",
    "We have an telescope aperture and a spatial resolution, which results in a spatial grid for the IFU cube. We can now assign the stars particles to the different spaxels in the IFU cube, i.e. define to which spaxel the stellar light of each stars particle contribute."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "from rubix.core.telescope import get_spaxel_assignment\n",
    "bin_particles = get_spaxel_assignment(config)\n",
    "\n",
    "rubixdata = bin_particles(rubixdata)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 6: Reshape data\n",
    "\n",
    "At the moment we have to reshape the rubix data that we can split the data on multiple GPUs. We plan to move from pmap to shard_map. Then this step should not be necessary any more. This step has purely computational reason and no physics motivated reason."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "from rubix.core.data import get_reshape_data\n",
    "reshape_data = get_reshape_data(config)\n",
    "\n",
    "rubixdata = reshape_data(rubixdata)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 7: Spectra calculation\n",
    "\n",
    "This is the heart of the `pipeline`. Now we do the lookup for the spectrum for each stellar particle. For the simple stellar population model by `BruzualCharlot2003`, each stellar particle gets a spectrum assigned based on its age and metallicity. In the plot we can see that the spectrum differs for different stellar particles."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "from rubix.core.ifu import get_calculate_spectra\n",
    "calcultae_spectra = get_calculate_spectra(config)\n",
    "\n",
    "rubixdata = calcultae_spectra(rubixdata)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "import jax.numpy as jnp\n",
    "\n",
    "plt.plot(jnp.arange(len(rubixdata.stars.spectra[0][0][:])), rubixdata.stars.spectra[0][0][:])\n",
    "plt.plot(jnp.arange(len(rubixdata.stars.spectra[0][0][:])), rubixdata.stars.spectra[0][1][:])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 8: Scaling by mass\n",
    "\n",
    "The stellar spectra have to be scaled by the stellar mass. Later heavier stellar particles should contribute more to the spectrum in a spaxel than lighter stellar particles."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "from rubix.core.ifu import get_scale_spectrum_by_mass\n",
    "scale_spectrum_by_mass = get_scale_spectrum_by_mass(config)\n",
    "\n",
    "rubixdata = scale_spectrum_by_mass(rubixdata)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 9: Doppler shifting and resampling\n",
    "\n",
    "The stellar particles are not at rest and therefore the emitted light is doppler shifted with respect to the observer. Before adding all stellar spectra in each spaxel, we dopplershift the spectra according to their particle velocity and we resample the spectra to the wavelength grid of the observing instrument."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "from rubix.core.ifu import get_doppler_shift_and_resampling\n",
    "doppler_shift_and_resampling = get_doppler_shift_and_resampling(config)\n",
    "\n",
    "rubixdata = doppler_shift_and_resampling(rubixdata)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "from rubix.core.pipeline import RubixPipeline \n",
    "\n",
    "pipe = RubixPipeline(config)\n",
    "\n",
    "wave = pipe.telescope.wave_seq\n",
    "print(wave)\n",
    "print(rubixdata.stars.spectra[0][0][:])\n",
    "\n",
    "plt.plot(wave, rubixdata.stars.spectra[0][0][:])\n",
    "plt.plot(wave, rubixdata.stars.spectra[0][1][:])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 10: Datacube\n",
    "\n",
    "Now we can add all stellar spectra that contribute to one spaxel and get the IFU datacube. The plot shows the spatial dimension of the `datacube`, where we summed over the wavelength dimension."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "from rubix.core.ifu import get_calculate_datacube\n",
    "calculate_datacube = get_calculate_datacube(config)\n",
    "\n",
    "rubixdata = calculate_datacube(rubixdata)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "datacube = rubixdata.stars.datacube\n",
    "img = datacube.sum(axis=2)\n",
    "plt.imshow(img, origin=\"lower\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 11: PSF\n",
    "\n",
    "The instrument and the earth athmosphere affect the spatial resolution of the observation data and smooth in spatial dimention. To take this effect into account we convolve our datacube with a point spread function (PSF)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "from rubix.core.psf import get_convolve_psf\n",
    "convolve_psf = get_convolve_psf(config)\n",
    "\n",
    "rubixdata = convolve_psf(rubixdata)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "datacube = rubixdata.stars.datacube\n",
    "img = datacube.sum(axis=2)\n",
    "plt.imshow(img, origin=\"lower\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "plt.plot(wave, datacube[12,12,:])\n",
    "plt.plot(wave, datacube[0,0,:])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 11: LSF\n",
    "\n",
    "The instrument affects the spectral resolution of the observation data and smooth in spectral dimention. To take this effect into account we convolve our datacube with a line spread function (LSF)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "from rubix.core.lsf import get_convolve_lsf\n",
    "convolve_lsf = get_convolve_lsf(config)\n",
    "\n",
    "rubixdata = convolve_lsf(rubixdata)\n",
    "\n",
    "plt.plot(wave, rubixdata.stars.datacube[12,12,:])\n",
    "plt.plot(wave, rubixdata.stars.datacube[0,0,:])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 12: Noise\n",
    "\n",
    "Observational data are never noise-free. We apply noise to our mock-datacube to mimic real measurements."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "from rubix.core.noise import get_apply_noise\n",
    "apply_noise = get_apply_noise(config)\n",
    "\n",
    "rubixdata = apply_noise(rubixdata)\n",
    "\n",
    "datacube = rubixdata.stars.datacube\n",
    "img = datacube.sum(axis=2)\n",
    "plt.imshow(img, origin=\"lower\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NBVAL_SKIP\n",
    "plt.plot(wave, rubixdata.stars.datacube[12,12,:])\n",
    "plt.plot(wave, rubixdata.stars.datacube[0,0,:])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## DONE!\n",
    "\n",
    "Congratulations, you have now created step by step your own mock-observed IFU datacube! Now enjoy playing around with the RUBIX pipeline and enjoy doing amazing science with RUBIX :)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "rubix",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.14"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}