{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "muslim-knight",
   "metadata": {},
   "source": [
    "# Superruns\n",
    "\n",
    "### Basic concept of a superrun:\n",
    "\n",
    "A superrun is made up of many regular runs  and helps us therefore to organize data in logic units and to load it faster. In the following notebook we will give some brief examples how superruns work and can be used to make analysts lives easier.\n",
    "\n",
    "\n",
    "Let's get started how we can define superruns. The example I demonstrate here is based on some dummy Record and Peak plugins. But it works in the same way for regular data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "mineral-indianapolis",
   "metadata": {},
   "outputs": [],
   "source": [
    "import strax\n",
    "import straxen"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "flush-smith",
   "metadata": {},
   "source": [
    "### Define context and create some dummy data:\n",
    "\n",
    "In the subsequent cells I create a dummy context and write some dummy-data. You can either read through it if you are interested or skip until **Define a superrun**. For the working examples on superruns you only need to know:\n",
    "\n",
    "* Superruns can be created with any of our regular online and offline contexts. \n",
    "* In the two cells below I define 3 runs and records for the run_ids 0, 1, 2. \n",
    "* The constituents of a superrun are called subruns which we call runs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "biblical-shame",
   "metadata": {},
   "outputs": [],
   "source": [
    "from strax.testutils import Records, Peaks, PeakClassification\n",
    "\n",
    "superrun_name = \"_superrun_test\"\n",
    "st = strax.Context(\n",
    "    storage=[\n",
    "        strax.DataDirectory(\n",
    "            \"./strax_data\", provide_run_metadata=True, readonly=False, deep_scan=True\n",
    "        )\n",
    "    ],\n",
    "    register=[Records, Peaks, PeakClassification],\n",
    "    config={\"bonus_area\": 42},\n",
    ")\n",
    "st.set_context_config({\"use_per_run_defaults\": False})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "suffering-burning",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "518184b9089146f083b20910b87d4b86",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Loading peaks: |          | 0.00 % [00:00<?]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "6d60889729cb4a0e8749485066d69a96",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Loading peaks: |          | 0.00 % [00:00<?]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "cbe097a3943f48d88f3d3b8089c269e5",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Loading peaks: |          | 0.00 % [00:00<?]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import datetime\n",
    "import pytz\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "import json\n",
    "from bson import json_util\n",
    "\n",
    "\n",
    "def _write_run_doc(context, run_id, time, endtime):\n",
    "    \"\"\"Function which writes a dummy run document.\"\"\"\n",
    "    run_doc = {\"name\": run_id, \"start\": time, \"end\": endtime}\n",
    "    with open(context.storage[0]._run_meta_path(str(run_id)), \"w\") as fp:\n",
    "        json.dump(run_doc, fp, sort_keys=True, indent=4, default=json_util.default)\n",
    "\n",
    "\n",
    "offset_between_subruns = 10\n",
    "\n",
    "now = datetime.datetime.now()\n",
    "now.replace(tzinfo=pytz.utc)\n",
    "subrun_ids = [str(r) for r in range(3)]\n",
    "\n",
    "for run_id in subrun_ids:\n",
    "    rr = st.get_array(run_id, \"peaks\")\n",
    "    time = np.min(rr[\"time\"])\n",
    "    endtime = np.max(strax.endtime(rr))\n",
    "\n",
    "    _write_run_doc(\n",
    "        st,\n",
    "        run_id,\n",
    "        now + datetime.timedelta(0, int(time)),\n",
    "        now + datetime.timedelta(0, int(endtime)),\n",
    "    )\n",
    "\n",
    "    st.set_config({\"secret_time_offset\": endtime + offset_between_subruns})  # untracked option\n",
    "    assert st.is_stored(run_id, \"peaks\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "departmental-ceremony",
   "metadata": {},
   "source": [
    "If we print now the lineage and hash for the three runs you will see it is equivalent to our regular data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "catholic-danish",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2-peaks-xia2iit6vb\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "{'peaks': ('Peaks',\n",
       "  '0.0.0',\n",
       "  {'bonus_area': 42, 'base_area': 0, 'give_wrong_dtype': False}),\n",
       " 'records': ('Records', '0.0.0', {'crash': False, 'dummy_tracked_option': 42})}"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "print(st.key_for(\"2\", \"peaks\"))\n",
    "st.key_for(\"2\", \"peaks\").lineage"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "inner-morris",
   "metadata": {},
   "source": [
    "### Metadata of our subruns: \n",
    "\n",
    "To understand a bit better how our dummy data looks like we can have a look into the metadata for a single run. Each subrun is made of 10 chunks each containing 10 waveforms in 10 different channels. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "suited-shelter",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'chunk_target_size_mb': 200,\n",
       " 'chunks': [{'chunk_i': 0,\n",
       "   'end': 50,\n",
       "   'filename': 'peaks-xia2iit6vb-000000',\n",
       "   'filesize': 1323,\n",
       "   'first_endtime': 41,\n",
       "   'first_time': 40,\n",
       "   'last_endtime': 50,\n",
       "   'last_time': 49,\n",
       "   'n': 100,\n",
       "   'nbytes': 223500,\n",
       "   'run_id': '2',\n",
       "   'start': 40,\n",
       "   'subruns': None}],\n",
       " 'compressor': 'blosc',\n",
       " 'data_kind': 'peaks',\n",
       " 'data_type': 'peaks',\n",
       " 'dtype': \"[(('Start time since unix epoch [ns]', 'time'), '<i8'), (('Length of the interval in samples', 'length'), '<i4'), (('Width of one sample [ns]', 'dt'), '<i4'), (('Channel/PMT number', 'channel'), '<i2'), (('Classification of the peak(let)', 'type'), '|i1'), (('Integral across channels [PE]', 'area'), '<f4'), (('Integral per channel [PE]', 'area_per_channel'), '<f4', (100,)), (('Number of hits contributing at least one sample to the peak ', 'n_hits'), '<i4'), (('Waveform data in PE/sample (not PE/ns!)', 'data'), '<f4', (200,)), (('Waveform data in PE/sample (not PE/ns!), top array', 'data_top'), '<f4', (200,)), (('Peak widths in range of central area fraction [ns]', 'width'), '<f4', (11,)), (('Peak widths: time between nth and 5th area decile [ns]', 'area_decile_from_midpoint'), '<f4', (11,)), (('Does the channel reach ADC saturation?', 'saturated_channel'), '|i1', (100,)), (('Total number of saturated channels', 'n_saturated_channels'), '<i2'), (('Channel within tight range of mean', 'tight_coincidence'), '<i2'), (('Largest gap between hits inside peak [ns]', 'max_gap'), '<i4'), (('Maximum interior goodness of split', 'max_goodness_of_split'), '<f4'), (('Largest time difference between apexes of hits inside peak [ns]', 'max_diff'), '<i4'), (('Smallest time difference between apexes of hits inside peak [ns]', 'min_diff'), '<i4')]\",\n",
       " 'end': 50,\n",
       " 'lineage': {'peaks': ['Peaks',\n",
       "   '0.0.0',\n",
       "   {'base_area': 0, 'bonus_area': 42, 'give_wrong_dtype': False}],\n",
       "  'records': ['Records',\n",
       "   '0.0.0',\n",
       "   {'crash': False, 'dummy_tracked_option': 42}]},\n",
       " 'lineage_hash': 'xia2iit6vb',\n",
       " 'run_id': '2',\n",
       " 'start': 40,\n",
       " 'strax_version': '1.6.5',\n",
       " 'writing_ended': 1724569392.7568176,\n",
       " 'writing_started': 1724569392.7369668}"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "st.get_metadata(\"2\", \"peaks\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "charming-output",
   "metadata": {},
   "source": [
    "### Define a superrun:\n",
    "\n",
    "Defining a superrun is quite simple one has to call:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "accepted-routine",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "superrun_name:  _superrun_test \n",
      "subrun_ids:  ['0', '1', '2']\n"
     ]
    }
   ],
   "source": [
    "st.define_run(superrun_name, subrun_ids)\n",
    "print(\"superrun_name: \", superrun_name, \"\\nsubrun_ids: \", subrun_ids)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "prepared-strength",
   "metadata": {},
   "source": [
    "where the first argument is a string specifying the name of the superrun e.g. `_Kr83m_20200816`. Please note that superrun names must start with an underscore. \n",
    "\n",
    "The second argument is a list of run_ids of subruns the superrun should be made of. Please note that the definition of a superrun does not need any specification of a data_kind like peaks or event_info because it is a \"run\".\n",
    "\n",
    "By default, it is only allowed to store new runs under the usere's specified strax_data directory. In this example it is simply `./strax_data` and the run_meta data can be looked at via:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "basic-processing",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'comments': [{'comment': ''}],\n",
       " 'end': datetime.datetime(2024, 8, 25, 2, 5, 48, 884000),\n",
       " 'livetime': 30.0,\n",
       " 'mode': [''],\n",
       " 'name': '_superrun_test',\n",
       " 'source': [''],\n",
       " 'start': datetime.datetime(2024, 8, 25, 2, 4, 58, 884000),\n",
       " 'sub_run_spec': {'0': 'all', '1': 'all', '2': 'all'},\n",
       " 'tags': [{'name': ''}]}"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "st.run_metadata(\"_superrun_test\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "egyptian-alpha",
   "metadata": {},
   "source": [
    "The superrun-metadata contains a list of all subruns making up the superrun, the start and end time (in milliseconds) of the corresponding collections of runs and its naive livetime in nanoseconds without any corrections for deadtime.\n",
    "\n",
    "Please note that in the presented example the time difference between start and end time is 50 s while the live time is only about 30 s. This comes from the fact that I defined the time between two runs to be 10 s. It should be always kept in mind for superruns that livetime is not the same as the end - start of the superrun."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "linear-elite",
   "metadata": {},
   "source": [
    "The superun will appear in the run selection as any other run:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "delayed-reggae",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>number</th>\n",
       "      <th>mode</th>\n",
       "      <th>source</th>\n",
       "      <th>tags</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>0.0</td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>1.0</td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>2.0</td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>_superrun_test</td>\n",
       "      <td>NaN</td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>024399</td>\n",
       "      <td>24399.0</td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             name   number mode source tags\n",
       "0               0      0.0                 \n",
       "1               1      1.0                 \n",
       "2               2      2.0                 \n",
       "3  _superrun_test      NaN                 \n",
       "4          024399  24399.0                 "
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "st.select_runs()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "korean-encounter",
   "metadata": {},
   "source": [
    "### Loading data with superruns:\n",
    "\n",
    "Loading superruns can be done in two different ways. Lets try first the already implemented approach and compare the data with loading the individual runs separately:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "differential-rocket",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "d7a3a5f7d34e4c57a95e25fe2c7bbaa5",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Loading 3 runs:   0%|          | 0/3 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "aa104c2ff8b94ae4951c8571cceaf5e7",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Loading 3 runs:   0%|          | 0/3 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "0a663f8ceb064dd9a082758c9df23571",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Loading peaks: |          | 0.00 % [00:00<?]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "sub_runs = st.get_array(subrun_ids, \"peaks\")  # Loading all subruns individually like we are used to\n",
    "superrun = st.get_array(superrun_name, \"peaks\")  # Loading the superrun\n",
    "assert np.all(sub_runs[\"time\"] == superrun[\"time\"])  # Comparing if the data is the same"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "inclusive-weather",
   "metadata": {},
   "source": [
    "To increase the loading speed it can be allowed to skip the lineage check of the individual subruns:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "muslim-schedule",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "5c3c4b0a881a43dcb619f1a9ce5f5f47",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Loading 3 runs:   0%|          | 0/3 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Unknown config option _check_lineage_per_run_id; will do nothing.\n",
      "Invalid context option _check_lineage_per_run_id; will do nothing.\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "1bbf0e929fd14100b94589100c0e3d63",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Loading peaks: |          | 0.00 % [00:00<?]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "sub_runs = st.get_array(subrun_ids, \"peaks\")\n",
    "superrun = st.get_array(superrun_name, \"peaks\", _check_lineage_per_run_id=False)\n",
    "assert np.all(sub_runs[\"time\"] == superrun[\"time\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "approximate-reality",
   "metadata": {},
   "source": [
    "So how does this magic work? Under the hood a superrun first checks if the data of the different subruns has been created before. If not it will make the data for you. After that the data of the individual runs is loaded.\n",
    "\n",
    "The loading speed can be further increased if we rechunk and write the data of our superrun as \"new\" data to disk. This can be done easily for light weight data_types like peaks and above. Further, this allows us to combine multiple data_types if the same data_kind, like for example `event_info` and `cuts`."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aboriginal-bedroom",
   "metadata": {},
   "source": [
    "### Writing a \"new\" superrun:\n",
    "\n",
    "To write a new superrun one has to set the corresponding context setting to true:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "aggressive-allah",
   "metadata": {},
   "outputs": [],
   "source": [
    "st.set_context_config({\"write_superruns\": True})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "fifteen-history",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "st.is_stored(superrun_name, \"peaks\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "contrary-nursing",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "st.make(superrun_name, \"peaks\")\n",
    "st.is_stored(superrun_name, \"peaks\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "numerical-uganda",
   "metadata": {},
   "source": [
    "Lets see if the data is the same:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "documentary-granny",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "df6afaf88d8446eca4eb6fd50a55160a",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Loading 3 runs:   0%|          | 0/3 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Unknown config option _check_lineage_per_run_id; will do nothing.\n",
      "Invalid context option _check_lineage_per_run_id; will do nothing.\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "f988a903d0ef48cda7b2632808746036",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Loading peaks: |          | 0.00 % [00:00<?]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "sub_runs = st.get_array(subrun_ids, \"peaks\")\n",
    "superrun = st.get_array(superrun_name, \"peaks\", _check_lineage_per_run_id=False)\n",
    "assert np.all(sub_runs[\"time\"] == superrun[\"time\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "decent-marijuana",
   "metadata": {},
   "source": [
    "And the data will now shown as available in select runs:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "periodic-magnet",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "389f7e5d940b4c51929295fff837da88",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Checking data availability:   0%|          | 0/1 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>name</th>\n",
       "      <th>number</th>\n",
       "      <th>mode</th>\n",
       "      <th>source</th>\n",
       "      <th>tags</th>\n",
       "      <th>peaks_available</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>0.0</td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>1.0</td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>2.0</td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>_superrun_test</td>\n",
       "      <td>NaN</td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             name  number mode source tags  peaks_available\n",
       "0               0     0.0                              True\n",
       "1               1     1.0                              True\n",
       "2               2     2.0                              True\n",
       "3  _superrun_test     NaN                              True"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "st.select_runs(available=(\"peaks\",))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "conservative-pencil",
   "metadata": {},
   "source": [
    "If a some data does not exist for a super run we can simply created it via the superrun_id. This will not only create the data of the rechunked superrun but also the data of the subrungs if not already stored:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "authentic-marijuana",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "st.is_stored(subrun_ids[0], \"peak_classification\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "continental-baptist",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "st.make(superrun_name, \"peak_classification\")\n",
    "st.is_stored(subrun_ids[0], \"peak_classification\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "portuguese-imaging",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "998fb135d8014180ba99faaece452db2",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Loading peak_classification: |          | 0.00 % [00:00<?]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "peaks = st.get_array(superrun_name, \"peak_classification\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "considerable-flavor",
   "metadata": {},
   "source": [
    "**Some developer information:** \n",
    "\n",
    "In case of a stored and rechunked superruns every chunk has also now some additional information about the individual subruns it is made of:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "norman-trouble",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "706bbdbc6b614d9c8b2a74025bd94974",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Loading peaks: |          | 0.00 % [00:00<?]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "({'0': {'end': 10, 'start': 0},\n",
       "  '1': {'end': 30, 'start': 20},\n",
       "  '2': {'end': 50, 'start': 40}},\n",
       " '_superrun_test')"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "for chunk in st.get_iter(superrun_name, \"peaks\"):\n",
    "    chunk\n",
    "chunk.subruns, chunk.run_id"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "piano-invitation",
   "metadata": {},
   "source": [
    "The same goes for the meta data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "expanded-inspection",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'chunk_i': 0,\n",
       "  'end': 50,\n",
       "  'filename': 'peaks-xia2iit6vb-000000',\n",
       "  'filesize': 3338,\n",
       "  'first_endtime': 1,\n",
       "  'first_time': 0,\n",
       "  'last_endtime': 50,\n",
       "  'last_time': 49,\n",
       "  'n': 300,\n",
       "  'nbytes': 670500,\n",
       "  'run_id': '_superrun_test',\n",
       "  'start': 0,\n",
       "  'subruns': {'0': {'end': 10, 'start': 0},\n",
       "   '1': {'end': 30, 'start': 20},\n",
       "   '2': {'end': 50, 'start': 40}}}]"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "st.get_metadata(superrun_name, \"peaks\")[\"chunks\"]"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.19"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
   "autoclose": false,
   "autocomplete": true,
   "bibliofile": "biblio.bib",
   "cite_by": "apalike",
   "current_citInitial": 1,
   "eqLabelWithNumbers": true,
   "eqNumInitial": 1,
   "hotkeys": {
    "equation": "Ctrl-E",
    "itemize": "Ctrl-I"
   },
   "labels_anchors": false,
   "latex_user_defs": false,
   "report_style_numbering": false,
   "user_envs_cfg": false
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}