{ "cells": [ { "cell_type": "markdown", "id": "muslim-knight", "metadata": {}, "source": [ "# Superruns\n", "\n", "### Basic concept of a superrun:\n", "\n", "A superrun is made up of many regular runs and helps us therefore to organize data in logic units and to load it faster. In the following notebook we will give some brief examples how superruns work and can be used to make analysts lives easier.\n", "\n", "\n", "Let's get started how we can define superruns. The example I demonstrate here is based on some dummy Record and Peak plugins. But it works in the same way for regular data." ] }, { "cell_type": "code", "execution_count": 1, "id": "mineral-indianapolis", "metadata": {}, "outputs": [], "source": [ "import strax\n", "import straxen" ] }, { "cell_type": "markdown", "id": "flush-smith", "metadata": {}, "source": [ "### Define context and create some dummy data:\n", "\n", "In the subsequent cells I create a dummy context and write some dummy-data. You can either read through it if you are interested or skip until **Define a superrun**. For the working examples on superruns you only need to know:\n", "\n", "* Superruns can be created with any of our regular online and offline contexts. \n", "* In the two cells below I define 3 runs and records for the run_ids 0, 1, 2. \n", "* The constituents of a superrun are called subruns which we call runs." ] }, { "cell_type": "code", "execution_count": 2, "id": "biblical-shame", "metadata": {}, "outputs": [], "source": [ "from strax.testutils import Records, Peaks, PeakClassification\n", "\n", "superrun_name = \"_superrun_test\"\n", "st = strax.Context(\n", " storage=[\n", " strax.DataDirectory(\n", " \"./strax_data\", provide_run_metadata=True, readonly=False, deep_scan=True\n", " )\n", " ],\n", " register=[Records, Peaks, PeakClassification],\n", " config={\"bonus_area\": 42},\n", ")\n", "st.set_context_config({\"use_per_run_defaults\": False})" ] }, { "cell_type": "code", "execution_count": 3, "id": "suffering-burning", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "518184b9089146f083b20910b87d4b86", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Loading peaks: | | 0.00 % [00:00\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
namenumbermodesourcetags
000.0
111.0
222.0
3_superrun_testNaN
402439924399.0
\n", "" ], "text/plain": [ " name number mode source tags\n", "0 0 0.0 \n", "1 1 1.0 \n", "2 2 2.0 \n", "3 _superrun_test NaN \n", "4 024399 24399.0 " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "st.select_runs()" ] }, { "cell_type": "markdown", "id": "korean-encounter", "metadata": {}, "source": [ "### Loading data with superruns:\n", "\n", "Loading superruns can be done in two different ways. Lets try first the already implemented approach and compare the data with loading the individual runs separately:" ] }, { "cell_type": "code", "execution_count": 9, "id": "differential-rocket", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d7a3a5f7d34e4c57a95e25fe2c7bbaa5", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Loading 3 runs: 0%| | 0/3 [00:00\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
namenumbermodesourcetagspeaks_available
000.0True
111.0True
222.0True
3_superrun_testNaNTrue
\n", "" ], "text/plain": [ " name number mode source tags peaks_available\n", "0 0 0.0 True\n", "1 1 1.0 True\n", "2 2 2.0 True\n", "3 _superrun_test NaN True" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "st.select_runs(available=(\"peaks\",))" ] }, { "cell_type": "markdown", "id": "conservative-pencil", "metadata": {}, "source": [ "If a some data does not exist for a super run we can simply created it via the superrun_id. This will not only create the data of the rechunked superrun but also the data of the subrungs if not already stored:" ] }, { "cell_type": "code", "execution_count": 16, "id": "authentic-marijuana", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "st.is_stored(subrun_ids[0], \"peak_classification\")" ] }, { "cell_type": "code", "execution_count": 17, "id": "continental-baptist", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "st.make(superrun_name, \"peak_classification\")\n", "st.is_stored(subrun_ids[0], \"peak_classification\")" ] }, { "cell_type": "code", "execution_count": 18, "id": "portuguese-imaging", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "998fb135d8014180ba99faaece452db2", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Loading peak_classification: | | 0.00 % [00:00