{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Failed Tasks\n", "\n", "Sometimes tasks can fail. Let's see how to deal with failed tasks in nornir.\n", "\n", "Let's start as usual with the needed boilerplate:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import logging\n", "\n", "from nornir import InitNornir\n", "from nornir.core.task import Task, Result\n", "from nornir_utils.plugins.functions import print_result\n", "\n", "# instantiate the nr object\n", "nr = InitNornir(config_file=\"config.yaml\")\n", "# let's filter it down to simplify the output\n", "cmh = nr.filter(site=\"cmh\", type=\"host\")\n", "\n", "def count(task: Task, number: int) -> Result:\n", " return Result(\n", " host=task.host,\n", " result=f\"{[n for n in range(0, number)]}\"\n", " )\n", "\n", "def say(task: Task, text: str) -> Result:\n", " if task.host.name == \"host2.cmh\":\n", " raise Exception(\"I can't say anything right now\")\n", " return Result(\n", " host=task.host,\n", " result=f\"{task.host.name} says {text}\"\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, as an example we are going to use a similar task group like the one we used in the previous tutorial:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def greet_and_count(task: Task, number: int):\n", " task.run(\n", " name=\"Greeting is the polite thing to do\",\n", " severity_level=logging.DEBUG,\n", " task=say,\n", " text=\"hi!\",\n", " )\n", " \n", " task.run(\n", " name=\"Counting beans\",\n", " task=count,\n", " number=number,\n", " )\n", " task.run(\n", " name=\"We should say bye too\",\n", " severity_level=logging.DEBUG, \n", " task=say,\n", " text=\"bye!\",\n", " )\n", "\n", " # let's inform if we counted even or odd times\n", " even_or_odds = \"even\" if number % 2 == 1 else \"odd\"\n", " return Result(\n", " host=task.host,\n", " result=f\"{task.host} counted {even_or_odds} times!\",\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Remember there is a hardcoded error on `host2.cmh`, let's see what happens when we run the task:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "result = cmh.run(\n", " task=greet_and_count,\n", " number=5,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's inspect the object:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result.failed" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'host2.cmh': MultiResult: [Result: \"greet_and_count\", Result: \"Greeting is the polite thing to do\"]}" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result.failed_hosts" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "nornir.core.exceptions.NornirSubTaskError()" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result['host2.cmh'].exception" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Exception(\"I can't say anything right now\")" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result['host2.cmh'][1].exception" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, the result object is aware something went wrong and you can inspect the errors if you so desire.\n", "\n", "You can also using the `print_result` function on it:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[36mgreet_and_count*****************************************************************\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[34m* host1.cmh ** changed : False *************************************************\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[32mvvvv greet_and_count ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO\u001b[0m\n", "\u001b[0mhost1.cmh counted even times!\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[32m---- Counting beans ** changed : False ----------------------------------------- INFO\u001b[0m\n", "\u001b[0m[0, 1, 2, 3, 4]\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[32m^^^^ END greet_and_count ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[34m* host2.cmh ** changed : False *************************************************\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[31mvvvv greet_and_count ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv ERROR\u001b[0m\n", "\u001b[0mSubtask: Greeting is the polite thing to do (failed)\n", "\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[31m---- Greeting is the polite thing to do ** changed : False --------------------- ERROR\u001b[0m\n", "\u001b[0mTraceback (most recent call last):\n", " File \"/nornir/core/task.py\", line 98, in start\n", " r = self.task(self, **self.params)\n", " File \"/tmp/ipykernel_8235/676894166.py\", line 20, in say\n", " raise Exception(\"I can't say anything right now\")\n", "Exception: I can't say anything right now\n", "\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[31m^^^^ END greet_and_count ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\u001b[0m\n", "\u001b[0m" ] } ], "source": [ "print_result(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There is also a method that will raise an exception if the task had an error:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ERROR!!!\u001b[0m\n", "\u001b[0m" ] } ], "source": [ "from nornir.core.exceptions import NornirExecutionError\n", "try:\n", " result.raise_on_error()\n", "except NornirExecutionError:\n", " print(\"ERROR!!!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Skipped hosts\n", "\n", "Nornir will keep track of hosts that failed and won't run future tasks on them:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "from nornir.core.task import Result\n", "\n", "def hi(task: Task) -> Result:\n", " return Result(host=task.host, result=f\"{task.host.name}: Hi, I am still here!\")\n", " \n", "result = cmh.run(task=hi)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[36mhi******************************************************************************\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[34m* host1.cmh ** changed : False *************************************************\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[32mvvvv hi ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO\u001b[0m\n", "\u001b[0mhost1.cmh: Hi, I am still here!\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[32m^^^^ END hi ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\u001b[0m\n", "\u001b[0m" ] } ], "source": [ "print_result(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can force the execution of tasks on failed hosts by passing the argument `on_failed=True`:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[36mhi******************************************************************************\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[34m* host1.cmh ** changed : False *************************************************\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[32mvvvv hi ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO\u001b[0m\n", "\u001b[0mhost1.cmh: Hi, I am still here!\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[32m^^^^ END hi ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[34m* host2.cmh ** changed : False *************************************************\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[32mvvvv hi ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO\u001b[0m\n", "\u001b[0mhost2.cmh: Hi, I am still here!\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[32m^^^^ END hi ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\u001b[0m\n", "\u001b[0m" ] } ], "source": [ "result = cmh.run(task=hi, on_failed=True)\n", "print_result(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also exclude the hosts that are \"good\" if you want to with the `on_good` flag:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[36mhi******************************************************************************\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[34m* host2.cmh ** changed : False *************************************************\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[32mvvvv hi ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO\u001b[0m\n", "\u001b[0mhost2.cmh: Hi, I am still here!\u001b[0m\n", "\u001b[0m\u001b[1m\u001b[32m^^^^ END hi ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\u001b[0m\n", "\u001b[0m" ] } ], "source": [ "result = cmh.run(task=hi, on_failed=True, on_good=False)\n", "print_result(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To achieve this `nornir` keeps a set of failed hosts in it's shared [data](../../api/nornir/core/state.html#nornir.core.state.GlobalState) object:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'host2.cmh'}" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nr.data.failed_hosts" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you want to mark some hosts as succeeded and make them back eligible for future tasks you can do it individually per host with the function [recover_host](../../api/nornir/core/state.html#nornir.core.state.GlobalState.recover_host) or reset the list completely with [reset_failed_hosts](../../api/nornir/core/state.html#nornir.core.state.GlobalState.reset_failed_hosts):" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "set()" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nr.data.reset_failed_hosts()\n", "nr.data.failed_hosts" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Raise on error automatically\n", "\n", "Alternatively, you can configure nornir to raise the exception automatically in case of error with the `raise_on_error` configuration option:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ERROR!!!\u001b[0m\n", "\u001b[0m" ] } ], "source": [ "nr = InitNornir(config_file=\"config.yaml\", core={\"raise_on_error\": True})\n", "cmh = nr.filter(site=\"cmh\", type=\"host\")\n", "try:\n", " result = cmh.run(\n", " task=greet_and_count,\n", " number=5,\n", " )\n", "except NornirExecutionError:\n", " print(\"ERROR!!!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Workflows" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The default workflow should work for most use cases as hosts with errors are skipped and the `print_result` should give enough information to understand what's going on. For more complex workflows this framework should give you enough room to easily implement them regardless of the complexity." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" } }, "nbformat": 4, "nbformat_minor": 2 }