Failed Tasks¶
Sometimes tasks can fail. Let’s see how to deal with failed tasks in nornir.
Let’s start as usual with the needed boilerplate:
[1]:
from nornir import InitNornir
from nornir.plugins.tasks import networking, text
from nornir.plugins.functions.text import print_result
nr = InitNornir(config_file="config.yaml")
cmh = nr.filter(site="cmh", type="network_device")
Now, as an example we are going to use a similar task group like the one we used in the previous tutorial:
[2]:
def basic_configuration(task):
# Transform inventory data to configuration via a template file
r = task.run(task=text.template_file,
name="Base Configuration",
template="base.j2",
path=f"templates/junos")
# Save the compiled configuration into a host variable
task.host["config"] = r.result
# Deploy that configuration to the device using NAPALM
task.run(task=networking.napalm_configure,
name="Loading Configuration on the device",
replace=False,
configuration=task.host["config"])
Note that the path is hardcoded to templates/junos, this should cause an error when trying to apply the configuration to the EOS devices. Let’s see what happens:
[3]:
result = cmh.run(task=basic_configuration)
Let’s inspect the object:
[4]:
result.failed
[4]:
True
[5]:
result.failed_hosts
[5]:
{'spine00.cmh': MultiResult: [Result: "basic_configuration", Result: "Base Configuration", Result: "Loading Configuration on the device"],
'leaf00.cmh': MultiResult: [Result: "basic_configuration", Result: "Base Configuration", Result: "Loading Configuration on the device"]}
[6]:
result['spine00.cmh'][1].exception
As you can see, the result object is aware something went wrong and you can inspect the errors if you so desire.
You can also using the print_result
function on it:
[7]:
print_result(result)
basic_configuration*************************************************************
* leaf00.cmh ** changed : False ************************************************
vvvv basic_configuration ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv ERROR
Subtask: Loading Configuration on the device (failed)
---- Base Configuration ** changed : False ------------------------------------- INFO
system {
host-name leaf00.cmh;
domain-name cmh.acme.local;
}
---- Loading Configuration on the device ** changed : False -------------------- ERROR
Traceback (most recent call last):
File "/Users/dbarroso/.virtualenvs/nornir/lib/python3.7/site-packages/napalm/eos/eos.py", line 231, in _load_config
self.device.run_commands(commands)
File "/Users/dbarroso/.virtualenvs/nornir/lib/python3.7/site-packages/pyeapi/client.py", line 730, in run_commands
response = self._connection.execute(commands, encoding, **kwargs)
File "/Users/dbarroso/.virtualenvs/nornir/lib/python3.7/site-packages/pyeapi/eapilib.py", line 499, in execute
response = self.send(request)
File "/Users/dbarroso/.virtualenvs/nornir/lib/python3.7/site-packages/pyeapi/eapilib.py", line 418, in send
raise CommandError(code, msg, command_error=err, output=out)
pyeapi.eapilib.CommandError: Error [1002]: CLI command 3 of 6 'system {' failed: invalid command [Invalid input (at token 1: '{')]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/dbarroso/workspace/nornir/nornir/core/task.py", line 62, in start
r = self.task(self, **self.params)
File "/Users/dbarroso/workspace/nornir/nornir/plugins/tasks/networking/napalm_configure.py", line 32, in napalm_configure
device.load_merge_candidate(filename=filename, config=configuration)
File "/Users/dbarroso/.virtualenvs/nornir/lib/python3.7/site-packages/napalm/eos/eos.py", line 246, in load_merge_candidate
self._load_config(filename, config, False)
File "/Users/dbarroso/.virtualenvs/nornir/lib/python3.7/site-packages/napalm/eos/eos.py", line 238, in _load_config
raise MergeConfigException(msg)
napalm.base.exceptions.MergeConfigException: Error [1002]: CLI command 3 of 6 'system {' failed: invalid command [Invalid input (at token 1: '{')]
^^^^ END basic_configuration ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* leaf01.cmh ** changed : True *************************************************
vvvv basic_configuration ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
---- Base Configuration ** changed : False ------------------------------------- INFO
system {
host-name leaf01.cmh;
domain-name cmh.acme.local;
}
---- Loading Configuration on the device ** changed : True --------------------- INFO
[edit system]
- host-name vsrx;
+ host-name leaf01.cmh;
+ domain-name cmh.acme.local;
^^^^ END basic_configuration ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* spine00.cmh ** changed : False ***********************************************
vvvv basic_configuration ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv ERROR
Subtask: Loading Configuration on the device (failed)
---- Base Configuration ** changed : False ------------------------------------- INFO
system {
host-name spine00.cmh;
domain-name cmh.acme.local;
}
---- Loading Configuration on the device ** changed : False -------------------- ERROR
Traceback (most recent call last):
File "/Users/dbarroso/.virtualenvs/nornir/lib/python3.7/site-packages/napalm/eos/eos.py", line 231, in _load_config
self.device.run_commands(commands)
File "/Users/dbarroso/.virtualenvs/nornir/lib/python3.7/site-packages/pyeapi/client.py", line 730, in run_commands
response = self._connection.execute(commands, encoding, **kwargs)
File "/Users/dbarroso/.virtualenvs/nornir/lib/python3.7/site-packages/pyeapi/eapilib.py", line 499, in execute
response = self.send(request)
File "/Users/dbarroso/.virtualenvs/nornir/lib/python3.7/site-packages/pyeapi/eapilib.py", line 418, in send
raise CommandError(code, msg, command_error=err, output=out)
pyeapi.eapilib.CommandError: Error [1002]: CLI command 3 of 6 'system {' failed: invalid command [Invalid input (at token 1: '{')]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/dbarroso/workspace/nornir/nornir/core/task.py", line 62, in start
r = self.task(self, **self.params)
File "/Users/dbarroso/workspace/nornir/nornir/plugins/tasks/networking/napalm_configure.py", line 32, in napalm_configure
device.load_merge_candidate(filename=filename, config=configuration)
File "/Users/dbarroso/.virtualenvs/nornir/lib/python3.7/site-packages/napalm/eos/eos.py", line 246, in load_merge_candidate
self._load_config(filename, config, False)
File "/Users/dbarroso/.virtualenvs/nornir/lib/python3.7/site-packages/napalm/eos/eos.py", line 238, in _load_config
raise MergeConfigException(msg)
napalm.base.exceptions.MergeConfigException: Error [1002]: CLI command 3 of 6 'system {' failed: invalid command [Invalid input (at token 1: '{')]
^^^^ END basic_configuration ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* spine01.cmh ** changed : True ************************************************
vvvv basic_configuration ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
---- Base Configuration ** changed : False ------------------------------------- INFO
system {
host-name spine01.cmh;
domain-name cmh.acme.local;
}
---- Loading Configuration on the device ** changed : True --------------------- INFO
[edit system]
- host-name vsrx;
+ host-name spine01.cmh;
+ domain-name cmh.acme.local;
^^^^ END basic_configuration ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
There is also a method that will raise an exception if the task had an error:
[8]:
from nornir.core.exceptions import NornirExecutionError
try:
result.raise_on_error()
except NornirExecutionError:
print("ERROR!!!")
ERROR!!!
Skipped hosts¶
Nornir will keep track of hosts that failed and won’t run future tasks on them:
[9]:
from nornir.core.task import Result
def hi(task):
return Result(host=task.host, result=f"{task.host.name}: Hi, I am still here!")
result = cmh.run(task=hi)
[10]:
print_result(result)
hi******************************************************************************
* leaf01.cmh ** changed : False ************************************************
vvvv hi ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
leaf01.cmh: Hi, I am still here!
^^^^ END hi ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* spine01.cmh ** changed : False ***********************************************
vvvv hi ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
spine01.cmh: Hi, I am still here!
^^^^ END hi ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You can force the execution of tasks on failed hosts by passing the argument on_failed=True
:
[11]:
result = cmh.run(task=hi, on_failed=True)
print_result(result)
hi******************************************************************************
* leaf00.cmh ** changed : False ************************************************
vvvv hi ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
leaf00.cmh: Hi, I am still here!
^^^^ END hi ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* leaf01.cmh ** changed : False ************************************************
vvvv hi ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
leaf01.cmh: Hi, I am still here!
^^^^ END hi ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* spine00.cmh ** changed : False ***********************************************
vvvv hi ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
spine00.cmh: Hi, I am still here!
^^^^ END hi ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* spine01.cmh ** changed : False ***********************************************
vvvv hi ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
spine01.cmh: Hi, I am still here!
^^^^ END hi ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You can also exclude the hosts that are “good” if you want to with the on_good
flag:
[12]:
result = cmh.run(task=hi, on_failed=True, on_good=False)
print_result(result)
hi******************************************************************************
* leaf00.cmh ** changed : False ************************************************
vvvv hi ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
leaf00.cmh: Hi, I am still here!
^^^^ END hi ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* spine00.cmh ** changed : False ***********************************************
vvvv hi ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
spine00.cmh: Hi, I am still here!
^^^^ END hi ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
To achieve this nornir
keeps a set of failed hosts in it’s shared data object:
[13]:
nr.data.failed_hosts
[13]:
{'leaf00.cmh', 'spine00.cmh'}
If you want to mark some hosts as succeeded and make them back eligible for future tasks you can do it individually per host with the function recover_host or reset the list completely with reset_failed_hosts:
[14]:
nr.data.reset_failed_hosts()
nr.data.failed_hosts
[14]:
set()
Raise on error automatically¶
Alternatively, you can configure nornir to raise the exception automatically in case of error with the raise_on_error
configuration option:
[16]:
nr = InitNornir(config_file="config.yaml", core={"raise_on_error": True})
cmh = nr.filter(site="cmh", type="network_device")
try:
cmh.run(task=basic_configuration)
except NornirExecutionError:
print("ERROR!!!")
ERROR!!!
Workflows¶
The default workflow should work for most use cases as hosts with errors are skipped and the print_result
should give enough information to understand what’s going on. For more complex workflows this framework should give you enough room to easily implement them regardless of the complexity.