Bonobo
¶
- Module
Contains all the tools you need to get started with the framework, including (but not limited to) generic transformations, readers, writers, and tools for writing and executing graphs and jobs.
All objects in this module are considered very safe to use, and backward compatibility when moving up from one version to another is maximal.
-
class
Graph
(*chain)[source]¶ Bases:
object
Represents a directed graph of nodes.
-
graphviz
¶
-
name
= ''¶
-
topologically_sorted_indexes
¶ Iterate in topological order, based on networkx’s topological_sort() function.
-
-
class
CsvReader
(*args, **kwargs)[source]¶ Bases:
bonobo.nodes.io.file.FileReader
,bonobo.nodes.io.csv.CsvHandler
Reads a CSV and yield the values as dicts.
- Parameters
path (str) – Path to use within the provided filesystem.
eol (str) –
Character to use as line separator.
Default: ‘n’
encoding (str) –
Encoding.
Default: ‘utf-8’
fs (str) –
The filesystem instance to use.
Default: ‘fs’
mode (str) –
What mode to use for open() call.
Default: ‘r’
output_fields (ensure_tuple) – Specify the field names of output lines. Mutually exclusive with “output_type”.
output_type – Specify the type of output lines. Mutually exclusive with “output_fields”.
delimiter (str) –
quotechar (str) –
escapechar (str) –
doublequote (str) –
skipinitialspace (str) –
lineterminator (str) –
quoting (int) –
headers –
fields (ensure_tuple) –
skip (int) – If set and greater than zero, the reader will skip this amount of lines.
reader_factory –
Builds the CSV reader, a.k.a an object we can iterate, each iteration giving one line of fields, as an iterable.
Defaults to builtin csv.reader(…), but can be overriden to fit your special needs.
Custom instance builder. If not all options are fulfilled, will return a
PartiallyConfigured
instance which is just afunctools.partial
object that behaves like aConfigurable
instance.The special _final argument can be used to force final instance to be created, or an error raised if options are missing.
- Parameters
args –
_final – bool
kwargs –
- Returns
Configurable or PartiallyConfigured
-
read
(file, context, *, fs)[source]¶ Write a row on the next line of given file. Prefix is used for newlines.
-
reader_factory
¶ Builds the CSV reader, a.k.a an object we can iterate, each iteration giving one line of fields, as an iterable.
Defaults to builtin csv.reader(…), but can be overriden to fit your special needs.
-
skip
¶ If set and greater than zero, the reader will skip this amount of lines.
-
class
CsvWriter
(*args, **kwargs)[source]¶ Bases:
bonobo.nodes.io.file.FileWriter
,bonobo.nodes.io.csv.CsvHandler
- Parameters
path (str) – Path to use within the provided filesystem.
writer_factory –
Builds the CSV writer, a.k.a an object we can pass a field collection to be written as one line in the target file.
Defaults to builtin csv.writer(…).writerow, but can be overriden to fit your special needs.
eol (str) –
Character to use as line separator.
Default: ‘n’
encoding (str) –
Encoding.
Default: ‘utf-8’
fs (str) –
The filesystem instance to use.
Default: ‘fs’
mode (str) –
What mode to use for open() call.
Default: ‘w+’
delimiter (str) –
quotechar (str) –
escapechar (str) –
doublequote (str) –
skipinitialspace (str) –
lineterminator (str) –
quoting (int) –
headers –
fields (ensure_tuple) –
Custom instance builder. If not all options are fulfilled, will return a
PartiallyConfigured
instance which is just afunctools.partial
object that behaves like aConfigurable
instance.The special _final argument can be used to force final instance to be created, or an error raised if options are missing.
- Parameters
args –
_final – bool
kwargs –
- Returns
Configurable or PartiallyConfigured
-
writer_factory
¶ Builds the CSV writer, a.k.a an object we can pass a field collection to be written as one line in the target file.
Defaults to builtin csv.writer(…).writerow, but can be overriden to fit your special needs.
-
class
FileReader
(*args, **kwargs)[source]¶ Bases:
bonobo.nodes.io.base.Reader
,bonobo.nodes.io.base.FileHandler
Component factory for file-like readers.
On its own, it can be used to read a file and yield one row per line, trimming the “eol” character at the end if present. Extending it is usually the right way to create more specific file readers (like json, csv, etc.)
- Parameters
path (str) – Path to use within the provided filesystem.
eol (str) –
Character to use as line separator.
Default: ‘n’
encoding (str) –
Encoding.
Default: ‘utf-8’
fs (str) –
The filesystem instance to use.
Default: ‘fs’
mode (str) –
What mode to use for open() call.
Default: ‘r’
output_fields (ensure_tuple) – Specify the field names of output lines. Mutually exclusive with “output_type”.
output_type – Specify the type of output lines. Mutually exclusive with “output_fields”.
Custom instance builder. If not all options are fulfilled, will return a
PartiallyConfigured
instance which is just afunctools.partial
object that behaves like aConfigurable
instance.The special _final argument can be used to force final instance to be created, or an error raised if options are missing.
- Parameters
args –
_final – bool
kwargs –
- Returns
Configurable or PartiallyConfigured
-
mode
¶ What mode to use for open() call.
Default: ‘r’
-
output
¶
-
output_fields
¶ Specify the field names of output lines. Mutually exclusive with “output_type”.
-
output_type
¶ Specify the type of output lines. Mutually exclusive with “output_fields”.
-
class
FileWriter
(*args, **kwargs)[source]¶ Bases:
bonobo.nodes.io.base.Writer
,bonobo.nodes.io.base.FileHandler
Component factory for file or file-like writers.
On its own, it can be used to write in a file one line per row that comes into this component. Extending it is usually the right way to create more specific file writers (like json, csv, etc.)
- Parameters
Custom instance builder. If not all options are fulfilled, will return a
PartiallyConfigured
instance which is just afunctools.partial
object that behaves like aConfigurable
instance.The special _final argument can be used to force final instance to be created, or an error raised if options are missing.
- Parameters
args –
_final – bool
kwargs –
- Returns
Configurable or PartiallyConfigured
-
mode
¶ What mode to use for open() call.
Default: ‘w+’
-
class
Filter
(*args, **kwargs)[source]¶ Bases:
bonobo.config.configurables.Configurable
- Filter out hashes from the stream depending on the
filter
callable return value, when called with the current hash as parameter.
Can be used as a decorator on a filter callable.
-
filter
¶ A callable used to filter lines.
If the callable returns a true-ish value, the input will be passed unmodified to the next items.
Otherwise, it’ll be burnt.
-
- Parameters
filter –
Custom instance builder. If not all options are fulfilled, will return a
PartiallyConfigured
instance which is just afunctools.partial
object that behaves like aConfigurable
instance.The special _final argument can be used to force final instance to be created, or an error raised if options are missing.
- Parameters
args –
_final – bool
kwargs –
- Returns
Configurable or PartiallyConfigured
-
filter
- Filter out hashes from the stream depending on the
-
class
FixedWindow
(*args, **kwargs)[source]¶ Bases:
bonobo.config.configurables.Configurable
Transformation factory to create fixed windows of inputs, as lists.
For example, if the input is successively 1, 2, 3, 4, etc. and you pass it through a
FixedWindow(2)
, you’ll get lists of elements 2 by 2: [1, 2], [3, 4], …- Parameters
length (int) –
Custom instance builder. If not all options are fulfilled, will return a
PartiallyConfigured
instance which is just afunctools.partial
object that behaves like aConfigurable
instance.The special _final argument can be used to force final instance to be created, or an error raised if options are missing.
- Parameters
args –
_final – bool
kwargs –
- Returns
Configurable or PartiallyConfigured
-
buffer
¶
-
length
¶
-
class
JsonReader
(*args, **kwargs)[source]¶ Bases:
bonobo.nodes.io.json.JsonHandler
,bonobo.nodes.io.file.FileReader
- Parameters
path (str) – Path to use within the provided filesystem.
eol (str) –
Character to use as line separator.
Default: ‘n’
encoding (str) –
Encoding.
Default: ‘utf-8’
fs (str) –
The filesystem instance to use.
Default: ‘fs’
mode (str) –
What mode to use for open() call.
Default: ‘r’
output_fields (ensure_tuple) – Specify the field names of output lines. Mutually exclusive with “output_type”.
output_type – Specify the type of output lines. Mutually exclusive with “output_fields”.
loader –
Custom instance builder. If not all options are fulfilled, will return a
PartiallyConfigured
instance which is just afunctools.partial
object that behaves like aConfigurable
instance.The special _final argument can be used to force final instance to be created, or an error raised if options are missing.
- Parameters
args –
_final – bool
kwargs –
- Returns
Configurable or PartiallyConfigured
-
loader
¶
-
class
JsonWriter
(*args, **kwargs)[source]¶ Bases:
bonobo.nodes.io.json.JsonHandler
,bonobo.nodes.io.file.FileWriter
- Parameters
Custom instance builder. If not all options are fulfilled, will return a
PartiallyConfigured
instance which is just afunctools.partial
object that behaves like aConfigurable
instance.The special _final argument can be used to force final instance to be created, or an error raised if options are missing.
- Parameters
args –
_final – bool
kwargs –
- Returns
Configurable or PartiallyConfigured
-
write
(file, context, *args, fs)[source]¶ Write a json row on the next line of file pointed by ctx.file.
- Parameters
ctx –
row –
-
envelope
¶
-
class
LdjsonReader
(*args, **kwargs)[source]¶ Bases:
bonobo.nodes.io.json.LdjsonHandler
,bonobo.nodes.io.json.JsonReader
Read a stream of line-delimited JSON objects (one object per line).
Not to be mistaken with JSON-LD (where LD stands for linked data).
- Parameters
path (str) – Path to use within the provided filesystem.
eol (str) –
Character to use as line separator.
Default: ‘n’
encoding (str) –
Encoding.
Default: ‘utf-8’
fs (str) –
The filesystem instance to use.
Default: ‘fs’
mode (str) –
What mode to use for open() call.
Default: ‘r’
output_fields (ensure_tuple) – Specify the field names of output lines. Mutually exclusive with “output_type”.
output_type – Specify the type of output lines. Mutually exclusive with “output_fields”.
loader –
Custom instance builder. If not all options are fulfilled, will return a
PartiallyConfigured
instance which is just afunctools.partial
object that behaves like aConfigurable
instance.The special _final argument can be used to force final instance to be created, or an error raised if options are missing.
- Parameters
args –
_final – bool
kwargs –
- Returns
Configurable or PartiallyConfigured
-
class
LdjsonWriter
(*args, **kwargs)[source]¶ Bases:
bonobo.nodes.io.json.LdjsonHandler
,bonobo.nodes.io.json.JsonWriter
Write a stream of Line-delimited JSON objects (one object per line).
Not to be mistaken with JSON-LD (where LD stands for linked data).
- Parameters
Custom instance builder. If not all options are fulfilled, will return a
PartiallyConfigured
instance which is just afunctools.partial
object that behaves like aConfigurable
instance.The special _final argument can be used to force final instance to be created, or an error raised if options are missing.
- Parameters
args –
_final – bool
kwargs –
- Returns
Configurable or PartiallyConfigured
-
class
Limit
(*args, **kwargs)[source]¶ Bases:
bonobo.config.configurables.Configurable
Creates a Limit() node, that will only let go through the first n rows (defined by the limit option), unmodified.
-
limit
¶ Number of rows to let go through.
TODO: simplify into a closure building factory?
- Parameters
limit –
Custom instance builder. If not all options are fulfilled, will return a
PartiallyConfigured
instance which is just afunctools.partial
object that behaves like aConfigurable
instance.The special _final argument can be used to force final instance to be created, or an error raised if options are missing.
- Parameters
args –
_final – bool
kwargs –
- Returns
Configurable or PartiallyConfigured
-
counter
¶
-
limit
-
-
class
PickleReader
(*args, **kwargs)[source]¶ Bases:
bonobo.nodes.io.file.FileReader
,bonobo.nodes.io.pickle.PickleHandler
Reads a Python pickle object and yields the items in dicts.
- Parameters
path (str) – Path to use within the provided filesystem.
eol (str) –
Character to use as line separator.
Default: ‘n’
encoding (str) –
Encoding.
Default: ‘utf-8’
fs (str) –
The filesystem instance to use.
Default: ‘fs’
output_fields (ensure_tuple) – Specify the field names of output lines. Mutually exclusive with “output_type”.
output_type – Specify the type of output lines. Mutually exclusive with “output_fields”.
fields (tuple) –
mode (str) –
Custom instance builder. If not all options are fulfilled, will return a
PartiallyConfigured
instance which is just afunctools.partial
object that behaves like aConfigurable
instance.The special _final argument can be used to force final instance to be created, or an error raised if options are missing.
- Parameters
args –
_final – bool
kwargs –
- Returns
Configurable or PartiallyConfigured
-
read
(file, context, *, fs)[source]¶ Write a row on the next line of given file. Prefix is used for newlines.
-
mode
¶
-
class
PickleWriter
(*args, **kwargs)[source]¶ Bases:
bonobo.nodes.io.file.FileWriter
,bonobo.nodes.io.pickle.PickleHandler
- Parameters
Custom instance builder. If not all options are fulfilled, will return a
PartiallyConfigured
instance which is just afunctools.partial
object that behaves like aConfigurable
instance.The special _final argument can be used to force final instance to be created, or an error raised if options are missing.
- Parameters
args –
_final – bool
kwargs –
- Returns
Configurable or PartiallyConfigured
-
mode
¶
-
class
PrettyPrinter
(*args, **kwargs)[source]¶ Bases:
bonobo.config.configurables.Configurable
- Parameters
filter –
A filter that determine what to print.
Default is to ignore any key starting with an underscore and none values.
max_width (int) –
If set, truncates the output values longer than this to this width.
Default: 80
Custom instance builder. If not all options are fulfilled, will return a
PartiallyConfigured
instance which is just afunctools.partial
object that behaves like aConfigurable
instance.The special _final argument can be used to force final instance to be created, or an error raised if options are missing.
- Parameters
args –
_final – bool
kwargs –
- Returns
Configurable or PartiallyConfigured
-
context
¶
-
filter
¶ A filter that determine what to print.
Default is to ignore any key starting with an underscore and none values.
-
max_width
¶ If set, truncates the output values longer than this to this width.
Default: 80
-
class
RateLimited
(*args, **kwargs)[source]¶ Bases:
bonobo.config.configurables.Configurable
Custom instance builder. If not all options are fulfilled, will return a
PartiallyConfigured
instance which is just afunctools.partial
object that behaves like aConfigurable
instance.The special _final argument can be used to force final instance to be created, or an error raised if options are missing.
- Parameters
args –
_final – bool
kwargs –
- Returns
Configurable or PartiallyConfigured
-
amount
¶
-
bucket
¶
-
handler
¶
-
initial
¶
-
period
¶
-
run
(graph, *, plugins=None, services=None, strategy=None)[source]¶ Main entry point of bonobo. It takes a graph and creates all the necessary plumbing around to execute it.
The only necessary argument is a
Graph
instance, containing the logic you actually want to execute.By default, this graph will be executed using the “threadpool” strategy: each graph node will be wrapped in a thread, and executed in a loop until there is no more input to this node.
You can provide plugins factory objects in the plugins list, this function will add the necessary plugins for interactive console execution and jupyter notebook execution if it detects correctly that it runs in this context.
You’ll probably want to provide a services dictionary mapping service names to service instances.
- Parameters
- Return bonobo.execution.graph.GraphExecutionContext
-
create_strategy
(name=None)[source]¶ Create a strategy, or just returns it if it’s already one.
- Parameters
name –
- Returns
Strategy
-
open_fs
(fs_url=None, *args, **kwargs)[source]¶ Wraps
fs.opener.registry.Registry.open_fs
, with default to local current working directory and expanding ~ in path.- Parameters
fs_url (str) – A filesystem URL
parse_result (
ParseResult
) – A parsed filesystem URL.writeable (bool) – True if the filesystem must be writeable.
create (bool) – True if the filesystem should be created if it does not exist.
cwd (str) – The current working directory (generally only relevant for OS filesystems).
default_protocol (str) – The protocol to use if one is not supplied in the FS URL (defaults to
"osfs"
).
- Returns
fs.base.FS
object
-
OrderFields
(fields)[source]¶ Transformation factory to reorder fields in a data stream.
- Parameters
fields –
- Returns
callable
-
SetFields
(fields)[source]¶ Transformation factory that sets the field names on first iteration, without touching the values.
- Parameters
fields –
- Returns
callable
-
UnpackItems
(*items, fields=None, defaults=None)[source]¶ >>> UnpackItems(0)
- Parameters
items –
fields –
defaults –
- Returns
callable
-
get_argument_parser
(parser=None)[source]¶ Creates an argument parser with arguments to override the system environment.
- Api
bonobo.get_argument_parser
- Parameters
_parser –
- Returns
-
parse_args
(mixed=None)[source]¶ Context manager to extract and apply environment related options from the provided argparser result.
A dictionnary with unknown options will be yielded, so the remaining options can be used by the caller.
- Api
bonobo.patch_environ
- Parameters
mixed – ArgumentParser instance, Namespace, or dict.
- Returns