BitSwan Features v1903, launched on 1st of April 2019
New features v1903:
Jupyter server in BitSwan ecosystem
Jupyter server is used to do real-time calculations using BitSwan objects as well as share examples of the code. Jupyter evolved to support interaction within data science. For more information about Jupyter, please refer to the official documentation at: https://jupyter.org/documentation
First implementation of pipeline builder
It is now possible to create a pump with pipelines using configurations files only, with reference to desired objects within BitSwan. It is possible to specify processors, which should be part of the pipeline, as a list with class names and configuration options. Triggered sources can contain reference to triggers to be used.
BlankApp for Telco
New blank application for BitSwan for Telco library can be used to quickly create a telco-based application. BitSwan for Telco library is built on top of BSPump to include pipelines, processors and lookups for telecommunication data processing using basic standards, such as XDR (GB, IuPS, S1 data) etc. It is available on GitHub: https://github.com/LibertyAces/BitSwanTelco-BlankApp
Open ID connectivity
BitSwan can now be used to authenticate using a specified Open ID provider, and thus authorize the user, receive and use tokens.
MySQL Transaction / Binary log
BitSwan is now able to connect to and read MySQL database transactions / binary logs and processed them as events. The binary log is generally mainly used for data recovery and replication. For more information, please refer to: https://dev.mysql.com/doc/refman/8.0/en/binary-log.html.
Session Analyzer to hold session data for a specified time
Session Analyzer is able to store metrics or other defined data for a given amount of time, which can be specified in the configuration.
Throttles are now listed when accessing pipelines via REST call
BSPump API pipeline endpoint now provides list of throttles, i. e. processors or other objects that have paused the pipeline. It may be because some data queue (in ElasticSearchSink, for example) is full and the external system is not able to process data in the pipeline's speed.
TimeDriftAnalyzer sends fixed zero metrics when there is no history
When there is no history in the data, TimeDriftAnalyzer sends zero metrics. This means, that the metrics are no longer computed, when there is an empty history.
New move
option in FileABCSource to move processed files to a specified directory
Processed files are now moved to a directory specified in the configuration, which cannot be a subfolder of the directory, where the files are originally located. This feature replaced the default -processed label behavior.
Kibana Object Library support in bskibana tool
Kibana Object Library is a collection of index patterns, searches, lookups, visualizations, dashboards and other objects, that can be used within a specific project to understand the data. It can be imported/exported to/from ElasticSearch, and compiled/decompiled to/from file.
Specification of included/excluded index patterns when importing Kibana Object Library with bskibana tool
Prepared Kibana objects like visualizations, dashboards, searches, index patterns etc. may be now imported into ElasticSearch, while specifying only those index patterns, that are relevant for the current deployment. BSKibana will then automatically include only those objects, that are connected to the specified index patterns.
- New
scan_time
metric in FileABCSource which reports the scan time of files in the specified directory - Fixed bugs in FileABCSource
- Paging in ElasticSearchSource can be switched off
- StringToBytesParser, BytesToStringParser, DictToJsonParser, JsonToDictParser
- Print and PrettyPrint processors and sinks now support stream specification
- Renamed parameter
delete_fields
toinverse
in Filter - Renamed parameter
include
toinclusive
in Filter - Introducing storage to TimeWindowAnalyzer
- Min/max date specifcation option when deleting ElasticSearch indices in bselastic tool