========= TT-series

High performance engine to store Time series data in Redis.

Redis Version required: >=5.0

TT-series is based on redis sorted sets to store the time-series data, Sorted set_ store scores with unique numbers under a single key, but it has a weakness to store records, only unique members are allowed and trying to record a time-series entry with the same value as a previous will result in only updating the score. So TT-series provide a solution to solve that problem.

TT series normally can support redis version > 3.0, and will support redis 5.0 in the future.

Tips

Max Store series length

For 32 bit Redis on a 32 bit platform redis sorted sets can support maximum 232-1 members, and for 64 bit redis on a 64 bit platform can support maximum 264-1 members. But large amount of data would cause more CPU activity, so better keep a balance with length of records is very important.
Only Support Python 3.6

Because python 3.6 changed the dictionary implementation for better performance, so in Python 3.6 dictionaries are insertion ordered. links: https://stackoverflow.com/questions/39980323/are-dictionaries-ordered-in-python-3-6
Performance Tips

With hiredis-py which it's targeted at speeding up parsing multi bulk replies from redis-server. So with a large amount of bulk data insertion or getting from redis-server, it can improve a great performance improvement.

Install

Install python package from pip release::

pip install ttseries

Documentation

Features

Quick inserts 100,000 records (1-2 sec) and get slice of 100,000 records (0.4-0.5 sec).
Support Data Serializer, Default Enable with MessagePack.
Support Data Compression.
In-order to void update previous records, Support Redis Hashes Time-Series storage format.
Support Numpy ndarray data type.
Support max length to auto to trim records.

Usage

TT-series provide three implementation to support different kinds of time-series data type.

RedisSimpleTimeSeries : Normally only base on Sorted sets to store records, previous records will impact the new inserting records which are NOT unique numbers.
RedisHashTimeSeries: Use Redis Sorted sets with Hashes to store time-series data, User don't need to consider the data repeatability with records, but sorted sets with hashes would take some extra memories to store the keys.
RedisNumpyTimeSeries: Support numpy.ndarray to store time-series records in redis sorted set.
RedisPandasTimeSeries: Support pandas.DataFrame to store time-series records in redis sorted set.

Serializer Data

TT-series use MsgPack_ to serializer data, because compare with other data serializer's solutions, MsgPack provide a better performance solution to serialize data. If user don't want to use MsgPack to serializer data, just inherit from ttseries.BaseSerializer class to implement the supported serializer class methods.

Examples

RedisSimpleTimeSeries && RedisHashTimeSeries && RedisNumpyTimeSeries && RedisPandasTimeSeries

Three series data implementation provide the same functions and methods, in the usage will provide the difference in the methods.

Prepare data records: ^^^^^^^^^^^^^^^^^^^^^

.. sourcecode:: python

from datetime import datetime
from redis import StrictRedis

now = datetime.now()
timestamp = now.timestamp()

series_data = []

for i in range(1000):
    series_data.append((timestamp+i,i))

client = StrictRedis() # redis client

Add records ^^^^^^^^^^^

.. sourcecode:: python

from ttseries import RedisSimpleTimeSeries

simple_series = RedisSimpleTimeSeries(client=client)

key = "TEST:SIMPLE"

simple_series.add_many(key, series_data)

Count records length ^^^^^^^^^^^^^^^^^^^^

Get the length of the records or need just get the length from timestamp span.

.. sourcecode:: python

# get the records length
simple_series.length(key)

# result: ...: 1000

# get the records length from start timestamp and end timestamp
simple_series.count(key, from_timestamp=timestamp, end_timestamp=timestamp+10)

# result: ...: 11

trim records ^^^^^^^^^^^^

Trim the records as the ASC.

.. sourcecode:: python

simple_series.trim(key,10) # trim 10 length of records

delete timestamp span ^^^^^^^^^^^^^^^^^^^^^

Delete timestamp provide delete key or delete records from start timestamp to end timestamp.

.. sourcecode:: python

simple_series.delete(key) # delete key with all records

simple_series.delete(key, start_timestamp=timestamp) # delete key form start timestamp

Get Slice ^^^^^^^^^

Get slice form records provide start timestamp and end timestamp with ASC or DESC ordered.

Default Order: ASC

If user want to get the timestamp great than (>) or less than (<) which not including the timestamp record. just use (timestamp which support <timestamp or >timestamp sign format like this.

.. sourcecode:: python

# get series data from start timestamp ordered as ASC.

simple_series.get_slice(key, start_timestamp=timestamp, acs=True)

# get series data from great than start timestamp order as ASC
simple_series.get_slice(key, start_timestamp="("+str(timestamp), asc=True)

# get series data from start timestamp and limit the numbers with 500
time_series.get_slice(key,start_timestamp=timestamp,limit=500)

iter ^^^^

yield item from records.

.. sourcecode:: python

for item in simple_series.iter(key):
    print(item)

RedisNumpyTimeSeries ^^^^^^^^^^^^^^^^^^^^

Numpy array support provide numpy.dtype or just arrays with data.

Use numpy.dtype to create records. must provide timestamp_column_name and dtype parameters.

.. sourcecode:: python

import numpy as np
from ttseries import RedisNumpyTimeSeries

dtype = [("timestamp","float64"),("value","i")]

array = np.array(series_data, dtype=dtype)

np_series = RedisNumpyTimeSeries(client=client, dtype=dtype, timestamp_column_name="timestamp")

Or just numpy array without dtype, but must provide timestamp_column_index parameter.

.. sourcecode:: python

array = np.array(series_data)

np_series = RedisNumpyTimeSeries(client=client,timestamp_column_index=0)

RedisPandasTimeSeries ^^^^^^^^^^^^^^^^^^^^^

Pandas TimeSeries use pandas.DataFrame to store time-series in redis. To initialize the class must provide columns and dtypes parameters.

columns parameter indicates the column names of the pandas.DataFrame.
dtypes indicates the dtype of each column in DataFrame, for example: { "value1":"int64","value2":"float32"} reference link: http://pbpython.com/pandas_dtypes.html

.. sourcecode:: python

from datetime import datetime

key = "AA:MIN"
now = datetime.now()
columns = ["value"]
date_range = pandas.date_range(now, periods=10, freq="1min")

data_frame = pandas.DataFrame([i + 1 for i in range(len(date_range))],
                            index=date_range, columns=columns)


dtypes = {"value":"int64"}
pandas_ts = RedisPandasTimeSeries(client=client, columns=columns, dtypes=dtypes)

Add ^^^

Add a time-series record to redis, series parameter indicates pandas.Series data type. and especial the series name value data type must be the pandas.DatetimeIndex.

.. sourcecode:: python

series_item = data_frame.iloc[0]
pandas_ts.add(key, series_item)

add_many ^^^^^^^^

Add large amount of pandas.DataFrame into redis, with the dataframe index data type must be the pandas.DatetimeIndex. For better insert performance, just use chunks_size to split the dataframe into fixed chunks_size rows of dataframes.

.. sourcecode:: python

pandas_ts.add_many(key, data_frame)

iter & Get ^^^^^^^^^^

retrieve records from redis sorted set, both of methods return pandas.Series.

.. sourcecode:: python

# yield all records data from redis
for item in pandas_ts.iter(key):
    print(item)
# return one record with specific timestamp
pandas_ts.get(key, 1536157765.464465)

get_slice ^^^^^^^^^

retrieve records to slice with start timestamp or end timestamp, with limit length. return pandas.DataFrame

.. sourcecode:: python

# return records from start timestamp 1536157765.464465
result_frame = pandas_ts.get_slice(key, start_timestamp=1536157765.464465)

# return records from start timestamp 1536157765.464465 to end timestamp 1536157780.464465
result2_frame = padas_ts.get_slice(key, start_timestamp=1536157765.464465, end_timestamp=1536157780.464465)

Benchmark

just run make benchmark-init, after then start make benchmark-test.

Go to the benchmark directory there exist an example of the benchmark test reports.

TODO

Support Redis 5.0
Support Redis cluster.

Author

Winton Wang

Donate

Contact

Email: [email protected]

Reference

links: https://www.infoq.com/articles/redis-time-series

.. _Sorted set: https://redis.io/commands#sorted_set .. _MsgPack: http://msgpack.org

.. |travis| image:: https://travis-ci.org/nooperpudd/ttseries.svg?branch=master :target: https://travis-ci.org/nooperpudd/ttseries

.. |appveyor| image:: https://ci.appveyor.com/api/projects/status/ntlhwaagr5dqh341/branch/master?svg=true :target: https://ci.appveyor.com/project/nooperpudd/ttseries

.. |codecov| image:: https://codecov.io/gh/nooperpudd/ttseries/branch/master/graph/badge.svg :target: https://codecov.io/gh/nooperpudd/ttseries

.. |codacy| image:: https://api.codacy.com/project/badge/Grade/154fe60c6d2b4e59b8ee18baa56ad0a9 :target: https://www.codacy.com/app/nooperpudd/ttseries?utm_source=github.com&utm_medium=referral&utm_content=nooperpudd/ttseries&utm_campaign=Badge_Grade

.. |pypi| image:: https://img.shields.io/pypi/v/ttseries.svg :target: https://pypi.python.org/pypi/ttseries

.. |status| image:: https://img.shields.io/pypi/status/ttseries.svg :target: https://pypi.python.org/pypi/ttseries

.. |pyversion| image:: https://img.shields.io/pypi/pyversions/ttseries.svg :target: https://pypi.python.org/pypi/ttseries

.. |requirements| image:: https://requires.io/github/nooperpudd/ttseries/requirements.svg?branch=master :target: https://requires.io/github/nooperpudd/ttseries/requirements/?branch=master

.. |docs| image:: https://readthedocs.org/projects/ttseries/badge/?version=latest :target: http://ttseries.readthedocs.io/en/latest/?badge=latest

.. |download| image:: https://pepy.tech/badge/ttseries :target: https://pepy.tech/project/ttseries

ttseries
ttseries copied to clipboard

Metadata

========= TT-series

Tips

Install

Documentation

Features

Usage

Serializer Data

Examples

Benchmark

TODO

Author

Donate

Contact

Reference

← Metadata

Owner

Metadata

ttseries ttseries copied to clipboard

Metadata

========= TT-series

Tips

Install

Documentation

Features

Usage

Serializer Data

Examples

Benchmark

TODO

Author

Donate

Contact

Reference

← Metadata

Owner

Metadata

ttseries
ttseries copied to clipboard