Time-Series Databases

About Time-Series Databases

Time series databases (TSDB) are optimized for understanding how a time stamped data series changes over time. A time-stamped database stores data series which represent repeated measurements. One example of such measurements is network jitter. This time-stamped data might be collected every millisecond, making it a perfect candidate for a time series database.

Database designers make use of some common characteristics in the time series data to make TSDB particularly fast, efficient, and cost effective for certain workloads. Time series data tends to:

  1. be written in extremely large quantities and often recorded in real-time (especially for use cases, such as video observability)
  2. have many more writes than reads (for example, jitter measurements may need to be recorded every millisecond, but not read nearly as often)
  3. have the more recent data read more often
  4. have many queries which require statistical measurements by time period. Each TSDB vendor uses these characteristics to optimize architecture, compression, built in statistical querying, and data lifecycle management (i.e. when to automatically roll up and summarize or delete old data).

One use of a TSDB in streaming video might be to correlate video player and CDN logs. Centralizing the data from players and CDNs could help with the identification of root causes behind QoE issues. In this case, both the player logs and CDN logs are time series and correlating changes in statistics across the sets of logs at particular times can uncover where CDN delivery issues led to streaming dropouts. A TSDB simplifies this analysis when compared to a document or relational database.

Some key questions to consider when selecting a TSDB vendor:

  • How much data will you generate (now and into the future)? You want to make sure the vendor will support you well into the future without breaking the bank.
  • What is the cardinality and dimensionality of your data? Some vendors have fixed limits for cardinality and dimensionality.
  • Are we certain older data can be summarized and discarded? This could impact cluster size and storage needs.
  • Is your data certain to arrive in order of timestamp? Different vendors require different solutions for out-of-order data.
  • What query loads will you need to support across different business functions?
  • What security needs do you have for your data? Can your data be processed by a third party or outside of your own cloud?

Armed with answers to these questions and a solid understanding of both your data characteristics and your business needs, you can select the best TSDB vendor from the Datazoom datatecture.

Related Areas

Above

Datatecture Level 2

Storage and Caching

Below

There are no levels below this group.

Special Thanks

We would like to thank the following companies for contributing to the description of this datatecture category.

We Need Your Help!

Would you be interested in writing the description of this section of the datatecture? If so, just fill out the form below. For your help, we will display your company’s brand under a “Special Thanks” section on this page.

Companies

The following companies are included in this group of the datatecture:

Scroll to Top