What time series database to use for IoT systems?

While most of the tutorials online nowadays describe influxdb as a very good time series database we discovered that there are some serious considerations you should do before selecting a time series database for your next IoT project.

Here are some findings we did and questions we should have asked when we researched and selected time series databases for a new design of an IoT system:

  • To design a stable database engine suitable for production typically takes 10+ years of user feedback and experience from smaller scale testing and continuous improvements. Several of the “high ranking” databases has not been around for that long
  • mySQL and ORACLE have stable database engines by now, but they are not time series databases
  • Influx has rewritten their database engine several times over a period of few years. This may be a hint to that they are still struggling to converge on a stable database design
  • We found several very worrying statements in bug reports related to influx such as “please help, our production database is suddenly corrupt”, “the database has crashed, the backup is also corrupt. There must be something fundamentally wrong with storage”, “we have discontinued functions iusedthisextensivelyandthereisnowayaroundit() in the next release due to a complete rewrite of the database engine” etc.
  • Some databases have very inefficient compression algorithms. They will compress, but when trying to read from a compressed table, the penalty is pretty big and it goes too slow forcing you to use uncompressed tables
  • Some databases do not support writing to compressed tables (!). You have to decompress before writing. This takes a lot of time. And there is no API support to check if you try to write to a compressed table. You have to resort to parsing log messages before writing. Yes it is true.
  • Even if a database shows up high in a ranking, it may be because of it has the highest growth rate. That does not tell you much. If you look closer, you may discover that all the big guys already use more conservative and well proven time series database systems. That people reading tutorials download and install influxdb does not mean that Google, Facebook, Amazon etc uses it. It does not mean that it is the best time series database available.
  • Some databases claim to support continuous aggregates. However did you benchmark and test properly that the functions used for reading from the aggregates are stable, does not require 100% CPU, and does not crash your database engine? You may be in for a surprise.
  • Some time series databases does not have support for regular data types and regular SQL syntax. This means you will have to have TWO databases. One for configuration data and one for time series.
  • Some time series databases from the cloud vendors are extremely expensive as soon as you come up into production. Did you actually check the cost? If not you may be in for a BIG surprise.
  • Even plotting time series in a web browser may be too slow. Did you check if your time series database can decimate and deliver data to your front end in real time?

 

Finn posisjonen til masten som ditt 4G / LTE modem er koplet til med Celle ID! (4G / LTE Cell ID database with position info., Norway)

Lurer du på hvor masten som ditt 4G / LTE modem er koplet til er lokalisert? (LTE Cell ID database with locations, Norway). 

Maybe this can help: I found a comprehensive database that contains approx 147000 lines at https://opencellid.org/ This dataset is made publicly available by Opencellid for download.

Here is the cell id data I downloaded for Norway:

 242.csv

Rename the file to 242.csv (it has a txt extension) and open it or import it in Excel.

You can then find lat and long position from the Excel cheet

Then enter that lat / long (with decimals and add E and N after the numbers) into the search field of Google maps