How to import data into InfluxDB with Python and Pandas

InfluxDB is a database for time series data, and I have recently been developing an application for geodetic and geotechnical monitoring around it. My input data is usually in text form. There are several ways to get text data into InfluxDB, including an annotated .csv format and a line format. I find both a bit fiddly. But there’s also the possibility to import a Pandas dataframe. Unfortunately, this is barely documented, and many examples apparently use an older version of the API.

Here are the necessary imports (pip install influxdb_client might be required):

from influxdb_client import InfluxDBClient
from influxdb_client.client.write_api import SYNCHRONOUS
import pytz

InfluxDB is timezone-aware, so make sure that your timestamps are in the right timezone:

timezone = pytz.timezone("Europe/Amsterdam")
data=pd.read_csv(filename,sep=",",index_col=False,parse_dates=["timestamp"])
data.set_index("timestamp",inplace=True)
data.index=data.index.tz_localize(timezone)

It is important to know that InfluxDB distinguishes between keys (columns that hold numeric values), and tags (columns that hold label strings). By default, the API treats everything as keys, which will cause issues later on. So you have to pass explicitly which columns should be treated as tags:

client = InfluxDBClient(url="servername:8086", token="influxdb_token", org="your_organisation",debug=False)
write_api = client.write_api(write_options=SYNCHRONOUS)
tags=["tag1","tag2"]
write_api.write(bucketname,"your_organisation",record=data,data_frame_measurement_name="name",data_frame_timestamp_column="timestamp",data_frame_tag_columns=tags)
write_api.close()
client.close()

Leave a comment

Your email address will not be published. Required fields are marked *