How I used Home Assistant and Python to ingest ESP32 sensor data

In my last blog post, I assembled some temperature/humidity sensors using ESP32 chips and BME280 sensors. Once those were up and running, the next step was to actually get data from them. My goal was to grab the data from those sensors and put it into my Elastic stack, the sink for all logs I care about. In this blog post, I’ll share how I did that using Home Assistant and Python.

Home Assistant Explained

What is Home Assistant? In short, it’s a home automation tool meant to make it easy to connect various devices and create actions based on triggers and rules you create. While automation is the focus, I decided to use it not with automation in mind, but for pure data collection. There were two main reasons I went with Home Assistant: first, it works very well with ESPHome, and second, it has an API so I can extract data from it easily and ship it elsewhere.

To get Home Assistant up and running you have a ton of options. Long-term, I may switch to a VM or dedicated hardware like a NUC, but to get started quickly I went the Docker container route. This command is all you need:

docker run -d \
  --name homeassistant \
  --privileged \
  --restart=unless-stopped \
  -e TZ=MY_TIME_ZONE \
  -v /PATH_TO_YOUR_CONFIG:/config \
  -v /run/dbus:/run/dbus:ro \
  --network=host \
  ghcr.io/home-assistant/home-assistant:stable

Pro-tip: make sure the path to your config is a real path so you can save your configuration; I learned this the hard way when I had to restart my container and I had to essentially start from scratch. Set up is pretty trivial, so I didn’t lose too much time, but it’s still annoying to lose my work.

Once that was up and running, I visited the URL for the host it is on such as http://localhost:8123, and I was greeted with the dashboard after configuring my first user:

Home Assistant overview

Adding Devices

Because Home Assistant supports ESPHome, it automatically detected the devices on my network and prompted me to add them. After naming them and assigning them to an area, I had all of my sensors in Home Assistant in no time:

Home Assistant ESPHome devices

That also meant that I could see the data being collected in Home Assistant, which has its own graphs and dashboards ready to go:

Temperature graph in Home Assistant

This was great, but I still needed one more step to achieve my goal: to actually get the data into Elastic. Fortunately, as previously mentioned, Home Assistant has a well-documented API, so all I needed to do was create an API token and start making API calls.

Sending to Elastic

The Home Assistant API returns JSON data, so my plan was to grab the data and just dump to a Logstash JSON listener. It was pretty much a copy/paste of the listener I use for CoreTemp data, just on a different port. With the listener set up, I created a very simple Python script to pull the data and send it to Logstash and hence to Elastic. Note: there is a HomeAssistant wrapper for Python that I could have used, but the API is so simple I went with the easy requests route instead. In my file, I have a function to grab the data for a given entity:

def get_data(entity_id, token, logger):
    base_url = "http://localhost:8123/"
    headers = {
            "Authorization": f'Bearer {token}',
            "content-type": "application/json"
            }

    url = base_url + f"api/states/{entity_id}"
    try:
        response = get(url,headers=headers)
        data = response.text
        return data
    except requests.exceptions.HTTPError as err:
        logger.error(f"HTTP error occurred: {err}")
    except Exception as e:
        logger.error(f"An error occurred: {e}")

This uses the API token created in Home Assistant to get the latest state for an entity and return the data. In the main function, I iterate through a list of entities provided on the command line, and if I get data, I create a socket to send to the Logstash listener:

 for entity in args.entities:
        logger.info(f'Getting data for entity {entity}')
        data = get_data(entity,args.token, logger)

        jsondata = json.loads(data)
        # Check to see if entity exists
        if jsondata.get("message") == "Entity not found.":
            continue
        # Check to see if data is avaiable
        if jsondata.get("state") == "unavailable":
            continue
        s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
        s.connect(('192.168.X.XXX',5201))
        s.sendall(bytes(data,encoding='utf-8'))
        s.close()

This will ignore entities that don’t exists or are offline, which helps keep bad data out of Elastic. To run, I set up a cron job to pull this data every 5 minutes.

On the Elastic side of things, I set up a separate data stream and index template for this data: you’ll want to ensure that the state field is a float so you can do math on the values it returns. After configuring everything properly and setting up some Kibana metrics and visualizations, I had all of that data in Elastic!

Kibana dashboard

I have to say that surprisingly the coolest part is to see the barometric pressure graph, as it can tell you very quickly when storms are coming due to falling pressure, or nice weather when the pressure is on the upswing:

Pressure graph

I may clean up the Python script a bit and put it online, and if I do I’ll let people know. Happy data collecting and automating!

Leave a comment

Your email address will not be published. Required fields are marked *