Project showcase
Temperature Streaming with Arduino + Big Data Tools

Temperature Streaming with Arduino + Big Data Tools © GPL3+

Platform for processing of streaming temperature data using Arduino, DHT sensor, ESP8266 module and Big Data / Hadoop ecosystem tools.

  • 45 respects

Components and supplies

Apps and online services

About this project


This project covers the process of deploying a simple architecture for the real-time and batch processing a temperature sensor data on Arduino with open source technologies part of the Big Data ecosystem. The purpose of the solution is to exemplify the flow of data through the different tools, from its capture to its transformation and Insights generation.

Within the presented architecture the services of publication, transfer and storage of data are agnostic to the format in which the data is sent by the Arduino board. This drives the idea of building a centralized service for the distribution of messages from different sending devices to many clients or services capable of consuming this data.

Under this premise, the possibilities of applicability of this architecture are directly proportional to the implementation creativity of information-emitting devices.


  • Arduino IDE 1.8.5
  • Hive 1.2.1
  • Kafka 0.10.0
  • Spark 1.6.2
  • Zeppelin Notebook 0.6.0
  • NiFi 1.2.0

Data flow

1.- Data generation from humidity/temperature sensor

  • The code loaded in the Arduino platform makes readings trough the DHT sensor every 3 seconds, capturing:
  • Percentage of humidity in the environment.
  • Temperature in Celsius (°C)
  • Temperature in Fahrenheit (°C)
  • The Heat Index is calculated. This measure determines how people perceive the temperature according to the humidity of the environment.
  • A request is made to an external web service to determine the time of reading according to a predefined time zone.

2.- Data publication to the MQTT server

  • The message or payload that will be sent to the MQTT server is built:
  • The payload is in JSON format.
  • It contains the data captured by the sensor, the calculated information, date/time of the reading, amount of milliseconds passed since the Arduino platform started and a unique identifier of the transmitter device.
  • Verifications of Internet and MQTT broker connections.
  • The payload is published to the MQTT broker on a specific topic under a predefined username and password.
  • The MQTT broker has a list of permissions that defines which users can publish information over existing topics.

3, 4 & 5.- Real-time data capture

  • The Apache NiFi service has an organized set of instructions that orchestrate the flow of data as they are captured:
  • NiFi connects or subscribes to the Mosquitto topic and captures messages in real time.
  • NiFi complement the received messages (JSON string) by defining new fields outside the message related to technical aspects of the message and the MQTT broker.
  • NiFi inserts messages and the new fields into the Hive data store.
  • NiFi publishes the original message in Kafka.
  • Hive and Kafka stores the data:
  • Hive allows batch processing of historical data.
  • Kafka allows real-time processing of data sent by the Arduino platform.

6 & 7.- Data processing

  • Zeppelin runs code blocks (Scala and SQL):
  • It is possible to query the data stored in the data warehouse.
  • It is possible to subscribe in real time to the Kafka topic to process the messages under different time windows.
  • The code is executed on Spark.
  • The data obtained in each time window are transformed and stored in Hive tables.
  • the average of the temperatures captured is calculated on every window .


  • Adafruit Unified Sensor 1.0.2
  • DHT sensor library 1.3.0
  • PubSubClient 2.6.0
  • Time 1.5.0
  • NTPClient 3.1.0


General settings:

  • Board: "ESP8266 Generic Module"
  • Flash Mode: "DIO"
  • Flash Size: "512K (64 SPIFFS)"
  • Debug port: "Disabled"
  • Debug level: "None"
  • Reset Method: "ck"
  • Crystal Frequency: "26 MHz"
  • Flash Frequency: "40 MHz"
  • CPU Frequency: "80 MHz"
  • Upload Speed: "115200"
  • Programmer: "AVRISP mkII"

Serial monitor

  • Autoscroll
  • Ambos NL & CR
  • 115200 bauds


In this project instructions are not loaded to the Arduino board, but to the ESP8266 module, since it is this module that will manipulate, transform and send the data.

To load instructions to the WiFi module it is necessary that it enters Flash Mode at the moment of start, which is achieved through the pin configuration showed in the Pinout diagram (Flash Mode).

It is recommended that the Arduino board does not have loaded instructions when carrying out the code load to the ESP8266 module.


In the serial monitor we can observe the process of connection, capture and publication of messages.

If we subscribe to the Mosquitto topic we can see how the messages are published by the Arduino board in real time.

NiFi publish the captured messages on Kafka and Hive. In the latter, additional fields related to the MQTT server are recorded in the table.

Once the NiFi template is started, if we subscribe to the Kafka topic to which we redirect the messages, we will be able to observe how the messages are published practically instantly when they are received at Mosquitto. In the following image we can see the reception of messages in the Mosquitto topic (left) and the Kafka topic (right).

On the other hand, if we consult the Hive table periodically, we will notice that the number of registers increases according to the messages captured by NiFi.

The notebook developed with Scala is in JSON format and can be imported into Zeppelin and is divided in 7 paragraph:

1.- Setup.

2.- Data capture.

3.- Calculation of temperature averages by window.

4.- Kmeans model creation and training.

5.- Data classification (window)

6.- Data classification (random data)

7.- Data inspection.

Next improvements

This project lacks the following characteristics that could increase the value of the possible applications for these technologies:

  • Arduino board integration with different types of sensors.
  • Multiplexing of signals sent to the WiFi module.
  • Development of status and control indicators (LEDs, alerts, alarms).
  • Flash Mode activation/deactivation button.


Detailed instructions for replicating this project are found in this Github repository, along with the skecths, templates, notebooks and test data used can be found.

At the moment it is in Spanish, so while it is translated into English you can practice your skills in this language ;).


Arduino temperature streaming demo
Original project source.


Pinout diagram of the Arduino components (Flash Mode)
Pinout diagram of Arduino components. This pin structures enables to init the ESP8266 module in Flash Mode to load code.

The upper section of the breadboard is dedicated to the ESP8266 module pins, which is powered by a 3.3V voltage and uses a 10k resistor. The voltage flow to the WiFi module is controlled by the green pin connected to the breadboard in the last column of the positive charge row.

The lower section of the breadboard is almost completely dedicated to the DHT temperature sensor. This sensor works with a voltage of 5V and a resistance of 1k.
Demo pinout fm pi7bglqgwh
Pinout diagram of the Arduino components (Boot Mode)
Pinout diagram of Arduino components. This pin structures enables to init the ESP8266 module in Boot Mode, executing the loaded code and sending data.

After the instructions have been loaded, connect the GPIO0 pin (white) of the ESP8266 module to the voltage across the resistor. In this way, the ESP8266 module will not enter the Flash Mode the next time the Arduino platform is started, allowing it to execute the loaded code as soon as it receives power.
Besides, the DHT sensor blue pin transfers the output signals, which must be captured by the WiFi module through the GPIO2 pin (blue).
Demo pinout bm hv5gnzzcsv
Pinout diagram of the DHT11 sensor
Pinout diagram of the DHT11 temperature/humidiy sensor.
Dht11 pinout a6bsgzqqvn
Pinout diagram of the ESP8266 module
Pinout diagram of the ESP8266 ESP-01 WiFi module.
Esp8266 pinout ahnvo6mhhw


Similar projects you might like

Publish Your Arduino Data to the Cloud

by Jaume Miralles

  • 57 respects

Temperature and Humidity Data Logger

Project tutorial by Wimpie van den Berg

  • 37 respects

Temperature and humidity meter (iot)

Project showcase by 윤원호 and gledel

  • 1 comment
  • 22 respects

Get Data from the Cloud to Your Arduino

by Jaume Miralles

  • 26 respects

Sensor Data Streaming with Arduino

Project in progress by 8bitkick

  • 1 comment
  • 38 respects

Temperature and Humidity Logger (Using Arduino)

Project showcase by lmsousa

  • 47 respects
Add projectSign up / Login