Project tutorial
Eye-to-Speech Module

Eye-to-Speech Module © GPL3+

We want to help people who are unable to communicate verbally and cannot use sign language by providing a wearable “eye-to-speech" solution.

  • 2,200 views
  • 2 comments
  • 9 respects

Components and supplies

Necessary tools and machines

About this project

1. Introduction

As part of our student projekt called Speak4Me we want to help people who are unable to communicate verbally and suffer from severe physical disabilities by providing a wearable “eye-to-speech” solution that allows basic communication. By using our device, users can express themselves with customizable phrases using situational and self-defined profiles that allow them to communicate with their environment with 128 phrases.

Through simple eye movements within certain timeframes and directions, Speak4Me will output audible predefined phrases and allow basic communication. A simple Web-Interface allows the creation of profiles, without any limitation on word complexity or length.

Following a simple step by step video tutorial Speak4Me can be built for around 100$, which is way below given solutions. All product related information regarding needed materials and construction manuals are provided via a central homepage.

Teaser Video

Project GitHub page including step-by-step video tutorials and the no-code customization tool: https://speak4me.github.io/speak4me/

1.1 Motivation

The UN has defined 17 Sustainability Goals (SDGs) that all members of the UN have adopted in 2015. Main focus of the SDGs is the desire to create a shared blueprint for more prosperity for the people and the environment by 2030 (cf. UN, 2015). Speak4Me embraces SDG 10 which aims to remove inequalities within and among countries, especially in poorer regions. Within this SDG our solution can bring value to people with speech related disabilities by giving them a way to communicate. Worldwide over 30 Million people are mute and therefore unable to communicate using their voice. (cf. ASHA, 2018)

Our main focus are patients with serious mental and or physical disabilities, that are not only verbally mute, but are also unable to communicate using gestures and other means. This includes patients with ALS, Apraxia and other degenerative diseases that lead to a slow loss of control over body functions, as well as patients affected by spine damage that makes it impossible to communicate via body language and other means. Birth defects, damage to vocal cords and accidents that damage relevant organs and many other conditions, can also lead to muteness.

Physical muteness is rarely an isolated condition. Most commonly it is just the result of other underlying conditions like deaf-muteness which is the most common reason, for people to be unable to communicate verbally (cf. Destatis 2019).

Speak4Me could also provide help to patients with temporary conditions. Just in Germany over 250.000 people suffer from strokes every year (cf. Destatis 2019). During recovery our solution can help strongly affected patients to communicate with their environment, which otherwise might be impossible. Other nursing cases like patients suffering from the Locked-In syndrome could also benefit, as only control of the eyes is required.

1.2 Objectives

Speak4Me wants to provide an affordable, customizable, easy to build and use device to support handicapped people to communicate with their environment. Language synthesizers and speech computers exist, but are very expensive which can make them unaffordable depending on socio economic background including the country of residence. Our target is to deliver a solution below 100$ in total cost, to reach as many affected people in the world as possible. By providing blueprints and code basis of the entire solution we want to encourage others to build upon our work an improve or adapt it.

1.3 Background

Our entire solution is based on the open Arduino platform, which is built around standardized hardware and a basic coding language. Previous projects based on the Arduino platform, have shown promising results in relation to eye tracking (cf. Arduino, 2018). Using infrared sensors attached to an ordinary pair of glasses, a project team was able to visualize the eye movement on LEDs arranged in a way that resembles a human eye. This allowed them to visualize the direction the user was looking at (right or left) in real-time and also allowed to track blinking.

We iterated on this capability and transformed the eye tracking functionality into an eye-controlled user interface. Using this interface, we are able to track the eye movement to up, down, left and right and put predefined phrases behind certain combinations of movements. Should a user for example look up, followed by looking right, Speak4Me can interpret this movement (“up” followed by “right”) and the integrated text-to-speech module will synthesize the phrase that was placed there in the profile.

2. Design Method

Speak4Me was created in several iterations. In its initial form, the sensors were directly attached to the glasses, which was insufficient for our needs. The distance between the eye and the sensors were too far, which led to unreliable data streams. While it was able to follow movement, the directional tracking was too vague to sufficiently register movement patterns. To overcome this challenge, we decided to attach a 3D printed sensor ring directly onto the glasses to reduce the distance between eyes and sensors. This change drastically increased the quality of sensors and yielded much better results.

As our solution is intended to work in close proximity to the human eye, we firstly had to verify the feasibility in regard to long term health effects. We had to make sure that our sensors would not be in any form harmful to the wearer, even over longer periods. From our first prototypes, we could assume a distance of 7-10mm from the sensors to the eye. With support, we managed to estimate the optical power of our sensors, which is below the exposure limit value mentioned in the corresponding guideline.

The “worst-case” exposure limit value (2, 8*10^4)/(C(α))*W*m^-2*sr^-1 (Guideline BGI5006, 3.6.1). If we now look at the aperture or directivity of the optics at about total angle 90° (±45°, after that it drops significantly) at 7mm distance, we get a circle with an area of (tan(30) × 7𝑚𝑚)^2 × 𝜋 = 51, 3𝑚𝑚^2. The semiconductor junction of the QTR-1RC Reflectance Sensor works with 1.2V nominal and a forward current of 20mA, so 24mW (Fairchild Semiconductor Corporation). If you then calculate well with an efficiency of 20%, the optical power should be in the range of around 5mW with 940nm wavelength. Combining these calculations leads this to a result of 5mW/51, 3mm^2 = 0, 0974*mW/mm^2. This is below the exposure limit value. Even with 10mW optical power, it would still comply with it.We assume no liability and advise you to familiarize yourself with the subject matter before use so that injuries can be avoided.

We initially used the Arduino Uno which met all of our requirements and built our first working prototype around this platform. Although sufficient in regards to features and our intended pricing policy, we later transitioned to another model. By switching to an Arduino Nano, we managed to get the same feature set into a much smaller form factor. As Speak4Me is intended to be used inside and outsides we had to ensure usability attributes like size and weight.

In its final form, the entire needed hardware fits onto a 64x55mm circuit board. For our solution we decided to use a custom-made circuit board that was optimized for our needs, but the use of standard sized boards is also possible. To further protect the device, we modeled and built 3D printed casing with exact measurements and needed cutouts.

One of our first hardware decisions was the text-to-speech module that we wanted to use. Our research has shown several solutions with text-to-speech capabilities that only cost around 10$ and fulfilled all our product requirements. Further research has shown that although sufficient, the output quality of those cheaper modules was adequate at best. We made the decision to select a more expensive Parallax Emic 2 that cost around 60$, but also provided far superior playback quality. This decision drove up the price of Speak4Me and is the most expensive part of our product, by far. Although expensive, we chose this module as the entire feasibility of Speak4Me depends on the ability to be understood when using it.

All relevant design files.stl files (3D Printing case) as well gerber files (circuit board) that are needed to deploy and build our solution are listed at the bottom of the project post under Enclosures and custom parts.

3. Solution Design

Within this chapter we describe the hardware and software aspects of our solution and how they are setup. We also provide the list of needed materials with pricing and quantity and possible purchase options.

3.1 Hardware

Our architecture for the solution design consists of two self-constructed components and a proper power source, all connected by wire. The first component is our main device and contains everything which is necessary for the device to operate but the sensors and the power supply. The second component is a pair of glasses on which we mount infrared reflectance sensors to detect the viewing direction of the wearer of the glasses. Since we are working with an operating voltage of 5 V with maximum amperage of around 400 mA, any common USB power bank is sufficient for our device.

3.2 Material List

The required material to build your own Speak4Me glass as well as a list of recommended tools for assembly can be found at the top of this project post in the section COMPONENTS AND SUPPLIES and NECESSARY TOOLS AND MACHINES.

3.3 Architecture

The hardware architecture is described below. The individual components and their connections are described in detail to provide an overview.

3.3.1 Cable

To connect the four sensors with the main device, we needed four data lines and two lines for the power supply of the sensors. The cable should be at least 1, 5 meters long, so that the main device can for example be mounted to a wheelchair and stays independent from the placement of the glasses and potential head movements. Due to its high distribution, assured quality and low cost, we used a simple USB 3.0 cable. However, did not implement a real USB device. To make sure that users cannot damage other hardware, we complied to the pin usages of the USB standard. By that, when connecting the device to a computer for example the sensors will run as usual, even if the computer will not be able to communicate with them. A USB 3.0 cable includes 6 lines intended for data communication. We used the two pairs of separately shielded cables to maximize the reliability of the communication. The resulting pin assignment can be seen in Table 1. The relevant wires of the dismantled USB cable were directly soldered to the sensors; all sensors use the same wire for power supply and ground connection.

3.3.2 Glasses

To keep the sensors at a steady position in front of the eyes of the user, we used a simple pair of glasses, which can be very cheap since they only serve as a holder for the sensors. The sensors get attached to it using a self-constructed and 3d printed plastic mount. First, we glued the sensors on the mount by using high quality glue pads. Afterwards, we glued the component directly onto the right lens of the glasses using the same sort of glue pads, cut out to fit to the bottom side of our mount.

3.3.3 Main device

The main device needs a microprocessor which serves as the “brain” of the device, orchestrating the input, calculations and resulting output while keeping track of sequential tasks etc. An Arduino Nano is used for this task. As a relief for the Arduino, we add a high-quality text-to-speech controller to reliably segregate the audio output functionality so that the Arduino remains available for further input like the control of the LED, etc. after sending the output texts to the controller. To perform this task, we used a “Parallax Emic 2”. We then add a socket to connect the glasses, a 3.5mm AUX port for the external speaker and connected it with an additional 2 pin header for optional use with an internal speaker and also add a status LED to give feedback to the user besides the audio output. Since the whole device has a very low power consumption, most modern powerbanks can eventually shut off since the amperage will be too small to be detected by the powerbank as a relevant device. To prevent this behavior, we add an additional circuit which we will be using to sequentially waste a little bit of energy on purpose. This is still a neglectable small amount of energy and can later be disabled on the software side, when not being necessary in a specific set up. After connecting the components, the resulting electric circuit can be seen in Figure 1 (a).

This electric circuit can be used as a basis to design a proper printed circuit board layout. The resulting PCB layout can be seen in Figure 1 (b). The according Gerber files can be sent to a manufacturer like “jlcpcb.com” for manufacturing. The components then get soldered onto the board and the main device is functionally ready to use.

To ensure portability of the device, a proper case is necessary. To be able to use a case as small as possible while having the right connector omissions and a speaker grill for the internal speaker, we created a 3D model of the case. We then printed the model using a 3D printer, relying on PLA as a suitable material due to its low cost and sufficient quality regarding stability and longevity.

3.4 User Interface and Implementation

The following sections describe the logic of measuring the directions, the structure of the user interface and the implementation of these logics in Arduino code.

3.4.1 Logic of the eye measurement

The four sensors which are attached to the glasses are measuring the brightness of specific areas of the eye every 100 milliseconds. The measurement of brightness is expressed in a number from 0 to 5000, which represents the time it takes for an infrared LED beam to reflect off a surface and return to the sensor. The faster, i.e., the lower the value, the brighter the surface, as brighter surfaces reflect more.

As seen in figure 2, to check whether the eye is looking left, right, up or down, the brightness of the four spots must first be measured in a neutral eye position. To do this, the user must look straight ahead for 3 seconds when starting the device. These values represent the initial value with which we compare the new measurements. A direction is detected as soon as the value on the opposite side becomes lower than the neutral value for 500 milliseconds. The reason for this is that when the eye looks to the right, for example, the left spot in the eye is white. The value must be above a certain factor in order not to recognise minimal movements as direction. This method gave the most stable results in our tests. After a direction is detected, the direction detection pauses for 500 milliseconds to prevent double detections. The LED on the device, which flashes as soon as a direction has been detected, serves as an orientation aid. If two successive directions could be measured within 5 seconds, we speak of a combination. We only allow combination of two directions which then leads to an output of a phrase.

3.4.2 Profiles

A profile is the summary of all possible combinations of the 4 directions, which results in 16 possible combinations. With our solution is it possible to use 8 profiles. We have prefabricated five of these profiles according to themes (Home, Caretaker, Doctors Visit, Friends, Chess Game), the remaining three must be created by the user.

All profiles can be completely modified via our customizing tool on the website. As can be seen in the visualisation of the "caretaker profile” in figure 3, we have also tried to separate the quadrants of the profiles thematically or logically in order to increase usability. In addition, the upper quadrant is the same in every profile, as it contains necessary phrases that we consider useful in every profile in order to express oneself without having to change the profile. The profiles can be selected via the menu.

If you close the eyes for 2 seconds, you switch back to the main menu which can be seen in figure 4. This duration can be customized in the settings of the customization tool. The menu itself is operated like the profiles themselves with combinations of eye movements. The volume can also be set, and the device muted here. To make it easier for the user to operate, we have used additional acoustic and visual signals. For example, the LED always flashes when a direction has been detected. This gives the user a good support, especially in the learning phase, to be able to use the tool safely and efficiently. Furthermore, the selected menu items are indicated acoustically in the menu. For example, when the user increases the volume, "Volume Up" is played.

3.4.3 Implementation

In the following, the most important key points and special aspects of the implementation in the Arduino code are explained.

LibrariesThe Arduino-library for the sensors is called QTRSensors. It enables an easy communication between the Arduino and the infrared sensors. It´s necessary to read and setup the sensors. We use the SoftwareSerial to communicate with the EMIC. The EMIC has its own library, but we do not use it because we would only use one function (output a string). Since the library requires a lot of memory, we decided to address the EMIC manually via this library. Another important library for our device is the Arduino-Timer. We use the timer functionalities to loop through our process and coordinate the logic of the different queries using protothreading. Avr/pgmspace is a library we use to store the variables which contain the phrases in the program memory to save Random Access Memory (RAM). This is important as the strings of phrases are long and would therefore exceed the RAM capacity of the Arduino.

Storing the phrasesThe phrases we output with the EMIC are stored in char variables which, as described, are not saved in the RAM but in the program memory. This is done via the PROGMEM function.

const char textdata_16[] PROGMEM = "I want to wash myself";

To be able to access the resulting 144 variables, we need an additional table which contains the pointers to the variables. This table is stored in the program memory as well.

const char *const data_table[] PROGMEM = {textdata_0, textdata_1, …

Input Processing

To reliably detect the user input, a reliable input processing routine is needed. As can be seen in Figure 5 every 100ms the readSensors function is being called. This activates a measurement of the four sensors reporting a duration between 0 and 5000ns which can be interpreted as a measurement of the brightness of the surface (in our case: the part of the eye and surrounding area) in front of the sensor. Applying thresholds derived from an initial calibration during startup, we can evaluate the measure and detect if the user was looking up, down, left or right, closing his eyes or doing none of these actions.

The detected direction can then be used as an input for the processing logic. The processing logic makes sure that the same direction has been detected for a defined timespan to avoid unintended input. After a first direction has been detected, it gets rejected again if the user is not inputting a second direction during an appropriate time span. Once the logic receives a valid detected direction pair or a valid detection of closed eyes, it triggers an according method to perform the relevant command.

A diagram of the resulting logic can be seen in Figure 5.

Phrase-SelectionWe select the phrases which are stored in the table with the following logic. The combinations are numbered from 1 to 16. The starting point is Left + Left with 1 and the quadrants are numbered clockwise. Down + Down is therefore 16. Now to get the phrase for a combination for the corresponding profile, we multiply the number of the profile by 16 and then add the number of the corresponding combination. For the combination Up + Right in profile 3, the phrase is consequently picked at position 55 (3 x 16 + 7) of the table.

3.5 Homepage

The Speak4Me homepage (https://speak4me.github.io/speak4me/) offers a single point of contact for the entire solution. A step-by-step video playlist explains all relevant features to the customer. This includes a detailed construction blueprint and guides as well as product capabilities. Needed materials and recommended tools are also listed here.

The webpage is designed in multiple pages instead of one continuous blog type layout to avoid sensory overload. As our product requires self-assembly by the user, it made sense to slowly bring in new information only when necessary. Therefore, the entire setup process is presented in the form of time coded Vimeo videos, which can be watched and replayed at the user's own pace.

When visiting the homepage, only our teaser is shown initially, while each additional video has its own subpage with additional information relevant to the current topic.

At the same time, the homepage provides the basic functionality of profile creation and modification. To make it unnecessary for the user to have any coding experience, the created configuration file will not only include the created profiles, but also the entire needed codebase for the Arduino. This allows a simple transfer process to be handled by anyone without in depth knowledge.

Through the homepage the user can customize phrases and settings using a single output file. Settings include timeouts, to allow for faster or slower reaction by the system depending on need and preference. Especially new users can test settings and find their own timeframes. When creating a profile, a PDF document can be created, showing the current phrase setup so users can refer to the current configuration without needing a computer.

3.6 Tutorials

To further improve the instructions, several parts of the explanations are covered in videos. Visual and auditory information can be processed considerably faster from the brain (cf. Sibley 2012) and it is easier for the viewer to imitate the steps of the process. As a result, this approach seemed superior to textual instructions. The elements used to create the videos are screen recordings, filming, pictures and voice.

As a general design standard, the videos were aimed to be motivational. A specific part of this was communicating enough, that also people with less technical knowledge can cope with all given challenges. Furthermore, a clear and minimalistic design approach was used to improve learning by preventing an overload of the sensory register. The videos were divided in seven parts:

4. Discussion

In closing we would like to share recommendations, limitations and possible future upgrades regarding the entire product.

4.1 Evaluation

The final prototype supports all features set out at the beginning. However, to optimize usability several factors should be considered while using the device.

Within the current setup the peripheral sight might be decreased due to the presence of the sensor ring. While looking straight ahead the sensor ring is only slightly obstructing, but horizontal visibility might be limited due to its presence. During assembly the sensor ring should be placed in a central position to optimize functionality and minimize the loss of vision. Depending on setup, the sensor ring might be too close to the eye to be worn comfortably. Before printing the sensor ring it should be double checked how much space there is between sensors and the eye, as each wearer is different adjustments should me made accordingly. Due to the sensor design the placement is important and should be considered should first trials not yield expected results.

In its current design die glasses require a direct cable connection to the device. The weight of the cable on the frame might move the glasses or make it more uncomfortable to wear than usual. To optimize read outs the glasses should always be placed in the same position.

Our device includes a build in speaker that has shown to provide good audio quality even over distances. Due to its size the volume and range of the speaker can be limiting in some cases. To allow for flexibility, we have added AUX jacks to provide support for external audio solutions should the onboard speaker not suffice.

Our solution is designed as a complete package, so no user needs coding experience to use and customize the device. In order to move the online created config file however, the Arduino IDE is necessary to actually move the files onto the device, a simple Explorer based copying command is currently not possible. For installation and setup, please refer to the official Arduino homepage. (see Arduino, 2020)

4.2 Theoretical and practical contribution

As our entire solution is designed around an Arduino as its foundation, we were able to develop a simple and inexpensive solution using standard technologies and frameworks. We have shown that an easy to build solution can provide considerable help to people in need. Our eye-to-speech setup using sensors to track eye movement can provide base level communication with the environment at a very low cost. Speak4Me can provide value for people in need and could have applications in other fields we have not considered, yet.

To provide other developers a basis for future changes, the entire solution and code base will be released as open source under the General Public License v3 (GPL) on GitHub (https://github.com/speak4me/speak4me)

4.3 Outlook and extensibility

Future iterations of Speak4Me could offer additional APIs like WIFI/Bluetooth to make open communication with other devices possible. A connected smartphone App could allow users to change and create profiles on the fly and transfer them directly to the device, without needing to connect to external devices like laptops or PCs to transfer profiles and settings. At the same time, those settings could allow the customization of the UI like specific movements controlling timeouts; resets or mute functionality based on user needs am abilities. Usability could also be improved by making Speak4Me attachable and removable to avoid permanent changes to glasses and frames. A simple clip design could be used to hold the device in place while in use, while allowing taking it off on demand. This would also allow for easy changing between glasses like sun- or reading glasses without the need off time-consuming assembly. To optimize usability, Speak4Me could also be attached to different types of face ware in the future. Additional storage would allow for active profiles and combinations beyond the current 4x4x8 design.

Ideally the number of active profiles would be in the hundreds to provide the biggest range in communication possible. External storage like standard SD-Cards could increase the given storage to allow ever further customization and profiles to be permanently stored on the device. By using online APIs like Google Speak or Amazon Polly, future products might be able to type in words on demand instead of using predefined profiles and phrases.

In its current form the speech module only supports English and Spanish as output languages. Future iterations should offer support for at least the 10 most spoken languages to offer the best possible coverage.

Regarding the distribution, Speak4Me could also be offered as a Do-it-Yourself package to further decrease the cost of the final product. The users could have the choice to either order a completely built product, or a cheaper package for self-assembly.

Code

Arduino CodeArduino
Required code for the Arduino. In the lines from 72 to 232 you can customize your expressions. If you would like to use a no-code solution to edit your expressions you can use the customization tool on your project website: https://speak4me.github.io/speak4me/profile-customization-tool.html
//sensor imports
#include <QTRSensors.h>

// For serial communication with the emic
#include <SoftwareSerial.h>

//To use timed actions -> protothreading
#include <arduino-timer.h>

// To save data to program space
#include <avr/pgmspace.h>

//Create timer to schedule the project parts
auto timer = timer_create_default();

//definitions
#define rxPin 6  // Connect SOUT pin of the Emic 2 module to the RX pin
#define txPin 7  // Connect SIN pin of the Emic 2 module to the TX pin
#define keepAlivePin 8
#define StatusLED 9


//CUSTOM_SETTINGS_DECLARATION_START
#define voiceStyle 1 // Male or female voice
#define initialVolume 1  // Update Volume Funktion einbinden, Aus dem Customizationtool kommt 1-5
#define closedDuration 2000 // How long to close the eyes to open the menu
#define keepAliveOpt true
#define statusLEDActive true // Learning LED which turns on when a direction is detected // Funktion to be implemented
#define dirDuration 500 // Duration for which a direction has to be measured to be recognized
#define dirPause 1000 // Cooldown after dir1 has been recognized
#define timeOutDuration 4000 // Timelimit after which direction 1 gets rejected
#define useAutoCalibration true
//CUSTOM_SETTINGS_DECLARATION_END


//variables: settings
bool muted = false;
uint8_t volume = initialVolume;
uint8_t activeProfile = 1;
bool inMenu = true;

// Margins for eye tracking detection
uint8_t factorL = 87;
uint8_t factorR = 87;
uint8_t factorUp = 90;
uint8_t factorDown = 83;

//variables: program logic
//Sensors

QTRSensors qtr;
uint16_t sensorValue[4];
uint16_t neutralSensorValue[4];

//Emic
SoftwareSerial emicSerial =  SoftwareSerial(rxPin, txPin);
bool emicReady;

//Input treatment
int8_t dir1 = -1; //0 1 2 3 for left Up right down; 4 for closed; -1 for null
int8_t dir2 = -1; //0 1 2 3 for left Up right down; 4 for closed; -1 for null
uint8_t status = 0; //0: Waiting for recognition; 1: timeout after dir1; 2: waiting for recognition 2
unsigned long signalTracker; //Variable to track for how long a certain signal has been detected
unsigned long timeoutTracker; //Variable to track the time when a timeout is reached



char textdataBuffer[100]; // To buffer PROGMEM readings - Has to be large enough for the largest string in must hold

//Loading Text Data into Progmem
// Menu
const char textdata_0[] PROGMEM = "Leave Menu";
const char textdata_1[] PROGMEM = "Volume Up";
const char textdata_2[] PROGMEM = "Toggle Mute"; // Muted - Unmuted mglich?
const char textdata_3[] PROGMEM = "Volume Down";
const char textdata_4[] PROGMEM = "No";
const char textdata_5[] PROGMEM = "Okay";
const char textdata_6[] PROGMEM = "Yes";
const char textdata_7[] PROGMEM = "Help";
const char textdata_8[] PROGMEM = "Profile 1";
const char textdata_9[] PROGMEM = "Profile 2";
const char textdata_10[] PROGMEM = "Profile 3";
const char textdata_11[] PROGMEM = "Profile 4";
const char textdata_12[] PROGMEM = "Profile 5";
const char textdata_13[] PROGMEM = "Profile 6";
const char textdata_14[] PROGMEM = "Profile 7";
const char textdata_15[] PROGMEM = "Profile 8";

// PLATZHALTER Profile 1 - Home
//CUSTOM_EXPRESSION_DECLARATION_START
const char textdata_16[] PROGMEM = "I want to wash myself";
const char textdata_17[] PROGMEM = "I need to go to the toilet";
const char textdata_18[] PROGMEM = "I would like to brush my teeth";
const char textdata_19[] PROGMEM = "I would like to change my clothes";
const char textdata_20[] PROGMEM = "Okay";
const char textdata_21[] PROGMEM = "Yes";
const char textdata_22[] PROGMEM = "Help";
const char textdata_23[] PROGMEM = "No";
const char textdata_24[] PROGMEM = "I want time for myself";
const char textdata_25[] PROGMEM = "I want to sleep";
const char textdata_26[] PROGMEM = "I need a break";
const char textdata_27[] PROGMEM = "I would like to stop";
const char textdata_28[] PROGMEM = "It is too hot";
const char textdata_29[] PROGMEM = "I am hungry";
const char textdata_30[] PROGMEM = "I am thirsty";
const char textdata_31[] PROGMEM = "I want sweets";

// Profile 2 - Friends
const char textdata_32[] PROGMEM = "I dont like this";
const char textdata_33[] PROGMEM = "I am good, thanks";
const char textdata_34[] PROGMEM = "I am feeling not so good today";
const char textdata_35[] PROGMEM = "I am tired";
const char textdata_36[] PROGMEM = "Okay";
const char textdata_37[] PROGMEM = "Yes";
const char textdata_38[] PROGMEM = "Help";
const char textdata_39[] PROGMEM = "No";
const char textdata_40[] PROGMEM = "Do you want to hang out?";
const char textdata_41[] PROGMEM = "How are you?";
const char textdata_42[] PROGMEM = "What are your plans for the day??";
const char textdata_43[] PROGMEM = "Want to go for a walk?";
const char textdata_44[] PROGMEM = "Buz kz buz kz buz buz kschschsch";
const char textdata_45[] PROGMEM = "That is wonderful";
const char textdata_46[] PROGMEM = "eeeee";
const char textdata_47[] PROGMEM = "Haahaahaa huuuhuuu looooooooooooool";

// Profile 3 - Caretaker
const char textdata_48[] PROGMEM = "Tighten my shoe";
const char textdata_49[] PROGMEM = "I am freezing, I need warmer clothes";
const char textdata_50[] PROGMEM = "I am warm, I need less clothes";
const char textdata_51[] PROGMEM = "Change my clothes";
const char textdata_52[] PROGMEM = "Okay";
const char textdata_53[] PROGMEM = "Yes";
const char textdata_54[] PROGMEM = "Help";
const char textdata_55[] PROGMEM = "No";
const char textdata_56[] PROGMEM = "I want to lay down";
const char textdata_57[] PROGMEM = "Turn me";
const char textdata_58[] PROGMEM = "Scratch me";
const char textdata_59[] PROGMEM = "I want to sit";
const char textdata_60[] PROGMEM = "I need a massage";
const char textdata_61[] PROGMEM = "I need a medical treatment";
const char textdata_62[] PROGMEM = "I am in pain";
const char textdata_63[] PROGMEM = "Something is wrong";

// Profile 4 - Doctor visit
const char textdata_64[] PROGMEM = "I have a problem";
const char textdata_65[] PROGMEM = "I dont feel well today";
const char textdata_66[] PROGMEM = "I feel good today";
const char textdata_67[] PROGMEM = "I am in pain";
const char textdata_68[] PROGMEM = "Okay";
const char textdata_69[] PROGMEM = "Yes";
const char textdata_70[] PROGMEM = "Help";
const char textdata_71[] PROGMEM = "No";
const char textdata_72[] PROGMEM = "Ask me which part of the body is affected";
const char textdata_73[] PROGMEM = "Something is wrong";
const char textdata_74[] PROGMEM = "I need medication";
const char textdata_75[] PROGMEM = "I need treatment";
const char textdata_76[] PROGMEM = "Left";
const char textdata_77[] PROGMEM = "Higher";
const char textdata_78[] PROGMEM = "Right";
const char textdata_79[] PROGMEM = "Lower";

// Profile 5 - Chess
const char textdata_80[] PROGMEM = "8";
const char textdata_81[] PROGMEM = "5";
const char textdata_82[] PROGMEM = "6";
const char textdata_83[] PROGMEM = "7";
const char textdata_84[] PROGMEM = "4";
const char textdata_85[] PROGMEM = "1";
const char textdata_86[] PROGMEM = "2";
const char textdata_87[] PROGMEM = "3";
const char textdata_88[] PROGMEM = "D";
const char textdata_89[] PROGMEM = "A";
const char textdata_90[] PROGMEM = "B";
const char textdata_91[] PROGMEM = "C";
const char textdata_92[] PROGMEM = "H";
const char textdata_93[] PROGMEM = "E";
const char textdata_94[] PROGMEM = "F";
const char textdata_95[] PROGMEM = "G";

// profile 6 - to be customized
const char textdata_96[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_97[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_98[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_99[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_100[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_101[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_102[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_103[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_104[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_105[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_106[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_107[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_108[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_109[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_110[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_111[] PROGMEM = "Please use the customizer to create your own profile";

// profile 7 - to be customized
const char textdata_112[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_113[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_114[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_115[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_116[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_117[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_118[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_119[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_120[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_121[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_122[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_123[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_124[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_125[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_126[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_127[] PROGMEM = "Please use the customizer to create your own profile";

// profile 8 - to be customized
const char textdata_128[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_129[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_130[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_131[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_132[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_133[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_134[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_135[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_136[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_137[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_138[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_139[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_140[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_141[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_142[] PROGMEM = "Please use the customizer to create your own profile";
const char textdata_143[] PROGMEM = "Please use the customizer to create your own profile";
//CUSTOM_EXPRESSION_DECLARATION_END


const char *const data_table[] PROGMEM = {textdata_0, textdata_1, textdata_2, textdata_3, textdata_4, textdata_5, textdata_6, textdata_7, textdata_8, textdata_9, textdata_10,
                                          textdata_11, textdata_12, textdata_13, textdata_14, textdata_15, textdata_16, textdata_17, textdata_18, textdata_19, textdata_20,
                                          textdata_21, textdata_22, textdata_23, textdata_24, textdata_25, textdata_26, textdata_27, textdata_28, textdata_29, textdata_30,
                                          textdata_31, textdata_32, textdata_33, textdata_34, textdata_35, textdata_36, textdata_37, textdata_38, textdata_39, textdata_40,
                                          textdata_41, textdata_42, textdata_43, textdata_44, textdata_45, textdata_46, textdata_47, textdata_48, textdata_49, textdata_50,
                                          textdata_51, textdata_52, textdata_53, textdata_54, textdata_55, textdata_56, textdata_57, textdata_58, textdata_59, textdata_60,
                                          textdata_61, textdata_62, textdata_63, textdata_64, textdata_65, textdata_66, textdata_67, textdata_68, textdata_69, textdata_70,
                                          textdata_71, textdata_72, textdata_73, textdata_74, textdata_75, textdata_76, textdata_77, textdata_78, textdata_79, textdata_80,
                                          textdata_81, textdata_82, textdata_83, textdata_84, textdata_85, textdata_86, textdata_87, textdata_88, textdata_89, textdata_90,
                                          textdata_91, textdata_92, textdata_93, textdata_94, textdata_95, textdata_96, textdata_97, textdata_98, textdata_99, textdata_100,
                                          textdata_101, textdata_102, textdata_103, textdata_104, textdata_105, textdata_106, textdata_107, textdata_108, textdata_109, textdata_110,
                                          textdata_111, textdata_112, textdata_113, textdata_114, textdata_115, textdata_116, textdata_117, textdata_118, textdata_119, textdata_120,
                                          textdata_121, textdata_122, textdata_123, textdata_124, textdata_125, textdata_126, textdata_127, textdata_128, textdata_129, textdata_130,
                                          textdata_131, textdata_132, textdata_133, textdata_134, textdata_135, textdata_136, textdata_137, textdata_138, textdata_139, textdata_140,
                                          textdata_141, textdata_142, textdata_143
                                         }; //has to include all 144 strings in the end


void setup()
{
  pinMode(keepAlivePin, OUTPUT);//LED for debug purposes at current stage of the project
  pinMode(StatusLED, OUTPUT);//LED for debug purposes at current stage of the project

  // set the data rate for the hardware serial port
  Serial.begin(9600);

  // configure the sensors
  qtr.setTypeRC();
  qtr.setSensorPins((const uint8_t[]) {
    2, 3, 4, 5
  }, 4);
  
  qtr.setTimeout(5000);
  
  // set the data rate for the SoftwareSerial port
  pinMode(rxPin, INPUT);
  pinMode(txPin, OUTPUT);
  emicSerial.begin(9600);
  emicSerial.print('\n');             // Send a CR in case the system is already up
  while (emicSerial.read() != ':');   // When the Emic 2 has initialized and is ready, it will send a single ':' character, so wait here until we receive it
  emicSerial.flush();                 // Flush the receive buffer

  // set volume of emic
  emicSerial.print('V');
  emicSerial.print(String(volume));
  emicSerial.print('\n');

  // set voicestyle of emic
  emicSerial.print('N');
  switch (voiceStyle) {
    case 1: emicSerial.print('0'); break;
    case 2: emicSerial.print('8'); break;
  }
  emicSerial.print('\n');

  emicReady = true;

  if (useAutoCalibration) {
    autocalibrateSensors();
  } else {
    calibrateSensors();
    setupTimers();
  }
}

void setupTimers() {
  // Setting up the timers
  timer.every(100, readSensors);
  timer.every(100, updateEmicReady);
  timer.every(3000, printDebugInfo);
  if (keepAlive) {
    timer.every(5000, keepAlive);
  }
}

void calibrateSensors() {
  qtr.read(neutralSensorValue);
  neutralSensorValue[0] = uint32_t(neutralSensorValue[0]) * factorL / 100;
  neutralSensorValue[1] = uint32_t(neutralSensorValue[1]) * factorUp / 100;
  neutralSensorValue[2] = uint32_t(neutralSensorValue[2]) * factorL / 100;
  neutralSensorValue[3] = uint32_t(neutralSensorValue[3]) * factorDown / 100;
  Serial.println(neutralSensorValue[0]);
  Serial.println(neutralSensorValue[1]);
  Serial.println(neutralSensorValue[2]);
  Serial.println(neutralSensorValue[3]);
}

void autocalibrateSensors() {
  qtr.read(neutralSensorValue);
  
  letEmicSpeak("Left");
  while (emicSerial.read() != ':');
  emicReady = true;
  delay(2000);
  autocalibrateSensor(2);
  
  letEmicSpeak("Up");
  while (emicSerial.read() != ':');
  emicReady = true;
  delay(2000);
  autocalibrateSensor(3);
  
  letEmicSpeak("Right");
  while (emicSerial.read() != ':');
  emicReady = true;
  delay(2000);
  autocalibrateSensor(0);
  
  letEmicSpeak("Down");
  while (emicSerial.read() != ':');
  emicReady = true;
  delay(2000);
  autocalibrateSensor(1);
  
  letEmicSpeak("Calibration completed");
  while (emicSerial.read() != ':');
  emicReady = true;
  setupTimers();
}

void autocalibrateSensor(uint8_t sensornumber) {
  qtr.read(sensorValue);
  neutralSensorValue[sensornumber] = (4 * sensorValue[sensornumber] + neutralSensorValue[sensornumber]) / 5;
}

void setFactor(uint8_t sensornumber) {
  
}
bool updateEmicReady() {
  if (emicSerial.read() == ':') {
    emicReady = true;
  }
}

void letEmicSpeak(char message[]) {
  if (emicReady) { //Sollten wir bei jeder Sprachausgabe prfen, um dem emic erst eine neue Ausgabe zu senden, wenn er bereit ist
    emicReady = false;
    emicSerial.print('S');
    emicSerial.print(message);  // Send the desired string to convert to speech
    emicSerial.print('\n');
  }
}

void toggleMute() {
  muted = !muted;
}

void bufferTextdata(uint8_t entrynumber) {
  strcpy_P(textdataBuffer, (char *)pgm_read_word(&(data_table[entrynumber])));
}

// Gets the detected directions as an input and performs the suitable actions depending on active profile, muted, etc.
void performCommand(int8_t dir1, int8_t dir2) {

  int8_t combinedDirs = dir1 * 10 + dir2; //combine two dirs into one variable

  if (muted) {
    if (combinedDirs == 2) {
      toggleMute();
      letEmicSpeak("Unmuted");
    }
  } else {

    if (inMenu) { //menu commands
      switch (combinedDirs) {
        
        case 0: // leave menu
          inMenu = false;
          letEmicSpeak("Closing Menu");
          break;

        case 1: // Volume Up
          raiseVolume();
          break;

        case 2: // Toggle Mute
          toggleMute();
          letEmicSpeak("Muted");
          break;

        case 3: // Volume Down
          lowerVolume();
          break;

        case 10:
          bufferTextdata(4);
          letEmicSpeak(textdataBuffer);
          break;

        case 11:
          bufferTextdata(5);
          letEmicSpeak(textdataBuffer);
          break;

        case 12:
          bufferTextdata(6);
          letEmicSpeak(textdataBuffer);
          break;

        case 13:
          bufferTextdata(7);
          letEmicSpeak(textdataBuffer);
          break;

        case 20:
          changeProfile(4);
          break;

        case 21:
          changeProfile(1);
          break;

        case 22:
          changeProfile(2);
          break;

        case 23:
          changeProfile(3);
          break;

        case 30: // down
          changeProfile(8);
          break;

        case 31:
          changeProfile(5);
          break;

        case 32:
          changeProfile(6);
          break;

        case 33:
          changeProfile(7);
          break;
      }
      
    } else { //profile commands
      switch (combinedDirs) {
        
        case 0:
          bufferTextdata(16 * activeProfile);
          letEmicSpeak(textdataBuffer);
          break;

        case 1:
          bufferTextdata(16 * activeProfile + 1);
          letEmicSpeak(textdataBuffer);
          break;

        case 2:
          bufferTextdata(16 * activeProfile + 2);
          letEmicSpeak(textdataBuffer);
          break;

        case 3:
          bufferTextdata(16 * activeProfile + 3);
          letEmicSpeak(textdataBuffer);
          break;

        case 10:
          bufferTextdata(16 * activeProfile + 4);
          letEmicSpeak(textdataBuffer);
          break;

        case 11:
          bufferTextdata(16 * activeProfile + 5);
          letEmicSpeak(textdataBuffer);
          break;

        case 12:
          bufferTextdata(16 * activeProfile + 6);
          letEmicSpeak(textdataBuffer);
          break;

        case 13:
          bufferTextdata(16 * activeProfile + 7);
          letEmicSpeak(textdataBuffer);
          break;

        case 20:
          bufferTextdata(16 * activeProfile + 8);
          letEmicSpeak(textdataBuffer);
          break;

        case 21:
          bufferTextdata(16 * activeProfile + 9);
          letEmicSpeak(textdataBuffer);
          break;

        case 22:
          bufferTextdata(16 * activeProfile + 10);
          letEmicSpeak(textdataBuffer);
          break;

        case 23:
          bufferTextdata(16 * activeProfile + 11);
          letEmicSpeak(textdataBuffer);
          break;

        case 30:
          bufferTextdata(16 * activeProfile + 12);
          letEmicSpeak(textdataBuffer);
          break;

        case 31:
          bufferTextdata(16 * activeProfile + 13);
          letEmicSpeak(textdataBuffer);
          break;

        case 32:
          bufferTextdata(16 * activeProfile + 14);
          letEmicSpeak(textdataBuffer);
          break;

        case 33:
          bufferTextdata(16 * activeProfile + 15);
          letEmicSpeak(textdataBuffer);
          break;

        case 40:
          inMenu = true;
          letEmicSpeak("Menu");
          break;
      }
    }
  }
}

// processes a recognized (or no recognized) eye detection direction and fires the according events depending on status, singalTracker and timeoutTracker
void processDirection(int8_t recognizedDir) {
  if (status == 0) {
    
    if (recognizedDir == -1) {
      dir1 = -1;
      
    } else if (recognizedDir != dir1) {
      dir1 = recognizedDir;
      signalTracker = millis();
      
    } else if (dir1 == 4) {
      if (millis() -  signalTracker > closedDuration) {
        dir1 = -1;
        performCommand(4, 0);
      }
    }
    
    else if (millis() -  signalTracker > dirDuration) {
      status = 1;
      timeoutTracker = millis();
      if (!muted) {
        letStatusLEDBlink();
      }
    }
    
  } else if (status == 1) {
    
    if (millis() - timeoutTracker > dirPause) {
      status = 2;
      timeoutTracker = millis();
      processDirection(recognizedDir);
    }
    
  } else if (status == 2) {

    if (millis() - timeoutTracker > timeOutDuration) {
      status = 0;
      dir1 = -1;
      dir2 = -1;
      processDirection(recognizedDir);
      
    } else if (recognizedDir == -1) {
      dir2 = -1;
      
    } else if (recognizedDir != dir2) {
      dir2 = recognizedDir;
      signalTracker = millis();
      
    } else if (millis() -  signalTracker > dirDuration) {
      performCommand(dir1, dir2);
      status = 0;
      dir1 = -1;
      dir2 = -1;
      timeoutTracker = millis();
    }
  }
}

// uses the hardware to read a set of sensor values and assumes in which direction the user was looking
bool readSensors() {
  qtr.read(sensorValue);

  if (sensorValue[0] < neutralSensorValue[0] //if eyes closed
      && sensorValue[2] < neutralSensorValue[2]
      && sensorValue[1] < neutralSensorValue[1]
      && sensorValue[3] < neutralSensorValue[3]) {
    Serial.println("ReadSensors: Closed eyes");
    processDirection(4);
  } else if (sensorValue[3] < neutralSensorValue[3]) { //if looking up
    Serial.println("ReadSensors: U ");
    processDirection(1);
  } else if (sensorValue[1] < neutralSensorValue[1]) { //if looking down
    Serial.println("ReadSensors: D ");
    processDirection(3);
  } else if (sensorValue[0] < neutralSensorValue[0]) { //if looking right
    Serial.println("ReadSensors: R ");
    processDirection(2);
  } else if (sensorValue[2] < neutralSensorValue[2]) { //if looking left
    Serial.println("ReadSensors: L ");
    processDirection(0);
  } else {
    Serial.println("ReadSensors: X ");
    processDirection(-1);
  }

  return true;
}

void changeProfile(uint8_t profilenumber) {
  activeProfile = profilenumber;
  inMenu = false;
  bufferTextdata(7 + profilenumber);
  letEmicSpeak(textdataBuffer);
}

void raiseVolume() {
  if (volume < 5) {
    updateVolume(volume + 1);
    bufferTextdata(1);
    timer.in(150, [] { letEmicSpeak(textdataBuffer); });
  } else {
    letEmicSpeak("Maximal volume reached");
  }
}

void lowerVolume() {
  if (volume > 1) {
    updateVolume(volume - 1);
    bufferTextdata(3);
    timer.in(150, [] { letEmicSpeak(textdataBuffer); });
  } else {
    letEmicSpeak("Minimal volume reached");
  }
}

void updateVolume(int8_t newVolume) {
  if (emicReady) { //Should be tested at every emic output to be sure its ready to receive commands
    emicReady = false;
    volume = newVolume;
    emicSerial.print('V');
    switch (volume) {
      case 1: emicSerial.print("-40"); break;
      case 2: emicSerial.print("-20"); break;
      case 3: emicSerial.print("0"); break;
      case 4: emicSerial.print("10"); break;
      case 5: emicSerial.print("18"); break;
    }
    emicSerial.print('\n');;
  }
}

void letStatusLEDBlink() {
  if (statusLEDActive) {
    digitalWrite(StatusLED, HIGH);
    timer.in(200, [] { digitalWrite(StatusLED, LOW); });
  }
}

bool keepAlive() { //uses the keep alive wiring to consume some power so that powerbanks do not turn off
  digitalWrite(keepAlivePin, HIGH);
  timer.in(300, [] { digitalWrite(keepAlivePin, LOW); });
  return true;
}

bool printDebugInfo() {
  Serial.print(sensorValue[0]);
  Serial.print('\t');
  Serial.print(sensorValue[1]);
  Serial.print('\t');
  Serial.print(sensorValue[2]);
  Serial.print('\t');
  Serial.print(sensorValue[3]);
  Serial.print('\t');
  Serial.println();

  return true;
}

void loop()
{
  timer.tick();
}
Speak4Me GitHub Repository
The repository contains the full source of the project including the webpage with the customization tool.

Custom parts and enclosures

3D printing files .stl for the case and retainer of sensors
Includes required files to print the case and the retainer of the sensors.
Circuit Board gerber and drill files
Includes all files that are required to order a professional circuit board for the main device.

Schematics

Architecture overview
Architectural overview euekgrytmt

Comments

Similar projects you might like

DIY Retro Look FM Radio with TEA5767 Module

Project tutorial by Mirko Pavleski

  • 31,319 views
  • 17 comments
  • 72 respects

Using the RAK811 LoRa module with Arduino

Project tutorial by Naresh krish

  • 19,642 views
  • 5 comments
  • 15 respects

u-blox LEA-6H 02 GPS Module with Arduino and Python

Project tutorial by Harshgosar

  • 4,534 views
  • 9 comments
  • 8 respects

Smartplug with Arduino UNO and HC-05 Bluetooth module

Project tutorial by dnbakshi07

  • 3,294 views
  • 2 comments
  • 8 respects

Servo Motor Using Arduino & PCA9685 16 Chanel Module

Project tutorial by Jithin Sanal

  • 62,299 views
  • 2 comments
  • 27 respects

SIM800L GPRS Module with Arduino AT Commands

Project tutorial by SetNFix

  • 52,556 views
  • 14 comments
  • 7 respects
Add projectSign up / Login