Voice Assistant

Project Overview

A simple voice assistant built with the Xmini-C3 board, featuring a microphone, speaker, and OLED display. The device uses on-device wake word detection and connects to Home Assistant Cloud for voice processing, enabling hands-free control of smart home devices.

Features

🎤 Built-in I2S microphone (ES8311 DAC) for voice commands
🔊 I2S audio speaker for voice responses and announcements
📺 OLED display (SSD1306) showing assistant status icons
🌈 RGB LED status indicator with various effects (listening, processing, error states)
🎯 On-device wake word detection (supports “Okay Nabu”, “Hey Mycroft”, “Hey Jarvis”)
⏲️ Timer support with audio notifications
🔘 Physical button for stopping the alarm timer and factory reset
🔌 Powered via USB-C

Progress

✅ Set up on-device wake word detection
✅ Configure Home Assistant Cloud voice pipeline
✅ Test voice commands (lights, timers)
Future improvements (see below)

Future Improvement Ideas

Add more display pages showing additional information
Implement custom LED effects for different states
Add local voice processing option (instead of cloud)
- Local pipeline
Volume control adjustments

Reusability Note

This project uses the Xmini-C3 board which has built-in I2S audio components. If using a different ESP32 board, you’ll need to add external audio hardware. The configuration is straightforward with minimal customization needed - mostly updating WiFi credentials and API keys in your secrets file.

What You’ll Need

Hardware

1x Xmini-C3 - ESP32-C3 board with built-in I2S audio (ES8311 DAC, microphone, speaker)
1x USB-C cable (data capable for programming)
1x Power supply (USB charger, 5V/1A minimum)

Software

ESPHome installed
Home Assistant with Cloud subscription (for voice processing)
Home Assistant Voice Assistant configured

Setup Instructions

Flash the device - Use the provided YAML configuration to flash your Xmini-C3
Configure secrets - Set your WiFi credentials and API encryption keys
Add to Home Assistant - The device should auto-discover via ESPHome integration
Configure Voice Pipeline - In Home Assistant, set up your preferred voice assistant pipeline
Expose Entities to Voice Assistant - Select some entities you want configured by the voice assistant. Less entities selected - faster the assistant. Possibly add aliases for easier control
Test wake word - Say “Okay Nabu” (or other configured wake words) to trigger the assistant
Try commands - Test with simple commands like “Turn on the lights” or “Set a timer”

How It Works

Wake Word Detection: Runs locally on the device using micro wake word models (configurable)
Voice Processing: Audio sent to Home Assistant Cloud for speech-to-text and intent recognition
Feedback: RGB LED and OLED display show current state (idle, listening, processing, speaking)
Button Control: Press the boot button to cancel the timer notification or hold 10s for factory reset

Status

This project is completed and working. The voice assistant successfully:

Detects wake words locally on the device
Responds to voice commands via Home Assistant Cloud
Controls lights and other smart home devices
Handles timers with audio notifications
Provides visual feedback via LED display

There’s significant room for improvement in terms of customization, additional features, and local processing options.

Acknowledgments

Configuration inspired by the M5Stack Atom Echo voice assistant from the ESPHome wake word voice assistants repository. It is actually mostly the copy of it with configuration specific for Xmini.

Main Configuration File

If you’re using ESPHome Device Builder create your New Device. Or if you’re using command line create your yaml file (e.g. xmini-voice-assistant.yaml) Then use the following file as a guide (details on how to customize it are below).

Download the full configuration: xmini-voice-assistant.yaml

esphome:
  name: my-mini-voice-assistant
  friendly_name: My Mini Voice Assistant

esp32:
  variant: esp32c3
  framework:
    type: esp-idf
    sdkconfig_options:
      CONFIG_ESPTOOLPY_FLASHMODE_DIO: y
  flash_size: 16MB

# Enable Home Assistant API
api:
  encryption:
    key: !secret mini_voice_assistant_api

ota:
  - platform: esphome
    password: !secret mini_voice_assistant_ota

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

logger:

substitutions:
  boot_btn_pin: GPIO09
  i2c_sda_pin: GPIO03
  i2c_scl_pin: GPIO04
  neopixel_pin: GPIO02
  i2s_ws_pin: GPIO06
  i2s_bck_pin: GPIO08
  i2s_mck_pin: GPIO10
  i2s_do_pin: GPIO05
  i2s_di_pin: GPIO07
  mute_pin: GPIO11

i2c:
  sda: ${i2c_sda_pin}
  scl: ${i2c_scl_pin}

output:
  - platform: gpio
    pin: ${mute_pin}
    id: mute_control
    inverted: true

audio_dac:
  - platform: es8311
    id: my_dac
    use_microphone: false
    bits_per_sample: 16bit
    #sample_rate: 48000
    sample_rate: 16000
    address: 0x18

button:
  - platform: factory_reset
    id: factory_reset_btn
    name: Factory reset

i2s_audio:
  - id: i2s_audio_bus
    i2s_lrclk_pin: ${i2s_ws_pin}
    i2s_bclk_pin: ${i2s_bck_pin}
    i2s_mclk_pin: ${i2s_mck_pin}

microphone:
  - platform: i2s_audio
    id: external_mic
    adc_type: external
    i2s_din_pin: ${i2s_di_pin}


#https://esphome.io/components/speaker/i2s_audio/
speaker:
  - platform: i2s_audio
    id: my_speaker
    dac_type: external
    i2s_dout_pin: ${i2s_do_pin}
    sample_rate: 16000
    channel: mono
    bits_per_channel: 16bit
    buffer_duration: 100ms

media_player:
  - platform: speaker
    name: None
    id: my_media_player
    announcement_pipeline:
      speaker: my_speaker
      format: WAV
    codec_support_enabled: false
    buffer_size: 6000
    volume_min: 0.4
    files:
      - id: timer_finished_wave_file
        file: https://github.com/esphome/wake-word-voice-assistants/raw/main/sounds/timer_finished.wav
    on_announcement:
      - if:
          condition:
            - microphone.is_capturing:
          then:
            - script.execute: stop_wake_word
      - light.turn_on:
          id: my_indicator
          blue: 100%
          red: 0%
          green: 0%
          brightness: 100%
          effect: none
    on_idle:
      - script.execute: start_wake_word
      - script.execute: reset_led

voice_assistant:
  id: va
  micro_wake_word:
  microphone:
    microphone: external_mic
    channels: 0
    gain_factor: 4
  media_player: my_media_player
  noise_suppression_level: 2
  auto_gain: 31dBFS
  on_listening:
    - light.turn_on:
        id: my_indicator
        blue: 100%
        red: 0%
        green: 0%
        effect: "Slow Pulse"
  on_stt_vad_end:
    - light.turn_on:
        id: my_indicator
        blue: 100%
        red: 0%
        green: 0%
        effect: "Fast Pulse"
  on_tts_start:
    - light.turn_on:
        id: my_indicator
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: none
  on_end:
    # Handle the "nevermind" case where there is no announcement
    - wait_until:
        condition:
          - media_player.is_announcing:
        timeout: 0.5s
    # Restart only mWW if enabled; streaming wake words automatically restart
    - if:
        condition:
          - lambda: |-
              return id(wake_word_engine_location).current_option() == "On device";
        then:
          - wait_until:
              - and:
                  - not:
                      voice_assistant.is_running:
                  - not:
                      speaker.is_playing:
          - lambda: id(va).set_use_wake_word(false);
          - micro_wake_word.start:
    - script.execute: reset_led
  on_error:
    - light.turn_on:
        id: my_indicator
        red: 100%
        green: 0%
        blue: 0%
        brightness: 100%
        effect: none
    - delay: 2s
    - script.execute: reset_led
  on_client_connected:
    - delay: 2s  # Give the api server time to settle
    - script.execute: start_wake_word
  on_client_disconnected:
    - script.execute: stop_wake_word
  on_timer_finished:
    - script.execute: stop_wake_word
    - wait_until:
        not:
          microphone.is_capturing:
    - switch.turn_on: timer_ringing
    - light.turn_on:
        id: my_indicator
        red: 0%
        green: 100%
        blue: 0%
        brightness: 100%
        effect: "Fast Pulse"
    - wait_until:
        - switch.is_off: timer_ringing
    - light.turn_off: my_indicator
    - switch.turn_off: timer_ringing

binary_sensor:
  - platform: gpio
    pin:
      number: ${boot_btn_pin}
      inverted: true
      mode:
        input: true
        pullup: true
    name: Button
    id: boot_btn
    disabled_by_default: true
    entity_category: diagnostic
    on_multi_click:
      - timing:
          - ON for at least 50ms
          - OFF for at least 50ms
        then:
          - if:
              condition:
                switch.is_on: timer_ringing
              then:
                - switch.turn_off: timer_ringing
              else:
                - script.execute: start_wake_word
      - timing:
          - ON for at least 10s
        then:
          - button.press: factory_reset_btn

light:
  - platform: esp32_rmt_led_strip
    id: my_indicator
    name: None
    disabled_by_default: true
    entity_category: config
    pin: ${neopixel_pin}
    default_transition_length: 0s
    chipset: ws2812
    num_leds: 1
    rgb_order: GRB
    restore_mode: ALWAYS_OFF
    effects:
      - pulse:
          name: "Slow Pulse"
          transition_length: 250ms
          update_interval: 250ms
          min_brightness: 50%
          max_brightness: 100%
      - pulse:
          name: "Fast Pulse"
          transition_length: 100ms
          update_interval: 100ms
          min_brightness: 50%
          max_brightness: 100%

script:
  - id: reset_led
    then:
      - if:
          condition:
            - lambda: |-
                return id(wake_word_engine_location).current_option() == "On device";
            - switch.is_on: use_listen_light
          then:
            - light.turn_on:
                id: my_indicator
                red: 100%
                green: 89%
                blue: 71%
                brightness: 60%
                effect: none
          else:
            - if:
                condition:
                  - lambda: |-
                      return id(wake_word_engine_location).current_option() == "On device";
                  - switch.is_on: use_listen_light
                then:
                  - light.turn_on:
                      id: my_indicator
                      red: 0%
                      green: 100%
                      blue: 100%
                      brightness: 60%
                      effect: none
                else:
                  - light.turn_off: my_indicator
  - id: start_wake_word
    then:
      - if:
          condition:
            and:
              - not:
                  - voice_assistant.is_running:
              - lambda: |-
                  return id(wake_word_engine_location).current_option() == "On device";
          then:
            - lambda: id(va).set_use_wake_word(false);
            - micro_wake_word.start:
      - if:
          condition:
            and:
              - not:
                  - voice_assistant.is_running:
              - lambda: |-
                  return id(wake_word_engine_location).current_option() == "In Home Assistant";
          then:
            - lambda: id(va).set_use_wake_word(true);
            - voice_assistant.start_continuous:
  - id: stop_wake_word
    then:
      - if:
          condition:
            lambda: |-
                  return id(wake_word_engine_location).current_option() == "In Home Assistant";
          then:
            - lambda: id(va).set_use_wake_word(false);
            - voice_assistant.stop:
      - if:
          condition:
            lambda: |-
              return id(wake_word_engine_location).current_option() == "On device";
          then:
            - micro_wake_word.stop:

switch:
  - platform: template
    name: Use listen light
    id: use_listen_light
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - script.execute: reset_led
    on_turn_off:
      - script.execute: reset_led
  - platform: template
    id: timer_ringing
    optimistic: true
    restore_mode: ALWAYS_OFF
    on_turn_off:
      # Turn off the repeat mode and disable the pause between playlist items
      - lambda: |-
              id(my_media_player)
                ->make_call()
                .set_command(media_player::MediaPlayerCommand::MEDIA_PLAYER_COMMAND_REPEAT_OFF)
                .set_announcement(true)
                .perform();
              id(my_media_player)->set_playlist_delay_ms(speaker::AudioPipelineType::ANNOUNCEMENT, 0);
      # Stop playing the alarm
      - media_player.stop:
          announcement: true
    on_turn_on:
      # Turn on the repeat mode and pause for 1000 ms between playlist items/repeats
      - lambda: |-
            id(my_media_player)
              ->make_call()
              .set_command(media_player::MediaPlayerCommand::MEDIA_PLAYER_COMMAND_REPEAT_ONE)
              .set_announcement(true)
              .perform();
            id(my_media_player)->set_playlist_delay_ms(speaker::AudioPipelineType::ANNOUNCEMENT, 1000);
      - media_player.speaker.play_on_device_media_file:
          media_file: timer_finished_wave_file
          announcement: true
      - delay: 15min
      - switch.turn_off: timer_ringing

select:
  - platform: template
    entity_category: config
    name: Wake word engine location
    id: wake_word_engine_location
    optimistic: true
    restore_value: true
    options:
      - In Home Assistant
      - On device
    initial_option: On device
    on_value:
      - if:
          condition:
            lambda: return x == "In Home Assistant";
          then:
            - micro_wake_word.stop:
            - delay: 500ms
            - lambda: id(va).set_use_wake_word(true);
            - voice_assistant.start_continuous:
      - if:
          condition:
            lambda: return x == "On device";
          then:
            - lambda: id(va).set_use_wake_word(false);
            - voice_assistant.stop:
            - delay: 500ms
            - micro_wake_word.start:

micro_wake_word:
  microphone: external_mic
  on_wake_word_detected:
    - voice_assistant.start:
        wake_word: !lambda return wake_word;
  vad:
  models:
    - model: okay_nabu
    - model: hey_mycroft
    - model: hey_jarvis

image:
  - file: mdi:robot
    id: va_listening
    type: binary
    resize: 48x48
  - file: mdi:robot-happy
    id: va_idle
    resize: 48x48
    type: binary

display:
  - platform: ssd1306_i2c
    id: my_display
    model: "SSD1306 128x64"
    address: 0x3C
    update_interval: never
    pages:
      - id: idle_page
        lambda: |-
          it.fill(COLOR_OFF);
          it.image((it.get_width() / 2), (it.get_height() / 2) + 8, id(va_idle), ImageAlign::CENTER);

My ESPHome Workshop

Required Devices

XiaGe Xmini-C3 AI Voice Development Board

Native API

Image

I2C Bus

Over-The-Air Updates (OTA)

GPIO Output

Script

WiFi

Template Switch