How to Build a Voice Assistant App with Python and a Speech API

We are all familiar with Google Assistant, Siri, and Alexa. The wonderful part is that you can create something comparable without a large corporation. You can build your small voice assistant that can listen, comprehend, and respond using Python and a few libraries. It can give you the time, look up information on Wikipedia, make jokes, and even access YouTube, but don't expect it to prepare dinner for you. What's the best part? You'll comprehend how it operates.

You should take Uncodemy's AI using Python course if you become interested in this topic and want to learn more about AI projects. If you're serious about going from modest projects to full-on AI programming, this is the best fit.

What we're building

This is the fundamental loop:

  • The assistant hears what you have to say.
     
  • That speech becomes text as a result.
     
  • It determines what you desire.
     
  • It responds to you aloud.
     

That's all. Basic cycle. It reacts when you speak.

Let's keep it brief: start by stating the time, look up information on Wikipedia, and then crack a joke. Additional functions like setting reminders or checking the weather can be added later.

Tools you'll need

  • Python 3 is installed on your computer.
     
  • A mic (your laptop mic is fine)
     
  • Internet (for the speech recognition part)
     
  • These Python libraries:
  •  
    • SpeechRecognition
    •  
    • pyttsx3
    •  
    • pyaudio
    •  
    • wikipedia
    •  
    • pyjokes

 

Run this to install everything:

Pip install SpeechRecognition pyttsx3 pyaudio wikipedia pyjokes

Sometimes pyaudio is a pain to install. If it fails, Try Searching for "install pyaudio wheel” for your system; it’s a common fix.

Step 1: Make it talk

Before we teach it to listen, let's get the speaking part working.

import pyttsx3

engine = pyttsx3.init()

def speak(text):

print(f"Assistant: {text}")

engine.say(text)

engine.runAndWait()

speak("Hello, I’m ready.")

Run this. If your computer talks back, you’re good.

Step 2: Make it listen

Now let’s capture your voice and turn it into text.

import speech_recognition as sr

def listen():

r = sr.Recognizer()

with sr.Microphone() as source:

print("Listening...")

audio = r.listen(source)

try:

text = r.recognize_google(audio)

print(f"You said: {text}")

return text.lower()

except sr.UnknownValueError:

speak("Sorry, I didn’t get that.")

return ""

except sr.RequestError:

speak("Speech service is down.")

return ""

Now it listens to you and prints what you said. If it can’t figure it out, it just says sorry instead of breaking.

Step 3: Add commands

Okay, let’s teach it what to do when it hears you.

import datetime

import wikipedia

import pyjokes

def handle_command(command):

if "time" in command:

now = datetime.datetime.now().strftime("%I:%M %p")

speak(f"The time is {now}")

return True

if "wikipedia" in command:

topic = command.replace("wikipedia", "").strip()

if not topic:

speak("Tell me what to search on Wikipedia.")

return True

speak("Searching Wikipedia...")

try:

summary = wikipedia.summary(topic, sentences=2)

speak(summary)

except:

speak("I couldn’t find that topic.")

return True

if "joke" in command:

joke = pyjokes.get_joke()

speak(joke)

return True

if "exit" in command or "quit" in command:

speak("Goodbye.")

return False

speak("I don’t know how to do that yet.")

return True

Then glue everything together:

if __name__ == "__main__":

speak("Hello, how can I help?")

running = True

while running:

query = listen()

if query:

running = handle_command(query)

Now test it. Say:

  • “time” → it tells you the time.
  •  
  • “Wikipedia Python” → it reads a short summary.
  •  
  • “tell me a joke” → you’ll get something silly.
  •  
  • “exit” → it stops.
  •  

Step 4: Improve its usability

It's rough, but it works for now. Some immediate enhancements:

To account for background noise, use:

(source, duration=0.5) r.adjust_for_ambient_noise

Answers should be brief. Prolonged speeches are dull.

Address errors with cordial responses.

These minor details count. They let the assistant feel more like an assistant and less like a robot.

Step 5: Add more features

Once the basics work, adding new tricks is fun.

Open YouTube:

import webbrowser

if "open youtube" in command:

webbrowser.open("https://youtube.com")

speak("Opening YouTube.")

return True

Weather reports: use the OpenWeatherMap API.

Reminders: use the schedule library.

To-do list: save and read from a text file.

Each new feature is just another “if” block.

Step 6: Personality matters

Nobody likes a monotone assistant. Change the voice and pace.

voices = engine.getProperty('voices')

engine.setProperty('voice', voices[1].id)   # Switch voice

engine.setProperty('rate', 175)             # Adjust speed

You can also make it greet you differently in the morning, afternoon, and evening. Small touches make it fun.

Step 7: Organize your code

If you keep everything in one file, it’ll get messy fast. Break it into files:

  • speech_io.py → listening and speaking
     
  • commands.py → what it can do
     
  • main.py → runs the loop
     

Cleaner, easier to grow.

Step 8: What about “Hey Assistant”?

That’s called a wake word. It’s possible, but it takes more resources. For now, just press Enter to start listening or use a hotkey. Later, you can explore proper wake word engines.

Step 9: Add a simple GUI

With tkinter, you can make a tiny window with a mic button and a box that shows what was said. Not required, but handy if you want to demo it to friends.

Step 10: Expect errors

Sometimes the mic doesn’t pick up. Sometimes the speech API is down. Don’t sweat it—just handle the errors gracefully. Say “I can’t connect” instead of freezing.

Where to take this next

You now have a working voice assistant. From here, you can:

  • Pull in daily news headlines.
     
  • Add translations.
     
  • Send quick emails.
     
  • Play music.
     
  • Take voice notes.
  •  

Each little project adds new skills.

And if you find yourself thinking “this is fun, I want to really learn AI,” that’s when a structured course comes in. The AI using Python course in Noida by Uncodemy is a good fit here because it moves beyond hobby projects into serious AI development—natural language processing, data handling, and building smarter apps.

Final thoughts

You can actually build your own voice assistant. You don't require a lot of code or expensive equipment. Only a microphone, a few libraries, and Python.

Start small with jokes, Wikipedia, and time. Next, enlarge. Consider adding a GUI, additional functionality, and personality. You'll have an assistant who truly feels helpful before you know it.

And don't just keep experimenting aimlessly if you want to develop this curiosity into a legitimate ability. Learn the fundamentals using an organized course, such as Uncodemy's AI using Python course. That serves as a link between "weekend projects" and "real-world AI skills."

So take out your laptop, type a few words, and greet your new helper. The speed at which it comes to life will amaze you.

Placed Students

Our Clients

Partners

...

Uncodemy Learning Platform

Uncodemy Free Premium Features

Popular Courses