Course subjects
Module 1: Introduction to AI and Sound
What is AI?
AI in Daily Life: Audio Examples
Basics of Sound Waves, Amplitude, Frequency
Digital Audio Fundamentals
Module 2: Harnessing AI Across Audio Domains
AI for Audio Enhancement and Restoration
AI for Audio Accessibility and Personalisation
AI in Speech and Voice Technologies
Popular Audio Libraries: Librosa, PyAudio
Use Case: AI-Driven Real-Time Captioning and Translation for Live Events
Case Study: Personalised Hearing Aid Adaptation Using AI and Smart Earbuds
Hands-on: Voice Emotion Detection Using Deepgram's Voice AI Platform
Module 3: Machine Learning & AI for Audio
Machine Learning Models for Audio Applications
Deep Learning & Advanced AI Techniques for Audio
Audio-Specific Architectures: CNNs, RNNs, Transformers
Transfer Learning in Audio AI
Use Case: Speech-to-Text Transcription for Medical Records
Case Study: AI-Powered Music Generation with Deep Learning
Hands-on: Build a Speech-to-Text Model Using TensorFlow
Module 4: Speech Recognition and Text-to-Speech
Fundamentals of Speech Recognition & Phonetics
API-Based ASR Solutions
Building Custom ASR Models with Transformers
Introduction to TTS & Voice Cloning
Use Case: Automating Meeting Transcriptions with Google Speech-to-Text API
Case Study: Custom Transformer-Based ASR Model for Multilingual Customer Support
Hands-on: Transcribe Audio with an ASR API; Generate Speech from Text
Module 5: Audio Enhancement & Noise Reduction
Common Audio Issues
AI-Based Noise Filtering & Enhancement
Use Case: Enhancing Audio Quality for Remote Work Calls Using AI Noise Reduction
Case Study: Krisp’s AI-Powered Noise Cancellation in Podcast Production
Hands-on: Use Krisp or Adobe Enhance Speech to Clean Noisy Audio
Module 6: Emotion & Sentiment Detection from Audio
Introduction to Emotion Detection
AI Models for Emotion Detection: RNNs, LSTMs, CNNs
Challenges: Bias, Multilingual Contexts, Reliability
Use Case: Enhancing Customer Service with Emotion Detection from Speech
Case Study: IBM Watson Tone Analyser for Real-Time Emotion Recognition
Hands-on: Use IBM Watson Tone Analyser or Similar APIs to Analyse Speech Samples
Module 7: Ethical and Privacy Considerations
Deepfakes and Voice Cloning Risks
Privacy and Data Security
Bias and Fairness in Audio AI
Use Case: Implementing Ethical Voice Data Collection and Consent Management
Case Study: Addressing Bias and Privacy in Audio AI Under GDPR Compliance
Hands-on: Detect Fake Audio Clips; Create an Ethical AI Checklist
Module 8: Advanced Applications & Future Trends
Sound Event Detection & Classification
Audio Search and Indexing
Innovations: Multimodal AI, Edge Computing, 3D Audio
Emerging Careers in Audio AI