Ethical Use of AI in the classroom

This will be a less funny and entertaining post, and more focused on lesson planning and creation of content, surrounding machine learning in the classroom, and some projects you can start. As such, it is not posted on the main page, but under the new heading, called Classroom Content.

Singing and Speaking Machine Learning is a niche area of the AI field, in which it’s focus is designed to make models of voices to allow them to speak anything the user wants. Yes, it can be used maliciously, but that’s where you as educators can step in and control and educate on its uses. One of the focuses is voice actors, such as Majel Barrett-Roddenberry, a name you’ve likely never heard of (outside of the surname, ya Trekkie). She is the voice of the Star Trek computer. She has long since passed away, however she was a pro who prerecorded lines for use in future episodes and series, which is why we can still hear it today.

Imagine now the VA for Homer Simpson passes away. We now have no more Homer Simpson, right? Wrong. Enter AI and the SVC/RVC and SIT tool kits.

The SIT toolkit, provided free under their respective licenses, is your starting point. Get it today and have fun with it, see if you can generate a model using the examples and sample content below. The SIT toolkit is very easy and portable, so it can be distributed in Google Classroom, etc. The code is also checked using GitHub and Hugging Face’s own built in security and malicious code checks. It does not require network or internet access.

By now, you might potentially see the problem arising with its malicious use. Instantly, most people think “the kids in my class will use it to make fake phone calls to their parents and make them say things!”. First, no, stop that thinking right away because you’ll soon learn how complex it is and how that belief is so far away from reality, that maybe a few less movies about AI is what your 2024 holiday season should entail 😉

Once you have the toolkit, you now need to provide it the data – this is the biggest blocker for making AI’s based on anyone’s voice, not just you, the reader. In a recent personal project, I wanted to isolate and generate a model on Dr. House, the TV series staring Hugh Laurie. Using my toolkit, I had 176 TV episodes totaling 45 minutes on average each. After isolating and trimming ONLY the specific portions of all the episodes where Dr. House is actively speaking, is fewer than 5 hours of data. To better understand what this data file looks like internally, think of one person speaking into a microphone (studio quality, sound booth, etc.). They have NO sentence structure, pauses, they don’t breath, and it never stops. It’s a constant never ending stream of someone babbling random words, for example:

“lupusdoctorhousevicodintreatmentsentencehelloillburgermosquitocia”

Can you honestly say, that your students have the ability to actively collect that much data, in crystal clear, studio quality audio? I strongly doubt it. Even stealthily captured audio from hidden microphones or devices in pockets/desks, can’t capture the clarity required to make the model speak accurately and professionally, and most importantly, believably.

To create a lesson plan, you will need a dataset of audio, which you can find publicly released on Hugging Face for many actors, VA’s, cartoons, everything you can think of. I have provided you a raw, untrained and copyright free one below in the link dump for easy planning. For the younger classrooms, you can wait for the next post, where we can generate a speaking AI for “President Biden” or “Kim Kardashian”, and you can have them repeat whatever your students want it to say, skipping the development side of things.

Happy 2025 Everyone, and don’t forget to be awesome!

Link Collection:

Speaker Identification Toolkit aka. SIT:
https://github.com/ThatJeffGuy/speaker-identification-toolkit

Pretrained Models:
https://huggingface.co/QuickWick/Music-AI-Voices/tree/main

Untrained Dataset:
https://huggingface.co/datasets/ScottishHaze/PayMoneyWubby

** RVC Toolkit and SVC Toolkit are needed for this future post mentioned above, however you can preemptively download them here. I do not recommend setting them up yet, as our process may change:

RVC Toolkit:
https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/tree/main

SVC Toolkit:
https://github.com/PlayVoice/whisper-vits-svc/tree/bigvgan-mix-v2

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.