Using AI to digitally clone myself (AKA creating a Puny-Punya)

by | Sunday, March 19, 2023

Note: The photo-manipulated image of me holding my own head was created almost 20 years ago by Paul Kurf, a student in my learning by design, class! Image design & layout, Punya


Ethan Mollick is a professor at Wharton and he has been doing some of the most interesting work in playing with ChatGPT3 and other generative language models, particularly exploring their role in teaching. If you haven’t been following him, and are even peripherally interested in these issues, do check out his substack: One useful thing.

One of his experiments involved creating video of himself that was entirely AI generated (see A quick and easy guide to cloning yourself).

Of course I had to try this for myself to create my own mini-Me (or should it be Puny Punya).

I decided to create a short video of myself introducing an episode of Just an Hour—a series of informal conversations that Betty Gee and I co-host every Friday for faculty and doctoral students. This felt appropriate for two reasons. First, that particular session was devoted to generative AI in higher education, and it seemed right to provide an example of what this could look like. Second, I knew was going to miss that particular episode, since I was flying back from the SITE conference, so it made sense to share a short video introducing the session and sending my regrets for missing the meeting.

What that in mind, I asked ChatGPT3 to write a short script on my behalf, in first person. In parallel I had trained Eleven Labs to generate my voice by reading it 2 – 3 minutes or prose from my website. Eleven Labs took that recording and created a virtual voice for me, which could then read out any prose it was given, in that voice. I gave it the prose created by ChatGPT3 and it read it back to me in my “digitally generated” voice. Finally, I uploaded a photo of myself and the MP3 to the D-id website. (I had previously used this website to create the video for my learning styles blog post),

Within minutes I had a video I could download.

So just to be clear, the video was created by text, audio and video all generated by AI (with minimal input from my side.

So does this video look and sound like me? I would ask you to be the judge of that but before we get to the videos, a few things to note.

  1. This entire process took me around 30 minutes (and that includes time it took to creating the accounts, training Eleven Labs on my voice, taking a picture, uploading it, and so on.
  2. Also, all this was done by spending just one dollar! Eleven Labs needed me to sign up and pay for the service but since they were having a sale I got to use it for that amazing price!
  3. The final product is sort of weird, and does not look extremely realistic. But just the fact that it even exists is amazing. I am sure if I had spent more time and money I could have had a much better product.
  4. I also experimented with creating a low-res version of the video and strangely enough it actually seems more realistic and believable than the hi-res one. It is almost as if the graininess of the video makes us ignore the other glitches.
  5. Last but surely not the least, Eleven Labs totally messed with my voice and accent. if you have heard me speak, or recognize my voice, you will immediately realize that my voice has been changed quite dramatically, removing almost all traces of my my Indian accent. Again, given my previous experience with these technologies, I am not surprised at all at this!

With that, here are the two videos, first the hi-res version followed by the low-res version. What do you think?

The Hi-Res version of the video.

The Low-Res version of the video.

The key question of course that we should all be asking ourselves: What does it mean when the cost of creating content like this drops to zero.


Addendum

For the record, below is the image that Paul Kurf created for me almost 20 years ago!

Despite my complaints about how Eleven Labs messed up my accent, it clearly does a better job with other voices, as evidenced by this story I read today about how an AI generated voice allowed someone to break into a bank account.

A few randomly selected blog posts…

Clement Mok on design

I was reading the final papers written by participants in my CEP 817, Learning Technology by Design seminar and came across this quote by Clement Mok in a paper written by Breanne Edmonds. I wanted to record it for future reference: Design means being good, not just...

A year of blogging

It was exactly a year ago, on the first of January 2008, that I began blogging (see first posting here). When I started I wasn't sure how well this blogging thing would work out. Now 12 months and 376 posts later - I have to say that I have truly enjoyed this. I had...

A chat about GPT3 (and other forms of alien intelligence)

A chat about GPT3 (and other forms of alien intelligence)

We recently celebrated the 10-year anniversary of writing a regular column series on Rethinking Technology & Creativity in Education for the journal TechTrends. Over the next few articles in this series, we are going to dive deeper into Artificial Intelligence...

From Crayons to AI: New article (10 years of writing)

From Crayons to AI: New article (10 years of writing)

Ten years ago, we, The Deep Play Research Group, were invited to write a regular series of articles for this journal exploring the relationship between technology, creativity and learning. To celebrate this anniversary, we decided to write two summary/ synthesis...

Martin Gardner, RIP

Martin Gardner, 1914 - 2010 Martin Gardner died five days ago. Gardner was an influential writer about mathematics and was one of the greatest influences on me (and my friends) as I was growing up. His recreational mathematics column was the main reason I subscribed...

Fact / Fiction, ambigram

Yesterday after I had posted my two latest ambigrams (see them here) I got a message on Facebook from my cousin Sonny (the one who composed the cool music for my Explore, Create videos) saying Big deal. I can make "fact" and "fiction" blur together till they are...

On performing one’s identity: A thought inspired by Jonathan Miller

It is difficult, in a world buffeted by change, to know what to hold on to. I often wonder about this when thinking of teaching and learning, when thinking of the speed at which technology is changing the world we live in... What do we hold on to? What do we let go?...

Deconstructing TV news

The video below has been getting a lot of attention on the blogs lately, and despite that it is pretty good. No kittens riding skateboards or mentos and Coke here. Just a beautifully constructed take down of TV News. A must see for all media literacy courses. Check it...

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *