Tag Archives: A.I.

Using A.I. to Create Music with Ampermusic and Jukedeck

For the last 14 years I’ve created the Audio Art Tour for Burning Man. It’s kind of a docent led audio guide to the major art installations out there, similar to an audio guide you might get at a museum.

Burning Man always has a different ‘theme’ and this year it was ‘I, Robot’. I generally try and find background music related to the theme. EDM is big at Burning Man, land of 10,000 DJs, so I could’ve just grabbed some electronic tracks that sounded robotic. Easy enough to do. However I  decided to let Artificial Intelligence algorithms create the music! (You can listen to the tour and hear the different tracks)

This turned out to be not so easy, so I’ll break down what I had to do to get seven unique sounding, usable tracks. I had a bit more success with AmperMusic, which is also currently free (unlike Jukedeck), so I’ll discuss that first.

Getting the Tracks

The problem with both services was getting unique sound tracks. The A.I. has a tendency of creating very similar sounding music. Even if you select different styles and instruments you often end up with oddly similar music. This problem is compounded by Amper’s inability to render more than about 30 seconds of music.

Using Artificial Intelligence and machine learning to create music

What I found I had to do was let it generate 30 seconds randomly or with me selecting the instruments. I did this repeatedly until I got a 30 second sample I liked. At which point I extended it out to about 3 or 4 minutes and turned off all the instruments but two or three. Amper was usually able to render that out. Then I’d turn off those instruments and turn back on another three. Then render that. Rinse, repeat until you’ve rendered all the instruments.

Now you’ve got a bunch of individual tracks that you can combine to get your final music track. Combine them in Audition or even Premiere Pro (or FCP or whatever NLE) and you’re good to go. I used that technique to get five of the tracks.

Jukedeck didn’t have the rendering problem but it REALLY suffered from the ‘sameness’ problem. It was tough getting something that really sounded unique. However, I did get a couple good tracks out of it.

Problems Using Artificial Intelligence

This is another example of A.I. and Machine Learning that works… sort of. I could have found seven stock music tracks that I like much faster (this is what I usually do for the Audio Art Tour).  The amount of time it took me messing around with these services was significant. Also, if Jukedeck is any indication, a music track from one of these services will cost as much as a stock music track. Just go to Pond5 to see what you can get for the same price. With a much, much wider variety. I don’t think living, breathing musicians have much to worry about. At least for now.

That said, I did manage to get seven unique, cool sounding tracks out of them. It took some work, but it did happen.

As with most A.I./ML, it’s difficult to see what the future looks like. There has certainly been a ton of advances, but I think in a lot of cases, it’s some of the low hanging fruit. We’re seeing that with Speech-to-text algorithms in Transcriptive where they’re starting to plateau and cluster around the same accuracy levels. The fruit (accuracy) is now pretty high up and improvement are tough. It’ll be interesting to see what it takes to break through that. More data? Faster servers? A new approach?

I think music may be similar. It seems like it’s a natural thing for A.I. but it’s deceptively difficult to do in a way that mimics the range and diversity of styles and sounds that many human musicians have. Particularly a human armed with a synth that can reproduce an entire orchestra. We’ll see what it takes to get A.I. music out of the Valley of Sameness.

 

Artificial Intelligence is The New VR

Couple things stood out to me at NAB.

1) Practically every company exhibiting was talking about A.I.-something.

2) VR seemed to have disappeared from vendor booths.

The last couple years at NAB, VR was everywhere. The Dell booth had a VR simulator, Intel had a VR simulator, booths had Oculuses galore and you could walk away with an armful of cardboard glasses… this year, not so much. Was it there? Sure, but it was hardly to be seen in booths. It felt like the year 3D died. There was a pavilion, there were sessions, but nobody on the show floor was making a big deal about it.

In contrast, it seemed like every vendor was trying to attach A.I. to their name, whether they had an A.I. product or not. Not to mention, Google, Amazon, Microsoft, IBM, Speechmatics and every other big vendor of A.I. cloud services having large booths touting how their A.I. was going to change video production forever.

I’ve talked before about the limitations of A.I. and I think a lot of what was talked about at NAB was really over promising what A.I. can do. We spent most of the six months after releasing Transcriptive 1.0 developing non-A.I. features to help make the A.I. portion of the product more useful. The release were announcing today and the next release coming later this month will focus on getting around A.I. transcripts completely by importing human transcripts.

There’s a lot of value in A.I. It’s an important part of Transcriptive and for a lot use cases it’s awesome. There are just also a lot of limitations.  It’s pretty common that you run into the A.I. equivalent of the Uncanny Valley (a CG character that looks *almost* human but ends up looking unnatural and creepy), where A.I. gets you 95% of the way there but it’s more work than it’s worth to get the final 5%. It’s better to just not use it.

You just have to understand when that 95% makes your life dramatically easier and when it’s like running into a brick wall. Part of my goal, both as a product designer and just talking about it, is to help folks understand where that line in the A.I. sand is.

I also don’t buy into this idea that A.I. is on an exponential curve and it’s just going to get endlessly better, obeying Moore’s law like the speed of processors.

When we first launched Transcriptive, we felt it would replace transcriptionists. We’ve been disabused of that notion. ;-) The reality is that A.I. is making transcriptionists more efficient. Just as we’ve found Transcriptive to be making video editors more efficient. We had a lot of folks coming up to us at NAB this year telling us exactly that. (It was really nice to hear. :-)

However, much of the effectiveness of Transcriptive comes more from the tools that we’ve built around the A.I. portion of the product. Those tools can work with transcripts and metadata regardless of whether they’re A.I. or human generated. So while we’re going to continue to improve what you can do with A.I., we’re also supporting other workflows.

Over the next couple months you’re going to see a lot of announcements about Transcriptive. Our goal is to leverage the parts of A.I. that really work for video production by building tools and features that amplify those strengths, like PowerSearch our new panel for searching all the metadata in your Premiere project, and build bridges to other technology that works better in other areas, such as importing human created transcripts.

Should be a fun couple months, stay tuned! btw… if you’re interested in joining the PowerSearch beta, just email us at cs@nulldigitalanarchy.com.

Addendum: Just to be clear, in one way A.I. is definitely NOT VR. It’s actually useful. A.I. has a lot of potential to really change video production, it’s just a bit over-hyped right now. We, like some other companies, are trying to find the best way to incorporate it into our products because once that is figured out, it’s likely to make editors much more efficient and eliminate some tasks that are total drudgery. OTOH, VR is a parlor trick that, other than some very niche uses, is going to go the way of 3D TV and won’t change anything.

Jim Tierney
Chief Executive Anarchist
Digital Anarchy

Just Say No to A.I. Chatbots

For all the developments in artificial intelligence, one of the consistently worst uses of it is with chatbots. Those little ‘Chat With Us’ side bars on many websites. Since we’re doing a lot with artificial intelligence (A.I.) in Transcriptive and in other areas, I’ve gotten very familiar with how it works and what the limitations are. It starts to be easy to spot where it’s being used, especially when it’s used badly.

So A.I. chatbots, which really doesn’t work well, have become a bit of a pet peeve of mine. If you’re thinking about using them for your website, you owe it to yourself to  click around the web and see how often ‘chatting’ gets you a usable answer. It’s usually just frustrating. You go a few rounds with a cheery chatbot before getting to what you were going to do in the first place… send a message that will be replied to by a human. Total waste of time and doesn’t answer the questions.

Artificial intelligence isn't great for chatbotsDo you trust cheery, know-nothing chatbots with your customers?

The main problem is that chatbots don’t know when to quit. I get it that some business receive the same question over and over… where are you located? what are your hours? Ok, fine, have a chatbot act as a FAQ. But the chatbot needs to quickly hand off the conversation to a real person if the questions go beyond what you could have in an FAQ. And frankly, an FAQ would be better than trying to fake-out people with your A.I. chatbot. (honesty and authenticity matter, even on the web)

A.I. is just not great at reading comprehension. It can get the jist of things usually, which I think is useful for analytics and business intelligence. But this doesn’t allow it to respond with any degree of accuracy or intelligence. For responding to customer queries it produces answers that are sort of close… but mostly unusable. So, the result is frustrated customers.

Take a recent experience with Audi. I’m looking at buying a new car and am interested in one of their SUVs. I went onto an Audi dealer site to inquire about a used one they had. I wanted to know 1) was it actually in stock and 2) how much of the original warranty was left since it was a 2017? There was a button to send a message which I was originally going to use but decided to try the chat button that was bouncing up and down getting my attention.

So, I asked those questions in the chat. If it had been a real person, they definitely could have answered #1 and probably #2, even if they were just an assistant. But no, I ended in the same place I would’ve been if I’d just clicked ‘send a message’ in the first place. But first, I had to get through a bunch of generic answers that didn’t answer any of my questions and just dragged me around in circles. This is not a good way to deal with customers if you’re trying to sell them a $40,000 car.

And don’t get me started on Amazon’s chatbots. (and emailbots for that matter)

It’s also funny to notice how the chatbots try and make you think it’s human, with misspelled words and faux emotions. I’ve had a chatbot admonish me with ‘I’m a real person…’ when I called it a chatbot. It then followed that with another generic answer that didn’t address my question. The Pinocchio chatbot… You’re not a real boy, not a real person and you don’t get to pass Go and collect $200. (The real salesperson I eventually talked to confirmed it was a chatbot.)

I also had one threaten to end the chat if I didn’t watch my language, which was not aimed at the chatbot. I just said, “I just want this to f’ing work”. A little generic frustration. However, after it told me to watch my language, I went from frustrated to kind of pissed. So much for artificial intelligence having emotional intelligence. Getting faux-insulted over something almost any real human would recognize as low grade frustration, is not going to make customers happier.

I think A.I. has some amazing uses, Transcriptive makes great use of A.I. but it also has a LOT of shortcomings. All of those shortcomings are glaringly apparent when you look at chatbots. There are, of course, many companies trying to create conversational A.I. but so far the results have been pretty poor.

Based on what I’ve seen developing products with A.I., I think it’s likely it’ll be quite a while before conversational A.I. is a good experience on a regular basis. You should think very hard about entrusting your customers to it. A web form or FAQ is going to be better than a frustrating experience with a ‘sales person’.

Not sure what this has to do with video editing. Perhaps just another example of why A.I. is going to have a hard time editing anything that requires comprehending the content. Furthering my belief that A.I. isn’t going to replace most video editors any time soon.

Getting transcripts for Premiere Multicam Sequences

Using Transcriptive with multicam sequences is not a smooth process and doesn’t really work. It’s something we’re working on coming up with a solution for but it’s tricky due to Premiere’s limitations.

However, while we sort that out, here’s a workaround that is pretty easy to implement. Here are the steps:

1- Take the clip with the best audio and drop it into it’s own sequence.
Using A.I. to transcribe Premiere Multicam Sequences
2- Transcribe that sequence with Transcriptive.
3- Now replace that clip with the multicam clip.
Transcribing multicam in Adobe premiere pro

4- Voila! You have a multicam sequence with a transcript. Edit the transcript and clip as you normally would.

This is not a permanent solution and we hope to make it much more automatic to deal with Premiere’s multicam clips. In the meantime, this technique will let you get transcripts for multicam clips.

Thanks to Todd Drezner at Cohn Creative for suggesting this workaround.