Tag Archives: Adobe

Testing A.I. Transcript Accuracy (most recent test)

Periodically we do tests of various AI services to see if we should be using something on the backend of Transcriptive-A.I. We’re more interested in having the most accurate A.I. than we are with sticking with a particular service (or trying to develop our own). The different services have different costs, which is why Transcriptive Premium costs a bit more. Gives us more flexibility in deciding which service to use.

This latest test will give you a good sense of how the different services compare, particularly in relation to Adobe’s transcription AI that’s built into Premiere.

The Tests

Short Analysis (i.e. TL;DR):

 For well recorded audio, all the A.I. services are excellent. There isn’t a lot of difference between the best and worst A.I… maybe one or two words per hundred words. There is a BIG drop off as audio quality gets worse and you can really see this with Adobe’s service and the regular Transcriptive-A.I. service.

A 2% difference in accuracy is not a big deal. As you start getting up around 6-7% and higher, the additional time it takes to fix errors in the transcript starts to become really significant. Every additonal 1% in accuracy means 3.5 minutes less of clean up time (for a 30 minute clip). So small improvements in accuracy can make a big difference if you (or your Assistant Editor) needs to clean up a long transcript.

So when you see an 8% difference between Adobe and Transcriptive Premium, realize it’s going to take you about 25-30 minutes longer to clean up a 30 minute Adobe transcript.

Takeaway: For high quality audio, you can use any of the services… Adobe’s free service or the .04/min TS-AI service. For audio of medium to poor quality, you’ll save yourself a lot of time by using Transcriptive-Premium. (Getting Adobe transcripts into Transcriptive requires a couple hoops to jump through, Adobe didn’t make it as easy as they could’ve, but it’s not hard. Here’s how to import Adobe transcripts into Transcriptive)

(For more info on how we test, see this blog post on testing AI accuracy)

Long Analysis

When we do these tests, we look at two graphs: 

  1. How each A.I. performed for specific clips
  2. The accuracy curve for each A.I. which shows how it did from its Best result to Worst result.

The important thing to realize when looking at the Accuracy Curves (#2 above) is that the corresponding points on each curve are usually different clips. The best clip for one A.I. may not have been the best clip for a different A.I. I find this Overall Accuracy Curve (OAC) to be more informative than the ‘clip-by-clip’ graph. A given A.I. may do particularly well or poorly on a single clip, but the OAC smooths the variation out and you get a better representation of overall performance.

Take a look at the charts for this test (the audio files used are available at the bottom of this post):


Click to zoom in on the image

Overall accuracy curve for AI Services

All of the A.I. services will fall off a cliff, accuracy-wise, as the audio quality degrades. Any result lower than about 90% accuracy is probably going to be better done by a human. Certainly anything below 80%. At 80% it will very likely take more time to clean up the transcript than to just do it manually from scratch.

The two things I look for in the curve is where does it break below 95% and where does it break below 90%. And, of course, how that compares to the other curves. The longer the curve stays above those percentages, the more audio degradation a given A.I. can deal with. 

You’re probably thinking, well, that’s just six clips! True, but if you choose six clips with a good range of quality, from great to poor, then the curve will be roughly the same even if you had more clips. Here’s the full test with about 30 clips:

Accuracy of Adobe vs. Transcriptive, full test results

While the curves look a little different (the regular TS A.I. looks better in this graph), mostly it follows the pattern of the six clip OAC. And the ‘cliffs’ become more apparent… Where a given level of audio causes AI performance to drop to a lower tier. Most of the AIs will stay at a certain accuracy for a while, then drop down, hold there for a bit, drop down again, etc. until the audio degrades so much that the AI basically fails.

Here are the actual test results:

TS A.I. Adobe Speechmatics TS Premium
Interview 97.2% 97.2% 97.8% 100.0%
Art 97.6% 97.2% 99.5% 97.6%
NYU 91.1% 88.6% 95.1% 97.6%
LSD 92.3% 96.9% 98.0% 97.4%
Jung 89.1% 93.9% 96.1% 96.1%
Zoom 85.5% 80.7% 89.8% 92.8%
Remember: Every additonal 1% in accuracy means 3.5 minutes less of clean up time (for a 30 minute clip).

So that’s the basics of testing different A.I.s! Here are the clips we used for the smaller test to give you an idea of what’s meant by ‘High Quality’ or ‘Poor Quality’. The more jargon, background noise, accents, soft speaking, etc there is in a clip, the harder it’ll be for the A.I. to produce good results. And you can hear that below. You’ll notice that all the clips are 1 to 1.5 minutes long. We’ve found that as long as the clip is representative of the whole clip it’s taken from, you don’t get any additional info from the whole clip. An hour long clip will product similar results to one minute, as long as that one minute has the same speakers, jargon, background noise, etc.

Any questions or feedback, please leave a note in the Comments section! (or email us at cs@nulldigitalanarchy.com)


‘Art’ test clip


‘Interview’ test clip


‘Jung’ test clip


‘NYU’ test clip


‘LSD’ test clip


‘Zoom’ test clip

Adobe Transcripts and Captions & Transcriptive: Differences and How to Use Them Together

Adobe just released a big new Premiere update that includes their Speech-to-Text service. We’ve had a lot of questions about whether this kills Transcriptive or not (it doesn’t… check out the new Transcriptive Rough Cutter!). So I thought I’d take a moment to talk about some of the differences, similarities, and how to use them together.

The Adobe system is basically what we did for Transcriptive 1.0 in 2017. So Transcriptive Rough Cutter has really evolved into an editing and collaboration tool, not just something you use to get transcripts.

The Adobe solution is really geared towards captions. That’s the problem they were trying to solve and you can see this in the fact you can only transcribe sequences. And only one at a time. So if you want captions for your final edit, it’s awesome. If you want to transcribe all your footage so you can search it, pull out selects, etc… it doesn’t do that.

So, in some ways the Transcriptive suite (Transcriptive Rough Cutter, PowerSearch, TS Web App) is more integrated than Adobe’s own service. Allowing you to transcribe clips and sequences, and then search, share, or assemble rough cuts with those transcripts. There are a lot of ways using text in the editing process can make life a lot easier for an editor, beyond just creating captions.

Sequences Only

Adobe's Text panel for transcribing sequences

The Adobe transcription service only works for Sequences. It’s really designed for use with the new Caption system they introduced earlier this year.

Transcriptive can transcribe media and sequences, giving the user a lot more flexibility. One example: they can transcribe media first, use that to find soundbites or information in the clips and build a sequence off that. As they edit the sequence, add media, or make changes they can regenerate the transcript without any additional cost. The transcripts are attached to the media… so Transcriptive just looks for which portions of the clips are in the sequence and grabs the transcript for that portion.

Automatic Rough Cut

Rough Cut: There are two ways of assembling a ‘rough cut’ with Transcriptive Rough Cutter. What we’re calling Selects, which is basically what I mention above in the ‘Sequences Only’ paragraph: Search for a soundbite, you set In/Out points in the transcript of the clip with that soundbite, and insert that portion of the video into a sequence.

Then there’s the Rough Cut feature, where Transcriptive RC will take a transcript that you edit and assemble a sequence automatically: creating edits where you’ve deleted or struckthrough text and removing the video that corresponds to those text edits. This is not something Adobe can do or has made any indication they will do, so far anyways.

Editing with text in Premiere Pro and Transcriptive Rough Cutter

Collaboration with The Transcriptive Web App

One key difference is the ability to send transcripts to someone that does not have Premiere. They can edit those transcripts in a web browser and add comments, and then send it all back to you. They can even delete portions of the text and you can use the Rough Cut feature to assemble a sequence based on that.

Searching Your Premiere Project

PowerSearch: This separate panel (but included with TS) lets you search every piece of media in your Premiere project that has a transcript in metadata or in clip/sequence markers. Premiere is pretty lacking in the Search department and PowerSearch gives you a search engine for Premiere. It only works for media/sequences transcribed by Transcriptive. Adobe, in their infinite wisdom, made their transcript format proprietary and we can’t read it. So unless you export it out of Premiere and then import it into Transcriptive, PowerSearch can’t read the text unfortunately.

Easier to Export Captions

Transcriptive RC let’s you output SRT, VTT, SCC, MCC, SMPTE, or STL just by clicking Export. You can then use these in any other program. With Adobe you can only export SRT, and even that takes multiple steps. (you can get other file formats when you export the rendered movie, but you have to render the timeline to have it generate those.)

I assume Adobe is trying to make it difficult to use the free Adobe transcripts anywhere other than Premiere, but I think it’s a bit shortsighted. You can’t even get the caption file if you render out audio… you have to render a movie. Of course, the workaround is just to turn off all the video tracks and render out black frames. So it’s not that hard to get the captions files, you just have to jump through some hoops.

Sharing Adobe Transcripts with Transcriptive Rough Cutter and Vice Versa

I’ve already written a blog post specifically about showing how to use Adobe Transcripts with Transcriptive. But, in short… You can use Adobe transcripts in Transcriptive by exporting the transcript as plain text and using Transcriptive’s Alignnment feature to sync the text up to the clip or sequence. Every word will have timecode just as if you’d transcribed it in Transcriptive. (this is a free feature)

AND… If you get your transcript in Transcriptive Rough Cutter, it’s easy to import it into the Adobe Caption system… just Export a caption file format Premiere supports out of Transcriptive RC and import it into Premiere. As mentioned, you can Export SRT, VTT, MCC, SCC, SMPTE, and STL.

Two A.I. Services

Transcriptive Rough Cutter gives you two A.I. services to choose from, allowing you use whatever works best for your audio. It is also usually more accurate than Adobe’s service, especially on poor quality audio. That said, the Adobe A.I. is good as well, but on a long transcript, even a percentage point or two of accuracy will add up to saving a significant amount of time cleaning up the transcript.

Why we charge upgrade fees

Most of the updates we release are free for users that have purchased the most recent version of the plugin. However, because we are not subscription based (we still do that old fashioned perpetual license thing), if you don’t own the latest version of the plugin… you have to upgrade to it.

It requires a TON of work to keep software working with all the changes Apple, Adobe, Nvidia and everyone else keeps making. Most of this work we do for free because they’re small incremental changes. Every time you see Beauty Box v4.0.1 or 4.0.7 or 4.2.4 (the current one)… you can assume a lot of work went into that and you don’t have to pay anything. However, eventually the changes add up or Apple (most of the time it’s Apple) does some crazy thing that means we need to rewrite large portions of the plug-in. In either case, we rev the version number (i.e. 4.x to 5.0) and an upgrade is required.

We do not go back and ‘fix’ older versions of the software. We only update the most recent one. Such is the downside of Perpetual licenses. You can use that license forever, but if your host app or OS changes and that change breaks the version of the plugin you have… you need to upgrade to get a fix.

If one of your clients comes to you with a video you did for them in HD, and says ‘hey, I need this in 4K’. Would you redo the video for free? Probably not. They have a perpetual license for the HD version. It doesn’t entitle them to new versions of the video forever.

We want to support our customers. The reason we develop this stuff is because it’s awesome to see the cool things you all do with what we throw out there. If we didn’t have to do any work to maintain the software, we wouldn’t charge upgrade fees. Unfortunately, it is a lot of work. We want to support you, but if we go out of business, that’s probably not going to benefit either of us.

Apple may say it only takes two hours to recompile for Silicon and that may be true. But to go from that to a stable plugin that can be used in a professional environment and support different host apps and graphics cards and all that… it’s more like two months or more.

So, that’s why we charge upgrade fees. You’re paying for all the coding, design, and testing that goes into creating a professional product that you can rely on. Not too mention the San Francisco based support team to help you out with all of it. We’re here to help you be successful. The flipside is we need to do what’s necessary to make sure we’re successful ourselves.

Why we charge crossgrade fees

It’s a lot of work supporting different host apps. Every company has a different API (application programming interface) and they usually work very differently from each other. So development takes a lot of time, as does testing, as does making sure our support staff knows each host app well enough to troubleshoot and help you with any problems.

Our goal with all our software is to provide a product that 1) does what it claims to do as well or better than anything else available, 2) is reasonably bug free and 3) completely supported if you call in with a problem (yes, you can still call us and, no, you won’t be routed to an Indian call center). All of that is expensive. But we pride ourselves on great products with great support at a reasonable cost. By having crossgrades we can do all of the above, since you’re not paying for things you don’t need.

If you create a video for a client in HD and then they tell you they want the video in a vertical format for mobile, do you do it for free? Probably not. While clients might think you just need to re-render it, you know that because you need to make the video compelling in the new format, make sure all text is readable, and countless other small things… it requires a fair amount of work.

That’s the way it is with developing for multiple APIs. So the crossgrade fee covers those costs. And since all of our plugins are perpetual licenses, you don’t have to pay a subscription fee forever to keep using our products.

If we didn’t charge crossgrade fees, we’d include the costs of development for all applications in the initial price of the plugin (which is what some companies do). This way you only pay for what you need. Most customers only use one host application, so this results in a lower initial cost. Only users that require multiple hosts have to pay for them.

And  we don’t actually charge per applications. For example, After Effects and Premiere use the same API, so if you buy one of our plugins for Adobe, it works in both.

The crossgrades come as a surprise to some customers, but there really are good reasons for them. I wanted you all to understand what they are and how much work goes into our products.

Why Doesn’t FCP X Support Image Sequences for Time Lapse (among other reasons)

In the process of putting together a number of tutorials on time lapse (particularly stabilizing it), I discovered that FCP X does not import image sequences. If you import 1500 images that have a name with sequential numbers, it imports them as 1500 images. This is a pretty huge fail on the part of FCP. Since it is a video application, I would expect it to do what every other video application does and recognize the image sequence as VIDEO.  Even PHOTOSHOP is smart enough to let you import a series of images as an image sequence and treat it as a video file. (and, no, you should not be using the caveman like video tools in Photoshop for much of anything, but I’m just sayin’ it imports it correctly)

There are ways to get around this. Mainly use some other app or Quicktime to turn the image sequence into a video file.  I recommend shooting RAW when shooting time lapse,  so this means you have to pull the RAW sequence into one of the Adobe apps anyways (Lightroom, After Effects, Premiere) for color correction.  It would be much nicer if FCP just handled it correctly without having to jump through the Adobe apps. Once you’re in the Adobe system, you might as well stay there, IMO.

No, I’m not a FCP X hater. I just like my apps to work the way they should… just as I tore into Premiere and praised FCP for their .f4v (Flash video) support in this blog post.

Time Lapse image sequence in Final Cut Pro failing to load as a single video file

 

What’s wrong with this picture?

Nvidia GeForce GTX 570 in a Macintosh

All the speed tests we’ve done with Beauty Box on Windows show the Nvidia GeForce video cards to outpace their much more expensive cousins, the Quadros, significantly. A GTX 570 (~$270) is about 25-30% faster than a Quadro 4000 ($800).

Since Beauty Box can involve some render time, we’ve wished that Apple would authorize one of the newer GeForce cards for the Mac. No such luck. So we’re tired of waiting. We took a stock PNY GeForce 570 and put it into our MacPro. And lo! It works!

So… what’d we do and what are the caveats? This was not a 570 with ‘flashed’ ROM. This was just a straight up 570 which we use in one of our PC machines. Nothing fancy. We did need to download a few things:

– Latest Nvidia driver for the Mac, which can be found here: http://www.nvidia.com/object/macosx-304.00.05f02-driver.html

– Latest CUDA drivers for the Mac, which can be found here: http://www.nvidia.com/object/mac-driver-archive.html (as of this writing, v5.0.37 was the latest)

– If you’re using Premiere you need to update the cuda_supported_cards.txt file to add the name of the video card. In this case it would be: ‘GeForce GTX 570’  To do this, you need to go to the Premiere.app file, right+click on it and select ‘Show Package Contents’. Once you do that, this is what you’ll see:

CUDA nvidia opencl adobe premiere macintoshOnce that’s done, you are good to go!

Now that caveats…

Continue reading Nvidia GeForce GTX 570 in a Macintosh

Evil Geeks vs. Evil Marketers

I’ve always said that I’d prefer to have an Evil Geek (Bill Gates) rule the world instead of an Evil Marketing Guy (Steve Jobs). Sort of like the difference between having the nerds or the cool kids run your high school. And sure enough, now that Steve has a dominent platform, he’s running it like the cool kids would.

I mean seriously. Geek evil is sort of like ‘pinky and the brain’ evil. Yeah, they might take over the world, but that’s what they plan every night. And even if they succeed, all they’ll end up doing is having chair jumping contests and all night Star Trek marathons (how else do you explain much of Microsoft’s software?)

Marketers, like Steve, are different.

Continue reading Evil Geeks vs. Evil Marketers

Photowalkin’ (and camera lenses)

Yesterday, I joined Photoshop product manager Bryan O’neil-Hughes for his Photowalk. This was part of the effort by NAPP to get folks out and taking pictures. There were photowalks all over the nation because of this.

It’s a pretty cool idea and was great fun. Adobe rockstar Julieanne Kost joined us along with a few other Adobe folks. The walk itself was fairly short in length and mostly went a few blocks around the Adobe campus in San Jose. You’d be surprised at how long it takes for 50 photographers to go a few blocks. In any event, this led to many photos of the Adobe building (there also seemed to be a good deal of photographers taking pictures of photographers).

Adobe through the leaves

When you go on walks like this, it’s interesting what your choice of lens does to your photos.

Continue reading Photowalkin’ (and camera lenses)

The Demise of Adobe

… has been rather exaggerated. Ok, way over-exaggerated.

Layoffs happen at big companies. When things are great you tend to hire based on great expectations. It’s better to have too much capacity and grow into it than to be overwhelmed. The flip side is when things slow you need to trim down and unfortunately, that means layoffs. An 8% reduction in workforce really isn’t something that should be seen as that concerning. At least, from an end users perspective… for the folks getting laid off… yeah, it sucks. Although Adobe has been known to give nice severance packages.

Adobe laid off 150 people in 2001, and Macromedia laid off 170, which was 10% of the staff at the time (which was partially because of a merger, but if things had been booming I don’t think it would have been nearly as high). So layoffs are hardly unprecedented. If Adobe and Macromedia survived the dot.com implosion, I’m sure they’ll do ok this time around.

The other factor in all this is that it’s incredibly difficult to get loans or other financing right now. You would think (and this is WHOLE other rant) that with the banks getting all this taxpayer money they’d be back in business making loans. But no. Things are tighter now than they were 6 months ago.

So… companies like Adobe really need to conserve the cash they have on hand. They don’t have as much flexibility in ‘waiting and see’.

This was, at least from Adobe’s perspective, a smart and necessary thing to do. Digital Anarchy is dependent on Adobe products, and I’m not reading anything into this other than just the normal reaction to the reduced expectations that happen in a recession (We’ve been in one for about 9-12 months at this point).

For Digital Anarchy, we’re proceeding much like Adobe (minus the layoffs… we don’t have enough people as it is :-), cutting the costs we can and continuing to release products. We’ve got four products on schedule to be released over the next 3-4 months. With any recession you can’t stop investing in new products, but you do need to watch your costs very carefully. That’s all Adobe is doing.

cheers, Jim

—————————-
Jim Tierney
www.digitalanarchy.com
Digital Anarchy
Filters for Photography & Photoshop
f/x tools for revolutionaries
—————————-

Adobe CS4 Launch Event

Went to the filming of the Adobe launch event on Monday which was interesting. I’m not exactly sure who it was aimed at or what the purpose of it was, but I can’t say I was overly impressed by it. The products are cool enough with some great new features, but the event was trying too hard to be Oprah or something and just didn’t work. It would’ve been better if they’d filmed the hipster designers talking about some cool project they’d used CS4 on and showed the clips instead of having said hipster designers come on stage and fumble through a product demo. Ben Grossman from the Syndicate did a good job, but he didn’t talk about his stuff, just the standard Adobe demo material. I would’ve been much more impressed by a 3-5 minute clip of him showing where CS4 was used in the Radiohead video.

Then again, I’m just a jaded and cranky plugin developer. Maybe it worked for everyone else. ;-)

Continue reading Adobe CS4 Launch Event

Party on @ Adobe Creative CS4 launch

Yesterday I went with Jim Tierney, our company president, to the Adobe CS4 Launch event. It was at Adobe’s headquarters in San Francisco, which is where our Digital Anarchy is based also, and perhaps 150 folks were there. Market leaders like authors and studio heads and — ahem — software folks like us. The slogan of Adobe Creative Suite CS4 is ‘Shortcut to Brilliant’ and the theme of this CS4 event was the three categories of improvements that CS4 brings: time-savers, integration and innovation.

The presentations were done really well. All of the presenters were polished and practiced but they seemed to ad-lib just enough to make their words feel real. After two well-chosen talking heads, ‘real’ users like designers and editors came onstage to show off what they’ve done in a week with their new CS4 tools. As the application and media changed, so did the lighting, so for the Photoshop presentation, the lights were blue. For Illustrator and web/interactive design, everything was red. I liked the mood that was set and the enthusiasm was high but not artificially so.

An interesting tidbit from the keynote speaker was that there are over a billion consumers worldwide who will never use a computer to connect to the internet. Their online connection is a mobile device. Makes sense when you count emerging but still rural markets like Africa and India but really, I hadn’t thought about how digital practices differ through the world. The speaker’s point was that this is why the flexibility of the end graphical product is so important now.

On to the Adobe software…

For the Photoshop CS4 presentation, which is what Digital Anarchy now focuses on, the discussion was mainly tool driven. There is a 3D panoramic stitcher that looked pretty cool though I must admit that I haven’t yet explored CS3’s stitching features. Adobe has added content-aware scaling, which decides upon and eliminates unimportant details for smarter scaling.

I was more impressed by the overall integration (yep, one of the three featured topics of the event) within the CS4 suite. Really it seemed to me that many of the strong features of certain apps have been propagated over to other apps, and often that cross-ventilation seems to be with formerly Macromedia functions.

For instance, Illustrator now has a Blob brush that lets you draw and editvector chunks in exactly the same way that Flash always has. Illustrator also FINALLY has the multi-page capabilities that Freehand did over a decade ago. And Flash’s new timeline and inverse schematic animation reminds me a lot of After Effects functionality.

Fireworks was also pretty impressive. I remember hating that app years ago when I taught web design because it felt very isolated from any true workflow. Now Fireworks can baton twirl in utter sophistication with Photoshop and it even saves out interactive PDF’s.

Well, that’s my round-up for now. I can’t wait to sink my fingers into Photoshop CS4 this week. The event presenters were lauding Adobe.tv as the place to go for free training and I intend to check out that site.

regards -Debbie

Today’s Blog Brought To You By The letter “A”

Random thought of the day as we get ready to release our first product for Avid

I find it odd that the four major companies in our industry all start with an ‘A’. Adobe, Apple, AutoDesk, and Avid. It makes me miss the Discreet name even more. I still think it was an idiotic move to kill the Discreet brand… one of the best brands the industry has ever had and they punt it. Dumb. “Autodesk Entertainment and Media” just rolls off the tongue like a dead moose and invokes the image of legions of corporate AutoCad drones creating PowerPoint presentations that get turned into YouTube videos. mmm…. exciting.

Anyways… moving along before I get kicked out of AutoDesk’s developer program…

Actually that’s enough random thoughts for one day.

cheers, Jim

On The Subject of NAB (and Avid)

Now that Avid has pulled out from NAB and won’t be exhibiting in 2008, here have been a lot of users and other folks wondering what it means and what the industry thinks of it. the immediate reaction of the entire industry was to exclaim, “No shit?” and 2.3 seconds later, after the full import of what that meant hit them, was to call their NAB sales rep and promise all manner of favors if they could move their booth to front and center of the show floor.

Since I’m hardly above such things (”I was young and poor and needed the booth space”), I joined in, attempting to move our Plugin Pavilion into the now vacant space of the Avid Developer Community booth. I even had the person from Avid that managed the ADC to call NAB on our behalf. All that got me was a terse email from our NAB rep saying we would definitely NOT be getting it. It’s the new sport in HD, groveling for Avid’s booth space. Look for it on the LVCC cafeteria monitors (instead of the usual strip club ads).

Continue reading On The Subject of NAB (and Avid)