Vista Audio for Musicians? Review
The new Windows Vista operating system from Microsoft is basically upon us now. So I've spent a fair number of hours trawling through a small mass of Microsoft techno-babble (around 2 years worth FFS!!) to get a grasp of what goodness - if any - there is in store for us computer-musicians... 'cos let's face it, sooner or later, most of us will be confirmed "Vistarians"!
So I suppose the question is: Is it worth upgrading to Vista, sooner, or later?
And that would be a good question since little has seemingly changed in the Windows audio realm since Windows 98... And even that has some of the DNA from the Windows 3.1 audio engine in it!
But let me put you out of your misery with the short answer before we dive in: Yes! Windows Vista has (finally) been completely rewritten from the ground up!
What's more, it's been built with with an eye on the pro music community too! In fact, one of Microsoft's engineers - Steve Ball - is a keen muso himself, which provides at least some reassurance that our side is being represented.
A lot of real tangible advancements have been made and there's much to talk about: My problem is trying to explain it all without boring you!
You see, being the discerning musicians that we are, naturally, we have a higher technical understanding of audio sound than your usual punter. Yet a fairly significant proportion of what I've had to digest has been code-speak, which would definitely bore you! However, within that code-speak is a bunch of concepts that highlight significant changes worth knowing.
So this is my attempt at stripping-out - and deciphering - the relevant, from the irrelevant parts, and packaging what's applicable to the music-maker into digestible nuggets that will hopefully satisfy our need to understand, what exactly, we are investing our time and money into.
I hope to give you a glimpse below the bonnet so you come away knowing what to look for in your future software and hardware installations to get the best from them (and Vista).
So, whilst I'll cover the headline features we'll be bombarded with by the marketing depts over the coming years, they are features geared more for us as listeners, and we want the music-making story behind Vista... So that is what we've got (hopefully!).
I hope it proves insightful.
First, the bad news...
MIDI in Vista
Few, or no changes have been made to MIDI within Vista, according to Larry Osterman - A key Vista developer working on the audio engine.
"We're not doing a huge amount with MIDI in Vista"
Larry Osterman - Microsoft
But then again, I'm thinking the improvements that come with the Vista audio engine will no doubt be inherited by its MIDI implementation as a side-effect anyway... afterall, what more could really be done with MIDI? It's not a sound and the General MIDI (GM) standard hasn't changed its spec (as far as I know anyway!).
To get a better understanding of the way Vista handles audio, we need to get under the bonnet and look at the new architecture of the audio engine, albeit a simplified view that applies to us.
A Stricter Audio Engine
Firstly, the hardware and software we'll be using will have to comply with a stricter set of rules. This is to ensure that it can perform to the higher level of required quality, and that the multiple new audio features in Vista work consistently across-the-board; whether that be installation, user experience or the actual software/hardware operation.
Vista Logo Requirement
We'll chip away at this as we go but essentially: Hardware and software manufacturers have to meet a specified standard, and pass certain operational tests (including audio fidelity tests) to qualify for the Vista Logo ("Made for Vista").
For audio, there are two levels of "qualification" that hardware/software manufacturers can choose to meet: Basic , and Professional.
From the point of view of someone buying Vista, it's not clear what the differences between the two logo's are exactly, but I suspect it's related to the differences described in the User, and Exclusive Modes, below.
I daresay they will probably mean more to us, as musicians, as well as the likes of serious gamers etc... than the average Joe-user whose needs are seemingly more than adequately catered for under the "basic" Vista Logo requirement.
At minimum, hardware (or software) that meets the logo requirement should mean that it will work with Vista's new set of pre-packaged audio features, out-of-the-box, and will gracefully upgrade with Vista as it matures (Yes, several mentions of Microsoft adding new features and richness over time!).
For instance, one of the new features of Vista is that it can discover exactly what is connected to it... including speakers and headphones! But the hardware that facilitates this must first tell Vista about itself; such as, how many speaker output jacks it has; what size and type of jacks they are etc... In turn, Vista can present this information to us in the new audio control panel so we can see at-a-glance.
For referrence: At the time of writing, the Vista Logo Requirement to look for is 3... or:
WLP 3.0 compliance
More, as we read...
It's only a logo! The benefits are only skin-deep, surely?
Well, I wouldn't blame you for thinking that, being a cynical consumer myself, but read on... the benefits of meeting the Vista Logo Requirement trickle right through the audio system as far as I can see... right through to the chip in fact (See: High Definition Audio, below).
Robert Scoble (who I believe was a Microsoft employee at the time), spoke of a third-party vendor that had complained of the Logo Requirement being too "onerous" to comply with, in one of the videos I watched.
Now whether this proves to be a barrier for innovative software-makers in the future, remains to be seen, but if you thought that the "Made for Vista" badge on the side of a box was just marketing hype, well... maybe it isn't afterall!
Universal Audio Architecture (UAA)
The Universal Audio Architecture (UAA) represents the wrapper, if you will, for the inner components that perform various functions on the audio streams.
Microsoft wants the audio computing industry to adopt a new standardised way of talking, and listening to Windows. This is mapped-out by the UAA.
By complying to the UAA spec, hardware and software can basically enjoy full access to all the new goodness we will talk about.
As a result, because each side understands each other - from-the-off - it means that when you install new hardware, it will (or should), just work out of the box... without additional driver downloads and hassle. (Although "premium" software may still require additional drivers in order to experience their added-value.)
One of the (side) benefits for us - as users - of having this higher specification will probably mean that hardware manufacturers will have to make better quality products, if they want to play on the Windows Vista playground anyway.
High Definition Audio (HDA)
OK, now we're starting to get to the interesting meat.
High Definition Audio technology is an Intel® specification (in collaboration with Microsoft and others in the hardware industry) and is an important component of the UAA. For the serious audio tech-heads; this is an evolution on the AC97 chip technology you may have come across (or heard of).
On the surface, HDA technology gives Vista native multi-channel audio playback capability (surround-sound with up to eight channels at 192 kHz/32-bit quality) out-of-the-box. Of particular interest to us is that it also provides "dedicated system bandwidth for critical audio functions" (See "Glitch-Resilience", below) - In a nutshell, no more audio drop-outs etc...!
Another headline feature of HDA is it enables the multi-streaming of two seperate audio sources to different devices (CD to kitchen; movie to living room, for example). I can see how this might be useful in studio's or where you have your singer recording in the bathroom, for instance.
Getting to Know You
Plugged your microphone into the speaker jack?
'Don't worry, I recognise that you're occassionally stupid - I'll change the jack to function as a microphone jack for you instead!'
Below the bonnet, one of the key features of HDA is that compliant hardware can now specifically tell Windows how it is setup. We touched on this earlier when we talked about new hardware telling Vista how many output jacks it has etc...
I'm thinking (and hoping) that it opens the way for the new MIDI controller you plug-in to USB, to correctly "auto-map" itself to whatever audio software, soft-synth etc... you choose to use, without the hassle that still exists out there today.
The Death of the Souncard?
All this HDA goodness, for most people, is going to mean no more separate soundcard - 'Cos it's all there basically! And whilst we, as musicians with particular eccentric needs, may not be able to benefit from this luxury, it does mean our music should sound better to our groupies!
So I don't think it will be the death of soundcards exactly, but it will probably mean they will become more specialised if they are to survive in the Joe-user consumer market (Gaming perhaps?).
Updates and Windows Downloads
If a third-party manufacturer does not work with any of these technologies, it is very highly likely that you will have to download additional software and drivers to get going. Plus, you're more than likely not going to receive updates (for it) via the automatic downloads from Windows going forward.
So to wrap this bit up then: The 3 levels we have dipped into so far (there are other internal components, I'm just covering the important good stuff), serve - amongst other things - to ensure there is a clear, understandable pipeline that successfully connects the hardware/software, through the Vista operating system, and right down to to the chip. By getting this far, Vista can now say, 'Right... now we know each other, I can provide these services to you, and for you, and also present the results to my end-user.'... which is you (as in you!)!
Music Software and Windows Vista Audio
So let's move on and look a bit deeper at what comes with Vista, and how it will work with future music apps.
I'm going to start off by briefly explaining the two ways (in simple terms) that music applications can work within the new Vista audio engine.
The signal path is radically different in Vista and this impacts the way our software and hardware will perform. By understanding a little of the how and why, I'm hoping it will help us to make more intelligent choices about the music software and hardware we choose to run on it.
There are basically one of two "environments" within Vista that developers of music software can chose to have their creations operate, each affords different advantages:
These are User Mode and Exclusive Mode
Windows Vista comes "pre-packaged" with a rich audio feature-set, as standard (listed below).
Most audio applications will probably take advantage of working with this "ready-made" feature-set because, from a developers point-of-view, not only is it easier to understand and interact with (in code), it also offers most, if not all (plus more), of the functionality that their applications will ever need to call on.
Another key benefit is that software-makers can get their product out to-market quicker (Read: Cheaper).
This is the User Mode realm.
In essence, working within the "Vista-managed" User Mode realm, will free software-makers from the worries of where, and how, audio is handled once the software passes the audio stream over to Vista for processing. Its audio engine can look after the necessary housekeeping tasks like sample conversion (if needed), effects, volume etc... on its behalf.
As mentioned, Windows Vista comes pre-packaged with several new features as standard (see below). Some of these "basic" new features, Microsoft claims, are features usually found only on high-end audio systems and are executed with equal, if not superior quality to those systems too! They are available thanks to Intel's® High Definition Audio (HDA) technology (mentioned above).
We should look at some of these features - Even though they are more "listening features" than musician features, they are still important to us.
Vista's Audio Feature-Set:
- Loudness Equalization DSP: When changing from one sound source to another, maintains equal loudness - No heart-attacks!
- Forward Bass Management: Typically, where a multi-channel (3 or more) speaker system employs a sub-woofer, the bass from all channels is "directed" to play through sub-woofer. Crossover freq is adjustable.
- Reverse Bass Management: Multi-channel system (3 or more). No sub-woofer. At least 2 main (full freq range) speakers. Bass is "mapped" to play thru' the main 2 speakers (typically front left & right).
- Low Frequency Protection
- Speaker Fill (SF): You have 5:1 multi-channel audio setup. SF will recreate that speaker experience from a 2-channel stereo audio source.
- Room Correction: Place a Mic where you sit and Vista calibrates your speaker levels for max "embrace", taking account of the rooms freq response, sound reflections, delays etc... Claims to be better than typical high-end systems. Useful for "weird" room/seating layouts.
- Virtual Surround (VS): Where multi-channel (3 or more) audio source is fired thru a 2-channel stereo soundcard, to external multi-channel receiver with surround-sound enhancement. VS encodes signals so they can be "squeezed" thru soundcard, then decoded back to multi-channel format by receiver.
- Channel Phantoming (CP): With 3-channel speaker system (Left, Right, Sub-woofer), CP makes "best use of the speakers that you do have", presumably to give the illusion of more channels, if the audio source is 5.1, for example.
- Virtualized Surround Sound over Headphones: Using a technology known as Head-Related Transfer Functions (HRTF), takes the "physics of your head" and creates an "outside of the head" experience of sounds moving front-to-rear, left, right and near and far. So it's like you're not wearing headphones - This is high-end stuff!
- Bass Boost
- Noise Suppression
- Automatic Gain Control
- Voice Activity Detection
In some ways, you could argue that what Vista gives (to musicians), it also taketh away; because all of the above, essentially reshapes, in some way, what we artistically intended in the final recording. No doubt as Vista percolates onto the world's desktops, the user-experience - and expectation - of how audio sounds are rendered, may well change with these new technologies.
It's going to be interesting to see how this will translate into how we write music, as well as the functionality of music software and the way sound-engineering practices may change in the future.
... We may well be writing and mixing songs in 7.1 multi-channel surround-sound, as standard, in 5 yrs time. Lots to learn!
So anyway, those are just some of the default "basic" features included in Vista (I'll cover some more in a minute) but I did say there were two ways third-party developers could build their software to work with the audio engine: Vista provides a route for them to build apps that have full control of audio streams starting from the input, right through to the output (I/O) points.
This is important for us as it provides an extremely fast route for signals to reach the hardware (soundcard for example) which should mean we will be worrying less about the latency problem!
User and Exclusive Mode Differences
Simply put: User Mode is when the audio app is working within Vista's "basic" audio infrastructure. Because Vista provides built-in services as part of that infrastructure - like the pre-packaged effects - developers don't need to code those features themselves... It's a more managed environment as Vista takes care of most of the heavy-lifting for them.
Exclusive Mode, conversely, hands over almost complete control to the developer's applications exclusively, removing the layers of User Mode code that might otherwise "get in the way" of specialised custom software. It also makes for an extremely fast signal path to the output device as a result.
I don't believe it's important to know what User Mode and Exclusive Mode is - I use the terms in case they prove useful in the future for understanding the differences, under advanced scenarios... Hey, ya never know!
Why's this important?
Well, up until Vista, in order for audio apps to achieve a similar "closeness" to the input and output points in XP, software-makers like Steinberg and Cakewalk, for example, had to write their own audio engines that sat completely outside of XP's own audio engine.
This means additional software and drivers... A payload we're all, no doubt, only too familiar with !
These drivers interact directly within the sacred walls of the... Windows kernel! And that's deep man!
Any cock-ups or conflicts occuring inside the kernel inevitably results in the dreaded BSOD (Blue Screen of Death). This is currently the leading reason for Windows instability. And Microsoft want an end to that risk.
So wonch'ya please make some noise for... Exclusive Mode!
So now, to achieve the same goal - but without the "work-arounds" just mentioned - Vista provides an Exclusive (Mode) "play-area" in which audio apps can choose to live: A dedicated space for music software to operate within the unrestricted audio slipstream flowing to the hardware.
By providing near direct access to the hardware buffer, the need to write those potentially troublesome drivers is basically removed (in theory anyway - there may be caveats where software have custom features which call for them. Also, as far as I can determine, they may be needed if software includes some XP features during the XP-to-Vista transitional phase.)
Additionally, and importantly, operating in Exclusive Mode guarantees that Vista won't put anything in-between the music software, and the hardware output buffer (such as the built-in effects and housekeeping processes provided in User Mode).
This provides a space for applications to provide their own custom audio processing functions, in addition to enjoying extremely fast - thus low latency - audio throughput.
This sounds like great news and is where I really wanted to lead you during this journey.
But with Exclusive Mode, comes a cost... Software-makers don't have the luxuries afforded in User Mode meaning they must provide all the custom features themselves, from scratch! Also, from what I understand, there will only likely be a very few select number of 3rd-party vendors able, and/or willing, to exploit Exclusive Mode because of the complexity of code required. So time will tell if we'll ever see more innovative applications coming from different players than the one's we know today, namely; Steinberg, Cakewalk, Pro Tools et al... I personally think we will as it still sounds as though a lot of barriers have been removed, compared to the pre-Vista days.
If all that was hard to digest, I apologise. However, it constitutes a significant development for our world and is worth knowing if we are to better understand what software to buy in the future.
Phew! It gets a bit easier from this point forward, honestly (for me anyway)!
So what else have we got?
With Vista, you should no longer need to worry about background tasks sucking the life away from audio performance, such as anti-virus programs which insist on policing your computer like a badly-timed drug-bust.
Symptoms of this phenomena are no doubt familiar to us all: Stuttering sound, that annoying "looping" effect, reminiscent of the stuck needle on turntables, or, in extreme circumstances; the total loss of audio completely!
Vista, more or less, guarantees total priority to audio-throughput, even when opening other applications!
Glitch resilience should make recording a much more assured experience now... even with other apps open.
Some useful further information for the curious on Glitch Resilience and other deep technical utterances including a stress test demonstrating glitch-free audio performance on a computer subjected to 100% CPU load.
"End-Point" is another way of basically expressing things like microphones, speakers, headphones etc... in other words, the final destination, which could also be across the internet, remember.
End-point discovery allows Vista to detect what is, and what's not, actually connected to your computer eg... Mic, headphones etc... and it can show you visually on Vista's new audio control panel too, which will prove useful in tracking down exactly where that errant sound is coming from... or why you can't hear it even!
Also, because Vista knows what devices are connected, it won't allow apps to stream audio to anything that isn't.
Vista gives you per-application control of volume, plus you can route applications to only output sound to a particular end-point, like a soundcard or USB headphones, for example.
So, if you suddenly get "dinged" by someone on instant messenger, you can have that IM sound routed through to your headphones, whilst Cubase continues to playback through your speakers, uninterupted! That should be another cool recording safeguard.
Related to per-application control: System event sounds - We're talking about the sounds that are thrown-out when email arrives, mouse-clicks, IM "dings" etc... at full volume, bang-slap in the middle of your best recording-take!
These sounds now have a combined master volume control so you can turn them down separately from application sounds.
Additionally, the volume of system sounds can be "tied" to other volume controls, such that, if, for example, you set the volume of system events to be 50% quieter than all open apps, it will stay tied at 50% even when you increase or decrease the volume in your applications... in other words; relative volume control.
Microphone Array Support
This sounds cool.... lemme explain:
You're recording a meeting, let's say. You setup four mics around the room to form a Mic array. Windows Vista can use those Mics to intelligently filter-out background noise and focus on the speaker (or singer) exclusively!
Vista can determine from those mics, which one is "most important", and use the others to subtract extraneous noise from the final signal. I'm not sure, but I think some outside TV broadcast units use similar technology to filter out traffic noise, for example.
Vista's internal support for microphone arrays provides:
- Improved acoustic echo cancellation
- Stationary noise suppressor
- Automatic gain control
- Wideband quality of sound capturing and processing
One of the ball-aches of recording through a Mic into the PC, is the sound of its fan, or the sudden whirling noise of the hard-drive. Mic arrays might prove to be a promising solution to this.
Robert Scoble - Then with Channel 9 - interviewed Steve Ball and Larry Osterman (on video) from the MS audio team in Sept 2005, during which a Mic array prototype was produced to illustrate how they may look. We could also be seeing them built-in to monitors (See diagram). However, for most of us here in music-land, I suspect we'll see other higher-end solutions come onto the market.
An example of how Mic arrays may look (and work) in PC monitors.
Sound Format Rendering
Up until Vista, if you played a 44KHz audio file, for instance, and a system sound having a sample rate of 22KHz was emitted at the same time, then Windows would render the rest of the audio file at 22KHz. Only when the system becomes quiet again, would Windows reset to the format used by the next audio event.
Vista allows us to specify the format we want and from then-on, all output is rendered according to the settings we make, regardless of the format of the content being played.
Hmm... Have I left anything out? (Lots probably)
Oh yeah... Sound Recorder in Vista can record for longer than one minute! RARRR!
So there you go! I would describe this as an intermediate-level overview of the new Vista audio engine. I haven't really got into the hard numbers like signal-to-noise ratio's and what-not, largely because they were difficult to find. But I will try and add them later at some point... unless you know yourself of course! In which case, drop a comment and let us all know.
Update: Some of the Vista audio specs are beginning to stream in.
Disclaimer: This article is my interpretation of how the Vista audio engine will be implemented in the finished release. I'm no expert on the deep inner-workings of how software audio engines either work, or are built. At the time of writing, I had not encountered Vista, either in its pre-release beta incarnation, nor final build. The above information was gleaned from various documents and video interviews carried out and/or written over the course of around 2 or so years. Therefore (here we go!), certain stated features may have been dropped from the final release, so will therefore constitute a complete waste of my time (and yours in reading this!). Further, my understanding of the documents, in part, or in full, may be totally out-of-whack with reality, or, merely so diluted (this bit is your fault, so watch it) that its value is meaningless. This, on account that I attempted to make it muso-friendly, but prolly went too far. Sooo...
Comments, corrections and additions welcome.
If you're a software developer who has arrived here looking for technical guidance on working directly with the Vista audio engine, I've included some useful links below.
Optimal Sound have some useful information on their website for application developers seeking guidance on gaining Vista Logo qualification plus all the necessary steps leading up to it. Optimal's president, David Roach, is a leading (outside) authority on designing audio hardware and software for Windows XP and Vista. He co-authored the book, "High Definition Audio for the Digital Home: Proven Techniques for Getting It Right the First Time" - I've never met, written nor spoke to the bloke and feel as though I should be on some sort of kickback after that unprompted pimping! Nevertheless, credit where it's due: Optimal Sound's site was a useful point of reference for this article, plus he's in one of the Microsoft interviews I caught so he obviously has the inside-track to the source... which should be reassuring to you.