Discovering Reason is a series of articles created especially for people who have been using Reason for some time, yet can't help but feel they've only scratched the surface. While many of them were written for much older Reason versions, they're more retro or classic than out of date.
Reason's endless possibilities are not always obvious and there's a myriad of nifty tricks hidden in this open-ended production environment. We are creatures of habit, and it's easy to become lazy and get stuck in routines - routines which are often a heritage from other production environments that emphasise on quantity and diversity rather than flexibility and experimentalism.
The articles will assume that you have a fair amount of experience with Reason, and will not cover all the details of certain basic operations. Consult the Reason Operation Manual if you stumble upon something unfamiliar.
This tutorial discusses the concepts of control voltages (CVs) and Gates in Propellerhead's software. Of course, they're not really voltages because everything is happening within the software running on your PC or Mac. But the concepts are the same so, before going on to discuss how to use them, let's first take the time to understand where CVs and Gates came from, what they are, and what they do.
We'll begin by considering one of the most fundamental concepts in synthesis: there is no sound that you can define purely in terms of its timbre. Even if it seems to exhibit a consistent tone and volume, there must have been a moment when it began and a moment when it will end. This means that its loudness is contoured in some fashion. Likewise, it's probable that its tone is also evolving in some way. So let's start by considering an unvarying tone generated by an oscillator and make its output audible by playing it through a signal modifier — in this case, an amplifier – and then onwards to a speaker of some sort. We can represent this setup as figure 1.
Figure 1: A simple sound generator
Now imagine that the amplifier in figure 1 is your hi-fi amp, and that the volume knob is turned fully anticlockwise. Clearly, you will hear nothing. Next, imagine taking hold of the knob and rotating it clockwise and then fully anticlockwise again over the course of a few seconds. Obviously, you will hear the sound of the oscillator evolve from silence through a period of loudness and then back to silence again. In other words, your hand has acted as a controller, altering the action of the modifier and therefore changing when you hear even though the audio generated by the source has itself remained unchanged.
Twisting one or more knobs every time that you want to hear a note isn't a sensible way to go about things, so early synthesiser pioneers attempted to devise a method that would allow them to control their sources and modifiers electronically. They found that they could design circuits that responded in desirable ways if voltages were applied at certain points called control inputs. So, for example, an amplifier could be designed in such a way that that, when the voltage presented to its control input was 0V its gain was -∞dB (which you would normally call ‘zero', or ‘off') and when the voltage was, say, +10V, the amplifier provided its maximum gain. Thus the concepts of control voltages (CVs) and voltage-controlled amplifiers (VCAs) were born. (Figure 2.)
Figure 2: Shaping the output from the amplifier
The next thing that was needed was a mechanism to determine the magnitude of the control voltage so that notes could be shaped in a consistent and reproducible fashion. Developed in a number of different forms, this device was the contour generator, although many manufacturers have since called it (less accurately) an envelope generator, or “EG”. The most famous of these is called the ADSR; an acronym that stands for Attack/Decay/Sustain/Release. These names represent the four stages of the contour. Three of them - Attack, Decay, and Release - are measures of time, while the Sustain is a voltage level that is maintained for a period determined by... well, we'll come to that in a moment. (Figure 3.)
Figure 3: The ADSR contour generator
If we now connect this device to the VCA in figure 2 it should be obvious that the contour shapes the loudness of the note as time passes. (Figure 4.) But how do we trigger it?
Figure 4: Amplitude control using a triggered contour generator
Let's consider what happens at the moment that you press a key on a typical analogue monosynth. Many such synths generate three control voltages every time that you do so. The first determines the pitch of the sound produced, so we can replace the concept of the oscillator in figure 1 with that of the Voltage Controlled Oscillator (VCO). The second is called a Trigger. This is a short pulse that initiates the actions of devices such as contour generators. The third is a called a Gate. Like the Trigger, the Gate's leading edge can tell other circuits that you have pressed a key but, unlike the Trigger, its voltage is generated for the whole time that you keep the key depressed, which means that it can also tell those other circuits when you release the key. (Figure 5.)
Figure 5: Pitch CV, Trigger & Gate signals
If we now return to the contour generator it's clear that the Gate is important because it tells the ADSR how long to hold the Sustain level before entering the Release phase. This means that I can redraw figure 6 to show the ADSR being triggered and how the note's shape is influenced by the duration of the Gate. (Figure 6.)
Figure 6: Controlling the ADSR
It should now be obvious why we need timing signals, but why do we need two of them? There are many synthesisers that work with just a pitch CV and a Gate, but consider what happens when you hold one key down continuously (resulting in a continuous Gate) and press other keys, perhaps to create trills or some other musical effects. If there are no subsequent Triggers, the Gate holds the ADSR at the sustain level until you release all the keys so, after the first note, none of the subsequent ones are shaped correctly. This is called "single triggering". However, if a Trigger is generated every time that a note is pressed the contour generator is re-initiated whether or not a previous note is held, and subsequent notes are shaped as intended. ("Multi-triggering".)
Putting it all together
At this point, we have a VCO controlled by a pitch CV, plus an ADSR contour generator whose initiation and duration are controlled by a Trigger and a Gate. In principle, this is all that we need to programme and play a wide range of musically meaningful sounds, but let's now extend the model by adding a second ADSR and a voltage controlled low-pass filter (VC-LPF) to affect the brightness of the sound. Shown in figure 7 (in which the red lines are control signals of one form or another, and the black lines are audio signals) this is all we need for a basic synthesiser.
Figure 7: A basic voltage-controlled synthesiser
Of course, real synths embody a huge number of embellishments to these concepts, but we can ignore these. What's important here is to understand the differences between the three control signals and to be able to recognise one from another. However, this is not always as straightforward as it might seem. Imagine that you replace ADSR 2 in figure 7 with the triangle wave output from a low frequency oscillator. Now, instead of having an articulated note, you have one that exhibits tremolo. In other words, the oscillator's triangle wave is acting as a CV. But what happens if we simultaneously replace the Trigger in figure 7 with the pulse wave output from the same LFO? The LFO is now generating a stream of triggers that regularly reinitialise ADSR 1 to shape the brightness of the sound in a repeating fashion. As figure 8 shows, it's not WHAT generates a signal that determines whether it's a CV, Trigger or Gate, it's WHERE you apply that signal that counts.
Figure 8: An LFO acting simultaneously as a CV generator and a Trigger generator
Of course, oscillators, filters and amplifiers need not be affected only by pitch CVs, contour generators and LFOs. There are myriad other voltages that can be presented to their control inputs, including velocity, aftertouch and joystick CVs, the outputs from S&H generators and voltage processors, envelope followers, pitch-to-CV converters, and many more devices. Discussing these could form the basis of an entire series of tutorials, but at this point we have to take a different direction...
So far, we've only discussed how things affect other things within a single synthesiser, but CVs and Gates also make it possible for separate instruments to talk to one another. Take the simple example of an analogue sequencer driving a monosynth. In this case, the voltages generated by the synth's keyboard are replaced by those generated within the sequencer, which sends a stream of Gates to tell the synth when to play and how to articulate the notes, and a stream of CVs to tell it what pitches those notes should be. If you're lucky, the sequencer may even be capable of producing a second CV (usually called an Auxiliary CV) that can affect other aspects of the sound while the sequence is playing. (Figure 9.)
Figure 9: Connecting a synth and a sequencer
But that's not all because, with suitably equipped instruments capable of generating and receiving various control voltages, the CVs, Triggers and Gates generated by one synthesiser can control what happens on a second (or a third, or a fourth...). Furthermore, CVs generated by any of them can be mixed together (or not) and directed to control the actions of things such as effects units, sequencer controls, panning mixers and much more. (See figure 10.) It's simple, and can be wonderfully creative. The only problem is that, with a fistful of patch cables and a bit of imagination, you'll soon be on your way to creating a multi-coloured mess of electrical spaghetti.
Figure 10: Connecting multiple devices using CVs and Gates
CVs and Gates in Reason
You might be wondering what all of the above has to do with digital environments such as Reason or Record. The answer is that control signals do not have to be analogue voltages; there's no reason why CVs should not be represented digitally as streams of numbers, nor any reason why Triggers and Gates should not be represented as streams of Note On/Off messages. Furthermore, there's no reason why the human interface of a digital system need look like pages of computer code because it should be possible to represent the connections using graphical representations of cables, just as it's possible for softsynths to represent their parameter values using on-screen knobs and sliders.
Let's turn to Reason (or Record) and create instances of, say, Thor and Malström and then hit the TAB key to reveal their back panels. (Figure 11.)
Figure 11: The rear panels of Thor and Malström
If you've not looked at these before, you may be surprised to discover the wealth of control options that they offer. What's more, there's nothing arcane about them because, as postulated above, they are indeed represented in the form of CV & Gate inputs and outputs that allow you to interconnect all manner of disparate devices – not just the synthesisers, but the sequencers, effects units and other modules within the software – and make them do things that would not otherwise be possible.
Create an instance of the PH-90 Phaser and look at its rear panel... there are the frequency and rate CVs that I implied in figure 10. Create a Mixer and look at its rear panel... there are the level and pan CVs that I drew. Create a Spider CV merger and splitter... there's your CV Mixer. Create an RPG-8 arpeggiator or a Matrix sequencer... you'll find CVs, CVs everywhere!
At this point, it can be tempting to start connecting dozens of virtual cables, and you'll soon be on your way to that mass of multi-coloured spaghetti that I mentioned above. So, to conclude this tutorial and illustrate what we've learned, I'm going to discuss a simple audio example that uses just two auxiliary CVs, each of which is generated by one instance of the Matrix sequencer.
Like its inspirations from the 1970s, Matrix generates three control signals: a pitch CV (which Propellerhead calls a Note CV), a Gate, and an auxiliary CV called the Curve CV. If you create an instance of Matrix with Thor already in place, it will automatically connect its Note CV output to Thor's main CV input, and its Gate output to Thor's Gate input, ready for use. However, I'm also going to connect its Curve CV output to Thor's CV1 input (see figure 12) and, turning to the front of the rack, use the modulation matrix in Thor to direct CV1 to the Y-axis of the Formant Filter in my patch. (Figure 13.)
Figure 12 – Connecting Matrix and Thor
Figure 13: Adding interest to the sound by controlling the filter with the Matrix's auxiliary CV
I'm now ready to programme a short sequence. I'll use the Note CV to determine the notes played, and the Curve CV to determine the characteristics of the Formant filter in the patch for each note. The Gate will act just as on the analogue synths described above, controlling the ADSRs that shape the notes generated by Thor. Sound #1 demonstrates this sequence without the Curve CV connected:
While sound #2 demonstrates it with the Curve CV doing its thing:
Next, I've added this sequence to a backing sequence generated simultaneously using a second instance of Matrix, a second instance of Thor, plus Redrum and a few other bits and pieces. Again, I've recorded two versions – one with the Curve CV disconnected:
...and one with it connected .
Finally, I've used an instance of Spider plus the Curve CV in the second instance of Matrix to pan the pitched parts left/right/left while the sequence is playing, and (using a CV inverter) to pan the rhythm part right/left/right to create a cross-panning effect between the drums and the bass lines:
Once you start to experiment with your own sounds, sequences and effects, you'll soon find that CVs and Gates are superb tools, and you'll never get tired of finding out what happens when you ask this device to control some aspect of that device and make it do something weird'n'wonderful just to see what the results might be. The only caveat is that – even in a digital system – the multi-coloured spaghetti is never far away. (See figure 14.) Now it's time for you to cook some of your own.
Figure 14: Some of the connections that generated sound #5
Yours truly is old enough to have been around when digital samplers first arrived. Admittedly I never touched a Fairlight or an Emulator back when they were fresh from the factory – those products were way out of a teenager's league – but I distinctly remember the first time I laid hands on an S612, Akai's first sampler. Its modest 128 kB RAM could hold a single 1-second sample at maximum quality (32 kHz) – but none the less it was pure magic to be able to record something with a mic and instantly trigger it from the keyboard. I spotted that particular Akai sampler hooked up in my local music store, and tried it out by sampling myself strumming a chord on a Spanish guitar. My first sample...!
As the years went by, I gradually became spoiled like everyone else; there were tons of high quality sample libraries available on floppies, and soon enough the market was swamped with dedicated sample playback instruments such as the S1000PB, the E-mu Proteus, the Korg M1 and the Roland U-series to name but a few. This trend carried over into software instruments; manufacturers and others kept sampling like crazy for us so it seemed more or less superfluous to do it yourself. Propellerhead was no exception – with no sampling facilities and no hard disk recording, Reason remained a playback device for canned samples for almost 10 years – but in Reason 5 and Record 1.5, they got around to adding a sampling feature. In typical Propellerhead fashion, don't do it unless it's done right. The trick to doing it right was to bring back the simplicity and instant gratification of those early samplers – just plug in a source, hit the sample button, perform a quick truncate-and-normalize in the editor, and start jamming away.
Setting Up for Sampling
The sampling feature is part of the Audio I/O panel on the hardware interface. On the front panel there's an input meter, a monitor level control, a button for toggling the monitoring on/off and an Auto toggle switch – activate this and monitoring will be turned on automatically while you're recording a sample (figure 1).
Figure 1: Reason's Hardware interface now sports a sampling input
On the back, there's a stereo input for sampling and this will be hooked up to audio inputs 1 and 2 by default. In case you're sampling a mono source such as a mic, and the source is audio input 1, make sure to remove the cable between audio input 2 and sampling input R - otherwise you'll get a stereo sample with only the left channel (figure 2).
Figure 2: The default routing is Audio Input 1/2 to Sampling Input L/R
Now, if you're sampling with a mic you may want some help with keeping the signal strong and nice without clipping, and perhaps a touch of EQ as well. In that case you can pass the mic signal through Reason devices such as the MClass suite and then sample the processed signal (figure 3).
Figure 3: You can process your audio input prior to sampling, for example with EQ
Sampling devices or the entire rack
You might wonder "hey, what's the point of sampling Reason itself when I can just export an audio loop?" Well, there are situations where that method just doesn't cut it. A few examples:
When you want to sample yourself playing an instrument, singing, rapping or scratching, you're running the audio through Reason's effect units and you want to capture with the sample.
When you want to edit something quickly without leaving Reason, for instance if you want to reverse a segment of your song or reverse single sounds such as a reverb 'tail'. Just wire it into the sampling input, hit the sample button on the Kong/NN-XT/NN19/Redrum, pull up the editor, reverse the sample, done.
When you want to sample the perfect 'snapshot' of instruments doing something that's non-repeatable, for example when a sample-and-hold LFO happens to do something you wish it would do throughout the entire song.
When you're doing live tweaking that can't be automated, such as certain Kong and NN-XT parameters.
It's a straightforward process where you simply take any Reason devices you want to sample and plug them into the Sampling Input. In this illustration, Reason has been set up to sample from a Dr. OctoRex (figure 4).
Figure 4: Routing Reason devices to the sampling input is this simple
When you sample straight from the Reason rack, you'll want to make sure to enable the monitoring, because unlike situations where you sample an external acoustic source, you'll hear absolutely nothing of what you're doing unless monitoring is enabled (figure 5).
Figure 5: Turn on monitoring so you can hear what you're doing
These utilities let you sample just about anything straight into Reason: iTunes, Windows Media Player, YouTube, Spotify, anything with audio (including other music and audio applications of course). Now let's have a look at using one of these utilities, SoundFlower — the setup procedure is similar for all these utilities, whether on PC or Mac (consult the included documentation for specifics).
After installing SoundFlower, start up Soundflowerbed (it's in the application folder). Now you should have a flower icon on the Menu bar. Pull down the menu and make sure that the built-in output is selected in the Soundflower 2CH section: (figure 6)
Figure 6: Make sure to select the built-in output
Next, open System Preferences and select "Soundflower (2ch)" as both input and output:
Figure 7: Open System Preferences to select the correct input
And finally, open Reason Preferences and select Soundflower (2ch) as the driver on the Audio tab:
Figure 8: Now set up the input in Reason's Preferences
Now you have set everything up so that Reason can tap the audio stream, and anything playing back over the built-in output can be sampled. Though in this particular example, do not enable monitoring on the Sampling Input panel since this will create a feedback loop!
The Sample Editor
The editor is very simple and straightforward to use — it lets you trim, reverse, normalize and loop samples – so we won't go through the basics here. Instead we'll explore the fact that you can edit not just your own samples, but even those contained in ReFills, and the edited samples can then be self-contained in song files so that you can share songs without the recipient owning the actual ReFill(s).
So, any sample you load into NN-XT, Kong, NN19 or Redrum can be edited manually. This means that you can also edit individual REX slices, since those can be loaded as patches into NN19 and NN-XT. While the new REX player DR. OctoRex lets you reverse individual slices, it doesn't always yield the desired results since there's often a very long 'tail' at the beginning of the slice so it has to be triggered way in advance. And sometimes there are 'bad' slices that aren't trimmed properly, end too abruptly, and issues like that. But by editing the slices manually you can shape them just the way you want, trim, reverse, fade in/fade out, normalize slices you find too quiet even at max volume, etc. Or you can create weird 'beat repeat' glitches by making very short loops on individual slices.
Here's how you do it: Create a Dr. OctoRex. Load up the REX file of your choice. Copy the notes to the Track. Then create an NN-XT and move or copy the OctoRex notes to the empty NN-XT track. Now, open the patch browser and load the same REX file into the NN-XT. The slices will show up in the NN-XT Editor window like this: (figure 9)
Figure 9: A Rex loop opened in the NN-XT
Now you have REX playback with the option to edit each slice. Simply select a sample on the NN-XT Editor display, and then right-click and select "Edit Sample" or hold [Alt/Option] while clicking on the Sample button. (figure 10)
Figure 10: Reason's built in sample editor
Here's one of those samples that are difficult to use when reversed, because it's about 10% percussive sound and the rest is just a hint of reverb. So what we'll do is reverse it and trim off most of the 'tail' by using the Crop function... (figure 11)
Figure 11: A long tail can be cropped
...and then, create a short fade-in on the trimmed tail by highlighting the beginning of the slice and clicking the Fade In button... (figure 12)
Figure 12: Fading in helps the sample start smoothly
...and now we have a more manageable reversed slice.
As mentioned earlier, another thing you can do with slices is to create very short loops to get that glitchy digital hiccup feel. It might look something like this: (figure 13)
Figure 13: A short sample with the loop mode set to back/forth
Here's an example of a factory soundbank REX loop where individual slices have been reversed and/or looped in the editor. Bars 1-2 and 5-6 play the original loop for comparison, bars 3-4 and 7-8 feature the tweaked loops played by the NN-XT.
As a quick demonstration of what you can do with just a mic and whatever potential sampling sources you have lying around, I went to the kitchen and spent half an hour recording random stuff with a Sennheiser e945 hooked up to an Apogee Duet. The song file below is based entirely on the following self-contained samples:
Bass drum: Large empty drawer closing Snare drum: Small drawer (full of utensils) closing Closed HH: Water tap, short burst Open HH: Water tap, long burst Additional percussion: Scissors, more drawers closing Bass 1: Microwave starting up Bass 2: Slapping the top of the microwave Bells: Butterknife against large beer glass Poly synth: Microwave "ready beep" FX drone: Dishwasher starting up
Sampling may seem a chore at first, with lots of trial and error, but it's actually quite easy, and ultimately it's rewarding to make music with your own unique sounds. When someone remarks "cool snare, where'd you get that?" you'll be able to say "oh, that's just me vacuuming up some Lego bricks". So grab a mic and go on a quest for hidden sonic treasures in your garage, your kitchen, your car... but go easy with the bathroom samples.
The final filter in Thor's armoury is a rather special one named a Formant filter, so-called because it imposes formants on any signal passed through it. But what are formants, and why would you want to impose them on anything?
Let's start to answer this by reminding ourselves of the four types of filters most commonly found in synthesizers. These are the low-pass filter (figure 1) the high-pass filter (figure 2) the band-reject or 'notch' filter (figure 3) and the band-pass filter (figure 4). Our journey into formant synthesis begins with the fourth of these.
Figures 1: low-pass filter
Figures 2: high-pass filter
Figures 3: band reject (notch) filter
Figures 4: band pass filter
A simple 6dB/oct band-pass filter is a fairly weak shaper of a signal, but if you place a number of these with the same centre frequency in series, the width of the pass-band becomes narrower and narrower until only a limited range of frequencies is allowed through. (Figures 5 and 6.)
Figures 5: The responses of placing two band-pass filters in series
Figures 6: The responses of placing 4 band-pass filters in series
Now imagine the case in which you place, say, three of these multiple band-pass filters in parallel. If you set the cut-off frequency to be different for each signal path, you obtain three distinct peaks in the spectrum (see figure 7) and the filters attenuate any signal lying outside these bands. As you can imagine, any sound filtered in this way adopts a distinctive new character.
(A similar result can be obtained using parallel peaking filters or even low-pass and high-pass filters with high resonance values, and a number of venerable keyboards in the 1970s used architectures based on these. Although not strictly equivalent, the results look similar and for many synthesis purposes are interchangeable.)
Figure 7: Multiple BPFs in parallel
If we wanted to pursue this path further, it would take us into a whole new domain of synthesis called physical modeling. This is because the characteristic resonances of acoustic instruments - the bumps in the instruments' spectral shapes - are recognisable from one instrument to the next. For example, all violas are of similar shape, size, and construction, so they possess similar resonances and exhibit a consistent tonality that allows your ears to distinguish them from say, classical guitars generating the same pitch. It therefore follows that imitating these resonances is a big step forward toward realistic synthesis. Today, however, we're going to restrict ourselves to the special case of this that is sometimes called 'vocal synthesis'.
The Human Voice
Because you share the architecture of your sound production system with billions of other people, it's safe to say that all human vocalizations - whatever the language, accent, age or gender - share certain acoustic properties. To be specific, we all push air over our vocal cords to generate pitched signals, and we can tighten and relax these cords to change the pitch that we produce. Furthermore, we all produce broadband noise.
The pitched sounds are generated deep in our larynx, so they must pass through our throats, mouths, and noses before they reach the outside world through our lips and nostrils. And, like any other cavity, this 'vocal tract' exhibits resonant modes that emphasise some frequencies while suppressing others. In other words, the human vocal system comprises a pitch-controlled oscillator, a noise generator, and a set of band-pass filters! The resonances of the vocal tract and the spectral peaks that they produce are the formants that I keep referring to, and they make it possible for us to differentiate different vowel sounds from one another. (Consonants are, to a large degree, noise bursts shaped by the tongue and lips and, in general, you synthesise these using amplitude contours rather than spectral shapes.)
Table 1 shows the first three formant frequencies for some common English vowels spoken by a typical male adult. As you can see, they do not follow any recognisable harmonic series, and are distributed in seemingly random fashion throughout the spectrum.
Table 1: Some examples of the first three formants in English vowel sounds
Given table 1 and a set of precise filters you might think that you should be able to create passable imitations of these sounds but, inevitably, things are not quite that simple. It's not just the centre frequencies of the formants that affect the quality of the sound, but also the narrowness of their pass-bands (their Qs) and their gains. So we can extend the information in table 1 to create formants that are more accurate. Let's take "ee" as an example... (See table 2).
Table 2: Adding amplitude and width to the formant values
This is an improvement, but it isn't the end of the story, because the sound generated by a set of static band-pass filters is, umm... static, whereas human vowel sounds are not. To undertake true speech synthesis, we need to make the band-pass filters controllable, applying controllers to all of their centre frequencies, Qs and gains. Unfortunately, this is beyond the scope of this tutorial; so let's now turn our attention to creating vocal-like sounds using the Formant Filters in Thor.
Creating a choral patch
Thor's Formant Filter imposes four peaks upon any wide-band signal fed through it, and we can see these if we apply white noise to its input and view its output using a spectrum analyser. You can move the peaks by adjusting the X and Y positions and the Gender knob, but the interactions between these are too complex to describe here. So instead, I created the simple patch in figure 8, and used this to capture four images and four audio samples, one of each taken at each corner of the X/Y display, all with the Gender knob at its mid position. You see the results in figures 9 to 12, and hear them in sounds #1 to #4.
Figure 8: Measuring the output spectrum of the Formant Filter
Figure 9: Output spectrum of Formant Filter: input = white noise, X=min, Y=min (Sound #1 )
Figure 10: Output spectrum of Formant Filter: input = white noise, X=max, Y=min (Sound #2 )
Figure 11: Output spectrum of Formant Filter: input = white noise, X=min, Y=max (Sound #3 )
Figure 12: Output spectrum of Formant Filter: input = white noise, X=max, Y=max (Sound #4 )
These responses don't imitate the formants of a human voice in a scientifically accurate way but they are nonetheless quite capable of conveying the impression of a human voice if we replace the noise at the input with a signal that is closer to that generated by the vocal cords. I have chosen a pulse wave with a value of 23 (a duty cycle of about 18%) and shaped the output with a gentle ASR amplitude contour. With no effects applied, the patch looks like figure 13, and you can hear it in sound #5:
Figure 13: Generating a simple pulse wave (Click to enlarge)
Now let's apply the Formant Filter. I've inserted this into the Filter 1 slot, set the Gender to a value of 46 and set the X/Y values to 46 and 38. (See figure 14.) There is nothing magical about these numbers; I just happen to like the results that they give, especially when I add additional elements into the patch. You'll also see that the key tracking is set to its maximum, which means that the spectral peaks move within the spectrum as the pitch changes. This is not strictly accurate but I find that, for this patch, the high notes are too dull if I leave the tracking at zero.
Figure 14: Formant filtering the pulse wave (Click to enlarge)
The patch now exhibits a 'vocal' timbre, but it's rather too static for my taste, so before recording a sample I've enhanced it a little by adding some movement to the positions of the filter peaks. I did this by applying a small amount of smoothed random modulation to the X position using LFO1 and by applying a small amount of smoothed random modulation to the Y position using LFO2. The resulting sound (shown in figure 15 and heard in sound #6 ) now has a touch of subtle instability that makes it a little more human than before:
Nonetheless, it sounds like nothing so much as a late-70s vocal synth with the ensemble button switched off. Ah... there's the clue. Leaving figure 15 untouched and invoking some external ensemble, EQ and reverb results in sound #7 . Luscious!!
Figure 15: A vocal patch (Click to enlarge)
Because the human voice comprises noise as well as tonal signal, we can enhance this still further by adding white noise at low amplitude to the input signal. Figure 16 shows this and, though the difference is again subtle, it can be a worthwhile improvement.
Figure 16: Adding noise to the vocal patch (Click to enlarge)
Of course, you might say that the addition of the external effects made the last sound what it is, and to some extent that would be true, but let's check what the latest patch sounds like without the Formant Filter:
As you can hear, it has the nuance of a vocal timbre, but at best you might call it a 'StringVox' patch. Clearly, it's the interaction of the filtered sound and the ensemble that achieves the desired effect, which is something that Roland demonstrated more than thirty years ago when they released the wonderful VP-330 Vocoder Plus, whose unaffected vocal sound was little more than a nasal "aah" but whose ensemble defined the choral sounds of the late-70s and early 80s.
Now let's ask what might happen if we replace the pulse wave that forms the basis of the previous sounds with something that is already inherently voice-like. We can investigate this by replacing the Analogue Osc with a Wavetable Osc, selecting the Voice table and choosing a suitable position with it. Figure 17 and sound #9 demonstrate this and, as you can hear, a different - but still very useable - vocal timbre results:
Figure 17: A wavetable-based formant-filtered vocal sound (Click to enlarge)
You might think that you always have to start with a quasi-vocal waveform to obtain a vocal sound, but this is far from true. Take the swarm of giant, angry insects in sound #10 , which was created using the patch in figure 18:
This is the unfiltered output from a Multi Osc with random detune being swept by the Mod Env from a large to a low value at the start of the note, the pitch being swept upward at the same time, and a delayed vibrato being supplied by LFO1. If we now add a Formant Filter to this patch (figure 19) the nature of the sound changes dramatically, becoming vocal in timbre and sounding almost like a male ensemble in a reverberant space, even though no effects have been applied:
Figure 18: Turning a swarm of angry insects into a male voice choir (Click to enlarge)
Figure 19: Turning a swarm of angry insects into a male voice choir (cont.) (Click to enlarge)
There are of course many other things we can do with vocal synthesis. Returning to the wavetable oscillator, I have created a new patch with the Gender set to maximum and the X/Y position in the centre of the display. (Figure 20.) I have added four paths in the modulation matrix to refine this, with a touch of vibrato supplied by LFO1, some random pitch deviation supplied by LFO2, a short sweep though part of the wavetable generated by the Mod Env, and an organ-like amplitude envelope generated by the Global Env that curtails every note eight seconds after you play it. ("Ah-ha!" I hear you say.) You might think that the use of a vocal wavetable and a Formant Filter with the Gender set to maximum would produce a female vocal timbre, but instead it emulates the strange tonal quality of a Mellotron. This is because the Mellotron's tape replay system exhibits strong peaks in its output spectrum, so the use of a formant filter is a good way to imitate this. Sound #12 demonstrates the patch in figure 20 played without external effects:
While sound #13 demonstrates what is possible when ensemble is applied:
Figure 20: Melly strings (Click to enlarge)
Finally, we come to the famous 'talking' synthesiser patch. There are many variants of this, mostly based around the sounds "ya-ya-ya-ya-ya" or "wow-ow-ow-ow-ow", but they all boil down to moving the formant peaks while the sound is playing. If we had a complex, scientifically accurate synthesiser, we could reproduce genuine vowel sounds, but few if any commercially available synths are capable of this. Figure 21 shows a Thor patch that says "wow" by shifting the Gender, X and Y values by appropriate amounts while opening and closing the audio amplifier. With no external effects applied, we obtain sound #14 from this. Wow!
Figure 21: Wow! (Click to enlarge)
To be honest, concentrating on vocal and SynthVox sounds only scratches the surface of formant synthesis, and you can use formant filters to create myriad other sounds ranging from orchestral instruments to wild, off-the-wall effects. But, unfortunately, there's no space to demonstrate them here because we've come to the end of my tutorials introducing Thor's filters. I hope that they have given you some new ideas and - as I suggested when I concluded my tutorials on Thor's oscillators - have illustrated why there is so much more to synthesis than tweaking the cut-off knobs of resonant low-pass filters. Thank you for reading; I appreciate it.
When we talk about an audio signal generated by an analogue (or virtual analogue) oscillator, we often describe it using three characteristics: its waveform, its frequency, and its amplitude. These, to a good approximation, determine its tone, its perceived pitch, and its volume, respectively. But there is a fourth characteristic that is less commonly discussed, and this is called the ‘phase’ of the signal.
Consider the humble 100Hz sine wave. You might think that this can be described completely by its frequency and its amplitude and, in practice, this is true provided that you hear it in isolation. But now consider two of these waves, each having the same frequency and amplitude. You can generate these by taking a single sine wave and splitting its output, passing one path through a delay unit as shown in figure 1. If no delay is applied, the two waves are said to be ‘in phase’ with one another (or, to express it another way, they have a phase difference of 0º) and, as you would imagine, you could mix them together to produce the same sound, but louder.
Figure 1: Adding in-phase sine waves
Now let’s consider what happens when the waves are ‘out of phase’, with one climbing away from ‘zero’ at the same time as the other drops away, as shown in figure 2. In this case, the second waveform is offset by half a cycle (a phase difference of 180º) with respect to the first and, if you mix them, they cancel each other out. In isolation, both waves sound identical, but mixing them results in silence. Given that this signal has a frequency of 100Hz, its period is 10ms. Consequently, the offset between the two waves in figure 2 (half a period) is exactly 5ms. This is a remarkable result because it tells you that, for a single frequency, a precise delay and a mixer can act as a perfect filter!
Figure 2: Adding out-of-phase sine waves
The results illustrated in figures 1 and 2 demonstrate perfect addition and perfect cancellation of a 100Hz sine wave but, if you combine the waves with phase differences other than 0º or 180º, you obtain different degrees of addition or cancellation. As you sweep the relative phase of the two signals from 0º to 360º the output from the system varies as shown in figure 3.
Figure 3: The output amplitude obtained as the phase difference is swept from 0º to 360º
So far, so good… A phase difference of 180º results in cancellation while a phase difference of 0º or 360º results in maximum reinforcement. But there’s nothing stopping us from applying delays that result in phase differences greater than 360º, and a phase difference of 361º has the same effect as one of 1º, 362º has the same effect as 2º… and so on ad infinitum. So, if you take a sine wave, split it into two paths, apply a delay to one path and mix the signals again, the sound will come and go as you adjust the delay time. In the case of the 100Hz sine wave, cancellation occurs with delays of 5ms, 15ms, 25ms…, and maximum reinforcement occurs with delays of 10ms, 20ms, 30ms… and so on. (Figure 4.)
Figure 4: Extending the delay time beyond one cycle
Now let’s turn this concept on its head. If there are many delay times that will result in cancellation of a 100Hz signal, it seems reasonable to surmise that there are many frequencies that will be cancelled by a given delay time. This turns out to be true, and the lowest frequency cancelled by a given delay “D” expressed in seconds is 1/2D. The next cancellation occurs at 3/2D, the next at 5/2D, the next at 7/2D and so on. Returning to our example, therefore, a delay of 5ms generates a fundamental cancellation at 100Hz, but it also cancels signals lying at 300Hz, 500Hz, 700Hz… and so on, all the way up through the spectrum, with frequencies lying between these figures attenuated as shown in figure 5. The resulting filter response looks much like a broad toothed comb.
Figure 5: The comb filter
Of course, sine waves are rather special cases, but complex signals such as music and speech can be represented by an infinite number of sine waves that describe all the frequencies present. So, for a given delay, every signal component will be reinforced or attenuated according to its frequency and the delay applied, and the resulting holes in the output spectrum are what give the comb filter its instantly recognisable character.
Finally, let’s consider what happens when the delay time is not constant but fluctuates in some fashion. In this case, the holes in the spectrum will sweep across the spectrum in some fashion, creating effects that include flanging, phasing and chorusing. Furthermore, these and other effects can be further reinforced by positive feedback (called ‘regeneration’ or ‘resonance’) to create some of the most interesting sounds and effects obtainable from subtractive synthesis.
Using the comb filter in Thor
I’ve discussed the theory behind comb filters in a bit more depth than usual because, when you knowhow they do what they do, you can design some amazing sounds using them. But you probably won’t stumble upon these by accident.
Let’s start by proving the concepts outlined above. Figure 6 shows a basic Thor patch, with a sine wave generated by Osc 1 being swept upward by applying the Filter Envelope to the oscillator frequency in the modulation matrix. There’s no filtering applied, just a gentle Attack and Release in the Amp Env to smooth the start and finish of the sound. If you play a note, you obtain the smooth upward sweep you would expect:
Figure 6: Generating an upward pitch sweep (Click to enlarge)
Now add a comb filter as shown in figure 7, with the filter frequency set to its minimum of 39.4Hz (approximately 25.4ms). Playing the same note as before demonstrates that the amplitude of the signal now changes from silent to loud to silent to loud as the pitch increases, with silences at 59.0Hz, 98.4Hz, 137.8Hz… and so on up the spectrum:
Not convinced? OK, then look at figure 8, which shows part of the audio in sound #2. Silent - loud - silent - loud - silent – loud… exactly as predicted.
Figure 7: Comb filtering a pitch sweep in Thor (Click to enlarge)
Figure 8: The amplitude of the filtered signal as the pitch sweeps upward (Click to enlarge)
A more interesting result is obtained if you replace the sine wave with something harmonically complex such as a sawtooth wave. A 100Hz sawtooth comprises components at 100Hz, 200Hz, 300Hz, 400Hz… all the way up to the upper limit of the spectrum that your audio system can handle. Consequently, if you sweep the pitch, you won’t hear the amplitude of the signal change very much, but you’ll hear its harmonic content change as various harmonics are reinforced or silenced. You can hear an unfiltered sawtooth sweep in sound #3:
And the same sweep passing through a comb filter in sound #4 .
Next, rather than sweep the pitch of the oscillator to alter the harmonic content of the waveform, let’s leave the pitch of the oscillator constant and alter the frequency of the comb filter. To demonstrate what this can do, let’s start by playing a simple chord with no comb filtering applied:
We can make this more interesting by reintroducing the filter and modulating its frequency using LFO1 as shown in figure 9. If we set the LFO rate to an appropriate value (which turns out to around 6Hz), we obtain sound #6:
Amazing… this is like turning on the Ensemble button of a cheap string synth! However, because there’s just a single modulation, you can hear a distinct “wa-wa-wa-wa-wa” in the timbre, so we can remove this and add greater depth to the patch by applying a second modulator to Filter 1 or by adding a second audio path, placing a second comb filter in the Filter 2 slot and modulating this at a different rate. I have chosen the latter approach with LFO2 as the modulator, using a rate of around 1Hz and a somewhat greater modulation depth than I used in the first path. You can see the patch in figure 10, and hear the result in sound #7. This is much nicer:
Figure 9: Modulating the frequency of the comb filter (Click to enlarge)
Figure 10: Passing a sawtooth wave through two comb filters in series (Click to enlarge)
This ‘dual-filtered’ sound is perfectly useable, but it retains perhaps too much of the brightness of the initial sawtooth wave, so you should EQ this to taste. I have done so by placing a low-pass filter in the Filter 3 slot (figure 11). With a cut-off frequency of around 2.5kHz, I obtained sound #8:
Figure 11: Removing some of the brightness from the previous patch (Click to enlarge)
If this still isn’t fat enough for you, you can add more oscillators to the patch. Figure 12 shows it with three detuned sawtooth oscillators instead of the single oscillator used above. Even without filtering, this would be an imposing patch, but when played through the dual filters, it’s a monster:
With the low-pass filter in the Filter 3 slot, it’s heavier and darker:
To illustrate this further, I’ve played a well-known chord sequence and recorded it as sound #11 (without EQ):
and as sound #12 (with EQ). Use with care!
Figure 12: The multi-oscillator dual-comb filtered patch (Click to enlarge)
Creating something a bit more radical
There’s a good reason why comb filters are so good at creating ensemble sounds: ensemble circuits typically comprise two, three or four variable delay lines followed by a mixer. Or, to put it another way, they comprise two, three or four modulated comb filters. But there’s something that a synthesiser’s comb filter has that a typical ensemble effect does not, and that’s a feedback loop. In a low-pass filter, the gain of this loop determines the amount by which the filter amplifies any signal components lying at or near its cut-off frequency. In a comb filter, increasing the value of the RES will emphasise ALL of the ‘humps’ in the filter’s response. So let’s return to the sawtooth sweep that created sound #4, but with the RES knob turned to its maximum rather than its minimum. As you can hear, the result is radically different:
Another interesting result is obtained if you leave the pitch of the oscillator unaffected and sweep the cut-off frequency of the comb filter. In sound #14 you can hear the harmonics of the sawtooth wave being picked out as the peaks of the comb travel upward in frequency until, at the end, they are so high up the spectrum that the sawtooth wave is heard almost unaffected:
This is all well and good, but it’s when you realise that a highly resonant comb filter can generate a harmonic series that things get really interesting. Figure 13 shows a patch that uses a resonant comb filter to carve the spectrum from a noise generator into something resembling an harmonic oscillator, with the overall shape of the resulting spectrum determined by the colour of the initial noise. If you modulate the noise colour using an LFO as shown, you can create some wonderful, evolving textures. The three bass notes in sound #15 demonstrate this, sounding much like an E-bow applied to a trans-Atlantic cable:
Taking this to the extreme, I added some external effects (chorus, EQ and reverb) to produce sound #16 . This might win me the job as sound designer for the next instalment of the Aliens saga, but it’s still just noise passed through a single comb filter!
Figure 13: Comb-filtered noise (Click to enlarge)
You can experiment with the patch in figure 13 to create a huge range of sounds ranging from menacing basses through to delicate, aetherial pads and voices. But, before finishing, there’s another use for the comb filter that I would like to demonstrate. If you pass a cyclic waveform through a highly resonant comb filter, you enter a new realm of sound design that involves convolving one harmonic series using a second. Think of it like this… you’re feeding a complete harmonic series into the filter, but you’ll only hear an output if one or more of the harmonics lie at the same frequencies as one or more of the filter’s pass-bands. Once you’ve got your head around this, you can devise all sorts of ways to generate interesting sounds. Figure 14 shows a patch in which a sawtooth wave that’s tracking the keyboard conventionally at 100% is being modified by a comb filter that’s tracking the keyboard at 90%. This means that different harmonics are being picked out depending upon the pitch, so each note that you play has a slightly different timbre from the others. You can set this up in many ways, but I rather like the bell-like timbre in sound #17 which, although it sounds a bit like AM or FM synthesis, would be all but impossible to obtain using other methods:
Figure 14: Using a comb filter to synthesise bells and chimes (Click to enlarge)
There are much more radical things that you can do with this patch, and sweeping the filter frequency manually will reveal some of these. You could also add a second comb filter and come up with all manner of interesting ways of making the resulting signals interact with each other and the source signal(s) you choose to put through them. To be honest, I could go on for days showing you how to get sounds out of comb filters. They are among the most powerful and radical sound-shaping tools at your disposal. Please use them.