Thursday, December 18, 2014

Generating Audio PSK31 with an Arduino

I have been playing around with the idea of generating PSK31 audio with an Arduino.  In a previous post a few years ago, I implemented a similar approach using the Parallax Propeller processor.  I never took that implementation beyond a proof of concept, but hope that others will find this interesting and perhaps useful.  Some of this information is copied from this previous post for ease of understanding.

PSK31 is a digital communications mode which is intended for live keyboard-to-keyboard conversations, similar to radioteletype. The data rate is 31.25 baud (about 50 word-per-minute).  PSK31's ITU emission designator is 60H0J2B. It uses BPSK modulation without error correction.

Instead of traditional frequency-shift keying, the information is transmitted by patterns of polarity-reversals (sometimes called 180-degree phase shifts). One way to think about this would be to swap antenna terminals on each phase reversal.

The 31.25 baud data rate was chosen so that the system will handle hand-sent typed text easily. There is a problem with PSK keying, namely the effect of key-clicks.  If hard keying of phase reversals were done, the result would be a very broad emission.  The solution is to filter the output or to shape the envelope amplitude of each bit, which amounts to the same thing.

In PSK31 a cosine shape is used.  Phase reversals are done at the minimum amplitude points.  The spectrum during a continuous sequence of polarity reversals at 31 baud will consist of two pure tones at +15/-15 Hz from the center frequency and no splatter.  A binary 0 is represented by a phase reversal and a binary 1 by no phase reversal. 



The audio tone chosen is 1 KHz constructed from 32 eight-bit samples at full amplitude per cycle.  The period of a 1 KHz tone cycle is 1 millisecond.  Each of the 32 samples per cycle has a period of 31.25 microseconds.  The audio samples are eight bit values from 0x00 - 0xFF with the zero crossing value at the midpoint 0x80.

The PSK31 character bit time is 32 ms constructed of 1024 samples.  A binary zero is represented by a phase reversal while a binary one is presented by the absence of a phase reversal.  Characters are encoded with a variable bit length code (varicode) where the length of each character is inversely proportional to the frequency of use.  Characters are encoded with a bit pattern where there are no sequential zero bits.  Two zero bits in a row signify the end of a character.

In order to implement the ramp up/down  scenarios at phase reversals, I have constructed a couple of tables of sinusoid information.  The first consists of 512 table entries defining the 16 cycles of ramping up from zero to full amplitude.  I have plotted the data in Excel as follows:


Once I get up to full amplitude, I use a separate 32 eight-byte table consisting of a single sinusoid.  For a binary zero, I ramp down (512 samples) by processing the above table in reverse, reverse the phase and immediately ramp back up again (another 512 samples), this time processing the table in the forward direction.  For a binary one, I use the 32 eight bit sample table below and repeat it 32 times for 1024 total samples.



I considered using my software DDS example for generating PSK, but decided instead to do a very similar implementation that was not quite so general purpose in nature in order to gain some performance enhancements.  I certainly don't need 32 bits of frequency setting resolution when generating audio tones and I think I need more phase points per cycle of the tone generated.  The example below is not at odds with my DDS example, it is just a special case of the more generic implementation previously described.

So far, I just have it generating a sequence of PSK31 zero bits, which represents an idle state with no data to transmit.  I want to see how this signal looks before I get busy implementing a full PSK31 encoder.

Firstly, we need our sinusoidal ramp up information.  This table could be compressed down to about 16 bytes if you are willing to do more calculations during interrupt processing.  Only 1/4 of one cycle (90 degrees) of data needs to be stored at full amplitude values.  The upper two bits could indicate which 90 degrees we are generating (0-89 degrees, 90-179 degrees, 180-269 degrees and 270-359 degrees).  Furthermore, we could calculate any number of amplitude values between zero and full amplitude at the cost of more CPU time during interrupt processing.  For now, I am storing the entire table and just processing one table entry per interrupt.  I have one extra byte at the end that is technically not necessary representing the final phase point at the zero crossing.

// 16 cycles of 32 samples each (512 bytes) of ramp-up sinusoid information
char zero[] = { 0x80,0x80,0x80,0x80,0x81,0x81,0x82,0x82,
                0x83,0x83,0x83,0x83,0x83,0x82,0x82,0x81,
                0x7F,0x7E,0x7D,0x7B,0x7A,0x79,0x78,0x77,
                0x76,0x76,0x76,0x77,0x78,0x79,0x7B,0x7D,
                0x80,0x82,0x85,0x87,0x89,0x8B,0x8D,0x8E,
                0x8F,0x8F,0x8F,0x8D,0x8C,0x89,0x86,0x83,
                0x7F,0x7C,0x78,0x75,0x71,0x6E,0x6C,0x6B,
                0x6A,0x6A,0x6B,0x6C,0x6F,0x72,0x76,0x7B,
                0x80,0x84,0x89,0x8E,0x92,0x96,0x99,0x9A,
                0x9B,0x9B,0x9A,0x98,0x94,0x90,0x8B,0x85,
                0x7F,0x79,0x73,0x6E,0x69,0x64,0x61,0x5F,
                0x5E,0x5E,0x60,0x62,0x66,0x6C,0x72,0x78,
                0x80,0x87,0x8E,0x95,0x9B,0xA0,0xA4,0xA6,
                0xA7,0xA7,0xA5,0xA2,0x9D,0x97,0x90,0x88,
                0x7F,0x77,0x6F,0x67,0x60,0x5A,0x56,0x53,
                0x52,0x52,0x55,0x59,0x5E,0x65,0x6D,0x76,
                0x80,0x89,0x92,0x9B,0xA3,0xA9,0xAE,0xB2,
                0xB3,0xB2,0xB0,0xAB,0xA5,0x9D,0x94,0x8A,
                0x7F,0x75,0x6A,0x61,0x58,0x51,0x4B,0x48,
                0x46,0x47,0x4A,0x4F,0x56,0x5F,0x69,0x74,
                0x80,0x8B,0x97,0xA1,0xAB,0xB3,0xB9,0xBD,
                0xBE,0xBD,0xBA,0xB4,0xAD,0xA3,0x98,0x8C,
                0x7F,0x73,0x66,0x5B,0x50,0x48,0x41,0x3D,
                0x3C,0x3D,0x40,0x46,0x4F,0x59,0x65,0x72,
                0x80,0x8D,0x9B,0xA7,0xB2,0xBC,0xC2,0xC7,
                0xC9,0xC8,0xC4,0xBD,0xB4,0xA9,0x9C,0x8E,
                0x7F,0x71,0x62,0x55,0x49,0x3F,0x38,0x33,
                0x31,0x33,0x37,0x3E,0x47,0x53,0x61,0x70,
                0x80,0x8F,0x9F,0xAD,0xB9,0xC4,0xCC,0xD1,
                0xD2,0xD1,0xCD,0xC5,0xBB,0xAE,0xA0,0x90,
                0x7F,0x6F,0x5F,0x50,0x42,0x37,0x2F,0x2A,
                0x28,0x29,0x2E,0x36,0x41,0x4E,0x5D,0x6E,
                0x80,0x91,0xA2,0xB2,0xC0,0xCB,0xD4,0xD9,
                0xDB,0xDA,0xD5,0xCD,0xC1,0xB3,0xA3,0x92,
                0x7F,0x6D,0x5B,0x4B,0x3C,0x30,0x27,0x21,
                0x1F,0x21,0x26,0x2F,0x3B,0x49,0x5A,0x6C,
                0x80,0x93,0xA5,0xB6,0xC6,0xD2,0xDC,0xE1,
                0xE4,0xE2,0xDC,0xD3,0xC7,0xB8,0xA6,0x93,
                0x7F,0x6C,0x58,0x46,0x37,0x2A,0x20,0x1A,
                0x18,0x19,0x1F,0x29,0x35,0x45,0x57,0x6B,
                0x80,0x94,0xA8,0xBB,0xCB,0xD8,0xE2,0xE9,
                0xEB,0xE9,0xE3,0xD9,0xCC,0xBC,0xA9,0x95,
                0x7F,0x6A,0x56,0x43,0x32,0x24,0x1A,0x13,
                0x11,0x13,0x19,0x23,0x31,0x42,0x55,0x6A,
                0x80,0x95,0xAB,0xBE,0xCF,0xDD,0xE8,0xEF,
                0xF1,0xEF,0xE9,0xDE,0xD0,0xBF,0xAB,0x96,
                0x7F,0x69,0x53,0x3F,0x2E,0x1F,0x15,0x0E,
                0x0B,0x0D,0x14,0x1F,0x2D,0x3F,0x53,0x69,
                0x80,0x96,0xAD,0xC1,0xD3,0xE2,0xED,0xF4,
                0xF6,0xF4,0xED,0xE2,0xD4,0xC2,0xAD,0x97,
                0x7F,0x68,0x52,0x3D,0x2B,0x1C,0x10,0x09,
                0x07,0x09,0x10,0x1B,0x2A,0x3C,0x51,0x68,
                0x80,0x97,0xAE,0xC3,0xD6,0xE5,0xF0,0xF7,
                0xFA,0xF8,0xF1,0xE6,0xD6,0xC4,0xAF,0x98,
                0x7F,0x67,0x50,0x3B,0x28,0x19,0x0D,0x06,
                0x04,0x06,0x0D,0x18,0x28,0x3A,0x50,0x67,
                0x80,0x98,0xAF,0xC5,0xD8,0xE7,0xF3,0xFA,
                0xFD,0xFA,0xF3,0xE8,0xD8,0xC5,0xB0,0x98,
                0x7F,0x67,0x4F,0x3A,0x27,0x17,0x0B,0x04,
                0x01,0x04,0x0B,0x17,0x26,0x39,0x4F,0x67,
                0x80,0x98,0xB0,0xC6,0xD9,0xE9,0xF4,0xFC
                ,0xFE,0xFC,0xF5,0xE9,0xD9,0xC6,0xB0,0x98,
                0x7F,0x67,0x4F,0x39,0x26,0x16,0x0A,0x03,
                0x01,0x03,0x0A,0x16,0x26,0x39,0x4F,0x67,
                0x80
              };

The last 32 bytes of this table are a single sinusoid cycle that is at full amplitude, so rather than create a separate table for this data, I will just index into the above array and use the last 32 bytes.

#define one (&zero[15*32])

// Useful macros for setting and resetting bits
#define cbi(sfr, bit) (_SFR_BYTE(sfr) &= ~_BV(bit))
#define sbi(sfr, bit) (_SFR_BYTE(sfr) |= _BV(bit))

The following variables are marked as volatile as they are used in an interrupt service routine.  We keep track of which phase point (index into above table) we are processing and a count of the remaining phase points in the table.  When ramping down from full volume to zero, we process the phase table in reverse order.  Reversing the processing order has the effect of reversing the phase, so in order to be phase continuous for an  entire PSK31 bit, we must reverse the phase by negating the table entry when ramping down.  Additionally, every time we cross through zero, we have to also reverse the phase.

volatile char *pbSine = &zero[0];  // Index into sinusoid table
volatile int   cbSine = 512;       // Length of  sinusoid table
volatile char  ix     = 1;         // Increment to get to next phase point
volatile char  phase  = 1;         // PSK31 phase reversal

The following is not quite correct for PSK31, though it is close.  We need to process a phase point every 31.25 us which is 32 kHz.  Currently I am dividing the system clock (16 MHz) by 512 for 31.25 kHz or 32 us for every phase point, so I am a little slow.  I will tend to this issue a little later.  Right now I am just trying to see if I can generate the waveforms.

// Setup timer2 with prescaler = 1, PWM mode to phase correct PWM
// See th ATMega datasheet for all the gory details
void timer2Setup()
{
  // Clock prescaler = 1
  sbi (TCCR2B, CS20);    // 001 = no prescaling
  cbi (TCCR2B, CS21);
  cbi (TCCR2B, CS22);

  // Phase Correct PWM
  cbi (TCCR2A, COM2A0);  // 10 = clear OC2A on compare match when up counting
  sbi (TCCR2A, COM2A1);  //      set OC2A on compare match when down counting

  // Mode 1
  sbi (TCCR2A, WGM20);   // 101 = Mode 5 uses OCR2A as top value rather than 0xff
  cbi (TCCR2A, WGM21);

}

Now for the interrupt service routine of the timer, I just fetch the next phase point out of the table and set it as the amplitude value.  If we are ramping down we negate the value to reverse the phase and when starting to ramp up again, we reverse the phase again.  When this is all figured out, I would likely rewrite all this in assembly.

// Timer 2 interrupt service routine (ISR).
ISR(TIMER2_OVF_vect)
{
  // Set current amplitude value for the sine wave being constructed 
  // taking care to invert the phase when processing the table in 
  // reverse order.
  OCR2A = *(pbSine) * ix * phase;
  pbSine += ix;
  if (0 == cbSine--)
  {
    cbSine = 512;
    // When we get done ramping down, phase needs to change
    if (ix < 0) phase = -phase;
    ix = -ix;
  }
}

Setup is assuming an ATMega2560 board.  I use pin 10 for Timer 2 PWM output.  There is nothing (yet) to do in the main loop.  This code just generates a slightly too slow string of PSK31 zero bits.

void setup() 
{
  // PWM output for timer2 is pin 10 on the ATMega2560
  // If you use an ATMega328 (such as the UNO) you need to make this pin 11
  // See https://spreadsheets.google.com/pub?key=rtHw_R6eVL140KS9_G8GPkA&gid=0
  pinMode(10, OUTPUT);      // Timer 2 PWM output on mega256 is pin 10

  // Set up timer2 to a phase correct 32kHz clock
  timer2Setup();

  sbi (TIMSK2,TOIE2);    // Enable timer 2.
}

void loop() 
{

}

I have placed a simple RC low pass filter on the output, so the integration is not very good, but looking at this on the scope, we see the following:



This looks pretty good, but I have some concerns about what I see here.  One concern is at the point where the ramp up table is processed in reverse order with a phase change.  There is a bit of a glitch here, so I may need to clean up my phase points in this area slightly.



Additionally, there is a lot of noise as the amplitude is ramping down that I need to investigate as can be seen here.



This appears to just be an artifact of my simple RC filter integrator that I am using to remove high frequency components of the PWM output from the Arduino.



So, I am a long way from generating PSK31 signals, but wanted to share the progress as I head in that direction.  As always, comments and questions are welcome by posting here or by emailing ko7m at arrl dot net.

Saturday, November 29, 2014

Debugging Interrupt Service Routines

I have been spending some time writing AVR Interrupt Service Routines of late and have decided to share some debugging techniques that I have found useful.

When writing ISRs, especially those used to service timer interrupts, as the timer period shortens, it becomes necessary to be especially prudent about the amount of code you place in the ISR.

Obviously, the ISR routine cannot execute longer than the timer period or timer interrupts will be missed.  When an ISR is entered, interrupts are disabled so that the ISR itself cannot be interrupted unless interrupts are explicitly re-enabled in the ISR itself.  This is something that should be done very carefully (if at all) after understanding all the ramifications.

It is a mistake to put Serial.print() code in your ISR code as such code takes a very long time to complete when interrupts are disabled.  It may appear to work for a while with very short output messages, but you are fooling yourself as to the usefulness of this output.

So, how does one debug ISR code?  I find it very useful to use an oscilloscope connected to a digital pin and to toggle that pin when events of interest are seen.  For example I might set the pin high when entering the ISR and low when exiting the ISR.  Along with allowing me to see that my interrupt service routine is in fact being called, I can see how often it is called by measuring the time between pin transitions from low to high.  Additionally, I can see how long the ISR itself is taking to run by measuring the time between the pin going high and it going low again.

By way of example, I have set up timer2 on an ATMega2560 to interrupt every 32 microseconds. 

// Setup timer2 with prescaler = 1, PWM mode to phase correct PWM
// Timer will count to maximum and then back down again (effectively a divide by
// 512 since timer2 is an 8 bit counter.  This will generate an interrupt every
// 32 us.  16000000 / 512 = 31250 = 32 us (.000032 seconds)
// See the ATMega datasheet for all the gory details
void timer2Setup()

  TIMSK2 = 0;
  TCCR2A = _BV(COM2A1) | _BV(COM2B1) | _BV(WGM20);
  TCCR2B = _BV(CS20);
}

And then enable timer2 when I am ready for the timer to start.

  // Useful macros for setting and resetting bits
  #define cbi(sfr, bit) (_SFR_BYTE(sfr) &= ~_BV(bit))
  #define sbi(sfr, bit) (_SFR_BYTE(sfr) |= _BV(bit))

  sbi (TIMSK2, TOIE2);      // Enable timer 2.

To handle the ISR, I have created a trivial handler that merely sets digital pin 8 high and then sets it back low.

  // Timer 2 interrupt service routine (ISR).
  ISR(TIMER2_OVF_vect)
  {
    digitalWrite(8, HIGH);
    digitalWrite(8, LOW);
  }

Now, looking at this on the scope, we see a positive going pulse on pin 8 every 32us indicating that the ISR is being entered at that rate.


But another very interesting fact is revealed by this scope trace.  The amount of time it takes the ISR to simply toggle a pin high and then low is an amazing 7.6us.  This is a freaking eternity in the relative scheme of things.  Out of 32us I have to perform all of my ISR code, just toggling a pin high then low takes 23% of the available time.



If your ISR code is very short and sweet, this may be a non-issue.  However, everything you can do to shorten ISR code length should be done because the ISR runs with interrupts disabled.  Nothing else can run until you return from your ISR.

A silly example would be an ISR that runs every 32us and takes nearly 100% of the available 32us to complete.  This would mean that precious little time would be available for your main loop to run in your Arduino sketch.  In this particular example, there is no code in the main loop, so it is of no consequence.  However, this is not the norm and you want to minimize the time spent with interrupts disabled as much as possible as a general rule so some time spent on optimizing your ISR code can pay big dividends on overall performance.  Any section of you code can be timed quite accurately using this method allowing experimentation of different techniques to speed up your code or at least understand the effect on performance of any code change.

So, what can we do to optimize this simple two line ISR routine?  Ditch the use of digitalWrite and manipulate the port directly.  The beauty of digitalWrite is that it allows you to abstract the notion of "digital pin 8" from the particular hardware you are running on.  If you are running on a Arduino UNO (or any ATMega328 device) then digital pin 8 is PORTB bit 0 (PB0).  However, if you are running on an ATMega2560 as I am, then this same digital pin 8 is PORTH bit 5.  The digitalWrite implementation hides all of this mess from you very nicely, but at the expense of code speed.

So, let's take the knowledge we have of what processor we are running on and what port and bit the digital pin is assigned to and optimize this code a little.  I am running on an ATMega2560, so digital pin 8 is PH5.  Modifying the code to manipulate the desired bit directly means the code is hardware specific, but we gain a significant speed improvement in the code.  This code is for the ATMega2560.

  // Timer 2 interrupt service routine (ISR).
  ISR(TIMER2_OVF_vect)
  {
    sbi(PORTH,5);
    cbi(PORTH,5);
  }

Now looking at the scope, we see that the setting and resetting of the pin only takes 320ns to execute, a significant savings.



If running on an ATMega328 such as found in the UNO, change the ISR as follows:

  // Timer 2 interrupt service routine (ISR).
  ISR(TIMER2_OVF_vect)
  {
    sbi(PORTB,0);
    cbi(PORTB,0);
  }

Hopefully you will find this technique useful in your general debugging as well as when debugging ISR code in particular.  As always if you need assistance, drop me a line at ko7m at arrl dot org and I will be happy to assist in any way I can.

Wednesday, November 12, 2014

VNA fixture PCB available from OshPark.com

Due to a number of requests to share my VNA fixture design, I have made them available at OshPark.com using the following link.  Boards can be ordered directly at that link in multiples of three very inexpensively (about $11).  Drop me a note at ko7m at arrl dot net should you encounter any difficulty in obtaining them.




The tactile switches used on the board are available at mouser.com at this link.  As always, I would be happy to help you succeed in implementing this board should you so desire.

Tuesday, November 11, 2014

Characterizing Crystals

Still learning how to use the VNA.  Grabbed a handful of Crystals and clipping them to the VNA fixture described in my previous post, I collect the following information:



I am not sure that I am doing things correctly, and certainly am not yet in a position to interpret the results seen above.  More study and experimentation will be required, but it looks like I will certainly have the necessary information to design filters.



More to come...

Monday, November 10, 2014

VNA fixture

In a previous blog post I mentioned purchasing a batch of 20 MHz crystals from China with the intent of building some crystal filters.  I am facing the need to be able to characterize my new collection of eBay crystals.  I have looked at many of the suggested techniques and out of them all, I like the idea of using a vector network analyzer for the task.

A few years ago, I purchased one of these units from SDR-Kits.



I purchased this at a time when support on Windows platforms for the necessary drivers was quite sub-optimal and as a result this device has not seen much use over the years.  However, I have dug it out, dusted it off and upgraded the firmware and USB drivers, which are now nicely supported on all versions of Windows.

Now, I need a nice fixture that can be used to characterize a batch of Crystals. Leveraging the fine work of Eldon (WA0UWH) I have built a fixture that can be used with this VNA for the task at hand.


The idea behind this is to provide a couple places where components can be connected to the fixture and swept with the VNA to be able to visualize its characteristics over a wide frequency range.

The 8 pin header is provided with three ground attach points, one signal attach point and two pins on each side that connect together.  The idea is to be able to create simple networks of discrete components and be able to sweep them or to just sweep individual components.



The buttons provide the ability to short or terminate the VNA with a 50 ohm load.  As part of the calibration process when the fixture is attached, you want to remove the effects of the fixture itself from the measurements so the VNA is calibrated with a short, through, open and load conditions before attaching components to be measured.

The two resistors are 100 ohm 0805 units that are connected in parallel across the SMA when nearest button is depressed providing a 50 ohm load.  The bottom button shorts the SMA connector when depressed.

The bare copper area at the bottom of the board allows for placing surface mount components across the gap.  These can be held in place with a non-conductive clothes pin or a conductive clip as appropriate.  When measuring surface mount crystals for example, a conductive clip will allow grounding of the crystal case as the back side of the board is an open ground plane while holding the connectors firmly against the top side of the board.



Hopefully this little fixture will help expand the usefulness of the VNA in my projects.  I will share my experiences using this characterizing crystals in a separate post.

Friday, November 7, 2014

At the tone...

Fun with Arduinos and scopes in XY display mode...  Looks good on good old analogue and high end digital scopes.  Not so much on my $400 digital model.


Dual channel 8 bit DACs and a bunch of table look-up data, and a cheap RTC if you want accurate time.



Commercial versions of hardware able to produce such a display are available.  

Thursday, November 6, 2014

New Minima-like build

My good friend Wayne NB6M has kindly loaned me his Minima-like build using my controller shield for the Arduino.  His front panel is very similar to his original Minima build, but is now sporting a 20x4 display.  He has removed the reset button from the front panel and added input for paddles in anticipation of me actually finishing the integration of my keyer code to the Minima code base.



Looking at the back of the panel, we can see the Arduino Uno and my controller shield mounted on the back of the display board.  Wayne has used #12 bare copper wire soldered to the front panel to provide attach points for the Uno and shield.  My shield will be modified to provide through-hole plating and solder pads so that it can be soldered in place.

The display board has been converted to i2c with a backpack board and the rotary encoder uses pins freed up by display being converted to i2c.  The current shield design does not incorporate the proposed pins for the encoder from the discussion list, but will be modified in the final run to be compliant.


Wayne is using a pretty conventional IF strip from the Minima, but has chosen to replace the KISS mixer and BFO mixer with ADE-1 devices.  His audio section is from a pre-existing project re-purposed for this project.  The two SMA connectors connect to the VFO and BFO Si570 outputs from my controller shield.  No low pass filter sections yet.  The current configuration makes a pretty nice general coverage receiver.


I have handed off Wayne's other Minima build to Eldon so he will have a working radio to test with during his software development efforts.