CCS C Software and Maintenance Offers
FAQFAQ   FAQForum Help   FAQOfficial CCS Support   SearchSearch  RegisterRegister 

ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

CCS does not monitor this forum on a regular basis.

Please do not post bug reports on this forum. Send them to CCS Technical Support

Execution times & speeding up programs

 
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion
View previous topic :: View next topic  
Author Message
andyd



Joined: 08 Mar 2007
Posts: 30

View user's profile Send private message

Execution times & speeding up programs
PostPosted: Sun Apr 08, 2007 10:57 am     Reply with quote

I've posted a few times about something I'm trying to do, but I'm still having problems with it. I'm trying to make a PIC16F88 decode ADPCM audio and output it via the PWM into a LPF and then an amp & speaker, but having some serious issues getting it to produce anything intelligble, which I think is down to how long it's taking to complete the process.

The sample rate of my audio is 8kHz, so in 1 second the PIC will need to decode and output 8000 samples, meaning 1 sample every 125us. Right?

Well when I set the decode/output loop to run 8000 times (I have a 1 second long sample), it takes about 4 or 5 seconds.

Is there any way of finding out the number of instruction cycles the routine takes up and then relating this to clock speed? I'm trying to run the PIC on its 8MHz internal oscillator to reduce the number of external components needed (as well as power etc.). I've got a feeling that what's taking up the time is the EEPROM read, and ideally I'd like to have the PIC doing this while it's doing other parts of the routine, but I'm having enough trouble getting my head round single threaded apps!

Code is below, any suggestions are welcome:

Decode routine call (wait is triggered by an interrupt on timer2):
Code:

   prevsample = 0;   // Clear ADPCM previous sample
      previndex = 0;   // Clear ADPCM previous index
      index = 0;   
      diffq = 0;
      step = StepSizeTable[index];   

      address_upper = 0b00000000;   
      address_lower = 0b00000001;
   
      fputs("Press button to start playback.", PC);
      
      while(input(PIN_B3)) // Wait for button press
      {
   
      }
      
      for(i=0;i<8000>>4) & 0x0f);
      
         Write10bitPWM(sample);
         while(wait)
         {

         }
         
         wait = 1;
         sample = ADPCMDecoder(code & 0x0f);
      
         Write10bitPWM(sample);
         while(wait)
         {

         }
      }         


ADPCM decode routine:
Code:

signed long ADPCMDecoder(char code) // ADPCM decoding routine
{

   /* Restore previous values of predicted sample and quantizer step
      size index
   */
   
   predsample = prevsample;
    index = previndex;


   /* Find quantizer step size from lookup table using index
   */
   step = StepSizeTable[index];


   /* Inverse quantize the ADPCM code into a difference using the
      quantizer step size
   */
   diffq = step >> 3;
   if(code & 4) diffq += step;
   if(code & 2) diffq += step>>1;
   if(code & 1) diffq += step>>2;

   /* Add the difference to the predicted sample
   */
   if( code & 8 ) predsample -= diffq;
   else predsample += diffq;


   /* Check for overflow of the new predicted sample */

   if(predsample > 32767)
      predsample = 32767;
   else if(predsample < -32768)
      predsample = -32768;

   /* Find new quantizer step size by adding the old index and a
      table lookup using the ADPCM code
   */
   index += IndexTable[code ];

   /* Check for overflow of the new quantizer step size index
   */

   if( index <0> 88 ) index = 88;
   

   /* Save predicted sample and quantizer step size index for next
      iteration
   */
   prevsample = predsample;
    previndex = index;

   /* Return the new speech sample */
   return(predsample);
}


EEPROM read:
Code:

unsigned char eeprom_read(unsigned char address_upper, unsigned char address_lower)
{
   unsigned char temp_byte;
   
   i2c_start();               // Start communication
   i2c_write(0xA0);                // Send control code & address of EEPROM then set to write mode
   i2c_write(address_upper);       // Write upper address bits
   i2c_write(address_lower);       // Write lower address bits
   
   i2c_start();               // Start communication
   i2c_write(0xA1);                // Send control code & address of EEPROM then set to read mode
   temp_byte = i2c_read(0);        // Read data without acknowledge bit
   
   i2c_stop();                     // Stop communication
   
   return temp_byte;
}


PWM output:
Code:

void Write10bitPWM(signed long sample)
{
   unsigned long pwmout = 0;

   pwmout = 0x8000 + sample;   // Offset around 0x8000

   pwmout = pwmout >> 7;      // Scale to 9 bit by shifting right 7 bits


   bit_clear(CCP1CON.5);
   if(pwmout & 0b000000010) bit_set(CCP1CON.5); // Set second most LSB

   bit_clear(CCP1CON.4);
   if(pwmout & 0b000000001) bit_set(CCP1CON.4); // Set most LSB

   pwmout = pwmout >> 2;      // Scale to 7 bit by shifting right 2 bits

   CCPR1L = pwmout;         // Write resulting 7 bits to CCPR1L
}


Interrupt for "wait":
Code:
#INT_GLOBAL
void timer_isr()
{
   #asm
   //Store current state of processor
   MOVWF save_w
   SWAPF status,W
   BCF   status,5
   BCF   status,6
   MOVWF save_status
   // Nothing else changes in your interrupt
   #endasm
   wait = 0;
   clear_interrupt(INT_TIMER2); 
   #asm
   // restore processor and return from interrupt
   SWAPF save_status,W
   MOVWF status
   SWAPF save_w,F
   SWAPF save_w,W
   #endasm
}
PCM programmer



Joined: 06 Sep 2003
Posts: 21708

View user's profile Send private message

PostPosted: Sun Apr 08, 2007 1:31 pm     Reply with quote

Quote:

I've got a feeling that what's taking up the time is the EEPROM read

Then use "page mode" instead of reading individual bytes.
http://www.ccsinfo.com/forum/viewtopic.php?t=17036&highlight=read_eeprom_block
ckielstra



Joined: 18 Mar 2004
Posts: 3680
Location: The Netherlands

View user's profile Send private message

PostPosted: Sun Apr 08, 2007 5:09 pm     Reply with quote

Some more hints:

Code:
      for(i=0;i<8000>>4) & 0x0f);
When posting code please select the 'Disable HTML in this post' option. Now parts of your code are missing making for unreadable code. Best is to disable this option as a default in your personal profile.

2) As you already suspected the EEPROM read routine is a problem. You are sending 4 bytes and receiving 1 byte. Including all control bits this adds up to a total of 48 bits to be transmitted. You didn't say which EEPROM you are using neither what speed the I2C is clocked, but assuming a clock of 400kHz this means you can do a maximum of 5,128 readings per second (excluding all timing overhead).

Is the sound you are trying to produce stored in the EEPROM? If yes, than you are reading the EEPROM with sequential addresses and is the suggestion from PCMprogrammer a real performance boost.
With consequetive reads the data transmitted over I2C is than reduced to 9 bits per sample, a theoretical maximum of 44,444 readings per second.

3) Another (relative small) optimization can be achieved by getting rid of the overhead of the interrupt function. You are already using a highly optimized version of the interrupt function, but the only functionality of the current interrupt is to get an accurate time synchronisation, i.e. you are waiting for the Timer2 to expire. With a slightly different approach you can optimize the interrupt handler away.

Everytime when a timer overflows it will set the corresponding Peripheral Interrupt Request Flag (PIR), this is regardless of their corresponding Interrupt Enable mask bit. Using this knowledge you can have the Timer2 interrupt disabled but still have your main loop test for the timer PIR flag being set:
Code:
#byte PIR1 = 0x0C
#bit TMR2IF = PIR1.1
void main()
{
  setup_timer2(...);
  disable_interrupt(INT_TIMER2);  // Note the _disabling_ of the interrupt.
  clear_interrupt(INT_TIMER2);

  ...

  for (...)
  {
    Write10bitPWM(sample);
    while (TMR2IF == 0)
    {}; // Wait until Timer2 overflows
    TMR2IF = 0;   // Reset Timer2 overflow flag. Alternatively use clear_interrupt(INT_TIMER2);

   ...

  }
}

// Note that there is no interrupt handler function anymore
andyd



Joined: 08 Mar 2007
Posts: 30

View user's profile Send private message

PostPosted: Mon Apr 09, 2007 7:00 am     Reply with quote

Apologies for the HTML thing, completely forgot!

Here's the main routine again:
Code:

   prevsample = 0;   // Clear ADPCM previous sample
      previndex = 0;   // Clear ADPCM previous index
      index = 0;   
      diffq = 0;
      step = StepSizeTable[index];   

      address_upper = 0b00000000;   
      address_lower = 0b00000001;
   
      fputs("Press button to start playback.", PC);
      
      while(input(PIN_B3)) // Wait for button press
      {
   
      }
      
      for(i=0; i<8000; i++)
      {
         wait = 1;
         code = eeprom_read(address_upper, address_lower);

address_lower++;      // Add 1 to lower address byte
          if(address_lower == 0x00) address_upper++;   // If lower address = 0, add 1 to upper address byte
         
sample = ADPCMDecoder((code>>4) & 0x0f); // Decode upper half of byte

         Write10bitPWM(sample);
         while(wait)
         {

         }
         
         wait = 1;
         sample = ADPCMDecoder(code & 0x0f); // Decode lower half of byte

      
         Write10bitPWM(sample);
         while(wait)
         {

         }
      }         


The EEPROM is a Microchip 24AA1025, I2C setup line is:

Code:
#use I2C(Master, sda = PIN_B1, scl = PIN_B4, FAST)


I'm aware that the EEPROM has a page read feature, but if I read a whole page at a time, do I not need a large array to store it in? My PIC doesn't have a huge amount of RAM and the compiler normally throws a wobbly when I declare an array of anything in the region of about 100 bytes...
PCM programmer



Joined: 06 Sep 2003
Posts: 21708

View user's profile Send private message

PostPosted: Mon Apr 09, 2007 1:25 pm     Reply with quote

Quote:

My PIC doesn't have a huge amount of RAM and the compiler normally
throws a wobbly when I declare an array of anything in the region of
about 100 bytes...

Here are some threads with sample code for accessing arrays which are
larger than one RAM bank on the 16F PICs. I'm sure there are even
more examples in the archives. I just didn't find them all.

http://www.ccsinfo.com/forum/viewtopic.php?t=20955&highlight=writebigarray

http://www.ccsinfo.com/forum/viewtopic.php?t=21776&highlight=array+banks

http://www.ccsinfo.com/forum/viewtopic.php?t=5598&highlight=array+banks
ckielstra



Joined: 18 Mar 2004
Posts: 3680
Location: The Netherlands

View user's profile Send private message

PostPosted: Mon Apr 09, 2007 3:00 pm     Reply with quote

Quote:
I'm aware that the EEPROM has a page read feature, but if I read a whole page at a time, do I not need a large array to store it in?
The EEPROM has a page write and a sequential read feature. For the sequential read it is not required to have a large buffer in the PIC, you can just sequentially read a byte at a time when you need it.
andyd



Joined: 08 Mar 2007
Posts: 30

View user's profile Send private message

PostPosted: Sun Apr 15, 2007 10:02 am     Reply with quote

Ah, ok. Well I've now implemented a sequential read and am now only outputting 8 bits to the PWM (removes a couple of extra bitshifts and AND functions), but it's still not quite fast enough. If I give it a file which contains a 1 kHz sine wave and look at the output of the filter on a scope, I see a 615-ish Hz sine.

Any other ideas on making it faster? I did think about pre-buffering part of the compressed audio (as I assume reading from the PIC's RAM will be faster than an external EEPROM), but the files are in the order of a couple of kB each, which is too big for the PIC's RAM, so I'd still have to be reading data from the EEPROM into the buffer while the decode routine was happening and so don't think it'd help?

Any other suggestions on making it faster or am I just stuck with slow audio unless I use a faster clock frequency?
Hans Wedemeyer



Joined: 15 Sep 2003
Posts: 226

View user's profile Send private message

Inline this
PostPosted: Sun Apr 15, 2007 11:36 am     Reply with quote

If you have code space then inline

Write10bitPWM(sample);

This avoids pushing and popping the stack.

I have similar code that ticks at 15uS but running at 40MHz on a PIC18 !

Not only will the PIC18 give you a faster clock in lots of chips there is more
RAM is you need it.

You may find a pin for pin compatible PIC18 to replace the PIC16
andyd



Joined: 08 Mar 2007
Posts: 30

View user's profile Send private message

PostPosted: Sun Apr 15, 2007 11:46 am     Reply with quote

Could you explain what you mean by inline? Smile
ckielstra



Joined: 18 Mar 2004
Posts: 3680
Location: The Netherlands

View user's profile Send private message

PostPosted: Sun Apr 15, 2007 2:30 pm     Reply with quote

andyd wrote:
Could you explain what you mean by inline? Smile
Check your C-manual for the #inline statement. On compiling code the compiler often has to decide to optimize for speed or for code size, using the #inline directive you tell the compiler that for the indicated function speed is important. An inline function can be executed faster because it is located directly at every program location where it is called, this saves storing variables at the stack, a goto and a return instruction. Disadvantage is the increased code space required as the function has to be copied 'in line' at every location where it is called.
The opposite of #inline is the #separate directive, this is the default compiler setting.

Quote:
Any other suggestions on making it faster or am I just stuck with slow audio unless I use a faster clock frequency?
In your main loop you are setting wait=1 twice, this is dangerous as this variable might already have been cleared in the interrupt routine.

Decoding 8000 samples with a PIC16 processor running at 8MHz leaves you with only 250 instruction times per sample. This is tight but should be possible.
Giving you some general advice is like shooting from the hip. Much better is when you do some measurements on your code in order to _know_ where the problems are. For example in MPLAB you can execute your program in the simulator and then use MPLAB's stopwatch function to meassure the time used by each function. I don't think it is possible to simulate the eeprom, but you can use an oscilloscope to meassure the real hardware and than replace the eeprom_read() by a stub function with an equal delay_us().
Display posts from previous:   
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group