CCS C Software and Maintenance Offers
FAQFAQ   FAQForum Help   FAQOfficial CCS Support   SearchSearch  RegisterRegister 

ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

CCS does not monitor this forum on a regular basis.

Please do not post bug reports on this forum. Send them to CCS Technical Support

Multiple clock sources and clock switching
Goto page Previous  1, 2
 
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion
View previous topic :: View next topic  
Author Message
gaugeguy



Joined: 05 Apr 2011
Posts: 303

View user's profile Send private message

PostPosted: Tue May 14, 2019 3:14 pm     Reply with quote

Adding a decimal in any position can be done by making a string of characters from the packed BCD with the '.' inserted as needed between digits.
Each step is very quick and by dividing it up in to several steps you avoid the printf trying to do every bit of conversion and formatting all at one time.
blowtorch



Joined: 11 Jun 2013
Posts: 35
Location: Cape Town

View user's profile Send private message

PostPosted: Tue May 14, 2019 5:34 pm     Reply with quote

So, results are very encouraging. Thank you for the idea of using BCD...

I wrote some test code in the sim, then ported into my main code. This involved quite a few changes, so more than likely I have screwed up something somewhere, it's 01:23AM here.

Initial results show a big improvement. I have some ideas to optimise it further, specifically for the GLCD I am using. So writing a dedicated "number" display function, where I convert to BCD and then use the BCD number as an index to a "number only" lookup table for the 5x7 font...cost is only 5 x 10 bytes..., and I be able to get rid of printf entirely...
Ttelmah



Joined: 11 Mar 2010
Posts: 19518

View user's profile Send private message

PostPosted: Tue May 14, 2019 11:28 pm     Reply with quote

temtronic wrote:
hmm, wonder if /2,/2,/2,/2,/2 is faster than /10 ?
My PIC PC is 'down for service'....


Yes. /32 by shifting is a lot faster than /10.
Ttelmah



Joined: 11 Mar 2010
Posts: 19518

View user's profile Send private message

PostPosted: Wed May 15, 2019 2:49 am     Reply with quote

As a comment, it is worth realising, that though the /10, is well written,
it won't be specifically using code designed to perform /10. The code will
just be the standard integer division code. So will loop through all the bits
involved to perform the division. With this in mind I decided to 'try my hand'
at writing a more efficient division to just perform /10. Now 'no guarantees',
just a first attempt at this!...
Code:

typedef struct {
   unsigned int16 quot;
   unsigned int8 rem;
} div_vals;

typedef union {
   unsigned int16 whole;
   unsigned int8 b[2];
} access;

div_vals div_10(access source)
{
   div_vals temp;
   temp.quot=(source.whole>>1)+(source.whole>>2);
   temp.quot+=temp.quot>>4;
   temp.quot+=temp.quot>>8;
   temp.quot>>=3;
   temp.rem=source.whole-(((temp.quot<<2)+temp.quot)<<1);
   if (temp.rem>9)
   {
      temp.quot+=1;
      temp.rem-=10;
   }
   return temp;
}

//Called like this:
   int16 test=12345;
   div_vals result;
   
   result=div_10(test);

//gives 1234 in test.quot, and 5 in test.rem


It looks to be about 3 to 4* faster than the CCS division using /10.

Might be worth a play!...
blowtorch



Joined: 11 Jun 2013
Posts: 35
Location: Cape Town

View user's profile Send private message

PostPosted: Wed May 15, 2019 6:11 am     Reply with quote

OK here is some updated code, designed for the sim in order to easily get timings etc, without external (think interrupts) messing with the numbers...

Earlier last night a google search found some nice code on a microchip forum which did a BCD conversion using simple subtraction as opposed to division, this I adapted, and changed so it outputs ASCII...I named the variables such that they should be self explanatory.

The first function named 'uint16_to_ascii' takes a 16 bit unsigned int and writes back to a string. No bounds checking is done, limit of 4 characters or 9999 value. 2nd function does the same for up to 999.


Code:
#include <16LF18345.h>
#pin_select U1TX=PIN_B7
#pin_select U1RX=PIN_B6

#use delay(clock=4MHZ)
#use rs232(UART1,baud=111111,parity=N,bits=8,stream=jn1out) // RS232 available
#include <stdio.h>
#include <stdlib.h>
typedef unsigned int8 uint8;
typedef unsigned int16 uint16;
typedef unsigned int32 uint32;
typedef signed int8 sint8;
typedef signed int16 sint16;
typedef signed int32 sint32;

void uint16_to_ascii_4(uint16 num_16, char* dest_ptr)
{
    *dest_ptr = 0;
    while (num_16 & 0x3C00) {
        num_16 -= 1000;
        *dest_ptr += 1;
    }
    if (num_16 >= 1000) {
        num_16 -= 1000;
        *dest_ptr += 1;
    }
    *dest_ptr |= 48;
    dest_ptr++;
    *dest_ptr = 0;
    while (num_16 & 0x0780)
    {
        num_16 -= 100;
        *dest_ptr += 1;
    }
    if (num_16 >= 100) {
        num_16 -= 100;
        *dest_ptr += 1;
    }
    *dest_ptr |= 48;
    dest_ptr++;
    *dest_ptr = 0;
    while (num_16 & 0x70)
    {
        num_16 -= 10;
        *dest_ptr += 1;
    }
    if (num_16 >= 10) {
        num_16 -= 10;
        *dest_ptr += 1;
    }
    *dest_ptr |= 48;
    dest_ptr++;
    *dest_ptr = (unsigned char) num_16 | 48;
}

void uint16_to_ascii_3(uint16 num_16, char* dest_ptr)
{
    *dest_ptr = 0;
    while (num_16 & 0x0780) // ((int)num_16 > 0)
    {
        num_16 -= 100;
        *dest_ptr += 1;
    }
    if (num_16 >= 100) {
        num_16 -= 100;
        *dest_ptr += 1;
    }
    *dest_ptr |= 48;
    dest_ptr++;
    *dest_ptr = 0;
    while (num_16 & 0x70) // (num_16 > 0)
    {
        num_16 -= 10;
        *dest_ptr += 1;
    }
    if (num_16 >= 10) {
        num_16 -= 10;
        *dest_ptr += 1;
    }
    *dest_ptr |= 48;
    dest_ptr++;
    *dest_ptr = (unsigned char) num_16 | 48;
}

void main()
{

    char str_secs[5]; // field length of 4 + 1 for null
    char str_millis[4]; // field length of 3 + 1 for null
    char whole_field[9];
    uint32 big_millis = 9999999;
    uint16 secs = 9999;
    uint16 millis = 999;
   

    printf("\r\nBCD vs printf Test\r\n");
    str_secs[4] = '\0';
    str_millis[3] = '\0';
    whole_field[4]='.';
    whole_field[8]='\0';
   
    delay_cycles(1); // dummy instruction - 1st break point
    printf("%08.3w", big_millis);
    delay_cycles(1); // dummy instruction - 2nd break point
    // above printf takes 22.166ms for 123456
    // above printf takes 22.237ms for 999999

    printf("\r\n");

    delay_cycles(1); // dummy instruction - 1st break point
    uint16_to_ascii_4(secs, str_secs);
    uint16_to_ascii_3(millis, str_millis);
    printf("%s.%s", str_secs, str_millis);
    delay_cycles(1); // dummy instruction - 2nd break point
    //above 2 bcd conversions and printf takes 2.316ms for 123 456
    //above 2 bcd conversions and printf takes 3.052ms for 9999 and 999
   
    printf("\r\n");

    delay_cycles(1); // dummy instruction - 1st break point
    uint16_to_ascii_4(secs, &whole_field[0]);
    uint16_to_ascii_3(millis, &whole_field[5]);
    printf("%s", whole_field);
    delay_cycles(1); // dummy instruction - 2nd break point
    //above 2 bcd conversions and printf takes 2.318ms for 123 and 456
    //above 2 bcd conversions and printf takes 3.054ms for 9999 and 999
   
    printf("\r\n");
   
   
   
    sleep();
}



Note the timing in comments! For the first number, printf did it in 22.1ms, the custom code did the equivalent in 2.3ms. Almost 10 times faster. Worst case will be when you have the biggest number that will fit, in this case printf was 22.3 and the custom function took 3ms. Still a seven fold improvement...

Yay! I think the improvement will be slightly better when driving a graphics LCD, because one can have a dedicated number display function that does a streamlined convert and directly indexes the byte array (font) used for display...
Even more yay!
gaugeguy



Joined: 05 Apr 2011
Posts: 303

View user's profile Send private message

PostPosted: Wed May 15, 2019 6:58 am     Reply with quote

Here is a BCD conversion routine that may help. This can be expanded to more digits.
It can be done slightly more efficiently in assembly but this isn't too bad.

Code:

// 16 bit 4 digit BCD conversion routine
unsigned int16 Int16toBCD4(unsigned int16 local_convert)
{
   //converts 16bit value, to four BCD digits. Tries to do it fairly
   //efficiently, both in size, and speed.
   unsigned int16 bit_cnt = 16;
   unsigned int16 BCD;
   BCD=0;
   {
      do
      {
         if ((BCD & 0x000F)>=0x0005) BCD+=0x0003;
         if ((BCD & 0x00F0)>=0x0050) BCD+=0x0030;
         if ((BCD & 0x0F00)>=0x0500) BCD+=0x0300;
         if ((BCD & 0xF000)>=0x5000) BCD+=0x3000;
         shift_left(&BCD,2,shift_left(&local_convert,2,0));
      }
      while (--bit_cnt != 0);
   }
   return BCD;
}
Ttelmah



Joined: 11 Mar 2010
Posts: 19518

View user's profile Send private message

PostPosted: Wed May 15, 2019 7:25 am     Reply with quote

Problem with the subtraction approach is it'll be slower on larger numbers.
Test with 49999, and you may find it is not as good as you think....

Gaugeguy's shift and if >5 add+3 approach, for each digit, is normally
considered the most efficient relatively easy to code algorithm.
blowtorch



Joined: 11 Jun 2013
Posts: 35
Location: Cape Town

View user's profile Send private message

PostPosted: Wed May 15, 2019 8:38 am     Reply with quote

Ttelmah wrote:
Problem with the subtraction approach is it'll be slower on larger numbers.
Test with 49999, and you may find it is not as good as you think....


Agreed, it is measurably slower. The numbers are in the previous post by way of comment. total time to convert 0123 and 456 took 2.3ms, whereas converting 9999 and 999 took 3ms. 30% longer.

Thanks Gaugeguy, I will code and test your method next, then feedback comparison.
dluu13



Joined: 28 Sep 2018
Posts: 395
Location: Toronto, ON

View user's profile Send private message Visit poster's website

PostPosted: Wed May 15, 2019 8:39 am     Reply with quote

Thanks for these posts, everyone. I've been noticing some lag myself when using %lw when I use my logic analyzer as well. We'll see how this goes :D

I'm gonna have to play with these myself to test it out!
blowtorch



Joined: 11 Jun 2013
Posts: 35
Location: Cape Town

View user's profile Send private message

PostPosted: Wed May 15, 2019 9:14 am     Reply with quote

Loosely related, how can one calculate the time taken to service a timer based ISR? I put the isr code into the sim, and used the stopwatch feature to time the 2 different paths through the code. This came out to 9 and 17 us respectively. But what to add to get the total isr service time?
dluu13



Joined: 28 Sep 2018
Posts: 395
Location: Toronto, ON

View user's profile Send private message Visit poster's website

PostPosted: Wed May 15, 2019 10:01 am     Reply with quote

I just tried the BCD stuff using gaugeguy's converter (five digits). Here's the code I tested with. I tested straight printing out the ints, scaling them with lw, scaling with float, and then BCD. Ints and BCD were not scaled, but I added a decimal point at the end of the number just to have the same number of chars printed.

As expected, floats were the slowest, coming in at 60ms to print everything. lw was next, coming in at 5.9ms. Straight ints and BCD came in at a tie at 5.4ms. Now, if I were to scale the BCD and make it add the decimal point where I want I don't know how much more time that will take. However, lw is pretty fast...

Code:
/*
 * File:   CuriosityPrint.c
 * Author: dluu
 *
 * Created on Apr 5, 2019
 */
#include<24FJ128GA204.h>

#FUSES NOWDT, NODEBUG, NOWRT, NOPROTECT, NOJTAG, ICSP1
#FUSES NOLVR, NOBROWNOUT, NOIOL1WAY, NODSBOR, NODSWDT
#FUSES NOALTCMPI, FRC_PLL, PLL_FROM_FRC, PLL8X

#PIN_SELECT U3RX=PIN_B5
#PIN_SELECT U3TX=PIN_B6

#USE DELAY(clock=32MHZ)
#USE RS232(BAUD=115200, UART3, BITS=8, PARITY=N, STOP=1, STREAM=PC, ERRORS, RECEIVE_BUFFER=128)

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

uint32_t Int16toBCD8(uint16_t local_convert)
{
    //converts 16bit value, to four BCD digits. Tries to do it fairly
    //efficiently, both in size, and speed.
    uint16_t bit_cnt = 16;
    uint32_t BCD;
    BCD = 0;
    {
        do
        {
            if ((BCD & 0x0000000F) >= 0x00000005) BCD += 0x00000003;
            if ((BCD & 0x000000F0) >= 0x00000050) BCD += 0x00000030;
            if ((BCD & 0x00000F00) >= 0x00000500) BCD += 0x00000300;
            if ((BCD & 0x0000F000) >= 0x00005000) BCD += 0x00003000;
            if ((BCD & 0x000F0000) >= 0x00050000) BCD += 0x00030000;
//            if ((BCD & 0x00F00000) >= 0x00500000) BCD += 0x00300000;
//            if ((BCD & 0x0F000000) >= 0x05000000) BCD += 0x03000000;
//            if ((BCD & 0xF0000000) >= 0x50000000) BCD += 0x30000000;
            shift_left(&BCD, 3, shift_left(&local_convert, 2, 0));
        }
        while (--bit_cnt != 0);
    }
    return BCD;
}

int main(void)
{
    uint16_t test[] = {11111, 22222, 33333, 44444, 55555, 12222, 23333, 34444, 45555};

    delay_ms(100);

    fprintf(PC, "\r\n\r\n");
   
    fprintf(PC, "test lu: ");
    output_high(PIN_B13);
    for (int i = 0; i < 9; ++i)
    {
        fprintf(PC, "%lu.,", test[i]);
    }
    output_low(PIN_B13); // 5.4 ms
    fprintf(PC, "\r\n");

    fprintf(PC, "test lw: ");
    output_high(PIN_A9);
    for (int i = 0; i < 9; ++i)
    {
        fprintf(PC, "%1.3lw,", test[i]);
    }
    output_low(PIN_A9); // 5.9 ms
    fprintf(PC, "\r\n");

    fprintf(PC, "test float: ");
    output_high(PIN_A10);
    for (int i = 0; i < 9; ++i)
    {
        fprintf(PC, "%1.3f,", (float) test[i] / 1000);
    }
    output_low(PIN_A10); // 60 ms
    fprintf(PC, "\r\n");

    fprintf(PC, "test bcd: ");
    output_high(PIN_C3);
    for (int i = 0; i < 9; ++i)
    {
        fprintf(PC, "%lx.,", Int16toBCD8(test[i]));
    }
    output_low(PIN_C3); // 5.4 ms
    fprintf(PC, "\r\n");

    while (1)
    {
    }

    return 0;
}


Now to figure out how to insert a decimal at the desired nibble


EDIT:
Code:
#define BCDNIBBLES 5

void printScaledBCD(uint16_t num, uint8_t decimalPlaces)
{
    uint32_t BCD5 = Int16toBCD5(num);
    if (decimalPlaces == BCDNIBBLES) fprintf(PC, "0");
    for (int i = 0; i < BCDNIBBLES; ++i)
    {
        if (BCDNIBBLES - i == decimalPlaces) fprintf(PC, ".");
        fprintf(PC, "%x", (BCD5 >> ((BCDNIBBLES - 1 - i) << 2))&0x0F);
    }
}


Adding the decimal point is about 0.1ms slower than not adding it.
I think I can use this in my code to gain about 7% speed over lw when printing numbers.

Code:
fprintf(PC, "test BCD dec: ");
output_high(PIN_B8);
for (int i = 0; i < 9; ++i)
{
    printBCD(Int16toBCD5(test[i]), 3);
    fprintf(PC, ",");
}
output_low(PIN_B8); // 5.5 ms
fprintf(PC, "\r\n");


Code:
void ScaledBCDtoStr(uint16_t num, uint8_t decimalPlaces, char * buf) // very slow...
{
    uint32_t BCD5 = Int16toBCD5(num);
    uint8_t decimal = 0;
    uint8_t j = 0;
    if (decimalPlaces > 0) decimal = 1;
    if (decimalPlaces == BCDNIBBLES) fprintf(PC, "0");
   
    for (int i = 0; i < BCDNIBBLES+decimal; ++i)
    {
        if (BCDNIBBLES - i == decimalPlaces)
        {
            buf[j] = '.';
            ++j;
        }
        buf[j] = ((BCD5 >> ((BCDNIBBLES - 1 - i) << 2))&0x0F) + 0x30;
        ++j;
    }

    buf[BCDNIBBLES+decimal] = '\0';
}

fprintf(PC, "test BCD str: ");
char bcdstr[10];
output_high(PIN_B9);
for (int i = 0; i < 9; ++i)
{
    ScaledBCDtoStr(test[i], 3, bcdstr);
    fprintf(PC, "%s,", bcdstr);
}
output_low(PIN_B9); // over 200 ms...
fprintf(PC, "\r\n");


puzzlingly, this takes over 200 ms... My ScaledBCDtoStr function is very slow... Are array accesses slow?
gaugeguy



Joined: 05 Apr 2011
Posts: 303

View user's profile Send private message

PostPosted: Thu May 16, 2019 8:22 am     Reply with quote

I have not looked at the listing for this, but here is what I think is happening.
The array access is doing the index calculation every time through the loop and this takes time.
If you switch to using a pointer instead of an array inside the loop I think it will not keep recalculating the offset each time and should save a significant amount of time.
Ttelmah



Joined: 11 Mar 2010
Posts: 19518

View user's profile Send private message

PostPosted: Thu May 16, 2019 8:53 am     Reply with quote

The real killer is this:

((BCD5 >> ((BCDNIBBLES - 1 - i) << 2))&0x0F) + 0x30;

Rotation by a variable, is done by having a one bit rotation, and looping
round counting till the number of bits needed has happened. Result this
is going to involve hundreds of instruction times....
Display posts from previous:   
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion All times are GMT - 6 Hours
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group