My FeedDiscussionsHeadless CMS
New
Sign in
Log inSign up
Learn more about Hashnode Headless CMSHashnode Headless CMS
Collaborate seamlessly with Hashnode Headless CMS for Enterprise.
Upgrade ✨Learn more
How to light a LED via IPC with the least resource consumption?

How to light a LED via IPC with the least resource consumption?

Abby's photo
Abby
·Jan 13, 2021·

17 min read

I find an interesting challenge for myself.

Set a topic: Tailor RT-Thread OpenSource Real-time Operating System

Set a goal: Using the IPC mechanism of RT-Thread (v3.1.2 above) to flash LED at 1HZ frequencies, and tailor the system as small as possible.

Hardware Platform: STM32H750 Let's Start!

Tip 1 (Make the smallest bare metal lighting system)

What is the minimum procedure for the STM32 bare metal LED lighting?

We have three methods can process:

  • Select LL library on STM32CUBEMX
  • Select HAL library on STM32CUBEMX
  • Create the main .c to light a small LED directly

We know that the code with the HAL library is definitely much more than the code in the LL library, and the lighting program we need may not requires as many library files, so I chose to create the main .c directly to light the small LED.

First, create an empty project that can run main.c project. By doing this, you can take the debugging method to determine whether the project is working properly. Use a while loop to judge. Second, light the LED. Figure out how many registers need to be configured to light up the STM 32 LED, the steps that required for lighting STM 32 are:

  • Turn on the RCC clock
  • Configure OUTPUT mode for PIN
  • Set a high pin or low pin

I debugged the main function which is shown below:

 1#include "stm32h7xx.h"
 2#define GPIO_PIN_8                 ((uint16_t)0x0100)  /* Pin 8 selected    */
 3
 4int main(void)
 5{
 6    int i;
 7    uint8_t abc = 0;
 8    RCC->AHB4ENR |= RCC_AHB4ENR_GPIOIEN;    //1)Turn on RCC Clock    
 9    i++;
10    GPIOI->MODER = 0xFFFDFFFF;              //2)Settings
11    for(i=0;;)
12    {
13        if(i == 1000000)
14        {
15            if(abc == 1)
16            {
17                GPIOI->BSRR = GPIO_PIN_8;    //Set high pin or low pin
18                abc =0;
19            }
20            else
21            {
22                abc = 1;
23                GPIOI->BSRR = GPIO_PIN_8 << 16;
24            }
25            i = 0;
26        }
27        i++;
28    }
29}

This code is analyzed based on the HAL and LL libraries, and if running the code into the board, we can see the small LED flashing.

1.png

Here we can see that the code is relatively fewer.

There is only a main.c file and a system_stm32h7xx.c file

Actually, the light can flash up only using these 3 files, and at this point, we can tailor the system_stm32h7xx.c, because we are probably using different IDE, so assembly syntax will also be different, the *.s file may be changed later, but you can also give a try, the same operation.

You can also check out 01_led_mini_system project under the current folder.

At this point, we can use Keil to help us to get familiar with how to achieve the minimum system in this bare metal case.

startup_stm32h750xx.s

In file startup_stm32h750xx.s, the peripheral interrupt vector table occupies a lot of codesize and can be optimized.

  • Peripheral interrupt
  • Internal interrupt can also be deleted if they are not commonly used

SystemInit

In this function, some unnecessary code can be optimized, the optimized code is as follows, this process can help to save almost 0.04KB code:

1void SystemInit (void)
 2{
 3
 4  /* FPU settings ------------------------------------------------------------*/
 5  #if (__FPU_PRESENT == 1) && (__FPU_USED == 1)
 6    SCB->CPACR |= ((3UL << (10*2))|(3UL << (11*2)));  /* set CP10 and CP11 Full Access */
 7  #endif
 8  /* Reset the RCC clock configuration to the default reset state ------------*/
 9  /* Set HSION bit */
10  RCC->CR |= RCC_CR_HSION;
11
12  /* Reset CFGR register */
13  RCC->CFGR = 0;
14
15  /* Reset HSEON, CSSON , CSION,RC48ON, CSIKERON PLL1ON, PLL2ON and PLL3ON bits */
16  RCC->CR &= 0xEAF6ED7FU;
17
18
19  /* Reset HSEBYP bit */
20  RCC->CR &= 0xFFFBFFFFU;
21
22  /* Disable all interrupts */
23  RCC->CIER = 0;
24
25  SCB->VTOR = FLASH_BANK1_BASE | VECT_TAB_OFFSET; /* Vector Table Relocation in Internal FLASH */
26}

Tip 2(IPC minimum resource consumption)

First of all, to select the kernel version since RT-Thread has a Standard Version and a Nano version, in this case, select the RT-Thread Nano version is more appropriate, let's start with RT-Thread Nano v3.1.3. RT-Thread Nano contains semaphore, mutex, event, mailbox, message queue, these sizes are similar, take any of it to use. Because we still need to tailor it.

Here's the case of thread numbers.

We all know that the main function in RT-Thread is also a thread, and idle is a thread. These two threads I think should be enough for use, because we also have an interrupt, systick. Here I use systick to wake up the main thread to operate, we'll need to consider how to light up the LED with the least amount of resources. I am thinking about getting it done on the MASTER branch and successfully light up the LED.

Add RTOS Test on keil

 1#include "stm32h7xx.h"
 2#include "rtthread.h"
 3#define GPIO_PIN_8                 ((uint16_t)0x0100)  /* Pin 8 selected    */
 4struct rt_semaphore dynamic_sem;
 5
 6int main(void)
 7{
 8    int i;
 9    rt_sem_init(&dynamic_sem, "dsem", 0, RT_IPC_FLAG_FIFO);
10    static rt_err_t result;
11    RCC->AHB4ENR |= RCC_AHB4ENR_GPIOIEN;
12    GPIOI->MODER = 0xFFFDFFFF;
13    while(1)
14    {
15        result = rt_sem_take(&dynamic_sem, RT_WAITING_FOREVER);
16        if(i%2 == 0)
17        {
18            GPIOI->BSRR = GPIO_PIN_8;
19        }
20        else
21        {
22            GPIOI->BSRR = GPIO_PIN_8 << 16;
23        }
24        i++;
25    }
26}
27
28uint32_t count = 0;
29extern struct rt_semaphore dynamic_sem;
30void SysTick_Handler(void)
31{
32    /* enter interrupt */
33    rt_interrupt_enter();
34    count++;
35    if(count >= RT_TICK_PER_SECOND)
36    {
37        count = 0;
38        rt_sem_release(&dynamic_sem);
39    }
40    rt_tick_increase();
41
42    /* leave interrupt */
43    rt_interrupt_leave();
44}

Here I just add the RTOS pack with Keil and then directly generate the project.

2.png

By this time, we can see that the project is very small, and the project is in 02_led_rtthread_mini_system_keil:

1Program Size: Code=10394 RO-data=1450 RW-data=72 ZI-data=1944

Tip 3(Integrate Project)

  • Start by building a new nano project with RT-Thread Studio
  • Then replace it with the code in Keil

system_init

Let's optimize the code outside of RTOS, as mentioned earlier in SystemInit

The optimized files here have startup_stm32h750xx. S and system_stm32h7xx.c.

After integration, we can see our code size:

4.png

Here we come to 03_rtstudio_mini_rtthread

Tip 4(Modify the compiler options)

The compilation options have the following:

  • Highest optimization level - Os
  • Turn each function into a .o and then link it -ffunction-sections-fdata-sections
  • Not applicable to standard libraries - nostdlib
  • Do not enable FPU with softfpu (current project does not require FPU for the time being, so we can save code)

Let's do it step by step to see how much codesize can be reduced at a time, and this time it's important to note that since the whole project is created by Makefile, there are some caches in it, so we're going to clean up the project and build it again each time, or rebuild it at all.

Optimize s level

3.png

In the optimization interface, select -Os

The code has almost halved to 5.8KB

5.png

Optimize - ffunction-sections

  • When the source file of ffunction-sectionsis compiled, a separate section is assigned to each fusion. This option is often used.

  • Fdata-sections when compiling the source file, assign a separate section to each data.

Select ffunction-secionts in the optimization interface, almost 2.0KB

6.png

Because we don't have a lot of data, the option -fdata-secions is not optimized very well, and I've seen it, and there's basically no code reduction.

Optimize the standard library

1-nostdlib does not use standard libraries

7.png

Optimised down to a size of about 1.3KB:

8.png

Optimize FPU

Some code can be optimized using the software FPU.

9.png

After optimization, the code is:

10.png

We're basically complete most works there. Then, let's take a look at the official RT-Thread Nano occupation data: A9FFF709-7AC8-4074-AD5E-361B289DE5BE.png optimization level 3

There's still something we can improve to catch up with the data that RT-Thread official presented.

Be sure to burn in time to see if the led can light up.

Tip5(Configure rtconfig.h RT-Thread)

Some of the RT-Thread default configurations are not required:

Most of the configuration in rtconfig.h can be removed, you can try to remove it and then compile it to see if the LED can still flash.

Here's the configuration I left behind.

1#define RT_USING_SEMAPHORE // To achive IPC communication
 2
 3#define RT_THREAD_PRIORITY_MAX  3  //This is the largest priority number, and this can be reduced to 3
 4
 5#define RT_USING_USER_MAIN  //This involves stdlib, so keeping it to save space
 6
 7#define RT_MAIN_THREAD_STACK_SIZE     128  //This is data from experiments
 8
 9#define RT_USING_CPU_FFS  //This option allows to optimize  __lowest_bit_bitmap to use CPU instructions
10
11#define RT_USING_COMPONENTS_INIT
12
13#define RT_TICK_PER_SECOND  1000

Let's check how much the specific size can be reduced after configuration:

RT_USING_CPU_FFS

This is an FFS using a CPU that optimizes a larger array into CPU commands:

It's almost 0.3KB:

11.png

RT_USING_CONSOLE

Console system takes almost 2KB size.

12.png

RT_MAIN_THREAD_STACK_SIZE

RT_MAIN_THREAD_STACK_SIZE is taken very little RAM.

Other compilation options have limitations for optimization, and I'll give a pass. Now, we can put the header file below in rtconfig.h to see how much code has been reduced:

1#define RT_THREAD_PRIORITY_MAX  3
 2
 3#define RT_TICK_PER_SECOND  1000
 4
 5#define RT_ALIGN_SIZE   4
 6
 7#define RT_NAME_MAX    4
 8
 9#define RT_USING_COMPONENTS_INIT
10// </c>
11
12#define RT_USING_USER_MAIN
13
14
15#define RT_MAIN_THREAD_STACK_SIZE     128
16
17#define RT_USING_CPU_FFS
18
19#define RT_USING_SEMAPHORE

Finally, the code stays at this size:

13.png

Now we come to 04_nostdlib_mini_rtthread

Tip 6 (Tailor code according to map file)

This part is boring but it is important to know.

Let's mainly look at the map file.

.map file in debug can be easily found, so I won't give a further introduction on this, I will introduce how to view the size of the function, here's a fragment intercepted from the map file.

1 .text.rt_tick_increase
 2                0x080000f0       0x28 ./rt-thread/src/clock.o
 3                0x080000f0                rt_tick_increase
 4 .text.rti_end  0x08000118        0x4 ./rt-thread/src/components.o
 5 .text.main_thread_entry
 6                0x0800011c        0x4 ./rt-thread/src/components.o
 7                0x0800011c                main_thread_entry
 8 .text.rti_board_end
 9                0x08000120        0x4 ./rt-thread/src/components.o
10 .text.rti_start
11                0x08000124        0x4 ./rt-thread/src/components.o
12 .text.rti_board_start
13                0x08000128        0x4 ./rt-thread/src/components.o
14 .text.rt_application_init
15                0x0800012c       0x3c ./rt-thread/src/components.o
16                0x0800012c                rt_application_init
  • The STM32 code starts at 0x08000000 in the ROM, so start with this address inside the map.

There's a clip on it:

rt_tick_increase this function starts at the 0x080000f0 address and is 0x28 in size

rt_application_init this function starts at the 0x0800012c address and is 0x3c in size

So little by little to reduce the code.

Tailor timer

Since the timer is not used, the function timer. c is basically not used, we can comment it all out, and the related calls are commented out. The entire TIMER accounts for about 0.6KB.

14.png

Optimize rt_memset and rt_memcpy

In kservice. c, the entire files, and calls can all be commented out, so the whole will occupy almost 0.08KB

15.png

rt_thread_exit

Exit the thread. In thread. c, there has a 0.08KB code for which we could tailor

16.png

idle task

There are some actions in the idle task that you can comment out on the idle.c

Only keep a while loop, and the small LED can also flash, and we can save 0.15KB.

17.png

hardfault handle

Hard fault occupies some codesize, hard faults implementation is in contex_gcc. S, because GCC did not optimize the .s file (i.e. the unused assembly can not be optimized, we could only reduce it little by little) this almost save 0.08KB

  • startup_stm32h750xx should also remove the hard fault handler function.

18.png

rt_critical

Because there is no inter-process competition, so scheduler.c

rt_enter_critical and rt_exit_critical can be optimized, which would save 0.09KB

19.webp

rt_components_init

Some options for components can also be optimized to save about 0.04KB.

rt_hw_interrupt_enable

We have fewer threads here, so the switch interrupts in thread.c can be optimized to save about 0.2KB.

Enable.webp

rt_interrupt_enter

There are fewer interrupts, rt_interrupt_enter in and out interrupts can be tailored, to save almost 0.4KB.

21.webp

also tailor the rt_hw_interrupt_thread_switch

rt_components_board_init

Remove this function to save 0.03KB

22.png

rt_ipc_list_suspend

case RT_IPC_FLAG_PRIO function is not needed, save 0.05KB by commenting it out directly.

23.png

Optimize inline function

When compiling the inline function we will need to put the entire code into the function, it can result in an excessive amount of code.

rt_service.h

I was thinking to change the inline function into a function that can reduce the code, but when I resumed the process, I found that the amount of code actually increased by 0.07KB by doing this.

This fully illustrates that if the inline function is well written and it could actually reduce code size.

24.png

rt_thread

The struct rt_thread has something that can be optimized to reduce code size

struct rt_timer thread_timer; this member-related code can be deleted.

25.png

It's almost done.

The final is project 05_final_cut_mini_system.

Result

26.png

RT-Thread Contact Info:

Website | Github | Twitter | Facebook | Youtube