Adafruit-GFX-Library icon indicating copy to clipboard operation
Adafruit-GFX-Library copied to clipboard

Changing `drawRGBBitmap(..., uint16_t *,...)` to virtual enables significant performance gains.

Open MHotchin opened this issue 4 years ago • 3 comments

Issue type: enhancement

  • Board: Wemos D1 R32, Mega 2560

If the method drawRGBBitmap(int16_t x, int16_t y, uint16_t *bitmap, int16_t w, int16_t h) is changed to virtual, this allows significant performance gains for any display board that supports a 'bulk transfer' mechanism, on the order of 90% or better.

Code compatibility should be 100%. Implementation is simple, just move the method declaration up to the 'virtual' section. Drivers that do nothing and use the existing implementation have very little change.

I have included a sketch that demonstrates this. It creates a 16x16 tile, then uses it to cover a 256x256 board - a total of 64K pixels written.

I ran the test first using the original code, then just changing the method to virtual, then using an over-ride, optimized version.


Using a Wemos D1 R32 + Waveshare 4" screen, I get the following times:
//
//  Using a Wemos D1 R32 + Waveshare 4" TFT screen.  ESP32 board, SPI ILI9486 display
//
//  16x16 tiles
//  Render Times:
//  Original  = 1261 mSec
//  Virtual   = 1261 mSec = +0%
//  Optimized = 124 mSec  = -90.1%
//
//  Sizes (Program / Min RAM)
//  Original  = 225,533 bytes / 16036 bytes 
//  Virtual   = 225,541 bytes / 16036 bytes
//  Optimized = 226,029 bytes / 16036 bytes

Using a Mega 2560 (plus same display), I get the following:
//
//  Using Mega 2560 + Waveshare 4" TT screen
//
//  16x16 tiles
//  Render Times:
//  Original  = 6229 mSec
//  Virtual   = 6315 mSec = +1.3%
//  Optimized = 298 mSec  = -95.2%
//
//  Sizes (Program / Min RAM)
//  Original  = 11,232 bytes / 306 bytes 
//  Virtual   = 11,392 bytes / 308 bytes
//  Optimized = 11,632 bytes / 308 bytes
//

Results for 32x32 tiles are similar, with slightly higher savings in time.


The other `drawRGBBitmap()` methods need not be virtual, since each one does some processing on each pixel before writing it - there's no opportunity for bulk transfer.
Complete text of sketch follows:
/*
 Name:		BlitTest.ino
 Created:	2020-07-11 7:51:36 PM
 Author:	Michael
*/


#include <Arduino.h>
#include <SPI.h>
#include <Adafruit_GFX.h>

#include <Waveshare_ILI9486.h>

namespace
{
	Waveshare_ILI9486 MyTFT;

	constexpr size_t BMP_WIDTH = 16;
	constexpr size_t BMP_HEIGHT = 16;

	constexpr size_t NUM_PIXELS = BMP_HEIGHT * BMP_WIDTH;

	constexpr size_t MAX_WIDTH = 256;
	constexpr size_t MAX_HEIGHT = 256;

	constexpr size_t TILES_WIDTH = MAX_WIDTH / BMP_WIDTH;
	constexpr size_t TILES_HEIGHT = MAX_HEIGHT / BMP_HEIGHT;
}


// the setup function runs once when you press reset or power the board
void setup() 
{
	Serial.begin(115200);

	SPI.begin();

	MyTFT.begin();
	MyTFT.fillScreen(0);
}

// the loop function runs over and over again until power down or reset
void loop() 
{
	uint16_t buffer[NUM_PIXELS];

	//  Fill buffer with random pixels
	for (size_t i = 0; i < NUM_PIXELS; i++)
	{
		buffer[i] = random(UINT16_MAX);
	}

	auto tStart = millis();

	for (size_t i = 0; i < TILES_HEIGHT; i++)
	{
		for (size_t j = 0; j < TILES_WIDTH; j++)
		{
			MyTFT.drawRGBBitmap(j * BMP_WIDTH, i * BMP_HEIGHT, buffer, BMP_WIDTH, BMP_HEIGHT);
		}
	}

	auto tEnd = millis();

	Serial.print("Render time: ");
	Serial.print(tEnd - tStart);
	Serial.println(" mSec.");

	delay(3000);
}

//
//  Using a Wemos D1 R32 + Waveshare 4" TFT screen.  ESP32 board, SPI ILI9486 display
//
//  16x16 tiles
//  Render Times:
//  Original  = 1261 mSec
//  Virtual   = 1261 mSec = +0%
//  Optimized = 124 mSec  = -90.1%
//
//  Sizes (Program / Min RAM)
//  Original  = 225,533 bytes / 16036 bytes 
//  Virtual   = 225,541 bytes / 16036 bytes
//  Optimized = 226,029 bytes / 16036 bytes
//
//
//
//  32x32 tiles:
//  Render Times:
//  Original  = 1259 mSec
//  Optimized = 120 mSec = -90.4%


//
//  Using Mega 2560 + Waveshare 4" TT screen
//
//  16x16 tiles
//  Render Times:
//  Original  = 6229 mSec
//  Virtual   = 6315 mSec = +1.3%
//  Optimized = 298 mSec  = -95.2%
//
//  Sizes (Program / Min RAM)
//  Original  = 11,232 bytes / 306 bytes 
//  Virtual   = 11,392 bytes / 308 bytes
//  Optimized = 11,632 bytes / 308 bytes
//
//
//
//  32x32 tiles
//  Render times:
//  Original  = 6225 mSec
//  Virtual   = 6305 mSec = +1.2%
//  Optimized = 279 mSec  = -95.5%

MHotchin avatar Jul 12 '20 05:07 MHotchin