Adafruit-GFX-Library
Adafruit-GFX-Library copied to clipboard
Changing `drawRGBBitmap(..., uint16_t *,...)` to virtual enables significant performance gains.
Issue type: enhancement
- Board: Wemos D1 R32, Mega 2560
If the method drawRGBBitmap(int16_t x, int16_t y, uint16_t *bitmap, int16_t w, int16_t h)
is changed to virtual, this allows significant performance gains for any display board that supports a 'bulk transfer' mechanism, on the order of 90% or better.
Code compatibility should be 100%. Implementation is simple, just move the method declaration up to the 'virtual' section. Drivers that do nothing and use the existing implementation have very little change.
I have included a sketch that demonstrates this. It creates a 16x16 tile, then uses it to cover a 256x256 board - a total of 64K pixels written.
I ran the test first using the original code, then just changing the method to virtual, then using an over-ride, optimized version.
Using a Wemos D1 R32 + Waveshare 4" screen, I get the following times:
//
// Using a Wemos D1 R32 + Waveshare 4" TFT screen. ESP32 board, SPI ILI9486 display
//
// 16x16 tiles
// Render Times:
// Original = 1261 mSec
// Virtual = 1261 mSec = +0%
// Optimized = 124 mSec = -90.1%
//
// Sizes (Program / Min RAM)
// Original = 225,533 bytes / 16036 bytes
// Virtual = 225,541 bytes / 16036 bytes
// Optimized = 226,029 bytes / 16036 bytes
Using a Mega 2560 (plus same display), I get the following:
//
// Using Mega 2560 + Waveshare 4" TT screen
//
// 16x16 tiles
// Render Times:
// Original = 6229 mSec
// Virtual = 6315 mSec = +1.3%
// Optimized = 298 mSec = -95.2%
//
// Sizes (Program / Min RAM)
// Original = 11,232 bytes / 306 bytes
// Virtual = 11,392 bytes / 308 bytes
// Optimized = 11,632 bytes / 308 bytes
//
Results for 32x32 tiles are similar, with slightly higher savings in time.
The other `drawRGBBitmap()` methods need not be virtual, since each one does some processing on each pixel before writing it - there's no opportunity for bulk transfer.
Complete text of sketch follows:
/*
Name: BlitTest.ino
Created: 2020-07-11 7:51:36 PM
Author: Michael
*/
#include <Arduino.h>
#include <SPI.h>
#include <Adafruit_GFX.h>
#include <Waveshare_ILI9486.h>
namespace
{
Waveshare_ILI9486 MyTFT;
constexpr size_t BMP_WIDTH = 16;
constexpr size_t BMP_HEIGHT = 16;
constexpr size_t NUM_PIXELS = BMP_HEIGHT * BMP_WIDTH;
constexpr size_t MAX_WIDTH = 256;
constexpr size_t MAX_HEIGHT = 256;
constexpr size_t TILES_WIDTH = MAX_WIDTH / BMP_WIDTH;
constexpr size_t TILES_HEIGHT = MAX_HEIGHT / BMP_HEIGHT;
}
// the setup function runs once when you press reset or power the board
void setup()
{
Serial.begin(115200);
SPI.begin();
MyTFT.begin();
MyTFT.fillScreen(0);
}
// the loop function runs over and over again until power down or reset
void loop()
{
uint16_t buffer[NUM_PIXELS];
// Fill buffer with random pixels
for (size_t i = 0; i < NUM_PIXELS; i++)
{
buffer[i] = random(UINT16_MAX);
}
auto tStart = millis();
for (size_t i = 0; i < TILES_HEIGHT; i++)
{
for (size_t j = 0; j < TILES_WIDTH; j++)
{
MyTFT.drawRGBBitmap(j * BMP_WIDTH, i * BMP_HEIGHT, buffer, BMP_WIDTH, BMP_HEIGHT);
}
}
auto tEnd = millis();
Serial.print("Render time: ");
Serial.print(tEnd - tStart);
Serial.println(" mSec.");
delay(3000);
}
//
// Using a Wemos D1 R32 + Waveshare 4" TFT screen. ESP32 board, SPI ILI9486 display
//
// 16x16 tiles
// Render Times:
// Original = 1261 mSec
// Virtual = 1261 mSec = +0%
// Optimized = 124 mSec = -90.1%
//
// Sizes (Program / Min RAM)
// Original = 225,533 bytes / 16036 bytes
// Virtual = 225,541 bytes / 16036 bytes
// Optimized = 226,029 bytes / 16036 bytes
//
//
//
// 32x32 tiles:
// Render Times:
// Original = 1259 mSec
// Optimized = 120 mSec = -90.4%
//
// Using Mega 2560 + Waveshare 4" TT screen
//
// 16x16 tiles
// Render Times:
// Original = 6229 mSec
// Virtual = 6315 mSec = +1.3%
// Optimized = 298 mSec = -95.2%
//
// Sizes (Program / Min RAM)
// Original = 11,232 bytes / 306 bytes
// Virtual = 11,392 bytes / 308 bytes
// Optimized = 11,632 bytes / 308 bytes
//
//
//
// 32x32 tiles
// Render times:
// Original = 6225 mSec
// Virtual = 6305 mSec = +1.2%
// Optimized = 279 mSec = -95.5%