Text rendering with OpenGL / FreeType / C++


Have you eaver tried rendering text using OpenGL? Well i have and there are lot's of different ways depending on what you want to doo. Some are simple with limited features, some are hard with lot's of features, and some are hard with almost no features. So what to go for?
Well I need a FAST way of rendering!!! Drawing lots of different texts in different places on the screen with verry little performance overhead. So my main priority is SPEED! The second thing i want is quality, drawing nice straight lines. And last but not least i need to be able to rotate text.
If you are looking for a "this vs that" go back and google again :-) Here I will only go through my choise a why I did what i did. But there will be lot's of information here that also apply to other implementations.

First off lit me say that the code is made for "Old school OpenGL" So there will be no shaders etc... So if you a rendering using Opengl 3+ then i will expect that you will have to modify allot of the code to get it to work.

So What's my aproach:
1. Use a second hand library to render the alphabet onto a texture...
2. Create an "alphabet map" containing all the nessesary stuff to render letters...
3. Write each letter using a single textured quad...
4. Bacth render and optimize to maximize performance...

First let's get the first opstical thrown away. What second hand library to use? Well theres many out there, and some have more features than the one i picked, but for my chois I went with "FreeType" witch realy hit all my requirements. It's simple to use, has all the basic rendering options, and it's free :-)
If you need a freetype tutorial you should probably take a look at there "Tutorial" but i will still be taking you through the process. Just for the fun and to understand my code better.

The first thing I coded was the function to create the texture (And alphabet map later on)

void TextRender::CreateTexture(std::vector<int>& fontSizes) // Note that personaly I needed more font sizes so as parameter i used a list of font sizes
{
  //... Initialization stuff was placed here ...

  FT_Error error = FT_Init_FreeType( &library );
  if( error )
  {
    std::cout << "... an error occurred during library initialization ..." << std::endl;
    return;
  }

  error = FT_New_Face( library,
    "c:/Windows/Fonts/Arial.ttf", // I personaly only need one font so instead of creating more i just load this ... And I'm on windows so ... yea ...
    0,
    &face );

  if(error == FT_Err_Unknown_File_Format)
  {
    std::cout << "... the font file could be opened and read, but it appears" << std::endl;
    std::cout << "... that its font format is unsupported" << std::endl;
  }
  else if( error )
  {
    std::cout << "... another error code means that the font file could not" << std::endl;
    std::cout << "... be opened or read, or that it is broken..." << std::endl;
  }


So let's take you through the first part real quick. What we do here is just start up freetype and load in the font file :-) Verry basic and simple (And the code is more or less from the Freetype tutorial)...

  int InitialSpacing = 2;
  int CharacterSpacing = 4;

Now what's all this spacing stuff for? Well texture samplers transform texture cordinats into pixel colors (yes this is a verry basic description) and in that process they can look at neighbour pixels in the texture. So to ensure that we stay within the texture and not sample form other characters we need blank pixels in between each character and arround the edge of the texture.
See on the picture, that if sampling is just a tiny bit off then we will get blury pictures ....

  FT_GlyphSlot& g = face->glyph;
  int w = InitialSpacing;
  int h = 0;
  int hTotal = 0;
  int wTotal = 0;

  int maxTextureSize = 8192;

First we just make it a bit more easy for our self so we don't have to write "face->glyph" over and over. But notice the maxTextureSize definition that i set to 8k. This is because different graphics cards have different max texture sizes and i realy need to support most of them. Folow this link GL_MAX_TEXTURE_SIZE to see a list of graphics cards and there capabilitys. Then you can see that i support like 80% of all chipset's and the ones that i don't support are so low end that I don't think i will eaver come across them in my application.

  for(auto fontsize : fontSizes) {

    error = FT_Set_Char_Size(
      face,    /* handle to face object           */
      0,       /* char_width in 1/64th of points  */
      fontsize*64,   /* char_height in 1/64th of points */
      96,     /* horizontal device resolution    */
      96 );   /* vertical device resolution      */

    for(int i = startChar; i < endChar; i++) {
      if(FT_Load_Char(face, i, FT_LOAD_RENDER)) {
        fprintf(stderr, "Loading character %c failed!\n", i);
        continue;
      }

      int wSize = g->bitmap.width + CharacterSpacing;
      if(w+wSize > maxTextureSize) {
        hTotal += h; // going to next row add current row height first
        h = 0;
        w = 0;
      }
      w += wSize;
      wTotal = __max(wTotal, w);

      h = __max(h, g->bitmap.rows);
    }
    hTotal += h; // Add current row
  }

  atlas_width = wTotal;
  atlas_height = hTotal;  

Now this part is a bit interesting because what i realy do here is just calculate how big my final texture needs to be :-) I'm adding space for each character and adding spacings. And if i go over maxTextureSize in length i change line :-) Yes that's right it supports switching line in case the font size is so big that I have to use muntriple lines in order to stay below the maximum texture size :-)
The only thing I realy don't check for is if I have to many lines (Filling the entire 8k x 8k pixels texture) but that would require huge font's or allot of different font sizes .. and currently I'm only rendering using 5 font sizes ranging from 8 to 18 pixels per character. So I'm okay with that...
You may also notice the "fontsize*64" that, as the comment states, is because of Fretype defigning everything in 1/64 th of points ... For more info se there documentation ...

And now for some magic:

  glGenTextures(1, &tex);
  glBindTexture(GL_TEXTURE_2D, tex);
  glPixelStorei(GL_UNPACK_ALIGNMENT, 1);

  glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, wTotal, hTotal, 0, GL_LUMINANCE_ALPHA, GL_UNSIGNED_BYTE, 0);

Okay it's not realy magic, I create a texture with the size that i calculated above. The interesting part is the "GL_RGBA and GL_LUMINANCE_ALPHA and GL_UNSIGNED_BYTE" part. Fretype ONLY deliver unsigned byte grayscale data (single unsigned byte) but i need real RGBA so that i can play with color tranparancy etc... So i create the texture in RGBA an input LUMINANCE_ALPHA values as UNSIGNED_BYTE. Meaning the inpus is unsigned bytes, and each byte value should be put in both the Red, Green, Blue, and Alpha channel :-) So i end up with a grayscale picture with transparency coresponding to the gray scale. Eg. White pixels are Transparrent, black pixels are not and gray are semy transparrent. This is done because then i can blend antialized pixels with my background :-) So that i can render text nicely on top of other stuff :-)

  int x = InitialSpacing;
  int yOffset = 0;
  int maxHeight = 0;

  for(int sizeIndex = 0; sizeIndex < fontSizes.size(); sizeIndex++) {

    int fontsize = fontSizes[sizeIndex];
    CharacterData[sizeIndex].texturePointSize = fontsize;
    CharacterInfo* characterMap = CharacterData[sizeIndex].characterMap;

    error = FT_Set_Char_Size(
      face,    /* handle to face object           */
      0,       /* char_width in 1/64th of points  */
      fontsize*64,   /* char_height in 1/64th of points */
      96,     /* horizontal device resolution    */
      96 );   /* vertical device resolution      */


    for(int i = startChar; i < endChar; i++) 
    {
      if(FT_Load_Char(face, i, FT_LOAD_RENDER))
        continue;

      int width = g->bitmap.width;
      int height = g->bitmap.rows;

      int widthTotal = width + CharacterSpacing;

      if(x + widthTotal > maxTextureSize)
      { 
        yOffset += maxHeight;
        maxHeight = 0;
        x = InitialSpacing;
      }

      maxHeight = __max(maxHeight, height);

      if(FT_HAS_VERTICAL(face)) {
        characterMap[i].advanceX = g->metrics.horiAdvance;
        characterMap[i].advanceY = g->advance.y;
      } else {
        characterMap[i].advanceX = g->advance.x;
        characterMap[i].advanceY = g->advance.y;
      }

      characterMap[i].bitmapWidth = g->bitmap.width;
      characterMap[i].bitmapHeight = g->bitmap.rows;

      characterMap[i].bitmapLeft = g->bitmap_left;
      characterMap[i].bitmapTop = g->bitmap_top;

      characterMap[i].textureOffsetX = x;
      characterMap[i].textureOffsetY = yOffset;

      GLubyte* expanded_data = new GLubyte[2 * height * width];

#ifdef TEXTRENDER_DEBUG
      std::string filename = "c:\\Texture\\filedata";
      filename.append(gentstl::int_to_str(fontsize));
      filename.append("_");
      filename.append(gentstl::int_to_str(i));
      filename.append(".tmp");
      FILE* fp = fopen(filename.c_str(), "wb");
      fputc((unsigned char)height, fp);
      fputc((unsigned char)width, fp);
#endif // TEXTRENDER_DEBUG

      for(int j=0; j < height; j++) {
        for(int k=0; k < width; k++){
          GLubyte mybyte = g->bitmap.buffer[k + g->bitmap.width*j];
          expanded_data[2*(k+j*width)] = 255; // Color bit (GrayScale) 
          expanded_data[2*(k+j*width)+1] = mybyte; // Antialized image
  //         expanded_data[2*(k+j*width)+1] = mybyte >= 200 ? 255 : 0; // Not anitalized

#ifdef TEXTRENDER_DEBUG
          fputc(expanded_data[2*(k+j*width)+1], fp);
#endif // TEXTRENDER_DEBUG

        }
      }
#ifdef TEXTRENDER_DEBUG
      fclose(fp);
#endif // TEXTRENDER_DEBUG


      glTexSubImage2D(GL_TEXTURE_2D, 0, x, yOffset, g->bitmap.width, g->bitmap.rows, GL_LUMINANCE_ALPHA, GL_UNSIGNED_BYTE, expanded_data);


      x += widthTotal;
    }
  }
}

Now the rest of the function above is acturaly realy straight forward but still takes up alot of code :-) All it does is go through each character from FreeType. Builds up a characterMap containing data for each letter saying there the letter is located in the texture, the size of the character, and the spacings that need to be used around the character when rendering Eg. the letter "L" is larger than the letter "." so we need different offset's and sizes on the screen...
I also put in a debug option "Just defigne TEXTRENDER_DEBUG" that saves the bitwise data to some files so that i can look at the textures... And i created a small c# program that read the debug data an converts it into a pitmap image that you can open...
The c# debug reader program can be downloaded here: --> DOWNLOAD HERE <-- Note that it only converts one letter at a time. Feel free to mess with the code...
Before saving the data i have a small loop that loops through the Fretype data and change it up a bit i can do antialixed or non antialized textures (Not anitalized is commented out) and set the RGB values to 255 (So that i get White in the texture).
And last but not least i give the data to OpenGL letter by letter, slowly filling the texture...

So now I have a texture that looks something like this: "Taken directly from my debug output, only converted with the c# tool"

Now that was part 1 & 2 of my plan. Now for step 3 and 4.
Part 3: Rendering a quad is realy simple so I'm not realy going to cover it much... Google it ... Or just fin it in my code below ...
Part 4 on the other hand is a bit interesting. Usualy when one optimizes code you just run the code with an analyzer attached (posibly using Visual Studio) but this approach may give you a false fealing of having written fast code, because opengl may acturaly store commands and process them later. E.g on my machine it looks like changing textures are almost free but rendering quads are insanely expensive??? That may be because the texture change commmand is just stored and don't happen untill I acturaly render quads. But in reality it is the changing of texture that takes time! Because of sometimes having to swap texture data in and out of memmory .... I'm not going into any futher explanation about debugging or OpenGL implementations, I'm just stating that it may not be trivial to find out what's slow OpenGL code, and how to improve it .... Now back to my original plan of using a technique called "batch rendering" (it may have another more widely used name but i know it as batch rendering). This technique is in it simplest form to do the folowing:
1. first collect and store all information needed to render "everything".
2. sort the data on criterias so that you can render it faster.
3. render in the sortet order.

Some might know the theory because stuf like "back to front rendering" and also other rendering stuff need specific rendering order. But I'm going to use it to minimize texture switcing.

void TextRender::RenderText(const char* text, int xpos, int ypos, float fontsize)
{
  if(strlen(text) == 0) return; // Don't draw

  double x = xpos;
  double y = ypos;
  FixToPixel(x);
  FixToPixel(y);


  CharacterDataMap* characterDataMap = GetMap(fontsize);
  if(characterDataMap == NULL)
    return; // May not be initialized

  CharacterInfo* characterMap = characterDataMap->characterMap;

  int fontscale = PixelToOGL(fontsize);

  float texturePointSize = characterDataMap->texturePointSize;


  int textLenght = strlen(text);
  for (int i = 0; i < textLenght; i++)
  {
    unsigned char num = text[i];
    if(num < startChar || num > endChar)
      std::cerr << "Character not found";
    float tx = characterMap[num].textureOffsetX / atlas_width;
    float w = characterMap[num].bitmapWidth / atlas_width;
    float ty = characterMap[num].textureOffsetY / atlas_height;
    float bh = characterMap[num].bitmapHeight / atlas_height;
    
    float characterheight = PixelToOGL(characterMap[num].bitmapHeight) / texturePointSize * fontsize;
    int characterwidth = PixelToOGL(characterMap[num].bitmapWidth) / texturePointSize * fontsize;
    int offsetUp = PixelToOGL(characterMap[num].bitmapTop) / texturePointSize * fontsize;

    float yoffset =y-offsetUp/*+fontscale*/;

    RenderQuad& renderQuad = lRenderQuads[currentindex];
    renderQuad.xStart = (float)x;
    renderQuad.xEnd = (float)x+characterwidth;
    renderQuad.yStart = (float)yoffset;
    renderQuad.yEnd = (float)yoffset+characterheight;

    renderQuad.uStart = tx;
    renderQuad.uEnd = tx+w;
    renderQuad.vStart = ty;
    renderQuad.vEnd = ty+bh;

    renderQuad.colorR = colorR;
    renderQuad.colorG = colorG;
    renderQuad.colorB = colorB;

    int caharacteradvance = ((int)characterMap[num].advanceX) >> 6;
    caharacteradvance = PixelToOGL(caharacteradvance) / texturePointSize * fontsize;

    if(caharacteradvance == 0) // Character advance is 0? Why should this eaver happen...
      std::cerr << "Character advance is 0. This may be an error.";
    x += caharacteradvance;
    FixToPixel(x);

    currentindex += 1;
  }
}

I'm not going into details on what's hapening here but in the header file i have a "std::map< RenderQuad >" called lRenderQuads. This is where i store each character that needs to be rendered. This makes it posible in mu code to write stuf like this: [Pseudo code]
billboard.Render();
// Here the texture will be switched
textrenderObject.RenderText("Text message", 20,300, 12);
smallCube.Render();
// Here the texture will be switched again
textrenderObject.RenderText("Another message", 243,203, 8);
teaPot.Render();
And whats happening is that each RenderText call just ads the characters to a list of characters to be rendered later on. So the list will contain "Text messageAnother message" and info for each character containing screen location and texture cordinate.
So the OpenGL calls will in reality instead look something like this [Pseudo code]
billboard.Render();
smallCube.Render();
teaPot.Render();
// Here the texture will be switched only once
textrenderObject.RenderText("Text message", 20,300, 12);
textrenderObject.RenderText("Another message", 243,203, 8);
And that limit's the texture switches. Now the example code only has 2 switches witch would be fast enough, but imagine a scenario where you render 1000 texts to screen and inbetween each text you render other stuff. Now that would create Lots of texture switches costing allot of performance! So the only realy good thing to do is batch up texts and render everything using the same texture in one go :-)
Just for the fun of it why do you thing i use a std::map instead of list or vectors? .... Don't read on before you have been thing about it ... Wait for it ... It's because I then only allocate memmory (RAM) once and reuse it each render cycle. Because creating objects in memmory is expensive. And if one frame contains 1000 texts with an average of 20 characters changes are that the next frame will contain almost the same. Because most application render at 30-100 frames per second, and if you are spoopsed to read the text it will most likely be on the screen for several seconds. So reusing memmory instead of re allocating each frame is the only sensable thing to doo in a performance critical situation.

So now we have created the texture containing the letters. Created a list of data saying how each charecter needs to be rendered. Created a Map containing each letter that need to be rendered allong with cordinats etc. So now all that is missing is rendering it all with OpenGL.

void TextRender::Render()
{
  glLoadIdentity();
  glBindTexture(GL_TEXTURE_2D, tex);

  glTexEnvi(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_MODULATE);

  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
  glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);

  glMatrixMode(GL_MODELVIEW);
  glDisable(GL_LIGHTING);
  glEnable(GL_TEXTURE_2D);
  glDisable(GL_DEPTH_TEST);
  glEnable(GL_BLEND);
  glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);	

  glBegin(GL_QUADS);
  for(int i = 0; i < currentindex; i++) {
    RenderQuad& renderQuad = lRenderQuads[i];
    glColor4f(renderQuad.colorR,renderQuad.colorG,renderQuad.colorB,1.0f);
    glTexCoord2f(renderQuad.uStart, renderQuad.vStart); glVertex3f(renderQuad.xStart, renderQuad.yStart, 0);
    glTexCoord2f(renderQuad.uStart, renderQuad.vEnd); glVertex3f(renderQuad.xStart, renderQuad.yEnd, 0);
    glTexCoord2f(renderQuad.uEnd, renderQuad.vEnd); glVertex3f(renderQuad.xEnd, renderQuad.yEnd, 0);
    glTexCoord2f(renderQuad.uEnd, renderQuad.vStart); glVertex3f(renderQuad.xEnd, renderQuad.yStart, 0);
  }
  glEnd();

  currentindex = 0;

  glDisable(GL_TEXTURE_2D);
}

If you don't understand the abbove don't feel stupid! Just google the parts you don't understand. All I'm dooing is render a list of textured quads onto the screen...

So that's it right ??? No wait a minut .... What about the missing functions you ask???
int TextRender::PixelToOGL(int pixel)
void TextRender::SetColor(int rgbColorCode)
void TextRender::FixToPixel(double &x)
CharacterDataMap* TextRender::GetMap(float fontsize)

PixelToOGL => This function is in the code because i render text in "Screen cordinats" but i scale everything in Opengl Eg. Scren cordinate (100, 100) may be OpenGL cordinate (1023021, 342752) so i need to do a bit of transforming. But your implementation could be as simple as "return pixel;" if your opengl app is working in screen cordinats ...
Why would one render in other than screen cordinats? Well the simple answor is: To prevent pixels flicoring / blinking... Imagine an object located right on the edge of a pixel row and vibrating slightly, then the oblect may in half the frames be rendered in one pixel row one frame but not the next..

SetColor(int) => I personaly like to have RGBA collors packed in a single 32 bit Integer (8 bit Red, 8 bit Green, 8 bit blue, 8 bit Alpha ) that's why the input is a single int, and what the function does is just set the variables "colorR, colorG, colorB". So that the user can select the color to render the text with ... But you cound have a SetColor(byte, byte, byte) or perhaps SetColor(float, float, float) ...

GetMap(int font size) => I acturaly have a data structure where each font size has it's own character map (maping into the same texture) ... See the header definition ... And when i then render a specific font i find the best texture map for the job.
So CreateTexture may have been called with font sizes "8, 10, 12, 16" but not with 14. So when someone call RenderText with font size 14 I just use the map for font size 12 instead, and just scale it up to fit size 14. This gives text that are more blured for font sizes that are not created. But it's a trade off ... Using more GPU memmory for texture vs more blury text... So what i doo is i call CreateTexture with the most commonly used text sizes in my application and less used sizes will then render a bit blurry. Personaly in my app 90% of the texts can be rendered with 2-3 font sizes...

FixToPixel => Now here is something interesting in many ways. (One is that i parse a refference as parameter, if you don't know what i does you don't know c++). No realy the cleaver part is that: As i mentioned someware above, i don't render using screen cordinats I use some arbitrary values that fit the applications world allot better! But that may leed to OpenGL rendering everything blurry (See pic below). So to prevent this I have a function that alligens my OpenGL cordinats with Pixel cordinats :-) This leeds to everything rendering perfectly :-)

Now you may be rendering in screen cordinats and that's fine :-) Then you will not have this problem but I do have to move pixels arround to get pixel perfect rendering ... And if you also have to then this is the simpel way to go :-)