← Blog

PDF → PowerPoint — Tables on slides

Table in PPT: same KD-tree, but the size is fixed size slide: 13.333 × 7.5″ table has to fit otherwise tiny font readability from a projector font ≥ 14 pt few rows/columns styling colored header row banded rows bound to the theme header-style XML <a:tblPr firstRow="1" bandRow="1"> <a:tableStyleId>{5940675A-B579-460E-94D1-54222C63F5DA}</a:tableStyleId> </a:tblPr>

PowerPoint tables are first-class shapes: rows, columns, cells, each cell its own text box. They can be styled by the deck’s theme (header row, banded rows), edited like any other shape, and restyled in one operation.

Table detection itself is geometric: find the horizontal and vertical line segments on the page, build a KD-tree of their intersections, and read each cell's content from the text bounded by them. What differs on a slide is the context. A slide imposes hard size limits a document does not, and the table is shown rather than read.

What changes when a table moves to a slide

Size

A Word table can occupy an entire page or run across several. A PowerPoint table has to fit in 13.333 × 7.5 inches (16:9) or 10 × 7.5 (4:3), and stay readable at projection distance. Large tables fail both constraints.

After conversion, an oversize PDF table fills the slide with a tiny font. Splitting it across slides is a separate problem (below). Shrinking forever produces text unreadable from any viewing distance.

How the user reads it

A document table sits in prose; the reader scrolls. A slide table is shown, from a projector, to an audience, for a few seconds. The constraints flip: large fonts, few rows, few columns. The converter has no license to redesign; it can only pass through what was in the PDF.

Styling

Word tables tend to be plain. PowerPoint tables are usually styled to match the deck: colored header, banded rows, highlighted totals. Most converters carry over fill colors and borders but stop there. The result is not promoted to a theme-aware table, so styling stays static when the user changes themes.

The conversion path

Per page:

  1. Run the standard line-detection / intersection / grid algorithm.
  2. For each table found, create a table-type shape with the matching row and column counts, fill the cells with text, and apply formatting.
  3. If the table is too large for one slide, decide how to split.

Strategies for oversized tables

Font scaling

Reduce the font in every cell until the table fits. Trivial to implement. Often produces text that is unreadable from a projector. A reasonable starting point only when the source was already close to fitting.

Row split

Break the table into chunks by rows:

Each chunk duplicates the header row. The strategy works for long lists. It requires recognizing which row is the header (a row whose styling differs from the rest).

Column split

Same idea, on the other axis:

Each chunk duplicates the first column when it carries row labels.

Rasterize

Render the table as PNG and insert it as a picture shape. Visual fidelity is preserved; editability is gone. For very large tables, this is often the only option that produces a readable slide.

Most converters skip splitting and either scale the font or let the table overflow the slide.

Merged cells

PowerPoint supports horizontal and vertical merging. The converter detects merges by finding cells where the expected interior border line is missing; everything spanned across that gap becomes one merged cell.

Horizontal merge:

<a:tc gridSpan="2">
  <a:txBody>...</a:txBody>
</a:tc>

Vertical merge spans multiple rows. The starting cell omits vMerge; continuation cells declare vMerge="1":

<a:tc>
  <a:txBody>...</a:txBody>
  <a:tcPr/>
</a:tc>
<a:tc vMerge="1">
  <a:txBody/>
  <a:tcPr/>
</a:tc>

When merges are missed, what should be one cell becomes a row of empty cells with the content scattered across them. The error compounds visually because every adjacent merge fails the same way.

Header style and theme binding

The right way to mark a header in a PPTX table is:

<a:tblPr firstRow="1" bandRow="1">
  <a:tableStyleId>{5940675A-B579-460E-94D1-54222C63F5DA}</a:tableStyleId>
</a:tblPr>

firstRow="1" enables header styling. bandRow="1" enables banded rows. Header and band colors live in tableStyles.xml and are referenced by tableStyleId. Theme changes that include a new table style flow through automatically.

The right behavior: detect the header row by its differing style, set firstRow and bandRow, and bind to a tableStyleId. Most converters skip this and copy the header’s fills and fonts as static formatting. The table looks identical at first; switching themes changes nothing.

When the table isn’t recognized at all

Borderless tables, partially ruled tables, and tables with rendering artifacts defeat line detection. The contents don’t disappear; they come through as independent text boxes, positioned exactly where they sat in the PDF.

The slide looks like a table at a glance. It is not one. Cells aren’t editable as cells, changes don’t reflow rows, and importing the slide into Excel produces unstructured text. The defense, when you control the source, is to give every table explicit dividing lines.

XML structure

PPT tables use the a: namespace where Word uses w:, but the structure is parallel:

The tableStyleId reference makes a table theme-aware. Without it, the table is just a grid of formatted text.

Where tables fail

Four points of fragility:

  1. Borderless tables go undetected and become loose text.
  2. Large tables become unreadable on a slide.
  3. Detection failures drop the table to a collection of text boxes.
  4. Theme styling is not inherited, even when detection succeeds.

If preserving the table accurately matters, keep the original .pptx or the source data (Excel, CSV) and work from that, not from a PDF.

A large table doesn't fit on a slide: four options font scaling 10pt → 6pt fits, but unreadable split by rows slide 1: rows 1–10 slide 2: rows 11–20 + duplicated header split by columns slide 1: cols 1–5 slide 2: cols 6–10 + duplicated label col insert as image rasterize the table preserves the look loses editability most converters use (1); a good one uses (2) with header detection