Using Data Types

Choose the right data type for every value

At a Glance

  • Difficulty: Intermediate
  • Time required: ~15 minutes
  • Prerequisites: Understanding Data Extraction
  • What you'll learn: The five data types and when to use each - especially Date and Query (with list)

What is a data type?

Every extraction rule reads a value from your PDF. The data type determines how that value is understood and processed. A value recognized as a Date, for example, can be reformatted freely; a value recognized as a Number can be checked against a valid range; and a Query returns a result value you define, depending on the content.

You select the data type in the rule editor under "General" via the "Data type" field. It is available regardless of the data source - whether the value comes from the document text, a barcode or the file information.

Note: If the value comes from the source "Use custom text", only the Text data type is available. Learn more in the tutorial Using Data Sources.


The five data types at a glance

Data type Suitable for Typical example
Text General values without special processing Invoice number, customer name, reference
Date Dates - freely reformattable (year/month/day individually) Invoice date, due date
Number Numeric values - checkable for range and format Amount, quantity, page count
Query Returns a defined result value depending on the content "paid" / "open"
Query (with list) Matches the value against a list and returns the assigned entry Customer code to full customer name

Text

Text is the default data type and the right choice for most values. The recognized text is taken over unchanged. Use it for anything that does not need special handling as a date or number - such as invoice numbers, names or reference codes.

Tip: Even purely numeric codes such as an invoice or customer number are usually best kept as Text. You only need the Number data type if you actually want to calculate, round or check ranges.


Date

The Date data type automatically recognizes dates in many notations - for example 12/15/2024, 2024-12-15 or December 15, 2024. The big advantage: a value recognized as a date can then be reformatted freely, because the program knows the year, month and day individually.

Example: reassembling a date

The PDF contains:

Invoice date: December 15, 2024

Using the date placeholders, you produce for example:

  • <RuleId:1(InvoiceDate){Year4}>-<RuleId:1(InvoiceDate){Month}>-<RuleId:1(InvoiceDate){Day}>2024-12-15
  • <RuleId:1(InvoiceDate){Year4}>\<RuleId:1(InvoiceDate){MonthName}>2024\December

This lets you name files in a sortable way or file them automatically into year and month folders - regardless of how the date was written in the original document. You can find the available date building blocks (year, month, month name, day and more) in the tutorial Placeholder System Explained.

Tip: For dates, always choose the Date data type - even if you want to keep the date unchanged. Only then are the reformatting options available later.


Number

The Number data type reads numeric values and understands different notations (for example thousands separators and decimal marks such as 1,234.56). Use it when you need the value as an actual number - for example to check a valid range or to enforce a consistent format.

Example: You extract an invoice amount and want to move only invoices of 1000 or more to a special folder. With the Number data type, the value can be checked as a number.


Query

With the Query data type, a rule returns not the found text itself, but a result value you define - depending on what the document contains. To do this, you set up one or more conditions and assign a return value to each.

Example: determining the payment status

  • The document contains the word Paid → result paid
  • Otherwise → result open

You can then use the result (paid or open) as a placeholder in the file name or target folder.

The Query is ideal whenever you want to sort documents into fixed categories based on their content - but the search term itself should not be output.


Query (with list)

The Query (with list) is the most powerful variant. Instead of individual conditions, you set up an entire list of search terms, each with an assigned result value. The program checks the content against the list and returns the assigned value of the match. This keeps even many cases clear and manageable.

Example: turning codes into full names

Found in the documentResult value
MMMustermann Ltd
BSExample & Sons Inc
MHSample Trading Corp

If the program finds the code MM in the document, the rule returns Mustermann Ltd. Typical uses are determining document types, departments, suppliers or customers based on a maintained mapping list.

Fixed list or central (dynamic) list

You can maintain the list in two ways:

  • Fixed in the rule: the assignments belong to this one rule only. Ideal for short, stable lists.
  • Central as a dynamic list: you maintain the list once in the program options and can reuse it across multiple rules and profiles. Changes to the list take effect everywhere immediately. Ideal for longer or frequently changing assignments.

Tip: Use a central (dynamic) list when the same assignment is needed in several profiles or has to be extended regularly. That way you maintain it in just one place.


Practical example: combining data types

Suppose you process incoming invoices and want to name and file them sensibly. To do so, you combine several rules with different data types:

Rule Data type Result
Invoice numberTextRE-2024-0042
Invoice dateDatereformatted to 2024-12-15
SupplierQuery (with list)code → Mustermann Ltd
Payment statusQuerypaid or open

From this, you can assemble the following file name, for example:

<RuleId:2(InvoiceDate){Year4}>-<RuleId:2(InvoiceDate){Month}>-<RuleId:2(InvoiceDate){Day}>_<RuleId:3(Supplier)>_<RuleId:1(InvoiceNumber)>_<RuleId:4(PaymentStatus)>.pdf

Result: 2024-12-15_Mustermann Ltd_RE-2024-0042_paid.pdf


Frequently asked questions

Question Answer
When Text, when Number? Use Text as long as you only output the value. Use Number only when you want to check ranges or work with calculations.
My date is not recognized. Make sure the Date data type is selected and the data area captures the complete date (green highlight in the preview).
Query or Query (with list)? For a few fixed cases, the Query is enough. For many assignments or reusable lists, the Query (with list) is clearer.

Next steps


Other step-by-step instructions

Getting Started

Basic Tasks

PDF Editing

E-Invoicing & Archiving

Practical Examples

Operation & Server


To the product page of Automatic PDF Processor
Try Automatic PDF Processor now for 30 days...     Go to the download page