Teradata DB-e 20: OTF Performance

Teradata Data Lakehouse

As you may know, Teradata released the Database Engine 20 a few months ago. One of the strongest assets in this version is the OTF support. If you are willing to use OTF, either Apache Iceberg or Delta Lake, with Vantage as a compute engine, this article will explain to you how to better tune the OTF workload in the Database Engine 20. This way, you will get the best possible performance through Vantage.

Of course, each OTF project has its own means to improve the performance, such as indexing, data skipping and statistics. Please refer to the documentation of the OTF flavour of your choice to learn the different techniques they have to offer to reduce the execution times.

On a separate note, as of August 2025, Teradata uses Apache Iceberg and Delta Lake Java APIs to connect to the storage with tables in the respective OTF project.

Workload Management

The OTF workloads which lean on the Java APIs provided by the OTF projects typically consume high amounts of CPU in the engine.

Moreover, Vantage needs to run a Java Virtual Machine to support the operations on the OTF tables through the Java APIs. As it is characteristic of the Java Virtual Machine (JVM) infrastructure, it consumes a lot of memory.

Consequently, you can use TASM to have the guardrails to protect the other, non-OTF high-priority workloads.

To put it differently, Workload Management can prevent:

  • Unexpected CPU peaks — Consequence of converting data from Parquet to Java SQL type, and then to the Teradata client format, which can be quite CPU-intensive.
  • Memory issues — When there is a high Java OTF query concurrency.

To configure Workload Management for OTF queries, TASM allows the creation of rule sets to control the concurrency and the priority of queries that read from or write to OTF tables. In order to use these rule sets, administrators must:

  1. Execute the store procedure TD_OTFDB.TDWMCreateOTFRulesSP, available in Enterprise 3. X and Intelliflex/VoV 20.0 for Java-based OTF. This store procedure creates the TASM rules POG_OTF_ALL (query limit) and POG_OTF_ALL_MM (query classification).
    • Users don’t need to run this procedure in a Lake platform because the TASM rules are already automatically defined.
  2. Adjust the rules POG_OTF_ALL and POG_OTF_ALL_MM based on the needs of the workload that runs in the Vantage platform.
  3. The OTF functions TD_OTFDB.TD_ICEBERG_READ, TD_OTFDB.TD_ICEBERG_WRITE, TD_OTFDB.TD_DELTA_READ and TD_OTFDB.TD_DELTA_WRITE are the classification criteria for the rules POG_OTF_ALL and POG_OTF_ALL_MM, for both CPU and concurrency.
  4. If an administrator activates a new TASM rule set at a later time, (s)he will have to re-execute the stored procedure TD_OTFDB.TDWMCreateOTFRulesSP specifying that new rule set.

Memory Allocation for the Java Virtual Machine

As I mentioned in the previous section, the Java Virtual Machine requires dedicated memory. So we must allocate additional memory to the TPA nodes (Trusted Parallel Application; i.e., the TPA nodes are the ones where the Vantage engine runs).

The memory allocation to the TPA nodes depends on the number of concurrent OTF queries the Java Virtual Machine needs to support. The default is 15 GB/node, but it can be adjusted.

Below, you have a suggestion for the number of concurrent queries supported on different instances with the default Java Virtual Machine memory configuration. Please note that the memory consumption may vary in your instance depending on several factors, so test for a more accurate memory value for your ecosystem.

Memory allocation for the JVM in Teradata

Thus, the cufconfig utility allows adjusting the Java Virtual Machine allocated memory from a TPA node, and so it permits for a higher concurrency in OTF queries. Under this paragraph, you have the steps to perform by whoever have access to the TPA nodes.

  1. Create a jvm.txt file under the /tmp directory with the below content. We must adjust the -Xmx value. In this example, the memory limit is increased to 30 GB.
    • HybridServer2JVMOptions: -Xms2g -Xmx30g -Dtdjvmtype=otf -Djava.security.properties=/etc/opt/teradata/tdotf/java_override.security
  2. Run this command: # cufconfig -f /tmp/jvm.txt

VARCHAR size

All VARCHAR columns are mapped to a String data type in Iceberg and Delta Lake. When Vantage reads these values, they are mapped to a default size of 4 KB. So, when users include these columns in a GROUP BY or JOIN condition, the string comparisons for these 4 KB columns severely impact performance.

To mitigate the performance hit that the VARCHAR columns cause, users must provide the actual sizes of the VARCHAR data set. They can set the VARCHAR sizes through the DATALAKE definition or QueryBand.

DATALAKE definition

If you want to control the VARCHAR size for all users and queries, you must add the sql_varchar_fixed_size_bytes parameter to the DATALAKE definition to set a VARCHAR default size.

CREATE DATALAKE Unity Catalog_DATALAKE
EXTERNAL SECURITY CATALOG ICEBERG_CATALOG_Unity Catalog_SA,
EXTERNAL SECURITY STORAGE ICEBERG_INVOKER_Unity Catalog_SA
USING CATALOG_TYPE('Unity Catalog')
CATALOG_LOCATION('https://abc.azuredatabricks.net/api/2.1/Unity Catalog-catalog/iceberg')
Unity Catalog_CATALOG_NAME ('abc_Unity Catalog')
STORAGE_ACCOUNT_NAME ('icebergstorageacct_abc')
TENANT_ID ('XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXX')
default_cluster_id ('0000-000000-00000000')
sql_varchar_fixed_size_bytes ('264')
TABLE FORMAT iceberg;

QueryBand

When you know the data set and the maximum length of all the character data columns, you can use QueryBand to set the individual VARCHAR size columns. This option allows for further control on the space allocated for VARCHAR.

Bear in mind that the QueryBand overrides the VARCHAR length set in DATALAKE through sql_varchar_fixed_size_bytes.

On a separate note, the scope of QueryBand is at the session-level. So, you’ll need to execute the SET QUERY_BAND statement on each session.

SET QUERY_BAND = 'column_sizes1=ITEM.i_item_id:32, ITEM.i_category:100, ITEM.i_class:100, ITEM.i_brand:100, ITEM.i_product_name:100, ITEM.i_color:40, ITEM.i_units:20, ITEM.i_size:40, ITEM.i_item_desc:400;' FOR SESSION;

Noteworthy, the VARCHAR column sizes are expressed in bytes in the QueryBand, and not as the number of characters. For example, if the column C_NAME is defined as VARCHAR(25), it should show the bytes in the QueryBand expression:

SET QUERY_BAND='column_sizes1=CUSTOMER_TABLE.C_NAME:50;' FOR SESSION;

When many VARCHAR columns need size adjustment, they may not all fit into a single name/value pair due to QueryBand property size limitations. In such cases, you should create multiple column_sizes name/value pairs. For example: column_sizes1=<value1>; column_sizes2=<value2>; column_sizes3=<value3>, and so on, within the same QueryBand statement.

SET QUERY_BAND = 'column_sizes1=NATION_iceberg.N_NAME:28, PARTTBL_iceberg.P_MFGR:28, REGION_iceberg.R_NAME:22;

column_sizes2=SUPPLIER_iceberg.S_NAME:36, SUPPLIER_iceberg.S_ADDRESS:80, SUPPLIER_iceberg.S_NATIONKEY:4, SUPPLIER_iceberg.S_PHONE:15, SUPPLIER_iceberg.S_COMMENT:200;' FOR SESSION;

WITH Clause

Whenever possible, you should rewrite queries using the WITH clause to improve performance. It is important to realise that the Java implementation of OTF in Vantage is a table operator, and the WITH clause allows avoiding invocations of multiple table operator.

QueryGrid and OTF

QueryGrid Manager is a server responsible for logging, heartbeat monitoring, and configuration, while all computation runs on the Vantage Database Engine. I.e., QueryGrid provides with engine-to-engine connectivity.

One of QueryGrid capabilities is that it can support the execution of queries on OTF to an Enterprise Cluster or VantageCore instance. So, QueryGrid will distribute the workload among all participating instances, thereby alleviating the load that high-consuming OTF queries could represent if all of them ran on the same instance.

Furthermore, QueryGrid can also connect VantageCloud instances with on-prem.

To Know More

If you need to know the OTF features, OTF projects and catalogues that Teradata supports in Vantage, navigate to the Getting Started guides, in the Teradata online documentation, and click on Getting started with the Open Table Formats. This tile will always take you to the latest version of the Teradata® Open Table Format for Apache Iceberg and Delta Lake User Guide.

In addition to the supported features and projects, the Teradata OTF User Guide explains how to use and tune Vantage when reading and writing from OTF workloads.


I added that the VARCHAR length must be expressed in bytes in the QueryBand expression on 22 September 2025.


Teradata. (n.d.). Teradata® Open Table Format for Apache Iceberg and Delta Lake User Guide on the Teradata online documentation website. Retrieved in August 2025, from https://docs.teradata.com/r/Enterprise_IntelliFlex_Lake_VMware/Teradata-Open-Table-Format-for-Apache-Iceberg-and-Delta-Lake-User-Guide


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *