Teradata DB-e 20: OTF Performance

As you may know, Teradata released the Database Engine 20 a few months ago. One of the strongest assets in this version is the OTF support. If you are willing to use OTF, either Apache Iceberg or Delta Lake, with Vantage as a compute engine, this article will explain to you how to better tune the OTF workload in the Database Engine 20. This way, you will get the best possible performance through Vantage.

Of course, each OTF project has its own means to improve the performance, such as indexing, data skipping and statistics. Please refer to the documentation for the OTF flavour of your choice to learn the different techniques it offers to reduce execution time.

On a separate note, until March 2026, Teradata used only the Apache Iceberg and Delta Lake Java APIs to connect to the OTF tables. However, starting in March 2026, Teradata began silently replacing the Java-based connections with a proprietary connection built into Vantage, which significantly boosts performance.

Finally, this article includes a last section that explains that Teradata automatically discovers new OTF tables in the Data Fabric, unlike other engines, and how users can list all OTF tables.

Workload Management

The OTF workloads which lean on the Java APIs provided by the OTF projects typically consume high amounts of CPU in the engine.

Moreover, Vantage needs to run a Java Virtual Machine to support the operations on the OTF tables through the Java APIs. As it is characteristic of the Java Virtual Machine (JVM) infrastructure, it consumes a lot of memory.

Consequently, you can use TASM to have the guardrails to protect the other, non-OTF high-priority workloads.

To put it differently, Workload Management can prevent:

Unexpected CPU peaks — Consequence of converting data from Parquet to Java SQL type, and then to the Teradata client format, which can be quite CPU-intensive.
Memory issues — When there is a high Java OTF query concurrency.

To configure Workload Management for OTF queries, TASM allows the creation of rule sets to control the concurrency and the priority of queries that read from or write to OTF tables. In order to use these rule sets, administrators must:

Execute the store procedure TD_OTFDB.TDWMCreateOTFRulesSP, available in Enterprise 3. X and Intelliflex/VoV 20.0 for Java-based OTF. This store procedure creates the TASM rules POG_OTF_ALL (query limit) and POG_OTF_ALL_MM (query classification).
- Users don’t need to run this procedure in a Lake platform because the TASM rules are already automatically defined.
Adjust the rules POG_OTF_ALL and POG_OTF_ALL_MM based on the needs of the workload that runs in the Vantage platform.
The OTF functions TD_OTFDB.TD_ICEBERG_READ, TD_OTFDB.TD_ICEBERG_WRITE, TD_OTFDB.TD_DELTA_READ and TD_OTFDB.TD_DELTA_WRITE are the classification criteria for the rules POG_OTF_ALL and POG_OTF_ALL_MM, for both CPU and concurrency.
If an administrator activates a new TASM rule set at a later time, (s)he will have to re-execute the stored procedure TD_OTFDB.TDWMCreateOTFRulesSP specifying that new rule set.

Memory Allocation for the Java Virtual Machine

As I mentioned in the previous section, the Java Virtual Machine requires dedicated memory. So we must allocate additional memory to the TPA nodes. TPA stands for Trusted Parallel Application and they are the nodes where the Vantage engine runs.

The memory allocation to the TPA nodes depends on the number of concurrent OTF queries the Java Virtual Machine needs to support. The default is 15 GB/node, but it can be adjusted.

Below, you have a suggestion for the number of concurrent queries supported on different instances with the default Java Virtual Machine memory configuration. Please note that memory consumption may vary in your instance depending on several factors. So you should test for a more accurate memory value for your ecosystem.

Thus, the cufconfig utility allows adjusting the Java Virtual Machine allocated memory from a TPA node. Consequently it permits a higher concurrency in OTF queries. Under this paragraph, you have the steps to perform by whoever have access to the TPA nodes.

Create a jvm.txt file under the /tmp directory with the below content. We must adjust the -Xmx value. In this example, the memory limit is increased to 30 GB.
- HybridServer2JVMOptions: -Xms2g -Xmx30g -Dtdjvmtype=otf -Djava.security.properties=/etc/opt/teradata/tdotf/java_override.security
Run this command: # cufconfig -f /tmp/jvm.txt

VARCHAR size

All VARCHAR columns are mapped to a String data type in Iceberg and Delta Lake. When Vantage reads these values, they are mapped to a default size of 4 KB. So, when users include these columns in a GROUP BY or JOIN condition, the string comparisons for these 4 KB columns severely impact performance.

To mitigate the performance hit caused by the VARCHAR columns , users must provide the actual sizes of the VARCHAR data set. They can set the VARCHAR sizes through the DATALAKE definition or QueryBand.

DATALAKE definition

If you want to control the VARCHAR size for all users and queries, you must add the sql_varchar_fixed_size_bytes parameter to the DATALAKE definition to set a VARCHAR default size.

			
CREATE DATALAKE Unity Catalog_DATALAKE
EXTERNAL SECURITY CATALOG ICEBERG_CATALOG_Unity Catalog_SA,
EXTERNAL SECURITY STORAGE ICEBERG_INVOKER_Unity Catalog_SA
USING CATALOG_TYPE('Unity Catalog')
CATALOG_LOCATION('https://abc.azuredatabricks.net/api/2.1/Unity Catalog-catalog/iceberg')
Unity Catalog_CATALOG_NAME ('abc_Unity Catalog')
STORAGE_ACCOUNT_NAME ('icebergstorageacct_abc')
TENANT_ID ('XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXX')
default_cluster_id ('0000-000000-00000000')
sql_varchar_fixed_size_bytes ('264')
TABLE FORMAT iceberg;

		

QueryBand

When you know the data set and the maximum length of all the character data columns, you can use QueryBand to set the individual VARCHAR size columns. This option allows for further control on the space allocated for VARCHAR.

Bear in mind that the QueryBand overrides the VARCHAR length set in DATALAKE through sql_varchar_fixed_size_bytes.

On a separate note, the scope of QueryBand is at the session-level. So, you’ll need to execute the SET QUERY_BAND statement on each session.

			
SET QUERY_BAND = 'column_sizes1=ITEM.i_item_id:32, ITEM.i_category:100, ITEM.i_class:100, ITEM.i_brand:100, ITEM.i_product_name:100, ITEM.i_color:40, ITEM.i_units:20, ITEM.i_size:40, ITEM.i_item_desc:400;' FOR SESSION;

Noteworthy, the VARCHAR column sizes are expressed in bytes in the QueryBand, and not as the number of characters. For example, if the column C_NAME is defined as VARCHAR(25), it should show the bytes in the QueryBand expression:

SET QUERY_BAND='column_sizes1=CUSTOMER_TABLE.C_NAME:50;' FOR SESSION;

When many VARCHAR columns need size adjustment, they may not all fit into a single name/value pair due to QueryBand property size limitations. In such cases, you should create multiple column_sizes name/value pairs. For example: column_sizes1=<value1>; column_sizes2=<value2>; column_sizes3=<value3>, and so on, within the same QueryBand statement.

			
SET QUERY_BAND = 'column_sizes1=NATION_iceberg.N_NAME:28, PARTTBL_iceberg.P_MFGR:28, REGION_iceberg.R_NAME:22;
column_sizes2=SUPPLIER_iceberg.S_NAME:36, SUPPLIER_iceberg.S_ADDRESS:80, SUPPLIER_iceberg.S_NATIONKEY:4, SUPPLIER_iceberg.S_PHONE:15, SUPPLIER_iceberg.S_COMMENT:200;' FOR SESSION;

WITH Clause

Whenever possible, you should rewrite queries using the WITH clause to improve performance. It is important to realise that the Java implementation of OTF in Vantage is a table operator, and the WITH clause allows avoiding invocations of multiple table operator.

QueryGrid and OTF

QueryGrid Manager is a server responsible for logging, heartbeat monitoring, and configuration, while all computation runs on the Vantage Database Engine. I.e., QueryGrid provides with engine-to-engine connectivity.

One of QueryGrid capabilities is that it can support the execution of queries on OTF to an Enterprise Cluster or VantageCore instance. So, QueryGrid will distribute the workload among all participating instances, thereby alleviating the load that high-consuming OTF queries could represent if all of them ran on the same instance.

Furthermore, QueryGrid can also connect VantageCloud instances with on-prem.

Automatically detect new OTF tables

If you add a new OTF table in the Data Fabric and don’t use the Teradata engine, Vantage automatically detects it without additional human intervention. Other engines do require the administrator to manually register the new table in their metadata. See the article Apache Iceberg, Delta Lake and Various Engines.

This feature is not a performance enhancement. However, it helps handle complex Data Fabrics in a way similar to the Evolution features already included in the different OTF flavours. If for every new table your organisation creates, you have to hand-make external tables in the engines, it poses a risk to operations or analysis of critical information. This situation is particularly concerning if your organisation has a central Data Fabric that houses all your data, and if operations are complex and involve many teams. Fortunately, it is not an issue in the case of Vantage

Note that Teradata doesn’t need to create any foreign tables for the optimiser to detect OTF tables; it queries the catalogue to retrieve any information it needs.

Furthermore, Vantage users can query tables in the connected Data Fabric via the DBC view DatalakeInfoV. See Data Discovery using Database Views and HELP Statements in the Teradata online documentation.

To Know More

If you need to know the OTF features, OTF projects and catalogues that Teradata supports in Vantage, navigate to the Getting Started guides, in the Teradata online documentation, and click on Getting started with the Open Table Formats. This tile will always take you to the latest version of the Teradata® Open Table Format for Apache Iceberg and Delta Lake User Guide.

In addition to the supported features and projects, the Teradata OTF User Guide explains how to use and tune Vantage when reading and writing from OTF workloads.

I added that the VARCHAR length must be expressed in bytes in the QueryBand expression on 22 September 2025.

The “Automatically detect new OTF tables” section was included on 15 April 2026. Furthermore, the remark about the proprietary OTF connection built into Vantage was added to correct that Vantage no longer uses the Java APIs.

Reference

Teradata. (n.d.). Teradata® Open Table Format for Apache Iceberg and Delta Lake User Guide on the Teradata online documentation website. Retrieved in August 2025, from https://docs.teradata.com/r/Enterprise_IntelliFlex_Lake_VMware/Teradata-Open-Table-Format-for-Apache-Iceberg-and-Delta-Lake-User-Guide