Spark, big data, waste, MySQL, database, data, data monitoring, data management

Teradata today announced it is adding support for multiple open table formats that make it possible to use any storage class of an object storage system to host data.

In addition, the company is making available a public preview of an artificial intelligence (AI) engine that enables models to take advantage to process data more efficiently in parallel.

Chris Twogood, senior vice president for global marketing for Teradata, said the company will support both the Apache Iceberg and Linux Foundation Delta Lake data formats to enable greater interoperability between its platforms and third-party data repositories, including Amazon Web Services (AWS) Glue, Unity and Apache Hive catalogs.

AWS

As part of this open strategy, Teradata is also adding connectors to platforms and tools such as Airbyte Cloud, Apache Airflow™, and dbt™ (data build tool) to make it simpler to build data pipelines.

Unlike rival providers of data lake platforms, Teradata is not trying to lock customers into one type of open format at the expense of another, said Twogood. IT teams as a result will be able to deploy any type of engine to process data on the Teradata VantageCloud Lake platform, he said.

IT teams will then be able to select any engine to run on this platform based on the price/performance provided, noted Twogood.
Historically, Teradata has made use of a proprietary table format, However, as more open formats have gained traction the need to support what have become de facto standards has increased. IT organizations are embracing these formats to make sure that as the volume of data they process and analyze continues to steadily increase they are not locked into any proprietary platform.

It’s not clear how much of that data adheres to these open formats, but going forward a much larger percentage is going to be stored using formats that enable data to migrate between various repositories more easily.

Data management strategies will naturally vary from one organization to another. Most are moving toward a single data lake, but some are opting to federate the management of data across multiple platforms, either because they prefer to or as part of an effort to eventually standardize on a single data lake. The one thing that is certain is the amount of data that needs to be stored continues to exponentially increase.

The value of that data is often unclear, but in the age of artificial intelligence (AI) more organizations than ever are looking to store data that might one day be used to train various types of models. Teradata is betting many of those models will be hosted directly on its data lake to maximize performance.

In theory, AI tools should make it simpler to manage all that data. In effect, IT organizations will need to invest in AI to enable them to take advantage of AI. The challenge and the opportunity now is determining how best to automate the management of all that data in a way that reduces total costs even as the amount of data being created continues to surge.

Techstrong TV

Click full-screen to enable volume control
Watch latest episodes and shows

Edge Field Day

Click full-screen to enable volume control

SHARE THIS STORY

RELATED STORIES