C3 AI Data Studio

C3 AI Data Studio provides data engineers, data scientists, and business analysts with a low-code / no-code environment to rapidly explore disparate data sources and model them into a unified, common, and extensible data image. Built using the model-driven architecture of the C3 AI Platform, C3 AI Data Studio provides over 200 pre-built data integrations to access and unify structured, unstructured, image, application, file system, database, sensor, and other data. Engineers and Scientists can also author their own integrations.

​C3 AI Data Studio enables users to configure and test data ingestion pipelines by inferring metadata and offers AI-assisted auto-mapping. Users can configure data ingestion pipelines by leveraging the C3 AI Platform’s expression library or write custom transforms in JavaScript, Python, or R. C3 AI Data Studio allows users to explore data models, set up validation checks to ensure data quality, and define access control policies.​

Low-code / no-code platform for data integration​

  • Leverage C3 AI Data Studio’s low-code / no-code environment to set up continuous data ingestion pipelines to read data from external data stores and persist data into relational, key value, or file storage systems
  • Create reusable modular multi-step transforms between source and target data stores​
  • Manually inject large datasets into ingestion pipelines for testing or persisting purposes

Unified Data Image

Unified data image

  • Explore, navigate, and visualize your data on a unified data image using C3 AI Data Studio
  • Make your enterprise and external data available for consumption through a common presentation layer
  • Present your data uniformly, abstracted from all the complexities such as data type (relational vs. time series), data source (internal vs. external), or data store type (key value vs. object storage)
  • Add data models either by manually defining the attributes or by automatically inferring metadata from existing sample files or external data sources

Data virtualization​

  • Use C3 AI Data Studio’s data virtualization mechanism to leverage existing enterprise data lakes
  • Reduce data center and cloud hosting costs by minimizing data replication across enterprise data stores​
  • View a unified data model abstracted from the underlying source system implementation. C3 AI Data Studio infers metadata from source systems end presents users with an extensible data model

Data Virtualization

Enterprise catalog

C3 AI Data Studio enables developers to access all relevant metadata on data objects, features, and machine learning models through an enterprise catalog.​

  • Capture technical and business metadata about data and machine learning models via a flexible and powerful cataloging system​
  • Discover and trace lineage from source systems to machine learning models for an end-to-end view of your data​
  • Easily track data sources, ownership, and usage for further analysis and extended collaboration
  • Accelerate data discovery by applying custom tags to the metadata

Pre-built connectors​

C3 AI Data Studio provides over 200 pre-built data integrations to access cloud and on-premise data sources without having to develop any custom integrations.​

  • Databases and big-data stores including Snowflake, Impala, HBase, Postgres, CosmosDB, MongoDB, Oracle, AWS RedShift, SQL Server​
  • Cloud applications including Salesforce, HubSpot​
  • Queue-based systems including Apache Kafka, Azure Event HUbs, Azure Topics, AWS Kinesis, AWS SQL​
  • File systems including AWS S3, Azure Data Lake Store gen2, Azure Blob, HDFS, local file system​

Pre-built connectors

Continuous data ingestion validation

Continuous data ingestion validation

  • ​Set up simple validation rules to ensure data quality on each C3 AI Model – check for data type, nullability, and set of allowed values
  • Leverage the expression engine to validate objects by their own data or by other correlated data inside the unified data image
  • Detect and report any unexpected transformation errors in the data ingestion pipeline for further investigation

Data exploration

  • Explore data across C3 AI Models through a common interface
  • Rapidly access and filter data regardless of the underlying system – source data stores, file storage systems, or object storages

Data Exploration

Extensive Transformation Engine screen shot

Extensive transformation engine​

  • Choose from more than 150 pre-built expressions to map source data to target data models
  • Define your own re-usable custom expressions and share across development teams
  • Implement code-based transform methods using JavaScript, Python, and R to handle complex data transformations
  • Leverage AI assisted auto mapping to reduce the time to create mappings between source data and the unified data image