Data persistence involves saving data in a non-volatile storage system so that the data’s value can be retrieved reliably later . Data can take many forms, including structured, unstructured, and semi-structured formats, so there are a variety of storage technologies designed to preserve the different types of data in their proper structure, including any metadata that describes the origin, format, or history of that data. Some examples include relational database management systems, key-value stores, NoSQL databases, Hadoop distributed file systems, and cloud data warehouses. Each technology has advantages and disadvantages in the way of cost, performance, reliability, latency, and access methods.
Data persistence is required for data science and machine learning because the fuel for analysis comes from collecting comprehensive data sets that represent historical behavior as well as current operational input. Although data can be stored locally within a data science platform, it more commonly resides on internal or external data stores, or is consolidated into a data lake, or can be accessed from federated virtual data sources.
Through its model-driven architecture, the C3 AI® Suite provides an object model that makes it easy to integrate both internal and external data sources. C3 AI’s platform can store data internally in Postgres or Cassandra data stores, load data files from HDFS or cloud-based storage, or federate data from external stores such as enterprise databases or public data repositories.