Data Pools

Data Pools are where you upload and store your data for your text generation needs. Each data pool has its own upload storage, preprocessor and objects storage.

Why Multiple Data Pools?

A pipeline can have multiple data pools, and each data pool is isolated from the others. This enables you define different processes for different types of data and retain separation between them.

You usually only want to have multiple data pools if you have multiple types of data with differing structure and meaning. You usually have a primary data pool that contains all the objects you want to generate text from and optionally additional secondary data pools that you can pull data from in the fanout script to enrich your primary data. Read more about enrichment here.

Having separate data pools saves you from having to combine your data before upload and allows you to only update relevant data when you need to.

Some examples for different sets of data pools could be:

Products + Brands + Categories
Hotels + Cities + Countries

Autogenerating Text

If you enable autogeneration on a data pool, each inserted and updated object will automatically start a new fanout.

Studio vs Cockpit

Studio does not need collections. You can store all data you need for your different text generation needs in a single data pool. See fanout for how to generate different texts, in different languages, from the same pool.

Data Pools ​

Why Multiple Data Pools? ​

Autogenerating Text ​

Data Pools

Why Multiple Data Pools?

Autogenerating Text