Appearance
Pipelines
Pipelines are responsible for the whole text generation process, from uploading data, through generating text, to delivering the results.
Each pipeline is completely isolated from others. You can have completely different data, rules, and results in each pipeline, but you can never reference data or blueprints from another pipeline.
A pipeline consists of:
Data Pools
Data pools help you organize different types of data that should not be mixed. If you only have one type of data, for example products, you only need one data pool. Learn more
Each data pool has its own upload storage, preprocessor and objects storage.
Uploads
The first stop of your (raw) data. Learn more
Preprocessor
Clean and transform your raw uploads with a preprocessor. Learn more
Objects
The data you uploaded and preprocessed, in the form of objects. Learn more
Fanout
Control how each data object is rendered. Learn more
Blueprints
Define rules for text generation. Learn more
Results
See the status of generation and its results. Learn more
TODO API, Settings
Studio vs Cockpit
While pipelines seem similar to projects in Cockpit, they are much more powerful and don't require you to think about collections or worry about what data to upload exactly. In Studio, you should end up with way fewer pipelines than you would have had projects in Cockpit.
Data Upload Flow
The data flow in a pipeline is as follows:
- Uploads: You upload your data to the pipeline. This data is stored in the uploads storage.
- Preprocessor: The preprocessor is run on each upload. It transforms the data into a format that is ready for text generation.
- Data: The preprocessed data is stored in the data storage. Each object in the data storage represents a single unit of data that can be used for text generation.
- If you have autogenerate enabled, each added or updated object automatically starts a text generation flow.
Text Generation Flow
If you either manually inititate text generation by pressing "Generate All", or if you have autogenerate enabled, the following steps are taken:
- Fanout: The fanout script is run for each object that has requested a render. The fanout script returns a list of render job definitions.
- Blueprints: For each render job, the specified blueprint is run with the specified language.
- Storage: The generated text is stored in the results storage.
- Delivery: The generated text is delivered to the user via webhook, if one is specified in the job config.
Through all these steps, you can monitor the progress and results in the results section of the pipeline.