Generate and stream synthetic dataset files in {JSON Lines} format (currently using google/gemma-2b-it)
{JSON Lines}
Disclaimer: LLM data generation is an area of active research with known problems such as biased generation and incorrect information.