Breaking Barriers: Tabular Synthetic Data for All – Accessible, Ubiquitous, and API-Driven

Synthetic data is revolutionizing AI and testing. This talk explores how an API-first, open-source SDK enables developers to create high-fidelity, privacy-safe tabular data—better than anonymization, scalable across environments, and ready for real-world use.

Data access should not be a privilege. Whether you are a student, researcher, or developer, the ability to work with high-quality, structured datasets should be accessible to all.

This talk explores how Python developers can generate high-fidelity synthetic tabular data that mirrors real-world patterns for AI, analytics, and software testing—without privacy risks.

Attendees will learn:

Why real-world data is often inaccessible and how synthetic data solves this
Why high-fidelity synthetic tabular data is better than anonymization
Path from UI-first to API-First Open-Source SDK (incl. lessons learned)
Live demo: Creating privacy-safe synthetic datasets using the MOSTLY AI SDK
How to evaluate synthetic data quality
Practical applications and the road ahead

Speaker

Michael Druk

Software Engineer (surprise?) An expat living in Austria for over 8 years I enjoy heavy metal music Traveling is nice, too Languages, as well (not just programming, I promise) Sailing is a more recent hobby of mine