Synthetic data is revolutionizing AI and testing. This talk explores how an API-first, open-source SDK enables developers to create high-fidelity, privacy-safe tabular data—better than anonymization, scalable across environments, and ready for real-world use.
Data access should not be a privilege. Whether you are a student, researcher, or developer, the ability to work with high-quality, structured datasets should be accessible to all.
This talk explores how Python developers can generate high-fidelity synthetic tabular data that mirrors real-world patterns for AI, analytics, and software testing—without privacy risks.
Attendees will learn:
- Why real-world data is often inaccessible and how synthetic data solves this
- Why high-fidelity synthetic tabular data is better than anonymization
- Path from UI-first to API-First Open-Source SDK (incl. lessons learned)
- Live demo: Creating privacy-safe synthetic datasets using the MOSTLY AI SDK
- How to evaluate synthetic data quality
- Practical applications and the road ahead
Speaker
Michael Druk
Software Engineer (surprise?) An expat living in Austria for over 8 years I enjoy heavy metal music Traveling is nice, too Languages, as well (not just programming, I promise) Sailing is a more recent hobby of mine