This presentation delves into creating a real-time analytics platform by leveraging cost-effective Change Data Capture (CDC) tools like Debezium for seamless data ingestion from sources such as Oracle into Kafka. We’ll explore how to build a resilient data lake and data mesh architecture using Apache Flink, ensuring data loss prevention, point-in-time recovery, and robust schema evolution to support agile data integration. Participants will learn best practices for establishing a scalable, real-time data pipeline that balances performance, reliability, and flexibility, enabling efficient analytics and decision-making.
I want to show you the principles behind the most popular DBMS, enumerating non-trivial use cases like self-containing dynamic reports, WebAssembly support and pure serverless DB hosting. While SQLite works well with aggregated datasets, DuckDB, its younger cousin, focuses on full OLAP support, allowing processing gigabytes of data in no time on low-end boxes (or even laptops). We will browse various useful features and interfaces in DuckDB, emphasizing scenarios that anyone can implement in their daily work.
In this workshop, we’ll introduce the key components of a multitier architecture designed to scale and streamline LLM productization at Team Internet—a global leader in online presence and advertising, serving millions of customers worldwide. For us, scalability and speed are critical to delivering high-performance services, including LLM applications.
Through hands-on coding exercises and real-world use cases from the domain name industry, we'll demonstrate how standardization enhances flexibility and accelerates development.
By the end of the session, you’ll know the keybuilding blocks that will help you efficiently building and scaling LLM applications in production.
Basic familiarity with LLMs, API integrations, and software engineering principles is recommended.