A Collection of Top AI, ML, and Data Engineering News, Articles, Presentations November 2023 [InfoQ]
AI, ML, and Data Engineering Round-Up Sponsored by
[InfluxData] [Latest Content](#latest-content), [Top Viewed Content](#top-viewed-content), [Top News](#news), [Top Articles and Presentations](#top-articles-and-presentations) In this special newsletter we bring you up to date on all the new content and news related to AI, ML, and Data Engineering on InfoQ. We are also maintaining a portal page for this content on InfoQ at: [(. [] Latest Content on InfoQ [Goldskyâs Streaming-First Architecture for Blockchain Data with Flink, Redpanda and Kubernetes]( (news, Oct 30, 2023)
[Performance and Scale - Domain-Oriented Objects vs Tabular Data Structures]( (presentations, Oct 30, 2023)
[Amazon MSK Replicator: Active-Passive and Active-Active Clusters for Apache Kafka Service]( (news, Oct 29, 2023)
[PyTorch 2.1 Release Supports Automatic Dynamic Shape Support and Distributed Training Enhancements]( (news, Oct 25, 2023)
[Nvidia Introduces Eureka, an AI Agent Powered by GPT-4 That Can Train Robots]( (news, Oct 24, 2023) [An Introduction to Stream Processing](
For big data analytics, stream processing has emerged as a crucial paradigm, reshaping how businesses interact with data. But what is it and why are more businesses using it? This article provides an overview of how stream-processing systems are structured and explores the most popular tools. [Read now](. Sponsored content [An Introduction to Stream Processing - Sponsored by InfluxData]( [] Top Viewed Content on InfoQ [Change Data Capture for Microservices]( (presentations, Oct 04, 2023)
[Hugging Face's Guide to Optimizing LLMs in Production]( (news, Sep 25, 2023)
[Google Open-Sources AI Fine-Tuning Method Distilling Step-by-Step]( (news, Oct 24, 2023)
[PlanetScale's Challenge to Oracle: Forking MySQL and Introducing Vector Search]( (news, Oct 15, 2023)
[OpenAI Announces ChatGPT Voice and Image Features]( (news, Oct 03, 2023) [] Top News [Stability AI Releases Generative Audio Model Stable Audio](
Harmonai, the audio research lab of Stability AI, has released Stable Audio, a diffusion model for text-controlled audio generation. Stable Audio is trained on 19,500 hours of audio data and can generate 44.1kHz quality audio in realtime using a single NVIDIA A100 GPU. [Multi-Modal LLM NExT-GPT Handles Text, Images, Videos, and Audio](
The NExT Research Center at the National University of Singapore (NUS) recently open-sourced NExT-GPT, an "any-to-any" multi-modal large language model (LLM) that can handle text, images, videos, and audio as input or output. NExT-GPT is based on existing pre-trained models and only required updating 1% of its total parameters during training. [Abu Dhabi Releases Largest Openly-Available Language Model Falcon 180B](
The Abu Dhabi government's Technology Innovation Institute (TII) released Falcon 180B, currently the largest openly-available large language model (LLM). Falcon 180B contains 180 billion parameters and outperforms GPT-3.5 on the MMLU benchmark. [Compactor: A Hidden Engine of Database Performance](
To meet the demand for high volumes of data, database designs have shifted to prioritize minimal work during ingestion and querying, with other tasks being performed in the background as post-ingestion and pre-query. This article will describe those tasks and how to run them in a completely different server to avoid sharing resources (CPU and memory) with servers that handle data loading and reading. [Read more](. Sponsored content [Compactor: A Hidden Engine of Database Performance - Sponsored by InfluxData]( [Google DeepMind Announces LLM-Based Robot Controller RT-2](
Google DeepMind recently announced Robotics Transformer 2 (RT-2), a vision-language-action (VLA) AI model for controlling robots. RT-2 uses a fine-tuned LLM to output motion control commands. It can perform tasks not explicitly included in its training data and improves on baseline models by up to 3x on emergent skill evaluations. [AWS Announces the Preview of Amazon CodeWhisperer Customization Capability](
Amazon Web Services has announced the preview of Amazon CodeWhisperer Customization Capability. This new functionality empowers users to fine-tune CodeWhisperer, enabling it to provide more precise suggestions by incorporating an organization's proprietary APIs, internal libraries, classes, methods, and industry best practices. [] Top Articles and Presentations [Simplifying Persistence Integration with Jakarta EE Data](
Jakarta Data simplifies data integration in Java apps, supports polyglot persistence, and unifies Jakarta EE technologies. Jakarta Data is open-source and versatile.
[article]( [Fabricator: End-to-End Declarative Feature Engineering Platform](
Kunal Shah discusses how their ML platform designed Fabricator by integrating various open source and enterprise solutions to deliver a declarative end-to-end feature engineering framework.
[Kunal Shah]( [Back to Basics: Scalable, Portable ML in Pure SQL](
Evan Miller walks through the architecture of Eppo's portable, performant, privacy-preserving, multi-warehouse regression engine, and discusses the challenges with implementation.
[Evan Miller]( [Introduction to Apache Arrow](
Big data requires cheaper memory, leading to advancements in query performance, analytics, and data storage. In this article, you will learn what Arrow is, its advantages and how some companies and projects use Arrow. [Read now](. Sponsored content [Introduction to Apache Arrow - Sponsored by InfluxData]( [Building High-Fidelity Data Streams](
Sid Anand discusses how they built a lossless streaming data system that guarantees sub-second (p95) event delivery at scale with better than three nines availability.
[Sid Anand]( [Ray: the Next Generation Compute Runtime for ML Applications](
Zhe Zang introduces the basic API and architectural concepts of Ray, as well as diving deeper into some of its innovative ML use cases.
[Zhe Zang]( [Connect with InfoQ on Twitter]( [Connect with InfoQ on Facebook]( [Connect with InfoQ on LinkedIn]( [Connect with InfoQ on Youtube]( You have received this message because you are subscribed to the âSpecial Reports Newsletterâ. To stop receiving this email, please click the following link: [Unsubscribe]( C4Media Inc. (InfoQ.com),
2275 Lake Shore Boulevard West,
Suite #325,
Toronto, Ontario, Canada,
M8V 3Y3