A Collection of Top Data Engineering, AI, and ML News, Articles, Presentations May 2024 [InfoQ]
Data Engineering, AI, and ML Round-Up Sponsored by
[RavenDB] [Latest Content](#latest-content), [Top Viewed Content](#top-viewed-content), [Top News](#news), [Top Articles](#top-articles), [Top Presentations](#top-presentations-and-interviews) In this special newsletter we bring you up to date on all the new content and news related to Data Engineering, AI, and ML on InfoQ. We are also maintaining a portal page for this content on InfoQ at: [(. [] Latest Content on InfoQ [NVIDIA Announces Next-Generation AI Superchip Blackwell]( (news, Apr 09, 2024)
[Microsoft Announces Garnet: a New Open-Source Cache-Store and Redis Alternative]( (news, Apr 06, 2024)
[Transactional Serverless Computing: PostgreSQL Creator Announces DBOS Cloud]( (news, Mar 30, 2024)
[Databrix Announces DBRX, an Open Source General Purpose LLM]( (news, Mar 30, 2024)
[CNCF Incubates Strimzi to Simplify Kafka on Kubernetes]( (news, Mar 24, 2024) [The Definitive Guide to Designing a Responsive Data Architecture](
This guide distills some of the essentials you need to design a responsive data architecture. Learn how to identify the strengths & weaknesses of different database technologies, how to design & optimize queries for speed and efficiency, the crucial role indexing plays, and more. [Download now](. Sponsored content [The Definitive Guide to Designing a Responsive Data Architecture - Sponsored by RavenDB]( [] Top Viewed Content on InfoQ [The Hidden Cost of Using Managed Databases]( (articles, Mar 06, 2024)
[Mistral AI's Open-Source Mixtral 8x7B Outperforms GPT-3.5]( (news, Jan 23, 2024)
[Google Announces 200M Parameter AI Forecasting Model TimesFM]( (news, Feb 27, 2024)
[OpenAI Releases New Embedding Models and Improved GPT-4 Turbo]( (news, Feb 06, 2024)
[Google Introduces Firestore Multiple Databases]( (news, Feb 24, 2024) [] Top News [Data Solutions Framework: an Open Source Project for Building Data Solutions on AWS](
AWS recently released the Data Solutions Framework (DSF), an opinionated open-source framework designed to accelerate the creation of data solutions on AWS. Built using the AWS CDK, the framework exposes abstractions and patterns as building blocks for constructing data solutions and is available in TypeScript (npm) and Python (PyPi). [Google BigQuery Introduces Vector Search](
Google recently announced that BigQuery now supports vector search. The new functionality enables vector similarity search required by data and AI use cases such as semantic search, similarity detection, and retrieval-augmented generation (RAG) with a large language model (LLM). [Microsoft Copilot: Copilot Pro, Copilot for Microsoft 365, Copilot GPT and More](
Microsoft has released Copilot Pro and Copilot for Microsoft 365, and is providing free access to those tools for smaller organizations and educational faculty. They also created the Copilot mobile application. Moreover, Copilot is also available in the Microsoft 365 mobile application. [Relational Restrictions: How to Select the Right Database Technology](
The Internet's evolution has challenged relational databases in distributed systems. The surge in data volume and velocity prompted the creation of new database engines. This white paper aims to untangle the options and offer guidance on selecting the right database technology for your requirements. [Download now](. Sponsored content [Relational Restrictions: How to Select the Right Database Technology - Sponsored by RavenDB]( [Pinecone Introduces its Serverless Vector Database](
Pinecone recently announced the public preview of its new serverless vector database, designed to reduce infrastructure management costs while improving the accuracy of generative AI applications. [OpenAI Launches AI Text-to-Video Generator Sora](
Sora is OpenAI's new generative AI model to create videos from textual prompts. Currently in preview, the new model is able to create photorealistic videos up to 60 seconds long leveraging its ability to understand how things exist in the real world and combining multiple shots together without character or style disruption. [] Top Articles [Generative AI: Shaping a New Future for Fraud Prevention](
The article examines how generative AI impacts fraud detection by reducing false positives and adapting to evolving fraud patterns, offering a potent solution when combined with machine learning.
[article]( [Relational Data at the Edge: How Cloudflare Operates Distributed PostgreSQL Clusters](
Discover how Cloudflare leverages distributed PostgreSQL clusters at the edge, tackling challenges like replication lag. The cross-region architecture ensures resilience and quick failovers.
[article]( [Generative AI and Organizational Resilience](
Organizations should empower staff to determine where generative AI makes sense, while building literacy on capabilities and limits. A human-centric, iterative approach will produce the best outcomes.
[article]( [Adding a Natural Language Interface to Your Application](
In this article, author Ashley Davis discusses how to add a natural language interface to a chatbot application and how to extend the chatbot by adding voice commands.
[article]( [Unpacking How Ad Ranking Works at Pinterest](
Aayush Mudgal of Pinterest presented a session at QCon San Francisco 2023 on Unpacking how Ad Ranking Works at Pinterest, showing how Pinterest uses deep learning for targeting advertisements.
[article]( [Beyond Performance Metrics: A Technical Comparison of RavenDB and MongoDB](
One question that we often hear from prospective clients is 'How does RavenDB stack up against MongoDB?' This white paper aims to shed light on the distinct functionalities of the two NoSQL data platforms, highlighting the key features, capabilities and differentiators between the two. [Download now](. Sponsored content [Beyond Performance Metrics: A Technical Comparison of RavenDB and MongoDB - Sponsored by RavenDB]( [] Top Presentations [Redesigning OLTP for a New Order of Magnitude](
Joran Greef discusses TigerBeetle, a new database, and why OLTP has a growing impedance mismatch, why the OLTP workload is becoming more contentious, why row locks, why storage faults, write stalls.
[Joran Greef]( [Graph Learning at the Scale of Modern Data Warehouses](
Subramanya Dulloor outlines an approach to addressing the challenges of warehouses and shows how to build an efficient and scalable end-to-end system for graph learning in data warehouses.
[Subramanya Dulloor]( [Simplifying Real-Time ML Pipelines with Quix Streams](
Tomáš Neubauer discusses Quix Streams, an open-source Python library that helps data scientists and ML engineers to build real-time ML pipelines.
[Tomáš Neubauer]( [In-Process Analytical Data Management with DuckDB](
Hannes Mühleisen discusses DuckDB, an analytical data management system that is built for an in-process use case. DuckDB speaks SQL, is integrated as a library, and uses query processing techniques.
[Hannes Mühleisen]( [Amazon DynamoDB Distributed Transactions at Scale](
Akshat Vig explains how transactions were added to Amazon DynamoDB using a timestamp-based ordering protocol to achieve low latency for both transactional and non-transactional operations.
[Akshat Vig]( [Connect with InfoQ on Twitter]( [Connect with InfoQ on Facebook]( [Connect with InfoQ on LinkedIn]( [Connect with InfoQ on Youtube]( You have received this message because you are subscribed to the âSpecial Reports Newsletterâ. To stop receiving this email, please click the following link: [Unsubscribe]( C4Media Inc. (InfoQ.com),
2275 Lake Shore Boulevard West,
Suite #325,
Toronto, Ontario, Canada,
M8V 3Y3