A Collection of Top Software Data Engineering News, Articles, Presentations July 2023 [InfoQ]
Data Engineering Round-Up Sponsored by
[Cockroach Labs] [Latest Content](#latest-content), [Top Viewed Content](#top-viewed-content), [Top News](#news), [Top Articles](#top-articles), [Top Presentations](#top-presentations-and-interviews) In this special newsletter we bring you up to date on all the new content and news related to Data Engineering on InfoQ. We are also maintaining a portal page for this content on InfoQ at: [(. [] Latest Content on InfoQ [OpenAI Announces Function Calling, Allowing Developers to Describe Functions]( (news, Jun 15, 2023)
[Meta's Open-Source Massively Multilingual Speech AI Handles over 1,100 Languages]( (news, Jun 13, 2023)
[Microsoft Build 2023: Bing AI and Copilot Plugins for ChatGPT OpenAI, Microsoft Fabric and More]( (news, Jun 02, 2023)
[New Azure Cosmos DB Features to Boost Performance and Optimize Cost]( (news, Jun 26, 2023)
[Google Announces State-of-the-Art PaLM 2 Language Model Powering Bard]( (news, Jun 06, 2023) [Architecting Distributed Transactional Applications (By O'Reilly)](
Learn how to build efficient, elastically scaling, multi-region applications using this blueprint that walks through options, approaches, and best practices for both the application and persistence layers. [Download now](. Sponsored content [Architecting Distributed Transactional Applications]( [] Top Viewed Content on InfoQ [Bloomberg Unveils a GPT Finance-Focused AI Model]( (news, Apr 10, 2023)
[Meta Open-Sources Computer Vision Foundation Model DINOv2]( (news, May 23, 2023)
[JunoDB: PayPal Open Sources Key-Value Store Powering 350 Billion Daily Requests]( (news, Jun 10, 2023)
[HuggingGPT: Leveraging LLMs to Solve Complex AI Tasks with Hugging Face Models]( (news, Apr 13, 2023)
[OpenAI Announces GPT-4, Their Next Generation GPT Model]( (news, Apr 04, 2023) [] Top News [Microsoft Open-Sources Multimodal Chatbot Visual ChatGPT](
Microsoft Research recently open-sourced Visual ChatGPT, a chatbot system that can generate and manipulate images in response to human textual prompts. The system combines OpenAI's ChatGPT with 22 different visual foundation models (VFM) to support multi-modal interactions. [Open Source MongoDB Alternative FerretDB Now Generally Available](
FerretDB, an open-source MongoDB alternative database, recently announced its general availability. Released under the Apache 2.0 license, the project allows developers to use existing PostgreSQL infra to run MongoDB workloads. [Hugging Face Releases StarCoder, the Next-Generation LLM for Seamless Code Generation](
Hugging Face and ServiceNow have partnered to develop StarCoder, a new open-source language model for code. The model created as a part of the BigCode initiative is an improved version of the StarCoderBase model trained on 35 billion Python tokens. StarCoder is a free AI code-generating system alternative to GitHubâs Copilot, DeepMindâs AlphaCode, and Amazonâs CodeWhisperer. [Architecting for Scale (By O'Reilly)](
Learn techniques for building systems that can handle huge quantities of traffic, data, and demandâwithout affecting the quality your customers expect. Architects, managers, directors in engineering and operations organizations will learn how to build applications at scale that run more smoothly. [Download now](. Sponsored content [Architecting for Scale]( [Meta Switches to MySQL Raft to Improve Reliability and Operational Simplicity](
Meta is rolling out MySQL Raft in its data centers to replace its current MySQL semisynchronous databases. The new consensus engine helps operation and allows MySQL servers to take responsibility for promotions and membership. [Running Large Language Models Natively on Mobile and Laptops](
MLC LLM is a new open source project aimed to enable deploying large language models on a variety of hardware platforms and applications. It additionally includes a framework to optimize model performance for each specific use case. [] Top Articles [AIOps: Site Reliability Engineering at Scale](
AIOps can simplify and streamline processes which can reduce the mental burden on employees while improving communication and collaboration between departments.
[article]( [In-Process Analytical Data Management with DuckDB](
DuckDB is an open-source OLAP database for analytical data management with in-process embedding, vectorized processing, and multi-core parallelization.
[article]( [Comparative Analysis of Major Distributed File System Architectures: GFS vs. Tectonic vs. JuiceFS](
Distributed file systems have emerged as dynamic and scalable storage solutions. We explore the architectures of three distributed file systems: Google File System (GFS), Tectonic, and JuiceFS.
[article]( [Reducing Risk by Building for Cloud Portability](
This webinar explores the importance of developing a cloud portability strategy in response to emerging legislations such as PRA and DORA. Key topics include selecting portable clouds, necessary database capabilities, cost considerations, and more. Itâs anticipated that similar risk-reducing legislation will extend to North America in the future. [Live Webinar (July 11th @ 10 AM EST) - Register Now](. Sponsored content [Cloud Concentration Risk]( [] Top Presentations [Operationalizing Responsible AI in Practice](
Mehrnoosh Sameki discusses approaches to responsible AI and demonstrates how open source and cloud integrated ML help data scientists and developers to understand and improve ML models better.
[Mehrnoosh Sameki]( [Orchestrating Hybrid Workflows with Apache Airflow](
Ricardo Sueiras discusses how to leverage Apache Airflow to orchestrate a workflow using data sources inside and outside the cloud.
[Ricardo Sueiras]( [Amazon DynamoDB: Evolution of a Hyperscale Cloud Database Service](
Akshat Vig presents Amazonâs experience operating DynamoDB at scale and how the architecture continues to evolve to meet the ever-increasing demands of customer workloads.
[Akshat Vig]( [Unraveling Techno-Solutionism: How I Fell out of Love with âEthicalâ Machine Learning](
Katharine Jarmul confronts techno-solutionism, exploring ethical machine learning, which eventually led her to specialize in data privacy.
[Katharine Jarmul]( [Connect with InfoQ on Twitter]( [Connect with InfoQ on Facebook]( [Connect with InfoQ on LinkedIn]( [Connect with InfoQ on Youtube]( You have received this message because you are subscribed to the âSpecial Reports Newsletterâ. To stop receiving this email, please click the following link: [Unsubscribe]( C4Media Inc. (InfoQ.com),
2275 Lake Shore Boulevard West,
Suite #325,
Toronto, Ontario, Canada,
M8V 3Y3