January 2019
[InfoQ]
Data Engineering Special Report
Sponsored by
[Aerospike]
[Latest Content](#latest-content), [Top Viewed Content](#top-viewed-content), [News](#news), [Top Articles](#top-articles), [Top Presentations](#top-presentations-and-interviews)
In this special newsletter we bring you up to date on all the new content and news related to Data Engineering on InfoQ. We are also maintaining a portal page for this content on InfoQ at: [(.
[] Latest Content on InfoQ
[What Machine Learning Can Learn from DevOps]( (article, Dec 15, 2018)
[Microsoft Announces AI-Assisted IntelliCode for TypeScript and JavaScript in VS Code]( (news, Dec 10, 2018)
[TensorSpace.js Delivers Neural Network 3D Visualization Framework]( (news, Dec 06, 2018)
[Amazon Introduces Intelligent-Tiering for S3 Storage to Automatically Optimize Costs]( (news, Dec 05, 2018)
[Azure Machine Learning Services Now Generally Available]( (news, Dec 05, 2018)
[A NoSQL Database Architecture for Real-Time Applications](
Learn about a new kind of NoSQL database architecture thatâs simple, cost-effective and that delivers speed at scale for real-time applications. This new architecture delivers predictable performance while using up to ten times fewer servers than most other databases. [Learn more](.
Sponsored content
[A NoSQL Database Architecture for Real-Time Applications](
[] Top Viewed Content on InfoQ
[Apache Kafka: Ten Best Practices to Optimize Your Deployment]( (articles, Oct 19, 2018)
[Back to the Future with Relational NoSQL]( (articles, Dec 04, 2018)
[The Evolution of Uber's 100+ Petabyte Big Data Platform]( (news, Nov 10, 2018)
[Scaling Apache Kafka at Pinterest]( (news, Dec 09, 2018)
[Amazon Announces Managed Streaming for Kafka in Public Preview]( (news, Dec 06, 2018)
[] Top News
[Face-api.js: JavaScript Face Recognition Leveraging TensorFlow.js](
Face-api.js is a JavaScript API for face detection and face recognition in the browser implemented on top of the tensorflow.js core API. It implements a series of convolutional neural networks (CNNs), optimized for the web and for mobile devices.
[Google Open-Sources BERT: A Natural Language Processing Training Technique](
In a recent blog post, Google announced they have open-sourced BERT, their state-of-the-art training technique for Natural Language Processing (NLP) . Google has decided to do this, in part, due to a lack of public data sets that are available to developers. In addition, optimizations have been made to Cloud TPUs to reduce the amount of time required for training NLP.
[Netflix Keystone Real-Time Stream Processing Platform](
Netflix recently published a post in their tech blog discussing the design considerations and insights of Keystone, their Real-time stream processing platform. Keystone has been operational since December 2015 and has grown significantly over the years as Netflix subscribers have grown from 65 to over 130 million in the past 3 years. This article follows on the latest state of Keystone platform.
[When, Where & Why to Use NoSQL?](
Download this white paper and learn the biggest challenges of managing big data, database requirements for dealing with big data, and how NoSQL databases address these challenges. [Download Now](.
Sponsored content
[When, Where & Why to Use NoSQL?](
[Redis 5.0 Released with New Streams Data Type](
Redis recently announced version 5 of its popular database, 15 months after the release of Redis 4. Probably the most important feature of this version is the support for a new data type, Streams. Sorted set functionality has also improved and Redis modules have also been expanded, with the introduction of Clusters and Timers APIs. LOLWUT and other improvements are reviewed in the article.
[Concept and Object Modeling Notation for Data Modeling NoSQL Databases](
Ted Hills hosted a workshop at the recent Data Architecture Summit 2018 Conference about data modeling for relational and NoSQL databases. He said that the NoSQL movement helped the database community realize two things. First, not every application needs ACID properties. Second, the tabular data organization is still a good choice for much data, although not for all datasets.
[] Top Articles
[Spark Application Performance Monitoring Using Uber JVM Profiler, InfluxDB and Grafana](
In this article, author Amit Baghel discusses how to monitor the performance of Apache Spark based applications using Uber JVM Profiler, InfluxDB and Grafana data visualization tool.
[Natural Language Processing with Java - Second Edition: Book Review and Interview](
Natural Language Processing with Java - Second Edition book covers NLP topic and various tools developers can use in their applications. InfoQ spoke with co-author Richard Reese about the book.
[Sentiment Analysis: What's with the Tone?](
In this article, authors discuss NLP-based sentiment analysis based on machine learning (ML) and lexicon-based approaches using KNIME data analysis tools.
[Analytics Zoo: Unified Analytics + AI Platform for Distributed Tensorflow, and BigDL on Apache Spark](
We describe how Analytics Zoo can help real-world users to build end-to-end deep learning pipelines for big data, including unified pipelines for distributed TensorFlow and Keras on Apache Spark.
[The Architecture of a Real-Time Operational DBMS](
Years ago, Aerospikeâs engineering team set out to build a distributed database system that handles real-time workloads smoothly and provides a high level of fault tolerance. Learn how they built a high-performance, distributed database to handle the needs of todayâs interactive online services. [Learn More](.
Sponsored content
[The Architecture of a Real-Time Operational DBMS](
[] Top Presentations
[The New Kid on the Block: Spring Data JDBC](
Jens Schauder describes the current state of Spring Data JDBC, its features and some of the underlying design decisions, especially its DDD-based API.
[Big Data and Deep Learning: A Tale of Two Systems](
Zhenxiao Luo explains how Uber tackles data caching in large-scale DL, detailing Uberâs ML architecture and discussing how Uber uses Big Data, concluding by sharing AI use cases.
[Reactive Relational Database Connectivity](
Ben Hale discusses the Reactive Relational Database Connectivity (R2DBC), explaining how the API works, the benefits of using it, and how it contrasts with the ADBC proposed as a successor to JDBC.
[Implementing AutoML Techniques at Salesforce Scale](
Matthew Tovbin shows how to build ML models using AutoML (Salesforce), including techniques for automatic data processing, feature generation, model selection, hyperparameter tuning and evaluation.
[Connect with InfoQ on Twitter](
[Connect with InfoQ on Facebook](
[Connect with InfoQ on LinkedIn](
[Connect with InfoQ on Google Plus](
[Connect with InfoQ on Youtube](
You have received this email because you subscribed to "Top Content and Special Reports Newsletter". To stop receiving weekly updates on trends, please click the following link: [Unsubscribe](
C4Media Inc. (InfoQ.com),
2275 Lake Shore Boulevard West,
Suite #325,
Toronto, Ontario, Canada,
M8V 3Y3