ChatGPT Data Science Prompts & Generative Agents                                                                                                                                                                                                         [Open in app]( or [online]()
[ChatGPT: Automated Prompt Scoring, Dolly 2.0 & Meta AI’s Segment Anything Model (SAM)]( ChatGPT Data Science Prompts & Generative Agents Apr 14
Â
[Share](
 👋 Hey, "Success in creating artificial general intelligence could be the biggest event in the history of our civilization. Or the worst. We just don't know." -[Max Tegmark, Physicist, Cosmologist and Machine Learning Researcher.]( The race towards creating artificial general intelligence could be the most significant event ever, or the worst, a risky bet with unpredictable consequences that make the world truly anxious. This week on [DataPro#39]( we delve into the exciting possibilities of fine-tuning Large Language Models to achieve significant breakthroughs in the evolutionary timeline. We also feature [valuable insights from AI experts on data privacy]( as well as connecting LLMs with [HuggingGPT]( [Dolly 2.0 from Databricks]( and building [Streamlit apps](. This Week’s Key Highlights: - [ChatGPT-Data-Science-Prompts]( - [Generative Agents: Stanford's Groundbreaking AI Study]( - [ChatGPT: Automated Prompt Scoring]( - [Meta AI introduces Segment Anything Model (SAM)]( If you’re interested in sharing ideas to foster the growth of the data community, then this survey is for you. Consider sharing your thoughts and get a FREE bestselling Packt book, The Applied Artificial Intelligence Workshop as PDF. Jump on in! [TELL US WHAT YOU THINK]( Cheers,
Merlyn Shelley Editor in Chief, Packt Recent Forks on GitHub & Hugging Face Models - [trl-lib]( StackLLaMa is a 7 billion parameter language model based on [Meta’s LLaMA model]( that has been trained on [pairs of questions and answers]( from [Stack Exchange]( using Reinforcement Learning from Human Feedback (RLHF) with the [TRL library](. - [databricks/dolly-v1-6b:]( Dolly 2.0 is a 12B parameter language model based on the [EleutherAI]( [pythia]( model family and fine-tuned [exclusively on a new]( high-quality human generated instruction. - [microsoft]( LMOps is a research initiative on fundamental research and technology for building AI products w/ foundation models, especially on the general technology for enabling AI capabilities w/ LLMs and Generative AI models. - [opennars]( Open-NARS is the open-source version of [NARS]( a general-purpose AI system, designed in the framework of a reasoning system. - [microsoft]( JARVIS, a system to connect LLMs with ML community. Check out the paper: [HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace.]( - [travistangvh]( In this repository, you will find 60 useful prompts that can be used with ChatGPT for data science purposes. [Pledge your support]( In the Industry AWS - [Build Streamlit apps in Amazon SageMaker Studio:]( [Streamlit]( is an open-source Python library that simplifies web app development for machine learning and data science. It's ideal for showcasing findings, deploying trained models, and getting feedback. With [Amazon SageMaker Studio]( IDE, you can build and host [Streamlit]( apps securely without front-end development. You can find the code here on [GitHub](. - [Deploy large language models on AWS Inferentia2 using large model inference containers:]( As ML models improve, they grow in size and generalize better, like transformer models. [Amazon SageMaker]( [deep learning containers]( (DLCs) are used to deploy large models on GPU instances. This post explains how to host models on [AWS Inferentia2]( using the [AWS Neuron]( SDK and [DJLServing]( for inference. Google Cloud - [Active Directory Diagnosis Tool for Cloud SQL:]( Are you using SQL Server on Google Cloud and facing challenges with Windows authentication, particularly when integrating with on-premises Active Directory domains via Managed Active Directory? Check out this PowerShell script on [GitHub]( that runs on your on-prem AD domain controller or domain-joined Windows VM to perform necessary checks. - [Announcing the public preview of BigQuery change data capture (CDC):]( Google announces public preview of [BigQuery change data capture (CDC)]( alongside existing [Datastream for BigQuery]( solution, allowing seamless data replication from relational databases to BigQuery. Enables ELT event post-processing, [GDPR compliance]( [data wrangling]( and replication of transactional systems. Just for Laughs! Why did the LLM refuse to play cards with the humans? It said, "I'm tired of shuffling through all those rules, I prefer sorting algorithms!" Streamline your Data Strategy with Experts [Creators of Intelligence]( – By [Dr. Alex Antic]( Thanks for all your feedback last week! As promised, we're excited to provide you with a sneak peek of insights from Australia's former Human Rights Commissioner, [Edward Santow](. Alex Antic: We spoke briefly about privacy before. When it comes to privacy and AI, can the two coexist? What are some of the main challenges that are posed regarding privacy when we talk about AI adoption? We've seen some of them already in society. Are there others that we need to be more aware of? Edward Santow: There are three things for me that immediately leap out. There's privacy at a very individual level: your personal information is being used against you, which we talked about before. There's privacy at a more societal level: the cumulative effect of a whole bunch of AI, such as facial recognition AI, which I'm particularly interested in at the moment, as it can lead us toward changing the nature of society into one that essentially is governed by a mass-surveillance state, which is worrisome. The third thing is that the way in which we are using information, particularly personal information, is changing. [Find out more here!]( This exclusively curated content is extracted from the upcoming book “[Creators of Intelligence]( by [Dr. Alex Antic.]( [EXPLORE NEW IDEAS & READ ON!]( Find Out What’s New? - [Generative Agents: Stanford's Groundbreaking AI Study Simulates Authentic Human Behavior:]( Stanford AI researchers introduce [Generative Agents]( computer programs using generative models to simulate human behavior in a virtual sandbox world. The study observes agents planning parties, running for mayor, and retaining memories with embellishments and different perspectives, showcasing remarkable behaviors. - [An Easy Way to Speed Up your dbt Runs on BigQuery:]( dbt has a "[threads]( configuration feature in profiles.yml file to determine the maximum number of models it can build concurrently, aiding parallelization in the directed acyclic graph [(DAG)](. This can optimize performance for large projects with many models and speed up results for stakeholders by [running up to 100 concurrent queries]( on BigQuery. - [Auto-Sklearn: How To Boost Performance and Efficiency Through Automated Machine Learning:]( AutoML, exemplified by [Auto-Sklearn]( automates machine learning tasks, such as data preprocessing, model selection, [hyperparameter optimization]( and ensemble building. This blog talks about Auto-Sklearn 2.0 which introduces improvements like early stopping, improved model selection strategies, and automated policy selection through meta-learning, making it more efficient and user-friendly. - [ChatGPT: Automated Prompt Scoring:]( Using Python, you can objectively choose and improve your ChatGPT prompts by generating scores and feedback for each iteration. As domain specific LLM models like [BloombergGPT]( and [Copilot]( become more common, prompt tweaking becomes essential for optimal performance. With automated prompt testing, you can iterate quickly and make informed decisions as prompt behavior may change with upgraded models like GPT-4. - [Pandas and Python Tricks for Data Science and Data Analysis:]( Pandas offers various methods to replace missing values, including imputation and advanced techniques. Another option is to combine the benefits of both Pandas and SQL statements using the str.extract() function, which can extract structured information from unstructured textual data. [Watch the video]( for more details. - [4 Ways to Do Question Answering in LangChain:]( LangChain is an open-source tool that allows you to chat with your own documents, including text files, PDFs, and websites. It supports various [Document Loaders]( like Notion, YouTube, and Figma, making it easy to do question answering. LangChain simplifies interaction with language models and facilitates building applications. You can learn more from the [video here](. Also, make sure to check out the [code here](. - [Meta AI Introduces Revolutionary Image Segmentation Model Trained on 1 Billion Masks:]( OpenAI's ChatGPT made revolutionary progress in NLP, and now Meta AI introduces Segment Anything Model (SAM) for image segmentation with a dataset of 1 billion masks on 11 million images. SAM is promptable and uses prompt embeddings to generate masks. [Google Colab]( can be used to [experiment with the algorithm]( for image segmentation. Sponsored Content Looking to grow your Data team with Expert Talent? [TeamEpic]( presents a novel approach to working with top-notch Data Science and Prompt Engineering Talent. They rigorously vet top Indian talent, train them with the help of experienced mentors on real-world projects, and work with organizations looking to expand their tech capabilities (at less than 1/3rd the cost of the average US hire). Plus, they offer a 30-day free trial to work with their talent, giving you a window to assess before committing. Interested? Get in touch with TeamEpic [here](. As a GDPR-compliant company, we want you to know why you’re getting this email. The _datapro team, as a part of Packt Publishing, believes that you have a legitimate interest in our newsletter and its products. Our research shows that you opted-in for email communication with Packt Publishing in the past and we think your previous interest warrants our appropriate communication. If you do not feel that you should have received this or are no longer interested in _datapro, you can opt out of our emails by clicking the link below.  [Like](
[Comment](
[Restack]( Â Read Packt DataPro in the app
Listen to posts, join subscriber chats, and never miss an update from Packt SecPro.
[Get the iOS app]( the Android app]( © 2023 Copyright © 2022 Packt Publishing, All rights reserved.
Our mailing address is:, Packt Publishing
Livery Place, 35 Livery Street, Birmingham, West Midlands B3 2PB
United Kingdom
[Unsubscribe]() [Start writing]()