T-SQL Tuesday #162 Invitation – Data Science in the time of ChatGPT

This is a great opportunity and I am honoured to be hosting this months T-SQL Tuesday blogging invitation. With the invitation of Steve, we have agreed to post topic on Data science.

I will be receiving all of your answers on blog posts and twitter (make sure to add #tsql2sday).

Data Science in the time of Chat GPT

Instead of writing and asking Data science questions, let’s discuss the aspects of Data science with the presence of Chat GPT 4.0.

By now, it is known to everyone that Chat GPT is a language model (LLM – Large Language Model) that is based on the GPT (Generative Pre-trained Transformer) architecture. It uses deep learning algorithms to like neural nets with billions of weights and transformers, that generated the sequence of tokens, that make up a piece of text.Transformers introduce the concept of “paying attention” to generally build better sequence of text. It operates primarily with probabilities of words and their sequence and therefore it is also good for human-like responses to natural language queries, making it great for a conversation-like experience.

There are many of the caveats hidden in the processing of text, adjustments of weights, functions (different and tweaked versions of Relu), additional corpora and billions of text for model training and many additional texts.

I have prepared two groups of questions. And I will not go into debate, if the end of data science is near, nor will go into debate, if the AGI (artificial general intelligence) will completely replace the role of data scientists. What I want to hear from you is simply how did you embrace (if at all) the use of Chat GPT, and what were your first impressions. And mostly, how did it help you (if at all), what did you use it for, and have you encountered any traps?

Usage and working along Chat GPT

Imagine using SQL, R, Python, Julia, or Scala, for your daily data science work. And you can practically ask Chat GPT anything and it will return you a relatively coherent and good answer. If you need an explanation, it will excel. Where and what have you used it for? Here is a short list, that might get you started:

  1. Explain the data science algorithn?
  2. Help tune or create SQL code to query big data
  3. Prepare R, Python, Scala code for exploring the data
  4. Help you prepare the training of the model in desired language
  5. Prepare the code for hyperparameter tunning and cross-validation
  6. Ask for data visualization for given dataset
  7. Help create dashboard
  8. Create code for model deployment, model re-training or model consumption
  9. Ask for preparing custom functions and algorithm/function adjustments?

Now, that you have added and found the list of where and how it did help you, I would like to understand, how did this help you? Feel free to make a general comparison and add some explanations. And lastly, of course, add, if this has in any kind of way compromise your work as a data scientist (in a term of embracing it in – a positive way, or in terms of a negative experience).

Responsible usage

We have seen many controversies around Chat GPT emerge. Some European Union countries have banned it, and some will so be doing it too. And the question is not only its use (as the end of humanity and empathy) but also the misuse of personal data, privacy issues and leaking of relevant, corporate information.

Have you considered responsible usage of Chat GPT? Here is again the short list for helping you:

  1. The use of personal data retrieved from the model
  2. Inserting sensitive (personal or company) data
  3. Explaining the section of R, Python, Scala code, that is the property of your enterprise

Instead of this, have you tried using it more responsibly:

  1. Using pseudo code for explanation of the algorithm
  2. Using mock data rather than real data
  3. Giving pseudo-code in order to receive the documentation
  4. Skipping on sensible data (SQL schema, model information, sensible data)

So which cases have you come across? Did it have any consequences for you? Which other responsible use of Chat GPT have you also done?

My takeaways

ChatGPT offers interesting answers (based on my experience and search), and it is the next step from a google search of Stackoverflow. In other words, it gives you a more focused answer. When exploring and searching forums, you might find several different solutions for a single problem, whereas here, you have to ask for another solution. And respectively, it can give you answer faster, in comparison to browsing the web. In both cases, both sides have their advantages and disadvantages, but non will assure you, that the answer is correct!

I embrace this technology as an additional learning source. But I personally do not use it as my daily driver, despite trying it out a couple of times (with mixed results; working and nonworking/useless/meaningless). It can be super helpful for entry/junior positions, but the more experienced you are, the more abstract data science work you and the more complicated topics you cover, less frequently you will presumably use it.

Advertisement
Tagged with: , , ,
Posted in thoughts, Uncategorized
10 comments on “T-SQL Tuesday #162 Invitation – Data Science in the time of ChatGPT
  1. […] article was first published on R – TomazTsql, and kindly contributed to R-bloggers]. (You can report issue about the content on this page […]

    Like

  2. […] I have skipped a lot of T-SQL Tuesdays, because either I did not have the time to write anything, or I felt I had nothing useful to write. That changes with edition #162, hosted by Tomaz Kastrun (b|t). He invites us to talk about data science in the time of ChatGPT. […]

    Like

  3. […] month is a timely topic, with Tomaz Kastrun hosting. I was lucky to meet Tomaz before the pandemic, and we had a great time at the SQL Saturday as well […]

    Like

  4. Here is my contribution, where I try to assess whether ChatGPT is ready to take over the role as execution plan expert.
    https://sqlserverfast.com/blog/hugo/2023/05/t-sql-tuesday-162-execution-plans-according-to-chatgpt/

    Like

  5. […] month’s T-SQL Tuesday is hosted by Tomaz Kastrun – his call is to write about how we’ve used ChatGPT, and what are ethical issues, if any, that we have […]

    Like

  6. […] month’s T-SQL Tuesday is hosted by Tomaz Kastrun – his call is to write about how we’ve used ChatGPT, and what are ethical issues, if any, that we have […]

    Like

  7. diligentdba says:

    Sorry got tag wrong and forgot to post here yesterday. Here goes. https://curiousaboutdata.com/2023/05/09/t-sql-tuesday-162-data-science-and-chatgpt/ Thanks for hosting!

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Follow TomazTsql on WordPress.com
Programs I Use: SQL Search
Programs I Use: R Studio
Programs I Use: Plan Explorer
Rdeči Noski – Charity

Rdeči noski

100% of donations made here go to charity, no deductions, no fees. For CLOWNDOCTORS - encouraging more joy and happiness to children staying in hospitals (http://www.rednoses.eu/red-noses-organisations/slovenia/)

€2.00

Top SQL Server Bloggers 2018
TomazTsql

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Discover WordPress

A daily selection of the best content published on WordPress, collected for you by humans who love to read.

Revolutions

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

tenbulls.co.uk

tenbulls.co.uk - attaining enlightenment with the Microsoft Data and Cloud Platforms with a sprinkling of Open Source and supporting technologies!

SQL DBA with A Beard

He's a SQL DBA and he has a beard

Reeves Smith's SQL & BI Blog

A blog about SQL Server and the Microsoft Business Intelligence stack with some random Non-Microsoft tools thrown in for good measure.

SQL Server

for Application Developers

Business Analytics 3.0

Data Driven Business Models

SQL Database Engine Blog

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Search Msdn

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

R-bloggers

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

MsSQLGirl

Bringing value to data & insights through experiences users love

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Data Until I Die!

Data for Life :)

Paul Turley's SQL Server BI Blog

sharing my experiences with the Microsoft data platform, SQL Server BI, Data Modeling, SSAS Design, Power Pivot, Power BI, SSRS Advanced Design, Power BI, Dashboards & Visualization since 2009

Grant Fritchey

Intimidating Databases and Code

Madhivanan's SQL blog

A modern business theme

Alessandro Alpi's Blog

DevOps could be the disease you die with, but don’t die of.

Paul te Braak

Business Intelligence Blog

Sql Insane Asylum (A Blog by Pat Wright)

Information about SQL (PostgreSQL & SQL Server) from the Asylum.

Gareth's Blog

A blog about Life, SQL & Everything ...

%d bloggers like this: