Using Databricks Assistant to Get Started with Databricks

posted in: Databricks | 0

Previously we looked at Getting Started using Databricks Community Edition and discovered to use the Assistant features this isn’t supported in Community Edition.

Prerequisites

If you have set followed the same steps you should now be able to see the Assistant and generate with AI options in your Notebook

image

Example 1 : Import a csv and query it

In this example we will attempt to recreate what we did in the getting started post but only using the assistant for guidance.

You commonly want to import and query data. Let’s start with asking “I want to import and query as CSV file. What do i need to do”

Basicquery1

Copying this to my notebook and trying it

image

There’s a few issues here.

  • The code supplied is an invalid format with the extra tab indenting. That’s easily identified by the assistant to help me fix my code
  • We can’t actually use our local file with this code. At first you might try a bunch of different file path syntax, make sure it’s a valid file on your computer etc. At no point did the assistant get me to the point of telling me it can’t actually upload from my local computer.

So let’s assume we worked out maybe it needs to be on the web. This time we’ll be a bit more specific and give it the url in the question

image

Trying this out gives the following

image

There’s a few issues here:

  • First step seems to work.
  • 2nd step fails. Tried a few diagnose etc but the first problem was user error not reading the instructions close enough. Need to change the step to use SQL

 image

image

At this point this isn’t looking promising. What prompts would you have used a first start?

Example 2—Finding and querying the immigrant table

This example will assume you already have some data in Databricks.

If you want a quick way to get your favourite CSV check out our post here.

The goal of this example will be to get assistance to query data in the following table

image

First we’ll take their suggestion of finding tables

image

It doesn’t give us our table immigrant though. If you have an idea of the schema you’re looking you can hone in on your exact table and get some example SQL:

image

Which you can paste in and run successfully

image

I did try a few different ways to find the table using natural language which tended to give me SQL to search the catalog rather than giving me suggestions for the correct table I might want to query. This would be useful when you have lots of data and don’t know exactly where everything is.

Example 3—Querying the immigrant table

For this set of assistant queries we’ll assume you know the exact table and the data you want to get out of it.

For these I found the best way was to ground it to the specific table by being very specific e.g. How do i find how many immigrants were on the Cornwall in ‘db_workspace.opendata.immigrants’

image

I did notice however it seems take a guess rather than use the real table when suggesting a query. My example would be to find what where the biggest family groups on each ship every year.

image

This makes up a bunch of fields that don’t exist and won’t give you the result you want.

Example 3 – Fixing syntax errors

When people first start with SQL it’s not obvious how the column name selection works. In this case we have field name siwth spaces e.g. Last name, Given names.

Here is a common mistake people make:

image

Asking it to diagnose in this instance assumes you wanting to display Given Names as a string and gives an incorrect suggestion

image

Similar to other assistants I find the best way to “reset” it’s errant thinking is to clear the history and ask for the fix again. This time it gives us the correct answer. Note for all of that that have been in SQL Server land for a long time and would have expected the answer to be [Given names] instead – the below is in fact the correct answer.

image

Conclusion

For a complete novice I feel the assistant solely would not be enough. You have to know the right questions to ask and know enough about the platform to know when it’s completely off track.

For someone with some knowledge and happy to also use traditional problem solving, searches, tutorials, forums etc. I feel the Assistant can give you some additional help. If the help keeps you in the same context and not having to jump between websites, tabs etc it’s going to make you more productive.

I’m curious to hear what your experiences have been with the Assistant so far?