Previously we looked at Getting Started using Databricks Community Edition and discovered to use the Assistant features this isn’t supported in Community Edition.
Prerequisites
- Setup a paid or free trail. See how we setup the free trial using Microsoft Azure.
- CSV data you want to query.
If you have set followed the same steps you should now be able to see the Assistant and generate with AI options in your Notebook
Example 1 : Import a csv and query it
In this example we will attempt to recreate what we did in the getting started post but only using the assistant for guidance.
You commonly want to import and query data. Let’s start with asking “I want to import and query as CSV file. What do i need to do”
Copying this to my notebook and trying it
There’s a few issues here.
- The code supplied is an invalid format with the extra tab indenting. That’s easily identified by the assistant to help me fix my code
- We can’t actually use our local file with this code. At first you might try a bunch of different file path syntax, make sure it’s a valid file on your computer etc. At no point did the assistant get me to the point of telling me it can’t actually upload from my local computer.
So let’s assume we worked out maybe it needs to be on the web. This time we’ll be a bit more specific and give it the url in the question
Trying this out gives the following
There’s a few issues here:
- First step seems to work.
- 2nd step fails. Tried a few diagnose etc but the first problem was user error not reading the instructions close enough. Need to change the step to use SQL
At this point this isn’t looking promising. What prompts would you have used a first start?
Example 2—Finding and querying the immigrant table
This example will assume you already have some data in Databricks.
If you want a quick way to get your favourite CSV check out our post here.
The goal of this example will be to get assistance to query data in the following table
First we’ll take their suggestion of finding tables
It doesn’t give us our table immigrant though. If you have an idea of the schema you’re looking you can hone in on your exact table and get some example SQL:
Which you can paste in and run successfully
I did try a few different ways to find the table using natural language which tended to give me SQL to search the catalog rather than giving me suggestions for the correct table I might want to query. This would be useful when you have lots of data and don’t know exactly where everything is.
Example 3—Querying the immigrant table
For this set of assistant queries we’ll assume you know the exact table and the data you want to get out of it.
For these I found the best way was to ground it to the specific table by being very specific e.g. How do i find how many immigrants were on the Cornwall in ‘db_workspace.opendata.immigrants’
I did notice however it seems take a guess rather than use the real table when suggesting a query. My example would be to find what where the biggest family groups on each ship every year.
This makes up a bunch of fields that don’t exist and won’t give you the result you want.
Example 3 – Fixing syntax errors
When people first start with SQL it’s not obvious how the column name selection works. In this case we have field name siwth spaces e.g. Last name, Given names.
Here is a common mistake people make:
Asking it to diagnose in this instance assumes you wanting to display Given Names as a string and gives an incorrect suggestion
Similar to other assistants I find the best way to “reset” it’s errant thinking is to clear the history and ask for the fix again. This time it gives us the correct answer. Note for all of that that have been in SQL Server land for a long time and would have expected the answer to be [Given names] instead – the below is in fact the correct answer.
Conclusion
For a complete novice I feel the assistant solely would not be enough. You have to know the right questions to ask and know enough about the platform to know when it’s completely off track.
For someone with some knowledge and happy to also use traditional problem solving, searches, tutorials, forums etc. I feel the Assistant can give you some additional help. If the help keeps you in the same context and not having to jump between websites, tabs etc it’s going to make you more productive.
I’m curious to hear what your experiences have been with the Assistant so far?