Last week I spent time on the Microsoft Azure Databricks booth at the Sydney AI Tour. Not knowing what I’ve have available to show attendees, I wanted to test something simple:
Can the Free Edition of Databricks actually do useful, real-world analytics fast enough to demo live and make people interested in what I’m doing?
Short Answer: yes.
The Setup: Real Data, Not Toy Examples
Instead of synthetic datasets that are never quite real or customer data I can’t show, I pulled from 3 very different sources:
- Fast F1 – detailed Formula 1 telemetry and race data
- Kloppy – I chose soccer – tracking movement and sequences etc
- yfinance – market data for equities
The finance one was a late addition as I ended up speaking to a bunch of banking/insurance type people who were more comfortable chatting about stocks than sport.
What I built a the booth
Using only genie and basic notebooks with no dashboards or pre-built queries and little knowledge of the datasets themselves I just asked questions.
Using Genie some of my favourite questions I could ask….
Fast F1
Visualise the speed, throttle and brake for the top 3 drivers at that Australian Grand Prix
Give me a speed heat map of these drivers at the race
Kloppy – Soccer
Visualise progressive passes
And it told me Top progressive passers:
- Ivan Rakitić: 18 progressive passes
- Lionel Andrés Messi Cuccittini: 15 progressive passes
- Sergio Busquets i Burgos: 13 progressive passes
- Marc-André ter Stegen: 12 progressive passes
- Sergi Roberto Carnicer: 6 progressive passes
Summarise shots on goal:
yfinance – ASX 200 and S&P 500
Visualise ASX top socks 1 month vs 3 months:
ASX Top Dividend companies:
An attendee was after a sunburst graph so we gave this one a go: Stock performance 3 Month returns by sector
Trading volumes as a heat map calendar
Then took a lot of the concepts and asked it to make an executive dashboard (including the radial clock in the blog post header). This one we rev’d on a few times as there was some colour contrast issues that took a few goes for it to fix.
Overall, without much instruction and without me having to write a line of python myself, it would translate that into python and a visual instantly and give me some insights.
For me the shift is you’re no longer building dashboards, you’re building data you can talk to.
What surprised me
A few things stood out:
- Speed – asking questions, creating the python script and execution was fast enough for a live demo on a booth where attention spans are short
- Low friction – the account was easy to set-up and I was using available apis and didn’t run out of “credit” during the shift
- The suggestions if I asked for a more interesting visualisation were pretty good. Most people were only asking for a pie chart so all the heat maps etc were really exciting.