Data analysis

Large Language Models can be a powerful tool for data annotation and classification. In the next video, we will discuss this in more detail before moving to the more advanced data analysis.

Data annotation and classification

Advanced data analysis

When it comes to more advanced data analysis using LLMs, I have bad news and good news for you. The bad news is that you will need some programming skills to do this, as most advanced features are only accessible through APIs not graphical interface. You will also need programming to enable automation and create more complex workflows.

But the good news is that it has never been a better time to learn programming—thanks to large language models themselves. I believe there have traditionally been two main barriers to learning programming, and LLMs help overcome both of them.

The first barrier is debugging—the time-consuming and extremely frustrating process of finding and fixing errors in code. There was an interesting paperAn Exploratory Study of Debugging Episodes (Alaboudi and LaToza, 2021)  that analysed recordings of actual programming process. As you can see on the plot below, for some of the participants most of their time was spent just on debugging. And note that these were professional programmers, for beginners this could be worse and might completely demotivate them from learning.

image

Today, you can simply copy-paste your code and error messages to LLMs, and they‘ll typically identify the problem and suggest fixes.

The second barrier is the long time before being able to apply new programming skills to actual research tasks. You might not have the luxury of spending months on basic exercises before getting to practical applications. And again it can be very demotivating. However, LLMs are able to generate high-quality code themselves. And it might be sufficient to have some basic understanding to be able to run the generated code and understand its logic. You’ll still need to learn programming to effectively verify and troubleshoot the code, but you can start applying it to real research tasks much more quickly.

In the following videos I’ll demonstrate how you can perform advanced data analysis using LLMs. The approach requires programming, however, all code will be generated by LLMs. In the first video I just copy-paste code from a typical chatbot interface and in the second I use a more efficient tool that further simplifies the process and reduces the required time. These two recordings are experimental because I was recording myself doing this live. But I think it might be interesting to see real-time unedited process of working with LLMs.

Advanced data analysis with LLMs in 30 minutes

Advanced data analysis with LLMs in less than 20 minutes