AI Blueprint for Modern Data Analysis
After a recent technology panel at HKU, I've had aspiring data analysts and students reach out about navigating today's tough job market. Today, I will walk you through a simple method of how I consolidate data and perform analytics using APIs and Python.
The Data Analyst's AI Blueprint
A data analyst's core workflow follows a clear path that becomes significantly faster and more powerful with AI. This process begins with defining objectives, followed by the crucial steps of data collection → cleaning → preparation. From there, the analyst can proceed to data analysis to find patterns and ultimately interpret and visualize those findings to generate compelling narratives.
Checklist: What You'll Need
Before you begin, ensure you have the necessary tools: Python, installed with key libraries such as pandas, requests, and transformers; secure API keys for any data sources you plan to use; and a data visualization tool like Tableau Desktop or Power BI Desktop. For students and non-coders, large language models (LLMs) like Gemini or Perplexity are invaluable for generating the necessary command lines to get your environment set up quickly.
Step 1: Data Collection & Consolidation
My initial method of manually listing data sources in a CSV file proved inefficient due to frequent access errors. Therefore, using APIs (Application Programming Interfaces) became the preferred solution. APIs function like agents, retrieving data from sources instantly and in a clean, organized format. For this guide, I use APIs from CoinGecko, Financial Modeling Prep (FMP), and Alpha Vantage to gather stablecoin data since they offer free tiers. With your initial source list in a CSV file and a Python script prepared in VS Code, you can execute a command in the terminal that will automatically extract data from these APIs and save the raw results to a new CSV file and get it ready for the next steps.
Step 2: Data Cleaning
Now for the core of the analysis! I will use Python in VS Code to preprocess, clean, and enhance my data. A quick tip again: you don't have to be a master coder. Large Language Models (LLMs) are great for generating Python scripts, though you should still understand the basics to identify and fix errors. First, we'll use the pandas library, a powerful tool for data cleaning. We can use it to load our raw data, format dates, handle missing values, and structure everything into a clean table, which is essential for visualization tools like Tableau. Running this script via the terminal will auto-generate a cleaned CSV file in your project folder.
Step 3: Enhancing Data with AI
Next, I can dive deeper into the data using AI. By installing the transformers library and leveraging an AI library like Hugging Face Transformers, I can now perform Natural Language Processing (NLP). Using the clean data, I can create a new Python script to extract powerful insights such as market sentiment scores, sentiment labels, and specific answers to questions from the text data. After running this script, the CSV file will be updated with these new, AI-generated insights with a much richer dataset to explore.
Step 4: Visualization with Tableau
With the AI-enhanced data, the final step is to create compelling dashboards and charts. By connecting the complete CSV file to a tool like Tableau or Power BI, I can easily drag and drop the raw table’s data into charts, graphs or dashboard. In my case for this specific example (refer to my vlog), I can produce charts like a stablecoin market cap over time, a sentiment distribution bar chart or a map of regulatory documents by jurisdiction. The AI-powered insights, now integrated into my data, make it simple to understand stablecoin growth, sentiment shifts, and the origins of new regulations, transforming raw data into an easily digestible narrative for any audience.
Key Takeaways
So, what have we learned about becoming a data analyst in the age of AI? Data is your raw material, and Python is your superpower, with libraries like pandas for cleaning and transformers for AI becoming essential tools. In today's fast-evolving AI landscape, knowing how to streamline your workflow with AI is not just an option, but an emerging necessity to succeed in banking and a vast range of other industries. The best way to master this is to practice, experiment with APIs, and play around with free tools like Tableau Public. Leveraging AI in your analysis journey can be a powerful edge, and if you want to see all these steps in action, check out my latest vlog for a full walkthrough.