1. Code Writing

Objective: Write a Python script for data analysis.

Prompt Example:

Use Pandas to load a CSV file.
Perform basic data cleaning (e.g., handling missing values, renaming columns).
Generate a summary of statistical data for numerical columns.
Create a data visualisation using Matplotlib.

Audience: University students and staff with a basic understanding of Python.

Expected Outcome: A clear and functional Python script that can be used as a template for data analysis tasks.

Example User Query:

Generate a Python script that performs the following tasks:

Load a CSV file using Pandas.
Clean the data by handling missing values and renaming columns.
Generate a statistical summary for numerical columns.
Create a data visualisation of the data using Matplotlib.

The audience are University students and staff with a basic understanding of Python. The outcome should be a clear and functional Python script that can be used as a template for data analysis tasks.

ELM’s Response:

Below is a Python script designed to load a CSV file using Pandas, clean the data by handling missing values and renaming columns, generate a statistical summary for numerical columns, and create a data visualisation using Matplotlib. This script can serve as a template for basic data analysis tasks.

Instructions:

Prepare your environment: Ensure you have Pandas and Matplotlib installed. You can install these packages using pip if you don't have them yet:

Save the script: Copy and save the above script into a Python file, for example, data_analysis_template.py.
Run the script: Execute the script from the command line or your preferred Integrated Development Environment (IDE), passing the path to the CSV file you want to analyse.

Notes:

Column Renaming: Modify the df.rename function call to match the actual columns in your CSV file.
Handling Missing Values: The example fills missing values with the mean of the respective columns. Depending on your dataset, you might choose different strategies like median, mode, or dropping missing values.
Visualisation: The histogram example is created for Column1. You can customise this to develop other types of plots and visualise different columns based on the requirements of your analysis.

This template provides a foundation for data analysis and can be expanded with additional data cleaning, analysis, and visualisation steps as required for more complex datasets.

This article was published on 2024-10-08