This project involves building a data pipeline that automates the process of fetching currency exchange rate (Forex) data from an API, processing, storing it in a SQL database, and accessing it for analysis. The pipeline is built as a python module called `data.py`, composed of two primary Python classes:
`API` and `SQLRepo`, each responsible for specific tasks within the pipeline.
Below is a detailed diagram representing the flow of data through this pipeline, from API call to database storage and retrieval.
1. `API` class
The `API` class interacts with the Alpha Vantage API to retrieve currency exchange data between two specified currencies, such as EUR to USD. After receiving the data, it validates, cleans, and structures it into a pandas DataFrame, making it ready for further processing or storage.
Figure 1: `API` class and part of `get_api_data' method
2. `SQLRepo` class
The `SQLRepo` class is designed to handle the storage and retrieval of the processed data. It connects to a SQLite database, where it creates a table based on the currency pair, and loads the data from the DataFrame into this table. Additionally, it allows querying the database to retrieve the stored data back into a DataFrame for analysis or reporting.
Figure 2: `SQLRepo` class and its methods
Conclusion
As a custom Python ETL pipeline, the `data.py` module provides a class `API` that EXTRACT and TRANSFORM foreign exchange data from the Alpha Vantage API and returns it as a clean Pandas DataFrame. class SQLRepo handles LOAD - from the dataframe to a SQL database; and pulling data from the SQL database (used here as Data Warehouse) for further processing/analysis.
Check out the editable code from my github repo.
Python Data Manipulation
API Integration
Automation
SQLite
SQL