Project information
- Category: Android App
- Client: Aston University Project
- Project date: 24 April, 2023
- Project URL: github.com/nisathnasar/Spending-Tracker
Track your spending
Spending tracker is an original app that enables individuals to track their finances using data from HSBC Bank Statements. The project is a combination of two main domains of work including:
- Developing the algorithm to understand and process data in a PDF HSBC bank statement and
- Implementing the algorithm to develop an Android app that enables users to upload statements and be able to track expenses.
- PDFBox Android by Tom Roush for stripping text from PDF
- MPAndroidCharts by Phil Jay for implementing visually pleasing analytics
- Google Firebase for authentication and Real-Time database
The algorithm
Once text is stripped from a bank statement, the data looks like this when printed in a console:
Hence, the goal is to organise the data into a database of key-value pairs. Since the data is untidy, we do not know what any of the data means. Therefore, I use a data mining framework called CRISP-DM for structured data processing.
The purpose of this is to sanitise the data, deal with missing values using appropriate imputation strategies and handling data types.
Simultaneously, this involves identifying patterns in the stripped text to identify what the data means and organise into a database. The app uses several regular expressions (regex) to identify the patterns such as:
- Lines that begin with a date value: "
^[0-9]{2}\s[a-zA-Z]{3}\s[0-9]{2}\s[()a-zA-Z0-9_-].*$
" - Lines that end with 1 money value: "
[,()a-zA-Z0-9_-].*\s[0-9]{1}[0-9]*\.[0-9]{2}$
" - Lines that end with 2 money values: "
[,()a-zA-Z0-9_-].*\s[0-9]{1}[0-9]*\.[0-9]{2}\s[0-9]{1}[0-9]*\.[0-9]{2}$
" - Identify 'BALANCE BROUGHT FORWARD'
Here is the pseudocode expression of the algorithm:
split pages into a list of pages;
for each page{
get all text in the page;
for each line in the page{
if the line matches the start-regex key{
replace all commas with space
get carried forward balance amount
}
}
if the page does not have any transaction data {
break loop;
}
remove all the lines after end-regex key
remove all the lines before start-regex key
for each line{
format date to DD-MMM-YY
combine broken lines that overflow to next line
if date is missing{
copy preceding date
}
add comma after each variable
if balance is missing{
calculate balance and assign to line
}
}
upload all the transactions in the page to the database
}
Android App
The algorithm is implemented on Android Studio using Java. The app uses the standard Model View Controller architecture where activity classes act as controller. The app follows standard conventions where possible such as camelCase variables, class names and file names. The app also uses appropriate components to ensure efficiency such as Fragments for reusable portion of code and RecyclerViews for efficient listing. Finally, the app also uses the latest Material 3 design specifications. The following diagrams aim to visualise the architecture.