Spending Tracker

Project information

Track your spending

Spending tracker is an original app that enables individuals to track their finances using data from HSBC Bank Statements. The project is a combination of two main domains of work including:

  • Developing the algorithm to understand and process data in a PDF HSBC bank statement and
  • Implementing the algorithm to develop an Android app that enables users to upload statements and be able to track expenses.
Hence, the app is considered relatively complex for Final Year Project. The app makes use of several libraries to achieve some of the functions such as:

The algorithm

Once text is stripped from a bank statement, the data looks like this when printed in a console:

Hence, the goal is to organise the data into a database of key-value pairs. Since the data is untidy, we do not know what any of the data means. Therefore, I use a data mining framework called CRISP-DM for structured data processing.

The purpose of this is to sanitise the data, deal with missing values using appropriate imputation strategies and handling data types.

Simultaneously, this involves identifying patterns in the stripped text to identify what the data means and organise into a database. The app uses several regular expressions (regex) to identify the patterns such as:

  • Lines that begin with a date value: "^[0-9]{2}\s[a-zA-Z]{3}\s[0-9]{2}\s[()a-zA-Z0-9_-].*$"
  • Lines that end with 1 money value: "[,()a-zA-Z0-9_-].*\s[0-9]{1}[0-9]*\.[0-9]{2}$"
  • Lines that end with 2 money values: "[,()a-zA-Z0-9_-].*\s[0-9]{1}[0-9]*\.[0-9]{2}\s[0-9]{1}[0-9]*\.[0-9]{2}$"
  • Identify 'BALANCE BROUGHT FORWARD'

Here is the pseudocode expression of the algorithm:

            
split pages into a list of pages;
for each page{
   get all text in the page;
   for each line in the page{
       if the line matches the start-regex key{
           replace all commas with space
           get carried forward balance amount 
       }
   }
   if the page does not have any transaction data {
      break loop;
   }
   remove all the lines after end-regex key
   remove all the lines before start-regex key

   for each line{
      format date to DD-MMM-YY
      combine broken lines that overflow to next line

      if date is missing{
         copy preceding date
      }

      add comma after each variable

      if balance is missing{
         calculate balance and assign to line
      }
    }

    upload all the transactions in the page to the database
 }
            
          

Android App

The algorithm is implemented on Android Studio using Java. The app uses the standard Model View Controller architecture where activity classes act as controller. The app follows standard conventions where possible such as camelCase variables, class names and file names. The app also uses appropriate components to ensure efficiency such as Fragments for reusable portion of code and RecyclerViews for efficient listing. Finally, the app also uses the latest Material 3 design specifications. The following diagrams aim to visualise the architecture.

Use Case

Demonstration of App (for examiners)

Architecture

Authentication

Main Activity

File Upload

Dashboard

PDF Processing

Transaction Listing

Analytics