HSBC Bank Statements: PDF to Excel file converter

Project information

Simple tool to convert from PDF bank statement to CSV

A program used to create an algorithm to strip data from PDF and process the data. Here is the pseudocode of the algorithm:

            
split pages into a list of pages;
for each page{
   get all text in the page;
   for each line in the page{
       if the line matches the start-regex key{
           replace all commas with space
           get carried forward balance amount 
       }
   }
   if the page does not have any transaction data {
      break loop;
   }
   remove all the lines after end-regex key
   remove all the lines before start-regex key

   for each line{
      format date to DD-MMM-YY
      combine broken lines that overflow to next line

      if date is missing{
         copy preceding date
      }

      add comma after each variable

      if balance is missing{
         calculate balance and assign to line
      }
    }

    upload all the transactions in the page to the database
 }
            
          

See more details about the algorithm on Spending Tracker Application

This program uses File Writer to write all transactions to a CSV file (Comma Separated Values). This can be extremely useful if user simple wants to track expenses by aggregating all transactions together in Excel. This is because Excel provides tons of features such as searching and filtering for transactions.