Hello Everyone,
Recently I came across a problem in my org to have a system which processes (reads) an excel file super fast and keeps the memory consumption minimal at the same time.
My first and obvious choice was Apache POI library. Which helped me building the system, and it was serving the purpose TILL I was bit by a block. System was not able to handle huge excel files and was breaking down if the file size goes above 20MB.
After spending much time on internet I came across http://poi.apache.org/components/spreadsheet/how-to.html#xssf_sax_api which is based on the idea of processing the underlying xml of excel file and using sax parser for super fast processing.
Obviously given implementation was very limited and was not enough to serve my purpose. Hence I had to enhance it to a point it became very understandable for anyone and useful at the same time.
This implementation is super fast and memory efficient to read huge excel file.
Find full implementation go to my git hub page here.
Just to give you how memory efficient this is I did a test with 23MB excel file-
Picture 1 represents the memory use of a code that uses XSSFWorkbook
Picture 2 represents the memory use of a code that uses SAXParser
Clearly you can see in picture1 it is consuming memory more that 1 GB where as in Picture2 it is consuming around 250MB.
An example code would look something like this.
Hope this sort note helps everyone looking for such solution for reading huge excel files.
The class ExcelFileReader has lots of useful methods make use of them try them out and let me know it helps.
Thanks!
Recently I came across a problem in my org to have a system which processes (reads) an excel file super fast and keeps the memory consumption minimal at the same time.
My first and obvious choice was Apache POI library. Which helped me building the system, and it was serving the purpose TILL I was bit by a block. System was not able to handle huge excel files and was breaking down if the file size goes above 20MB.
After spending much time on internet I came across http://poi.apache.org/components/spreadsheet/how-to.html#xssf_sax_api which is based on the idea of processing the underlying xml of excel file and using sax parser for super fast processing.
Obviously given implementation was very limited and was not enough to serve my purpose. Hence I had to enhance it to a point it became very understandable for anyone and useful at the same time.
This implementation is super fast and memory efficient to read huge excel file.
Find full implementation go to my git hub page here.
Just to give you how memory efficient this is I did a test with 23MB excel file-
Picture 1 represents the memory use of a code that uses XSSFWorkbook
Picture 2 represents the memory use of a code that uses SAXParser
An example code would look something like this.
ExcelFileReader fileReader = new ExcelFileReader();
String path = "";
String sheetName = "";
File file = new File(path);
fileReader.initialize(file);
List
String path = "";
String sheetName = "";
File file = new File(path);
fileReader.initialize(file);
List
Hope this sort note helps everyone looking for such solution for reading huge excel files.
The class ExcelFileReader has lots of useful methods make use of them try them out and let me know it helps.
Thanks!