Fast Displaying Large CSV File, Filter And Save Filtered Data Using SQLite & Java

SQLite CSV file often used for download or upload data from/to database. If table’s records count is large or very large then downloaded data to CSV file will have large in size.
Opening that large CSV file using spreadsheet application requires a lot of memory and CPU resources. Depending on user computer specifications, some consume much time to open.

It is much faster using Notepad++ to open large CSV file but if we want to filter data and save it to another CSV then using Notepad++ will not be efficient especially for some complex filtering criteria.

==> The workaround of this situation is to import that large CSV to SQLite. After importing to SQLite, we can filter it using SQL Query and save it back to CSV file.

According to documentation, SQLite is an embedded SQL database engine. SQLite reads and writes directly to ordinary disk files. A complete SQL database with multiple tables, indices, triggers, and views, is contained in a single disk file.
Think of SQLite not as a replacement for Oracle but as a replacement for fopen(). Its performance is usually quite good even in low-memory environments.

This blog post gives example on basic connection to SQLite, Sql Query, importing CSV, save query to CSV. Also example on using SQLite in Java so you can build your own application.

Basic SQLite Usage

  1. Download SQLite binary executable file from https://www.sqlite.org/download.html
  2. Extract Zip file to a folder. It is recommended to extract on easy founded folder
  3. Open command prompt and go to SQLite folder. In order to connect & create a database ‘db1’, type:
    sqlite3 db1
    This syntax will connect you to database db1 if it is already exist or create db1 if it does not exist. The db1 database is just a file with same name and reside in SQLite directory.
    Remember you must on SQLite directory to run sqlite3 command. If you dont want to be always in SQLite directory then you need to register the SQLite directory on ‘PATH’ of Environment Variable.
    If you already have SQLite Path on Environment Variable then you should specify the database path to connect i.e sqlite3 C:/SQLite/db1. Otherwise a new db1 database will be created in same directory of wherever you do sqlite3 command.
  4. After database ‘db1’ created, please create a table ‘testtable’ for beginner demonstration i.e

    create table testtable (one varchar(50), two int);
    


    Above code will create ‘testtable’ with only two columns, one and two.

  5. Insert records to ‘testtable’ example

    insert into testtable (one, two) values ('hello', 1);
    insert into testtable (one, two) values ('world!', 2);
    

  6. Showing table records & apply filter example

    select * from testtable;
    select * from testtable where two=2;
    


    Show records without column header

  7. Updating & deleting example

    update testtable set two=two+1;
    delete from testtable where two=2;
    

  8. Using dot-command example

    .header on
    select * from testtable;
    


    Show records along with column header

  9. Importing CSV to SQLite example

    .mode csv
    .import C:/work/largeCSV.csv csvtbl1;
    


    Import largeCSV CSV file to csvtbl1 SQLite table.

  10. Saving sql query result to CSV

    .mode csv
    .header on
    .once C:/work/anotherCSV.csv
    select * from testtable;
    

More documentation on SQLite 3 please refer to https://www.sqlite.org/docs.html

Running SQLite command Inside Text File Example

SQLite can run commands inside Text File. We write SQLite commands in text file and then pipe the file to database.
Create a txtImport.txt file with contains:

.mode csv
.import C:/work/largeCSV.csv csvtbl1;


Pipe this file to db1 from windows command prompt inside SQLite directory

sqlite3 db1 < C:/work/txtImport.txt


Above commands will import largeCSV.csv file to table csvtbl1

You can try it by your self using a real large CSV file to import. It is much faster then opening using spreadsheet application.
After imported then we can filter data using SQL query. Off course by using query is very flexible.

In order to save Query result to Csv, please Create a txtExport.txt file with contains:

.mode csv
.header on
.once C:/work/filteredCSV.csv
select * from csvtbl1 where col1 like '%xxx%' and col2=10;


Pipe this file to db1 from windows command prompt inside SQLite directory

sqlite3 db1 < C:/work/txtExport.txt


Above commands will save Query result to filteredCSV.csv file

Implementing SQLite Using Java

These Java examples can be used in order to make application that can display content of large CSV file. As previously explained that SQLite is better than fopen() and fputs(). Using SQLite to process large CSV file is faster and more efficient.
However your application need to have a threshold of how many rows in CSV that catagorized large. For example if the CSV rows only below 1000 then it is not large so you can use normal File operation.

Here is Java example for working with SQLite

static void RetrieveData() {    
	try {
	Class.forName("org.sqlite.JDBC");
	Connection connection = null;
	try {
		connection = DriverManager.getConnection("jdbc:sqlite:D:/utility/sqlite-shell-win32/db1");
		Statement statement = null;
		statement = connection.createStatement();
		statement.setQueryTimeout(30); 
		ResultSet rs = statement.executeQuery("select * from csvtbl1 limit 0,2");
		while(rs.next())
		{
		  // read the result set
		  System.out.println("one = " + rs.getString("one"));
		  System.out.println("two = " + rs.getInt("two"));
		}
		rs.close();
		connection.close();
	} catch (SQLException ex) {
		Logger.getLogger(Sqlite2.class.getName()).log(Level.SEVERE, null, ex);
	} 
	} catch (ClassNotFoundException ex) {
            Logger.getLogger(Sqlite1.class.getName()).log(Level.SEVERE, null, ex);
    }
	
}


There is a connection string dbc:sqlite:D:/utility/sqlite-shell-win32/db1 and you should change its path to your own SQLite directory, otherwise new database ‘db1’ will be created in java compilation directory.
Inserting, Update and Delete operations are trivial. You can use something like statement.executeUpdate(UpdateQuery);.

Importing and Exporting From/To CSV
Please refer to Import Csv file point. This Java example using those SQLite dot-commands. Firstly, creating txtImport.txt programmatically and run sqlite3 automatically using Process.

static void createTxtFile(String txtPath, ArrayList contents) {
	File f = new File(txtPath);
	
	try {
		FileWriter fw = new FileWriter(f, true);
		PrintWriter pw = new PrintWriter(fw);
		for (Object line : contents) {
			pw.println(line);
		}
		pw.close();
		fw.close();
	} catch (IOException ex) {
		Logger.getLogger(Sqlite2.class.getName()).log(Level.SEVERE, null, ex);
	}
}

static void ImportCsv() {
	ArrayList al = new ArrayList();
	al.add(".mode csv");
	al.add(".import C:/largeCSV.csv csvtbl1");
	createTxtFile("D:/txtImport.txt",al);       
	String command = "cmd.exe /c sqlite3.exe D:/utility/sqlite-shell-win32/db1 < D:/txtImport.txt";
	//System.out.println(command);
	try {
		 Process p = Runtime.getRuntime().exec(command);
	} catch (IOException ex) {
		Logger.getLogger(Sqlite2.class.getName()).log(Level.SEVERE, null, ex);
	}
}	


Please look at part cmd.exe /c sqlite3.exe D:/utility/sqlite-shell-win32/db1 < D:/txtImport.txt.
This part pipes SQLite commands inside the text file to database ‘db1’. Also I already register sqlite3 path in Environment Variable so I can type in sqlite3 in anywhere from command prompt but the database path itself must be specified.

Below is code for save filtered query to CSV using Java. The filter is just to retrieve 1000 rows.

static void ExportToCsv() {
	ArrayList al = new ArrayList();
	al.add(".header on");
	al.add(".mode csv");
	al.add(".once C:/filteredcsv.csv");        
	al.add("select * from csvtbl1 limit 0,1000;");
	createTxtFile("D:/txtExport.txt",al);
	String command = "cmd.exe /c sqlite3.exe D:/utility/sqlite-shell-win32/db1 < D:/txtExport.txt";
	//System.out.println(command);
	try {
		 Process p = Runtime.getRuntime().exec(command);
	} catch (IOException ex) {
		Logger.getLogger(Sqlite2.class.getName()).log(Level.SEVERE, null, ex);
	}
}


Remember to put semicolon mark after every SQL Query syntaxt.

This large CSV operation with help of SQLite is fast & efficient.

Cheers,
Agung Gugiaji

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s