How to Read CSV File in Java?

Learn via video course
FREE
View all courses
Java Course - Mastering the Fundamentals
Java Course - Mastering the Fundamentals
by Tarun Luthra
1000
5
Start Learning
Java Course - Mastering the Fundamentals
Java Course - Mastering the Fundamentals
by Tarun Luthra
1000
5
Start Learning
Topics Covered

Overview

A Comma Separated Values(CSV) file is a file that contains data in tabular form. OpenCSV contains libraries that allow us to read CSV file in Java easily with the help of predefinded classes and methods. Firstly, include the OpenCSV dependency in the project to import necessary classes for reading the file. read CSV file in Java and handle exceptions that may be thrown.

What is OpenCSV?

In CSV files the column data is separated by commas (commas as delimiter) and rows are separated by the next line in a plain text file.

Open CSV is a user-friendly, third-party API that provides a standard library to read CSV file in Java 8 and above. It is useful to access and perform operations on complex CSV files. There are no native libraries to process CSV files and thus, OpenCSV is currently a better alternative. It is mainly suitable for standard tabular data. It allows reading and writing data in CSV format. It also provides a Parse class that allows configuration.

The OpenCSV classes may throw IOException or other exceptions which must be handled. We will be performing all the operations inside try-catch blocks.

Reading Large Files

When reading large files the memory usage must be considered and data must be processed row by row instead of loading the entire file into the memory. Here are some of the ways we can process large files in Java:

  • Java 8 Streams - From Java 8 or above, we can use Java stream to process CSV data efficiently by combining the BufferedReader with stream operations.
  • Define Chunk Size - Rather than reading one linke from the file into memory, we can define a chunk size to load and process data equivalant to the chunk size.
  • We can also use external databases, streaming libraries, etc. to reduce the memory overhead for large files.

Data Validation

Values from CSV files are read as String data only, if it contains integers, dates, or other types of data then these must be manually extracted. This introduces scope of error while reading data from files. Thus, data validation is a critical step when reading CSV files. Some of the ways we can validate data includes

  • checking the format of data with the column type,
  • ensuring all the manditory fields have data,
  • data doesn't exceed the range,
  • duplicate data detection, etc.

How to Use It

Let us see how to read CSV files in Java.

STEP 1: Including an OpenCSV parser library

  • For a maven project, add the dependency in the pom.xml file inside the <project></project> tags.

  • For a Gradle Project, add OpenCSV dependency as follows:

  • The jar file OpenCSV Jar is also available which works in almost all formats.

STEP 2: Classes of OpenCSV to view and modify CSV files:

  1. CSVReader - This class facilitates methods to read CSV files as a list of Array of Strings.
  2. CSVWriter - CSVWriter class is used to write to a CSV file with the help of a list of Array of Strings.
  3. CsvToBean - It helps us convert CSV file data to Java Beans.
  4. BeanToCsv - As can be guessed by the name itself, it can convert Java Beans to a CSV file.

Reading a CSV File

From any of the above methods mentioned, you can insert the OpenCSV library into the Java project. In this tutorial, we will be using the Maven project to read a CSV file.

Let our CSV file UserData.csv be as follows:

It has three columns Name, Age, and Email, and four rows.

Create a demo maven project and add a dependency. To read a CSV file in Java line by line create or copy a CSV file in the src/main/resources folder. reading a csv file

Reading Data Line by Line

Let us see how to read CSV file in Java using OpenCSV library in our maven project. We will create a class JavaDemo and a main method inside it.

Output:

Explanation:

We are creating an object of CSVReader which takes an object of FileReader. FileReader is a Java class that allows file operations. As this class throws IOException we are writing the code try-catch block. We will use the object of the CSVReader to check if the next line of the file has data using the method readNext(). This method returns true in case data is there, else it returns false if all the data is already read.

Note: We are using the try-with-resources statement to automatically close all the resources used when the operations are performed and are no longer needed. This ensures proper exceptional handling as well as resource management.

Reading All Data at Once

Instead of reading data line by line, it is possible to read all data at once using the method readAll().

Output:

Explanation:

Here instead of using a for loop we are reading and storing all the data at once in a single list of arrays of strings. This is then read in the output using nested for-each loops.

How to Read CSV File in Java with Different Separator

Let us see how to read CSV file in Java which is having having different separators such as semicolon(:), hash(#), or dollar sign($).

Let us first create a new CSV file with # as a delimiter.

To read the above file we will have to tell the CSVReader class that we are using # as a separator and to do this we will use the CSVParser class's object and specify the separator character to be # using CSVParserBuilder().withSeparator('#').build().

Output:

Explanation:

Let us decode the code in parts and understand it:

  • Imports:

    We are importing CSVReader, CSVParser, CSVReaderBuilder, and CSVParserBuilder.

    • CSVParser and CSVParserBuilder - These two classes allow us to configure the operations of the CSVReader class with the help of built-in classes. In our case it lets us change the delimiter to #.
    • CSVReader and CSVReaderBuilder - The CSVReaderBuilder class is used to set up the CSVReader in a way that it uses the custom parser.
  • We are using try-with-resources to handle resources and exceptions effectively.

  • So firstly, we are customizing the Parser to set the separator as #.

  • Next we read the file and saved it in a list of arrays of strings.

Examples

We have only read CSV files using OpenCSV but various other functionalities can be performed using OpenCSV. We will see an example of writing a CSV file in Java using the OpenCSV parser library.

Let initially the file be as follows:

To add more rows to it, we will use classes FileWriter, CSVWriter, and method writeNext() to write data to a file.

Explanation: FileWriter is a Java method for writing into files. OpenCSV provides a class called CSVWriter which takes an object of the FileWriter class as an argument. We are using the method writeNext() to append rows to the file. The rows are in the form of String arrays. Each column's value is separated by an element of an array.

FAQs

Q. What is OpenCSV, and why should we use it for reading CSV files in Java?

A. OpenCSV is a Java library that provides a simple and efficient way to read and write CSV (Comma-Separated Values) files. It offers a user-friendly API for parsing CSV data, making it easier to work with tabular data in Java applications.

Q. How can I handle header rows when reading a CSV file with OpenCSV?

A. OpenCSV provides a convenient option to handle header rows. You can use the withHeader() method to specify that the first row of the CSV file contains headers. Once enabled, you can use methods like readNext() or readAll() to access data using header names as keys.

Q. How can I handle errors or exceptions when reading CSV files with OpenCSV?

A. OpenCSV provides error-handling mechanisms through exceptions like IOException and CsvValidationException. You can catch and handle these exceptions in your code to manage errors gracefully, such as handling file not found or invalid CSV format errors.

Conclusion

  • OpenCSV is a library parser that lets read CSV files in Java and perform operations on them. CSVs are the most commonly used data files in tabular format.
  • OpenCSV provides classes to process CSV files. It is suitable for standard tabular data only.
  • Data must be processed row by row instead of loading entire files in the memory when processing large datasets to manage memory usage.
  • There are various ways to include the OpenCSV parser library, we can either add dependency or download the jar file from the official website.
  • The delimiter doesn't necessarily be a comma(,) but it can be a #, $ or a ;. OpenCSV lets us configure the delimiter using CSVParser and CSVParserBuilder classes.
  • While reading data from a CSV file we can read the data one row at a time or all the data together. This data is stored in a list of String arrays. Inside the array, each element represents a column.
  • We can also write data into a CSV file using the OpenCSV parser library.