
Introduction
This program helps to read multiple files from the system and compare the contents to check the duplicate records. This is a sample program written using Java Programming language. We used a very simple algorithm so everyone can understand and execute it on their local system. Please find steps used to check the duplicate records below –
- Read the Text File from the File System and load them in a List.
- While Reading the File check and Prepare the total Duplicate Element available.
- Call method to validate the duplicate records and prepare the Object to display for the user.
- Print the Result
Program Code
package com.kw.sample;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;
/**
* This class helps to retrieve duplicate number repeated in multiple files with
* total number of repetitions.
*
* This use 11 Files to validate duplicate numbers and it's repetition.
*
* @author dsahu1
*
*/
public class ReadDataFromMultipleTextFiles {
/**
* This method helps to execute the program.
*
* @param args
* @throws FileNotFoundException
* @throws IOException
*/
public static void main(String[] args) {
// Files path
String path = "C:/Files/";
// List to contain Duplicate records
List<Integer> dublicateData = new ArrayList<Integer>();
// Set to contain unique records
Set<Integer> set = new HashSet<Integer>();
// Actual File Name
String fileWithPath = path + "A.txt";
// Call method to read number from text file and generate List
List<Integer> listA = readDataFromFile(set, dublicateData, fileWithPath);
System.out.println("listA.size() : " + listA.size());
System.out.println("set.size() : " + set.size());
System.out.println("dublicateData.size() : " + dublicateData.size());
// -------------------------------------------------------------------
// Actual File Name
fileWithPath = path + "B.txt";
// Call method to read number from text file and generate List
List<Integer> listB = readDataFromFile(set, dublicateData, fileWithPath);
System.out.println("listB.size() : " + listB.size());
System.out.println("set.size() : " + set.size());
System.out.println("dublicateData.size() : " + dublicateData.size());
// -------------------------------------------------------------------
// Actual File Name
fileWithPath = path + "C.txt";
// Call method to read number from text file and generate List
List<Integer> listC = readDataFromFile(set, dublicateData, fileWithPath);
System.out.println("listC.size() : " + listC.size());
System.out.println("set.size() : " + set.size());
System.out.println("dublicateData.size() : " + dublicateData.size());
// -------------------------------------------------------------------
// Actual File Name
fileWithPath = path + "D.txt";
// Call method to read number from text file and generate List
List<Integer> listD = readDataFromFile(set, dublicateData, fileWithPath);
System.out.println("listD.size() : " + listD.size());
System.out.println("set.size() : " + set.size());
System.out.println("dublicateData.size() : " + dublicateData.size());
// -------------------------------------------------------------------
// Actual File Name
fileWithPath = path + "E.txt";
// Call method to read number from text file and generate List
List<Integer> listE = readDataFromFile(set, dublicateData, fileWithPath);
System.out.println("listE.size() : " + listE.size());
System.out.println("set.size() : " + set.size());
System.out.println("dublicateData.size() : " + dublicateData.size());
// -------------------------------------------------------------------
// Actual File Name
fileWithPath = path + "F.txt";
// Call method to read number from text file and generate List
List<Integer> listF = readDataFromFile(set, dublicateData, fileWithPath);
System.out.println("listF.size() : " + listF.size());
System.out.println("set.size() : " + set.size());
System.out.println("dublicateData.size() : " + dublicateData.size());
// -------------------------------------------------------------------
// Actual File Name
fileWithPath = path + "G.txt";
// Call method to read number from text file and generate List
List<Integer> listG = readDataFromFile(set, dublicateData, fileWithPath);
System.out.println("listG.size() : " + listG.size());
System.out.println("set.size() : " + set.size());
System.out.println("dublicateData.size() : " + dublicateData.size());
// -------------------------------------------------------------------
// Actual File Name
fileWithPath = path + "H.txt";
// Call method to read number from text file and generate List
List<Integer> listH = readDataFromFile(set, dublicateData, fileWithPath);
System.out.println("listH.size() : " + listH.size());
System.out.println("set.size() : " + set.size());
System.out.println("dublicateData.size() : " + dublicateData.size());
// -------------------------------------------------------------------
// Actual File Name
fileWithPath = path + "I.txt";
// Call method to read number from text file and generate List
List<Integer> listI = readDataFromFile(set, dublicateData, fileWithPath);
System.out.println("listI.size() : " + listI.size());
System.out.println("set.size() : " + set.size());
System.out.println("dublicateData.size() : " + dublicateData.size());
// -------------------------------------------------------------------
// Actual File Name
fileWithPath = path + "J.txt";
// Call method to read number from text file and generate List
List<Integer> listJ = readDataFromFile(set, dublicateData, fileWithPath);
System.out.println("listJ.size() : " + listJ.size());
System.out.println("set.size() : " + set.size());
System.out.println("dublicateData.size() : " + dublicateData.size());
// -------------------------------------------------------------------
// Actual File Name
fileWithPath = path + "K.txt";
// Call method to read number from text file and generate List
List<Integer> listK = readDataFromFile(set, dublicateData, fileWithPath);
System.out.println("listK.size() : " + listK.size());
System.out.println("set.size() : " + set.size());
System.out.println("dublicateData.size() : " + dublicateData.size());
// Construct result map object to contain the result
Map<Integer, Map<String, Integer>> result = new HashMap<Integer, Map<String, Integer>>();
// Call method to validate the duplicate records and prepare the result
// object
validateDoublicateWithCountAndFile(result, dublicateData, listA, "A");
// Call method to validate the duplicate records and prepare the result
// object
validateDoublicateWithCountAndFile(result, dublicateData, listB, "B");
// Call method to validate the duplicate records and prepare the result
// object
validateDoublicateWithCountAndFile(result, dublicateData, listC, "C");
// Call method to validate the duplicate records and prepare the result
// object
validateDoublicateWithCountAndFile(result, dublicateData, listD, "D");
// Call method to validate the duplicate records and prepare the result
// object
validateDoublicateWithCountAndFile(result, dublicateData, listE, "E");
// Call method to validate the duplicate records and prepare the result
// object
validateDoublicateWithCountAndFile(result, dublicateData, listF, "F");
// Call method to validate the duplicate records and prepare the result
// object
validateDoublicateWithCountAndFile(result, dublicateData, listG, "G");
// Call method to validate the duplicate records and prepare the result
// object
validateDoublicateWithCountAndFile(result, dublicateData, listH, "H");
// Call method to validate the duplicate records and prepare the result
// object
validateDoublicateWithCountAndFile(result, dublicateData, listI, "I");
// Call method to validate the duplicate records and prepare the result
// object
validateDoublicateWithCountAndFile(result, dublicateData, listJ, "J");
// Call method to validate the duplicate records and prepare the result
// object
validateDoublicateWithCountAndFile(result, dublicateData, listK, "K");
System.out.println("result.size() : " + result.size());
// Call method to print the result object map
printResults(result);
}
/**
* This method helps to print the output result in readable format.
*
* @param result
*/
public static void printResults(Map<Integer, Map<String, Integer>> result) {
Set<Entry<Integer, Map<String, Integer>>> set = result.entrySet();
Iterator<Entry<Integer, Map<String, Integer>>> itr = set.iterator();
while (itr.hasNext()) {
Entry<Integer, Map<String, Integer>> entry = itr.next();
Map<String, Integer> map = entry.getValue();
System.out.println(entry.getKey() + " ::: " + map
+ " - Total Repetitions - " + map.size());
}
}
/**
* This method helps to validate the duplicate records in given file and
* Construct the result object.
*
* @param result
* @param dublicateData
* @param list
* @param fileName
*/
public static void validateDoublicateWithCountAndFile(
Map<Integer, Map<String, Integer>> result,
List<Integer> dublicateData, List<Integer> list, String fileName) {
for (Integer number : list) {
if (dublicateData.contains(number)) {
if (result.containsKey(number)) {
Map<String, Integer> map = result.get(number);
map.put(fileName, 1);
result.put(number, map);
} else {
Map<String, Integer> map = new HashMap<String, Integer>();
map.put(fileName, 1);
result.put(number, map);
}
}
}
}
/**
* This method helps to read the records from the test file and construct
* List Object.
*
* @param set
* @param dublicateData
* @param fileWithPath
* @return List
*/
public static List<Integer> readDataFromFile(Set<Integer> set,
List<Integer> dublicateData, String fileWithPath) {
File file = new File(fileWithPath);
String line = null;
List<Integer> list = new ArrayList<Integer>();
try {
System.out.println("Batch ID are Picked From --- " + fileWithPath);
// FileReader reads text files in the default encoding.
FileReader fileReader = new FileReader(file);
// Always wrap FileReader in BufferedReader.
BufferedReader bufferedReader = new BufferedReader(fileReader);
// Read Files
while ((line = bufferedReader.readLine()) != null) {
Integer data = new Integer(line.trim());
list.add(data);
if (set.contains(data)) {
dublicateData.add(data);
} else {
set.add(data);
}
}
} catch (Exception e) {
System.err.println("**** Exception Occured *****"
+ e.getLocalizedMessage());
}
return list;
}
}
Output

We tested this Program using 3 Text Files A.txt, B.txt and C.txt
Run1: Without any Duplicate Records in the Files.
Batch ID are Picked From --- C:/Files/A.txt
listA.size() : 32
set.size() : 32
dublicateData.size() : 0
Batch ID are Picked From --- C:/Files/B.txt
listB.size() : 120
set.size() : 152
dublicateData.size() : 0
Batch ID are Picked From --- C:/Files/C.txt
listC.size() : 353
set.size() : 505
dublicateData.size() : 0
result.size() : 0
--------------------------------------------------------------------
Run2: With some Duplicate Records in the Files.
Batch ID are Picked From --- C:/Files/A.txt
listA.size() : 36
set.size() : 36
dublicateData.size() : 0
Batch ID are Picked From --- C:/Files/B.txt
listB.size() : 125
set.size() : 161
dublicateData.size() : 0
Batch ID are Picked From --- C:/Files/C.txt
listC.size() : 353
set.size() : 505
dublicateData.size() : 9
result.size() : 9
864 ::: {B=1, C=1} - Total Repetitions - 2
774 ::: {B=1, C=1} - Total Repetitions - 2
615 ::: {B=1, C=1} - Total Repetitions - 2
215 ::: {A=1, C=1} - Total Repetitions - 2
63 ::: {A=1, C=1} - Total Repetitions - 2
47 ::: {B=1, C=1} - Total Repetitions - 2
28 ::: {B=1, C=1} - Total Repetitions - 2
607 ::: {A=1, C=1} - Total Repetitions - 2
362 ::: {A=1, C=1} - Total Repetitions - 2