# Thread: convert .txt file in .csv format

1. Member
Join Date
Mar 2010
Posts
16
Rep Power
0

## convert .txt file in .csv format

Dear all,

I hope you are doing good.
I am looking to convert .txt file in .csv format.
The contents might have different names and values.
e.g.

start:
id:XXXX
name:abc

start:
age:29
height:5'9

start:
accountno:xxx
emailid:xxxxx

end:

I want export these contents to a .csv file. I am aware that I need to use delimiter. But I am not getting the exact out put I do wish to.
This is what I thought I should do.

1.look for the start:
2.once you catch first start:, look for the name:value pair using : delimiter.
3.store this name:value pair in local variable like string etc.
4.repeat the steps until we see end:
5.extract the name:value pair in a .csv file.

I am interested in message at console. At the same time, I also want the stored name:value pairs should be exported to .csv file and I should be able to display contents of the .csv file.

This is what I have written so far.
Java Code:
import java.util.regex.*;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;

{
public static void main(String[] args)
{
File file = new File("C:\\source1.txt");
StringBuilder sb = new StringBuilder();
String line=null;

try
{
String text = null;

// repeat until all lines is read
{
if(!line.trim().equals(""))
{
if(line.startsWith("begin:"))
{
String[] data = sb.toString().trim().split("\n");

for(String chunk : data)
{
String[] parts = chunk.split(":");
System.out.println(parts[0] + " : " + parts[1]);
}

sb.setLength(0);
}
else
{
sb.append(line).append("\n");
}
}
/*String splitarray[] = line.split(":");
String firstentry = splitarray[0];
String secondentry = splitarray[2];
System.out.println(firstentry + " " + secondentry);*/

}
} catch (FileNotFoundException e)
{
e.printStackTrace();
} catch (IOException e)
{
e.printStackTrace();
} finally
{
System.out.println("Out");
}
}

// show file contents here
//System.out.println(str.toString());    }
}
It would be a great help anyone has got an idea for further development. Thanks in advance.

Best regards.

2. Senior Member
Join Date
Mar 2010
Posts
952
Rep Power
5
I'm not sure I understand what you want here, so let's back up for a second.

Typically, a .csv file has a heading line (not always) and then many lines of records, all in the same format -- something like this:

Java Code:
"id","name","address"
1,"Barack H. Obama","1600 Pennsylvania Ave."
2,"Gordon Brown","10 Downing St."
3,"Herman Munster","1313 Mockingbird Ln."
Your example specifies different kinds of records. Do you know how many different kinds there will be? Do you know their record format ahead of time, or will you need to interpret everything from the .txt file? Do you want a separate .csv file for each record type (that would make the most sense to me) or should all the different record types be mixed together in one .csv file (doesn't make much sense)?

Don't try to do everything in main(). Your code is already about six indents deep, and you've barely gotten started. Break things up into methods. Think about your data structures before you start coding. If you know exactly what the different record types will be, you might consider writing a class for each type.

Tell me whether what I've said so far makes sense, and then we can work from there.

-Gary-

3. Member
Join Date
Mar 2010
Posts
16
Rep Power
0
@gcalvin

Thank you very much for your reply. Yes you are right, I should not write everything in main. Now about .txt and .csv.

1. Text file contains several key:value pairs format after begin:.
2. It contains several begin: followed by key:value pairs.
for e.g.
begin:
name:xxx
age:22
sex:xx

begin:
id:xx
accountNo:xxx

begin:
id:xx
age:xx
name:xx

end:
3. Key means labels like name,id,address which eventually go in the headers of the .csv files.
4. I want take these key(labels) out and place them in a string and then I will export them to the .csv file.
5. I don't want duplication of the labels. For e.g. if key called name, repeats several times after begin:
6. Therefore, I believe I should use set to avoid duplication.
7. First I have to check for begin:(begin and delimiter ":". at first i tried to take "begin:" as a delimiter) then read start reading file.
8. Check for the key:value pairs delimited by ":"
9. store keys in header and then store respective values in the .csv file.
10. repeat this until the End:.

This is for what I have been trying. Let's see how far I get with it. I will post my modified code which is not working.

Java Code:
import java.util.*;
import java.util.regex.*;
import java.io.*;
import com.sun.java.util.*;
import org.apache.commons.lang.StringUtils;

{
public static void main(String[] args)
{
File file = new File("C:\\source1.txt");
StringBuilder sb = new StringBuilder();
String line=null;

try
{

// repeat until all lines are read
Set<String> s=new HashSet<String>();
int size=s.size();

{
if(!line.trim().equals(""))
{
if(line.startsWith("BEGIN:"))
{
String[] data = sb.toString().trim().split("\n");

for(String chunk : data)
{
String[] parts = (String[])chunk.split(":");
System.out.println(parts[0] + " : " + parts[1]);
Iterator<String> it = s.iterator();
while (it.hasNext())
{
//Object element=it.next();
size=s.size();
}

}

sb.setLength(0);

}
else
{
sb.append(line).append("\n");
}

}

}
System.out.println(s);
}
catch (FileNotFoundException e)
{
e.printStackTrace();
} catch (IOException e1)
{
e1.printStackTrace();
} finally
{
System.out.println("out");
}
// show file contents here
System.out.println(sb.toString());
}
}
Thank you very much.
P.S. About data structure, i think i will have start on paper with pen as I am a beginner and It's learning curve for me.

4. Senior Member
Join Date
Mar 2010
Posts
952
Rep Power
5
OK, I think I understand a little better now. You don't have different record types -- there's only one record type -- but each record in the .txt file may be missing some fields. And I think you suggested that some records might even have duplicate fields, but I'll ask about that later.

So if you have this .txt file:
Java Code:
begin:
name:Barack H. Obama
age:48
sex:M

begin:
id:100023
accountNo:12398765478236

begin:
id:100028
age:23
name:Ed "Kookie" Burns

end:
...you want to end up with this .csv file:
Java Code:
"name","age","sex","id","address","accountNo"
"Barack H. Obama",48,"M",,"",""
"",,"",100023,"1313 Mockingbird Lane","12398765478236"
"Ed ""Kookie"" Burns",23,"",100028,""
Note that I'm assuming that age and id are numeric values, and all other fields are strings. You can treat age and id as strings also, which will make things simpler, but I like a challenge. Also note the handling of strings that have double quotes in the text.

So I hope I'm understanding correctly so far? Now what happens if you get a record like this?:
Java Code:
begin:
id:100375
name:Alfred E. Neuman
sex:M
id:100485
What do you want to do about the duplicate field? The easiest thing is to discard all but the last appearance of the field name. The safest thing is to raise an error and reject the entire record (maybe even the entire .txt file). Maybe the specification says to write one record with one id and another record with the other id, all other fields staying the same.

This post is getting long, so let me know if we're on the right track, and we can talk about data structures next.

-Gary-

5. Member
Join Date
Mar 2010
Posts
16
Rep Power
0
@gcalvin

Thanks a lot Gary. Yes. We are on right track. This is exactly for what I am looking. I have changed some logical steps now. I am way behind for what I am looking but I believe now I am on right track. Here is my new code. Sorry I have not added comments in this code.

Java Code:
import java.util.*;
import java.util.regex.*;
import java.io.*;

import com.sun.java.util.*;
import org.apache.commons.lang.StringUtils;

{
public static void main(String[] args)
{
File file = new File("C:\\source1.txt");
String line=null;
String end="END:";
String begin="BEGIN:";
StringBuffer str = new StringBuffer();

try
{
String text=null;
Set<String> s=new HashSet<String>();
try {
{

if(line.equalsIgnoreCase(begin))
{
for(String parts:line.split(":"))
{
String columnFile=StringUtils.substringBefore(line, ":");
String dataValue=StringUtils.substringAfter(line, ":");
Object o= (String)columnFile.toString();
if(!s.equals(o))
{
}
else
{
System.out.println("\n Duplicate labels are not allowed");
}
}
}
System.out.println(s.toString());
}
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

}catch(IOException e)
{
e.printStackTrace();
}
}
}

6. Member
Join Date
Mar 2010
Posts
16
Rep Power
0
@gcalvin

Good thing is that with this modified code, i am taking keys (labels) out of the file and duplication is skipped.

Java Code:
import java.util.*;
import java.util.regex.*;
import java.io.*;

import com.sun.java.util.*;
import org.apache.commons.lang.StringUtils;

{
public static void main(String[] args)
{
File file = new File("C:\\source1.txt");
String line=null;
String end="END:";
String begin="BEGIN:";
StringBuffer str = new StringBuffer();

try
{
String text=null;
Set<String> s=new HashSet<String>();
try {
{

if(!line.equalsIgnoreCase(begin))
{
for(String parts:line.split(":"))
{
String columnFile=StringUtils.substringBefore(line, ":");
String dataValue=StringUtils.substringAfter(line, ":");

if(!s.contains(columnFile))
{
}
}
}

}
System.out.println(s.toString());
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

}catch(IOException e)
{
e.printStackTrace();
}
}
}

7. Senior Member
Join Date
Mar 2010
Posts
952
Rep Power
5
Take a look at this, and see if it gives you some ideas:
Java Code:
import java.io.*;
import java.util.*;

public class Txt2Csv {
private String filename;
private boolean moreRecords;
private Set<String> fieldNames;
private ArrayList< Map<String, String> > records;
private static final String BEGIN = "begin:";
private static final String END = "end:";

/**
* @param filename the name of the .txt file we want to parse
*/
public Txt2Csv(String filename) {
this.filename = filename;
fieldNames = new HashSet<String>();
records = new ArrayList< Map<String, String> >();
moreRecords = false; // openInputFile() should set it to true if all goes well
}

private void generateCsv() {
// TODO Auto-generated method stub

}

private void parseRecord(String record) {
// TODO Auto-generated method stub

}

String record = "";
String line = "";
while (true) {
try {
} catch (IOException e) {
e.printStackTrace();
}
if (line.toLowerCase().startsWith(BEGIN)) {
break;
} else if (line.toLowerCase().startsWith(END)) {
moreRecords = false;
break;
} else {
record += line + "\n";
}
}
return record;
}

private void openInputFile() {
try {
} catch (FileNotFoundException e) {
e.printStackTrace();
}
String line = "";
while (!moreRecords) {
try {
} catch (IOException e) {
e.printStackTrace();
}
if (line.toLowerCase().startsWith(BEGIN)) {
moreRecords = true;
} else if (line.toLowerCase().startsWith(END)) {
break;
}
}
}

public void parseFile() {
openInputFile();
while (moreRecords) {
parseRecord(record);
}
generateCsv();
}

/**
* @param args
*/
public static void main(String[] args) {
// TODO: add code to get file name from args or
//       prompt user for it
Txt2Csv myApp = new Txt2Csv("input.txt");
myApp.parseFile();
}
}
I haven't really written any code that you didn't already write, but by breaking it down into methods, I think I've made it a little easier to read and understand. I've also given you a big data structure hint for handling your parsed records. You have two methods left to implement, and you might want to break up generateCsv() into multiple methods as well.

Your parseRecord() method should create a new HashMap<String, String> for the new record you're generating, then add that to the records ArrayList. Meanwhile, it should add the field names to the fieldNames Set. You need to parse all of your records into memory, and then generate your .csv file at the end, because you need to know what all the field names are when you write the first heading line of your .csv file.

Keep your try{} blocks as tight as possible, and you should write better exception handlers than what we have right now.

-Gary-
Last edited by gcalvin; 03-12-2010 at 05:43 PM.

8. Senior Member
Join Date
Feb 2009
Posts
307
Rep Power
6
Why are you trying to do this all in one class? I would suggest breaking this down even further. Make yourself three classes, a Reader class, a Record class, and a Writer class.

You should first READ all the RECORDS from the file. Then using these RECORDS, WRITE to the file.

This design allows you to easily add methods in the code to read and write different types in which the record might be laid out differently(Example: Read the records from the CSV file and write to a Txt File).

9. Senior Member
Join Date
Mar 2010
Posts
952
Rep Power
5
Originally Posted by StormyWaters
Why are you trying to do this all in one class? I would suggest breaking this down even further. Make yourself three classes, a Reader class, a Record class, and a Writer class.

You should first READ all the RECORDS from the file. Then using these RECORDS, WRITE to the file.

This design allows you to easily add methods in the code to read and write different types in which the record might be laid out differently(Example: Read the records from the CSV file and write to a Txt File).
Hi Stormy,

I don't necessarily disagree with you, especially about having a separate Record class. If we break it up as you suggest, which class would hold the collection of records? Both the Reader and the Writer need it. Which class would take care of parsing the text into Records (I think the Record class should do it, but an argument could be made that the Reader class should do it)? Which class would generate the CSV lines?

I'm not sure that the Reader and Writer classes offer us much advantage, at least at this stage, but I could be convinced otherwise. A Record class is probably a good idea, and is definitely a good idea if you know all of the fields ahead of time.

-Gary-

10. Senior Member
Join Date
Feb 2009
Posts
307
Rep Power
6
Originally Posted by gcalvin
If we break it up as you suggest, which class would hold the collection of records? Both the Reader and the Writer need it.
The way I would picture it is that the Reader class would return a List of Records and whichever class is using it should manage it. If you are testing the Test class would take care of it by doing something like this

Java Code:
public class Test {

public static void main(String[] args) {
try {
Writer writer = new Writer();
writer.writeToCSVFile("C:\\out.csv");
} catch (Throwable e){
e.printStackTrace();
}
}

}

Originally Posted by gcalvin
Which class would take care of parsing the text into Records (I think the Record class should do it, but an argument could be made that the Reader class should do it)?
The record class is just an object that can have values set. It should not be responsible for the process of parsing the text. This should be handled by the Reader class with a method that would parse the .txt file and return a List of Records.

Originally Posted by gcalvin
Which class would generate the CSV lines?
That would be the purpose of the Writer class, to output the information in the desired format.

Originally Posted by gcalvin
A Record class is probably a good idea, and is definitely a good idea if you know all of the fields ahead of time.
You don't even need to know the fields ahead of time, just keep a List of all the Field Names you've added and a Hashtable to hash the Field Name to the data. Then create a getFieldNames() method that would return the names of the Fields stored in the Record and couple this with a getFieldValue(String fieldName) method that would return the data for the Field.

11. Senior Member
Join Date
Mar 2010
Posts
952
Rep Power
5
OK, I see what you're saying. I was thinking that the Record (or actually, more specific subclasses of Record) would know about its own fields and their types, and would be better suited to create and parse the different possible textual representations. I think we'd need to know more about the full project to determine which is the better approach. For what we have so far, I think I'd stick with the single-class implementation. It's still pretty simple -- the record is a simple HashMap -- and there aren't enough methods to warrant breaking it up yet, in my judgment. But I can certainly understand why you'd take a different approach.

-Gary-

12. Member
Join Date
Mar 2010
Posts
16
Rep Power
0
@gcalvin
@StormyWaters

Thank a lot for your guidance and suggestions guys.

Best regards.

13. Member
Join Date
Mar 2010
Posts
16
Rep Power
0
@gcalvin

Hello Gary,

Sorry to ask you again for the same post. This is getting very confusing for me. I went through your code. It's good but I am not getting few concept. I am making my code simple at the present situation.

Java Code:
import java.util.*;
import java.util.regex.*;
import java.io.*;
import com.sun.java.util.*;
import org.apache.commons.lang.StringUtils;

{
public static void main(String[] args)
{
File file = new File("C:\\source3.txt");
String line=null;
boolean moreRecords;
ArrayList<String> records;
String end="END:";
String begin="BEGIN:";

try
{
String columnFile="",columnFileTemp="",dataValue="";
int size=0;
Set<String> s=new HashSet<String>(); //to get the key values in header only once, declare set
records = new ArrayList<String>();
boolean flag=false;
try
{
{

if(!line.equalsIgnoreCase(begin))	//ignore the begin: delimiter and start reading contents
{
//to extract contents separate variables after and before the colon
columnFileTemp = StringUtils.substringBefore(line, ":");
dataValue=dataValue+(dataValue.equals("")?"":",")+StringUtils.substringAfter(line, ":");
System.out.println(dataValue);
if(!s.contains(columnFileTemp))
{
columnFile=columnFile+(columnFile.equals("")?"":",")+StringUtils.substringBefore(line, ":");
s.add(columnFileTemp);//to avoid duplication check it with previously entered keys.
s.size();
}
else
{
System.out.println("\n duplicate values");
}

//System.out.println(records);//print the value on console
}
else
{
if(flag)
{
if(records.contains(dataValue))
{

}
dataValue="";
}
flag=true;
}
}
System.out.println(s.size());
//dataValue="\n"+dataValue;
System.out.println(records);
//create new .csv file and write the contents (keys:value) in the file
String element="";
File file1 = new File("c:\\write.csv");
FileWriter writer = new FileWriter(file1,true);
writer.write(columnFile);
Iterator<String> itr= records.iterator();
while(itr.hasNext())
{
writer.write(System.getProperty("line.separator"));
element=itr.next();
writer.write(element);

}

writer.close();
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}catch(IOException e)
{
e.printStackTrace();
}
}
}
now my .txt file.

BEGIN:
name:abc
id:123
gender:m
BEGIN:
dob:09/11/83
BEGIN:
accountno:1234
name:xyz
id:1230
gender:f
END:

My problem is, i am getting keys in the header but when it comes to data(value),it doesn't match with corresponding keys (fieldnames). I am trying everything i could. using set to avoid duplication of the fieldnames and am using arraylist for the values(data). but i am not getting desired result.like,

column1,column2,column3,column4
,@,@,@
@,,@,,
,@,@,

14. Senior Member
Join Date
Mar 2010
Posts
952
Rep Power
5
This is not simple. This is a mess with everything crammed back into main(). Break it up into methods. I'm not even going to try to read it like this. Also you're trying to put your records into an ArrayList<String> after I explained that they need to go into an ArrayList<Map<String, String>>.

Break it up into methods again, and then tell me what concepts you're not getting.

-Gary-

15. Senior Member
Join Date
Mar 2010
Posts
952
Rep Power
5
Sorry, I didn't mean to get harsh. But seriously, do the parseFile(), openInputFile() and readFile() methods, and then try to implement parseRecord() and show me what you can come up with, and I'll try to help. We can tackle generateCsv() after that.

-Gary-

16. Member
Join Date
Mar 2010
Posts
16
Rep Power
0
@gcalvin

Actually, I should have split the code earlier. I am sorry about that. I will work on it and I will send you fresh code again. Fact is once I start to get results, I don't do the edition. I should do it though. Thanks for your attention.

Best regards.

17. Member
Join Date
Mar 2010
Posts
16
Rep Power
0
@gcalvin

I have split my code now. I am sending you what I have actually done. I have changed lots of things. I am stuck now while writing writeToCsv() method where in I am trying to get key values from hashmap.

Java Code:
import java.io.*;
import java.util.*;

import org.apache.commons.lang.StringUtils;

public class textToCsv
{
//Set<String> s = new HashSet<String>(); //to get the key values in header only once, declare set
String begin = "begin:";
String end = "end:";
String columnFileTemp,dataValue="";
String record = "";
String line = "";
ArrayList< Map<String, String> > records=new ArrayList<Map<String,String>>();
/**
* @param filename the name of the .txt file we want to parse
* @throws IOException
*/
public textToCsv()
{
super();
}

private Map<String,String> getColumn(String fileName) throws IOException

{
Map <String,String> columnName=null;
File file = new File(fileName);
try
{
columnName=new HashMap<String,String>();
{
if(!line.equalsIgnoreCase(begin))	//ignore the begin: delimiter and start reading contents
{
//to extract contents separate variables after and before the colon
columnFileTemp = StringUtils.substringBefore(line, ":");
if(!StringUtils.equalsIgnoreCase(columnFileTemp, "END"))
columnName.put(columnFileTemp,"");
}
}
System.out.println(columnName);
}catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
finally
{
System.out.println("ending.....");
}
return columnName;
}
private ArrayList<Map<String,String>> getData(String fileName,Map<String, String> columns) throws IOException
{
Map <String,String> dataRecord=null;
File file = new File(fileName);

try
{
int i=1;
while (line != null && !line.contains(end))//start reading the contents and between end of contents
{
if(StringUtils.containsIgnoreCase(line, "begin:"))	//ignore the begin: delimiter and start reading contents
{   System.out.println(i);
i++;
dataRecord=new HashMap<String,String>();
Collection c = columns.keySet();
Iterator itr= c.iterator();
while(itr.hasNext())
dataRecord.put((String)itr.next(),"");

}else{
while(StringUtils.indexOf(line, "BEGIN:")==-1 && line != null){
columnFileTemp = StringUtils.substringBefore(line, ":");
dataValue= StringUtils.substringAfter(line, ":");
if(columnFileTemp!="END")
dataRecord.put(columnFileTemp, dataValue);
}
continue;
}
}
System.out.println(records);
}catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
finally
{
System.out.println("ending datavalue.....");
}
return records;
}
/**
* @param args
* @throws IOException
* @throws IOException
*/
private void writeToCsv() throws IOException
{
try
{
File file1=new File("c:\\write.csv");
FileWriter writer = new FileWriter(file1,true);
Iterator<Map<String, String>> itr= records.iterator();
while(itr.hasNext())
{
writer.write(System.getProperty("line.separator"));
}
writer.close();
}catch(IOException e){}
}

public static void main(String[] args) throws IOException
{
// TODO: add code to get file name from args or
//       prompt user for it
textToCsv myApp = new textToCsv();
String fileName="c:\\source3.txt";
Map<String,String> columnMap=myApp.getColumn(fileName);
ArrayList<Map<String,String>> dataList=myApp.getData(fileName,columnMap);
myApp.writeToCsv();
}
}

in source3.txt i am parsing following records.
BEGIN:
name:abc
id:123
gender:m
BEGIN:
dob:07/01/89
BEGIN:
accountno:1234
name:xyz
id:1230
gender:f
END:

Please let me know your suggestions or ideas to get key values from hashmap.

Best regards.

18. Senior Member
Join Date
Mar 2010
Posts
952
Rep Power
5
OK, let's start at the top:
Java Code:
import java.io.*;
import java.util.*;

import org.apache.commons.lang.StringUtils;
You probably don't need the Apache StringUtils, as you're not doing anything difficult with your strings.
Java Code:
public class textToCsv
Start your class name with an upper-case letter. Remember that your source code file name has to match.
Java Code:
[COLOR="RoyalBlue"]public class TextToCsv
[/COLOR]
Java Code:
{
//Set<String> s = new HashSet<String>(); //to get the key values in header only once, declare set
You have this commented out, I believe because you declared a similar collection in your main() method, and confused yourself. You do want this to be an instance variable, but give it a useful name, like "fieldNames" and not a useless name like "s".
Java Code:
[COLOR="RoyalBlue"]        private Set<String> fieldNames;
[/COLOR]
Java Code:
	String begin = "begin:";
String end = "end:";
These are constants, so they should be declared like this:
Java Code:
[COLOR="RoyalBlue"]        private static final String BEGIN = "begin:";
private static final String END = "end:";
[/COLOR]
private means only methods in this class have access
static means there is only one copy that is shared by all instances of this class
final means the value cannot be changed
And we give them names in ALL CAPS to remind ourselves that they are constants. Style is important -- learn good habits now.

Java Code:
	String columnFileTemp,dataValue="";
String record = "";
String line = "";
I'm not sure what these are yet, or whether they ought to be instance variables, but in any case, it's generally better practice to declare your ivars here and initialize them in your constructor.
Java Code:
	ArrayList< Map<String, String> > records=new ArrayList<Map<String,String>>();
This one should definitely be initialized in the constructor, but it looks like instead you created yourself a local variable in the constructor and ignored this.
Java Code:
[COLOR="RoyalBlue"]        private ArrayList<Map<String, String>> records;
[/COLOR]
I think you should have ivars for your file names too, and it will be convenient to keep a boolean moreRecords as well.
Java Code:
[COLOR="RoyalBlue"]        private String inFileName;
private String outFileName;
private boolean moreRecords;
[/COLOR]
Now we get to your constructor:
Java Code:

/**
* @param filename the name of the .txt file we want to parse
* @throws IOException
*/
public textToCsv()
{
super();
}
You've taken all the code out of the constructor, so the javadoc comments are no longer accurate, and the super() call is superfluous. As it is now, you should remove the constructor altogether, but actually you should be initializing your instance variables here.
Java Code:
[COLOR="RoyalBlue"]        public TextToCsv(String inputFileName, String outputFileName) {
fieldNames = new HashSet<String>();
records = new ArrayList<Map<String, String>>();
inFileName = inputFileName;
outFileName = outputFileName;
moreRecords = false; // we're not ready to read moreRecords until we've opened the input file
}
[/COLOR]
After this you really start to go off the rails, so let me skip down to your main() method:
Java Code:
	public static void main(String[] args) throws IOException
No, don't let main() throw any exceptions -- catch them and deal with them.
Java Code:
[COLOR="RoyalBlue"]        public static void main(String[] args)
[/COLOR]
Java Code:
	{
// TODO: add code to get file name from args or
//       prompt user for it
textToCsv myApp = new textToCsv();
String fileName="c:\\source3.txt";
We've changed the class name and constructor, so:
Java Code:
[COLOR="RoyalBlue"]                TextToCsv myApp = new TextToCsv("c:\\source3.txt", "c:\\dest.txt");
[/COLOR]
Java Code:
		Map<String,String> columnMap=myApp.getColumn(fileName);
ArrayList<Map<String,String>> dataList=myApp.getData(fileName,columnMap);
myApp.writeToCsv();
System.out.println("Your file is written");
Move the rest of this code out of main() and into a separate method.
Java Code:
[COLOR="RoyalBlue"]                myApp.parseFile();
}
}
[/COLOR]
The reason why you want to do this is that in practice, you will rarely use a class's main() method. This TextToCsv class will be used from within some other class that's part of a larger application. That other class will want to do just what our main() method does -- get an instance of TextToCsv and then call that object's parseFile() method. Even if you think your class will never be used by another class, this is just a good habit to get into.

So let's write parseFile()
Java Code:
[COLOR="RoyalBlue"]        public void parseFile() {
BufferedReader br = openInputFile(); // should set moreRecords to true
while (moreRecords) {
parseRecord(nextRecord); // writes key/value pair to records ArrayList and also updates fieldNames Set
}
closeInputFile(); // TODO write this
writeToCsv();
}
[/COLOR]
I showed you code you can use for openInputFile() and readNextRecord() in a previous example. It's very important that each method does just one easy-to-understand thing. Your openInputFile() method should just open the file and return a BufferedReader, dealing with any exceptions in the process. My example also read into the file as far as the first "begin:" string and set moreRecords to true. Am I breaking my own rule about "do just one thing"? Maybe. Feel free to write a separate positionToFirstRecord() method.

Your readNextRecord method should just read lines from the file and concatenate them into a String. The first time it's called, the String nextRecord should be set to "name:abc\nid:123\ngender:m". Then parseRecord() needs to turn that string into a HashMap<String, String> that we can store in the records ArrayList.
Java Code:
[COLOR="RoyalBlue"]        private void parseRecord(String recordString) {
Map<String, String> record = new HashMap<String, String>();
String[] lines = recordString.split("\n");
for (String line : lines ) {
String fieldName = line.substring(0, line.indexOf(':'));
String fieldValue = line.substring(line.indexOf(':') + 1);
record.put(fieldName, fieldValue);
// update fieldNames Set
}
}
[/COLOR]
So when we're done with parsing all the records, we have our fieldNames Set that has exactly one copy of each fieldName we found, and our records ArrayList, that has a Map of fieldName/fieldValue pairs for each record.

Now we're ready to tackle writeCsv(). It's probably not a bad idea to model it after our parseFile() method.
Java Code:
[COLOR="RoyalBlue"]        public void writeCsv() {
BufferedWriter bw = openOutputFile();
for (Map<String, String> record : records) {
String recordString = generateCsvRecord(record);
writeCsvRecord(bw, recordString);
}
closeOutputFile(); // TODO write this
}

String result = "";
boolean firstField = true;
for (String fieldName : fieldNames) {
if (!firstField) result += ",";
firstField = false;
result += "\"";
result += fieldName;
result += "\"";
}
try {
bw.write(result + System.getProperty("line.separator"));
} catch (IOException e) {
// TODO: handle exception gracefully
}
}

private String generateCsvRecord(Map<String, String> record) {
String result = "";
boolean firstField = true;
for (String fieldName : fieldNames) {
// TODO: finish this -- I've done too much of it for you
//   already. For each fieldName you need to check if
//   you have it in your current Map, and if so, get
//   the value, quote it, and concatenate it to result.
//   If the fieldName is not in your current Map, you
//   need to concatenate a quoted empty String to the
//   result, as a placeholder. Add commas before all
//   but the first field.
}
return result;
}

private BufferedWriter openOutputFile() {
// TODO: write this method
}

private void writeCsvRecord(BufferedWriter bw, String recordString) {
// TODO: write this one too
}
[/COLOR]
That's a whole lot to digest. Look at it carefully. Resist the temptation to cram it all back into main(). Ask me specific questions.

-Gary-
Last edited by gcalvin; 03-17-2010 at 10:08 PM.

19. Member
Join Date
Mar 2010
Posts
16
Rep Power
0
@gcalvin

Indeed it's a exact technical analysis. I will improve my code writing skills.

Best regards.

20. Senior Member
Join Date
Mar 2010
Posts
952
Rep Power
5
Just be patient with yourself. Break your work into small pieces rather than trying to do it all at once. Do you understand how the ArrayList<Map<String, String>> works now?

-Gary-

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•