ArrayList<String> data cleanup for JTable- Strip surrounding doublequotes
Hi all. This is probably a simple fix but the answer is elluding me at the moment.
I have a JTable built on an AbstractTableModel (I needed some features of this API, I know it's harder but that's not the issue). The table model takes a String[] for the column names and an ArrayList<String> for the data set.
Now, the data source is a pipe-delimited .csv file. When I dump the data into the file it is formatted like this:
"fo"o" |"123" |"bar" |"abc" |"321"
"foo" |"def" |"bar" |"a"bc" |"321"
"bar" |"12"3" |"bar" |"123" |"321"
"def" |"123" |"bar" |"abc" |"321"
I read it like this:
Code:
//get size of data source
LineNumberReader lnr = new LineNumberReader(new FileReader(f));
lnr.skip(Long.MAX_VALUE);
int ln = lnr.getLineNumber() - 1; // subtract 1 to account for the skipped row
lnr = null;
// Import CSV File into table model
ArrayList<String> al = new ArrayList<String>(ln);
while (c < ln) {
al.add(s.nextLine());//.replaceAll("\"", """)); //clearly this doesn't meet the objective
//al.add(s.nextLine().replace("\"", ""));
c++;
}
The problem is, I need to strip the surrounding "", but keep the ones that are part of the actual data set.
Any inexpensive methods to target only the surrounding double quotes?
I'd really appreciate any constructive input.
Re: ArrayList<String> data cleanup for JTable- Strip surrounding doublequotes
Split the string on the pipe. Trim each. Then replace the surrounding " (check the first and last characters, and replace if they are "). You could alternatively use a regular expression to group out the data internal to the quotes.
Re: ArrayList<String> data cleanup for JTable- Strip surrounding doublequotes
Similar to doWhile, split base on " |", then drop the first character of the first element and the last character of the last element.
Of course, someone will now come in with a regex that will do it in one go...:)
Re: ArrayList<String> data cleanup for JTable- Strip surrounding doublequotes
Quote:
Originally Posted by
Tolls
Of course, someone will now come in with a regex that will do it in one go...:)
:P: :(y): Yes I would use/advise to use the Scanner class with findWithinHorizon. read, filter, add string to the list and closing the stream in ~3-5 lines of code !
But it`s not clear to me, why do you need the number of lines? Only for the next while loop? I think that you don`t need the linenumberreader (first 4 lines)
Re: ArrayList<String> data cleanup for JTable- Strip surrounding doublequotes
I really appreciate the input, guys. So the issue is setting up the scanning loop/component to where I can get access to each data point and strip the doublequotes. I'm struggling to even get that far. I've looked high and low, but can't quite get around the following memory error:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Unknown Source)
at java.util.Arrays.copyOf(Unknown Source)
at java.util.ArrayList.ensureCapacity(Unknown Source)
at java.util.ArrayList.add(Unknown Source)
at pcInvpkg.SearchTest.getTable(SearchTest.java:289)
at pcInvpkg.SearchTest.<init>(SearchTest.java:108)
at pcInvpkg.gui2.build(gui2.java:40)
at pcInvpkg.gui2.main(gui2.java:27)
Using this code:
Code:
Scanner scan = new Scanner(new File("C:\\PCQ\\A_R06.csv"));
ArrayList<String> al = new ArrayList<String>();
scan.useDelimiter("\\|");
while(scan.hasNext()){
String[] list = scan.next().split("\\|");
for (String s : list) {
//do string manipulation to String s here to remove ""
al.add(s);
}
}
scan.close();
scan = null;
Tips/examples for a more efficient implementation?
Re: ArrayList<String> data cleanup for JTable- Strip surrounding doublequotes
Quote:
Originally Posted by
Tolls
Of course, someone will now come in with a regex that will do it in one go...:)
People try to do too much with regular expressions, blindly ignoring the efficiency of it all; on top of that, they won't believe you when you explain to them that something can't be done with regular expressions, no matter when you prove the fact by using the pumping lemma; the only thing they do is laugh in a silly way. I dislike those regular expression monsters.
kind regards,
Jos
Re: ArrayList<String> data cleanup for JTable- Strip surrounding doublequotes
Quote:
Originally Posted by
JosAH
People try to do too much with regular expressions, blindly ignoring the efficiency of it all; on top of that, they won't believe you when you explain to them that something can't be done with regular expressions, no matter when you prove the fact by using the pumping lemma; the only thing they do is laugh in a silly way. I dislike those regular expression monsters.
I love 'em ;) Code:
public class SplitPipeRemoveQuotes {
public static void main(String[] args) {
String[] inputs = {
"\"fo\"o\" |\"123\" |\"bar\" |\"abc\" |\"321\"",
"\"foo\" |\"def\" |\"bar\" |\"a\"bc\" |\"321\"",
"\"bar\" |\"12\"3\" |\"bar\" |\"123\" |\"321\"",
"\"def\" |\"123\" |\"bar\" |\"abc\" |\"321\""
};
String regex = "(^\")|(\"?\\s?\\|\"?)|(\"$)";
for (String input : inputs) {
System.out.println(input);
String[] outputs = input.split(regex);
for (String output : outputs) {
System.out.println("[" + output + "]");
}
System.out.println("--------------------------");
}
}
}
No, I don't know how to avoid getting one empty element at position [0]
db
Re: ArrayList<String> data cleanup for JTable- Strip surrounding doublequotes
Quote:
Originally Posted by
DarrylBurke
I love 'em ;)
Turn on the rotating knives machine!
kind regards,
Jos ;-)
Re: ArrayList<String> data cleanup for JTable- Strip surrounding doublequotes
Quote:
Originally Posted by
DarrylBurke
No, I don't know how to avoid getting one empty element at position [0]
db
AHHH soooo close!! I'm reading that once an array is instantiated it can't be altered as far as it's structure.. darn!
EDIT Got it!!
Snippet as implemented:
Code:
while (c < ln) {
String regex = "(^\")|(\"?\\s?\\|\"?)|(\"$)";
String[] line = s.nextLine().split(regex);
String line2 = arrayToString2(line, "|");
//System.out.println(line2);
al.add(line2);
c++;
}
s.close();
s = null;
Using this little diddy:
Code:
public static String arrayToString2(String[] a, String separator) {
StringBuffer result = new StringBuffer();
if (a.length > 0) {
result.append(a[0]);
for (int i=2; i<a.length; i++) {
result.append(separator);
result.append(a[i]);
}
}
return result.toString();
}
You guys are amazing, I really don't know what I'd do without this forum.
Re: ArrayList<String> data cleanup for JTable- Strip surrounding doublequotes
Quote:
Originally Posted by
JosAH
Turn on the rotating knives machine!
That gives a new meaning to split(...) ....
db
Re: ArrayList<String> data cleanup for JTable- Strip surrounding doublequotes
Quote:
Originally Posted by
Redefine12
AHHH soooo close!! I'm reading that once an array is instantiated it can't be altered as far as it's structure.. darn!
You can replaceAll("^\"", "") before splitting. I considered that cheating, because it uses two regexes. Code:
:
:
String regex = "(\"?\\s?\\|\"?)|(\"$)";
for (String input : inputs) {
System.out.println(input);
String[] outputs = input.replaceAll(("^\"", "").split(regex);
:
:
Do you understand the structure of the regexes? If you don't, ask.
db
Re: ArrayList<String> data cleanup for JTable- Strip surrounding doublequotes
I edited my above post with my hacky solution I gleaned from an example--it works, and using a seperate method is good OOP from my feeble grasp of it. I'm pretty impressed with the power of regex after seeing your implementation. I'm gonna devote a few hours to learning the concepts this afternoon. If you've got one, I'd love a link to a good explanation on regex (for dummies).
Re: ArrayList<String> data cleanup for JTable- Strip surrounding doublequotes
Oh, interesting solutions. My idea was
Code:
try (Scanner sc = new Scanner(new File("C:\\PCQ\\A_R06.csv"))) {
while (sc.findWithinHorizon("\"(.+?)\"[\\|\\s]", 0) != null) {
al.add(sc.match().group(1));
}
}
:(sweat):
I love them too :x:
Re: ArrayList<String> data cleanup for JTable- Strip surrounding doublequotes