Results 1 to 6 of 6
Thread: String Intern & Java memory
- 10-08-2010, 05:09 PM #1
Member
- Join Date
- Oct 2010
- Posts
- 2
- Rep Power
- 0
String Intern & Java memory
Hi all,
I am desparately in need of your help. I am using Java to parse a very big file line by line, split each line into strings and then put the strings in a hashmap. Before putting the strings in hashmap, if I intern them, the program takes very low memory but if I do not do intern, it takes memory almost a factor of 2. I dont understand this since all these strings are temporary references and hashmap should keep only one copy of a string based on content. My code is:
package test;
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.InputStreamReader;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.*;
import java.io.*;
public class Trouble
{
TreeMap<String, Integer> hs = new TreeMap();
TreeSet<String> pruned = new TreeSet();
BufferedReader BR = null;
int count = 0;
public Trouble(String paramString)
{
String[] arrayOfString = null;
String str = null;
try
{
this.BR = new BufferedReader(new InputStreamReader(new FileInputStream(paramString)));
}
catch (Exception localException1)
{
localException1.printStackTrace();
}
while (true)
{
try
{
str = this.BR.readLine();
count ++;
} catch (Exception localException2) {
}
if (str == null)
break;
if(count % 5000 == 0)
System.out.println(count);
arrayOfString = str.split("\\s+");
for (int i = 1; i < arrayOfString.length; i++)
{
if (this.hs.containsKey(arrayOfString[i]))
{
int d = hs.get(arrayOfString[i]);
if(d == 1)
{
//String internedS = arrayOfString[i];
String internedS = arrayOfString[i].intern();
this.hs.put(internedS, hs.get(internedS) + 1);
}
}
else
{
//String internedS = arrayOfString[i];
String internedS = arrayOfString[i].intern();
this.hs.put(internedS, Integer.valueOf(1));
}
}
}
arrayOfString = (String[])this.hs.keySet().toArray(new String[0]);
for (int i = 0; i < arrayOfString.length; i++)
if (((Integer)this.hs.get(arrayOfString[i])).intValue() >= 2)
this.pruned.add(arrayOfString[i]);
this.hs.clear();
try
{
this.BR.close();
}
catch (Exception localException3) {
}
}
public static void main(String[] args)
{
Trouble e = new Trouble(args[0]);
}
}
If you uncomment the two commented lines and comment the immediate next line, the program will take a lot more memory. I am running it like:
java -Xmx5g -XX:MaxPermSize=2g -verbose:gc -cp bin:$CLASSPATH:$( echo Jars/*.jar . | sed 's/ /:/g') test.Trouble train-set
plz help. I dont want to use intern(because its slow) but want to reduce memory. but I dont understand why this memory difference is happening.
GK
- 10-08-2010, 07:23 PM #2
Senior Member
- Join Date
- Mar 2010
- Location
- Manila, Philippines
- Posts
- 257
- Rep Power
- 4
Every project, package, class, method, variable, syntax, algorithm, etc.
are registered in my memory bank. Thanks to this thread.
- 10-08-2010, 08:08 PM #3
Member
- Join Date
- Oct 2010
- Posts
- 2
- Rep Power
- 0
Hi chyrl,
no, the output will not be different.
duplicate lines are inside if-else , so only one block get executed each time
- 10-09-2010, 02:44 AM #4
Senior Member
- Join Date
- Mar 2010
- Location
- Manila, Philippines
- Posts
- 257
- Rep Power
- 4
Have you tried using other collection object?
Every project, package, class, method, variable, syntax, algorithm, etc.
are registered in my memory bank. Thanks to this thread.
- 10-09-2010, 03:30 AM #5
Moderator
- Join Date
- Feb 2009
- Location
- New Zealand
- Posts
- 4,561
- Rep Power
- 11
Could you repost using code tags so that the code is readable?
From what I can gather your code seems to be reading some lines and adding each (whitespace separated) word to a TreeMap instance associating it with an integer value. I get lost at about this point!
Note that split() returns an array of substrings of the string it was passed (the line in your case). My understanding is that for as long as a reference to that substring exists the whole line will be retained. If the line is long and the bits you want to retain are small you can remove this overhead with:
Java Code:String toStore = new String(arrayOfString[i]); hs.put(toStore, 1);
I'm also not clear about why you are calling intern(). A TreeMap uses the natural ordering of its keys, not ==, to get().
- 10-09-2010, 02:16 PM #6
Similar Threads
-
Java Memory Issue
By personal in forum Advanced JavaReplies: 12Last Post: 01-07-2012, 02:05 PM -
memory leaks in Java
By Navatha in forum New To JavaReplies: 8Last Post: 09-29-2010, 06:42 PM -
Java not using all free memory.
By abacathoo in forum New To JavaReplies: 10Last Post: 09-13-2010, 11:21 AM -
memory game in JAVA
By lclclc in forum New To JavaReplies: 19Last Post: 10-18-2009, 04:41 PM -
how do I increase memory allocated to code cache (Non Heap Memory)
By manibhat in forum Advanced JavaReplies: 2Last Post: 08-21-2008, 07:33 PM


LinkBack URL
About LinkBacks
Reply With Quote

Bookmarks