Results 1 to 12 of 12
Thread: RegExp and UTF-8 Characters
- 08-14-2010, 02:02 AM #1
Member
- Join Date
- Aug 2010
- Posts
- 18
- Rep Power
- 0
RegExp and UTF-8 Characters
Hi everyone,
I am relatively new to Java and RegExp.
At the moment I am building a socket server where I use UTF-8 control characters (\u0000 - \u0007) for special messages.
What I am looking for is a RegExp pattern to convert all of these to a $ symbol. My main problem is actually specifying the characters in the RegExp.
This works:
But this doesn't work and it is what I need:Java Code:myNewString = myString.replaceAll("\u0000", "$")
Thanks for your help and reading,Java Code:myNewString = myString.replaceAll("[\u0000-\u0007]", "$")
Dan
- 08-14-2010, 02:32 AM #2
Senior Member
- Join Date
- Dec 2008
- Posts
- 526
- Rep Power
- 0
Hello :)
You can use code like a
Java Code:char d; void gogo(int c) { int a=(int)'\u0000'; int b=(int)'\u0007'; if(c>=a && c<=b){d='$';} }Last edited by Webuser; 08-14-2010 at 02:34 AM.
If my answer helped you. Please click my "REP" button and add a comment
Have a Good Java Coding :)
- 08-14-2010, 03:01 AM #3
I get the following error with these lines of code:
String myString = "First this \u0000 for testing";
String myNewString = myString.replaceAll("\u0000", "$"); // line 31
Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 1
at java.lang.String.charAt(String.java:687)
at java.util.regex.Matcher.appendReplacement(Matcher. java:711)
at java.util.regex.Matcher.replaceAll(Matcher.java:81 3)
at java.lang.String.replaceAll(String.java:2190)
at TestStatement.main(TestStatement.java:31)
- 08-14-2010, 03:10 AM #4
Senior Member
- Join Date
- Dec 2008
- Posts
- 526
- Rep Power
- 0
Or this one...
Java Code:char gogo(int c) { int a=(int)'\u0000'; int b=(int)'\u0007'; char d=(char)c; if(c>=a || c<=b){d='$';} System.out.println(d); return d; }Last edited by Webuser; 08-14-2010 at 03:13 AM.
If my answer helped you. Please click my "REP" button and add a comment
Have a Good Java Coding :)
- 08-14-2010, 09:49 AM #5
0.
I don't think so. That code should produce the error Norm reported. Probably the code you posted isn't the code you ran. Don't do that -- it greatly reduces the chances of getting targeted help on a forum.This works:Java Code:myNewString = myString.replaceAll("\u0000", "$")
1. The dollar sign is a metacharacter in the replacement String and needs to be quoted with a double backslash.
2. There's no problem with the character class or the unicode characters.
dbJava Code:myNewString = myString.replaceAll("[\u0000-\u0007]", "\\$")
- 08-14-2010, 10:48 AM #6
- Join Date
- Sep 2008
- Location
- Voorschoten, the Netherlands
- Posts
- 11,400
- Blog Entries
- 7
- Rep Power
- 17
-
Don't the unicode escape sequences need to be escaped or double backslashes as well? i.e.,
Java Code:public class UnicodeRegEx { public static void main(String[] args) { String test = "Hello World"; System.out.println(test); String regex = "[\\u0061-\\u0079]"; test = test.replaceAll(regex, "\\$"); System.out.println(test); } }
- 08-14-2010, 12:02 PM #8
- Join Date
- Sep 2008
- Location
- Voorschoten, the Netherlands
- Posts
- 11,400
- Blog Entries
- 7
- Rep Power
- 17
Nope, those \uxxxx escape sequences are handled by javac, the Java compiler; the regexp compiler doesn't care about special characters in funny intervals, it just cares about its meta characters, so the regexp is:
kind regards,Java Code:String regex = "[\u0061-\u0079]";
Jos
-
- 08-14-2010, 04:13 PM #10
Member
- Join Date
- Aug 2010
- Posts
- 18
- Rep Power
- 0
Thanks for all the replies everyone!
Oh no I was so unlucky to choose the $ dollar symbol lol.
Ok so all I needed was to add the \\$ instead of $ and it works perfectly now, thank you very much guys.
Working code:
Norm + Darryl.Burke, I understand now that the code I ran SHOULDN'T have worked but for some reason it honestly DID work on my compiler.Java Code:msg.replaceAll("[\u0000-\u0007]", "\\$");
JosAH, I did read the documentation A LOT and I can't find anything about $ symbols here: String (Java 2 Platform SE v1.4.2))
Thanks again everyone!
- 08-14-2010, 04:16 PM #11
- Join Date
- Sep 2008
- Location
- Voorschoten, the Netherlands
- Posts
- 11,400
- Blog Entries
- 7
- Rep Power
- 17
- 08-14-2010, 04:26 PM #12
Member
- Join Date
- Aug 2010
- Posts
- 18
- Rep Power
- 0
Similar Threads
-
Swapping Characters
By besweeet in forum New To JavaReplies: 8Last Post: 02-18-2010, 04:37 PM -
Need help with escape characters
By jayjones149 in forum New To JavaReplies: 1Last Post: 02-15-2010, 08:10 AM -
RegExp to remove tag from html file with exceptions
By Daedalus in forum Advanced JavaReplies: 3Last Post: 09-27-2008, 04:43 AM -
[SOLVED] help with RegExp
By JT4NK3D in forum New To JavaReplies: 5Last Post: 05-23-2008, 04:05 AM -
Getting all characters in a String
By Alayna in forum New To JavaReplies: 2Last Post: 05-20-2007, 11:49 AM


LinkBack URL
About LinkBacks
Reply With Quote

Bookmarks