1. Member
Join Date
Feb 2012
Posts
23
Rep Power
0

## Regular Expression Problem

I have an Arraylist of strings and I am trying to use some sort of regex expression to split the strings up into 5 parts.

Sample strings:
KRAFT BREYERS Lowfat Strawberry Yogurt (1% Milkfat) 96 18.2 3.8 0.8
Milk whole, 3.25% milkfat 61, 4.78 3.15, 3.27
Yogurt plain, 13 grams protein per 8 ounce 56 7.68 5.73 0.18
Ground turkey 93% lean 7% fat pan-broiled crumbles 213 0 27.1 11.6

I am trying to split the sting based on the last 4 numbers in every string. So if I were to split the first sting in the list above it would look like:
KRAFT BREYERS Lowfat Strawberry Yogurt (1% Milkfat)
96
18.2
3.8
0.8

I tried something like (?<![a-z])\s\d\s(?<![a-z]), but that doesn't seem to work. Any ideas?

2. Senior Member
Join Date
Feb 2012
Posts
117
Rep Power
0

## Re: Regular Expression Problem

Regex might not be the right tool for the job here. Instead, consider what you know about the string -
I am trying to split the sting based on the last 4 numbers in every string.
Perhaps you could work with this knowledge instead.

3. ## Re: Regular Expression Problem

Are the commas in your second input sample intentional? I'm guessing, not, but I've catered to it anyways -- except for eliminating the comma from the final (split) output. Post back if my guess is wrong.
Java Code:
`"\\s(?=([\\d.,]*\\s?){1,4}\$)"`
Brief explanation: split on whitespace when followed by (one to four ((any number of digit/dot/comma) + zero or one whitespace)) followed by end of input.

db

edit This removes the (zero or one) comma that precedes the whitespace.
Java Code:
`"(?:,?)\\s(?=([\\d.,]*\\s?){1,4}\$)"`
Last edited by DarrylBurke; 02-28-2012 at 09:36 PM.

4. Member
Join Date
Feb 2012
Posts
23
Rep Power
0

## Re: Regular Expression Problem

Thanks DarrylBurke, this seems to work. I guess I was a lot farther away from getting this to work than I thought.

The comma is intentional. One of the database fields is a food description and its an alphanumeric String with punctuation. Then when that description field had 4 numbers appended to it that I needed for a calculation it made it difficult to split. I couldn't figure out how to split off the last 4 numbers while ignoring the numbers in the description field (like percentages).

5. ## Re: Regular Expression Problem

Thanks DarrylBurke, this seems to work. I guess I was a lot farther away from getting this to work than I thought.
You're welcome. To be honest, I didn't even read your regex -- my regex skills don't extend to interpreting someone else's regex (heck, I can't even fathom most of those I've written!). So I can't comment on how far you were from a workable regex.

If you have a large dataset, you might want to observe for any performance issues. I do know there are greedy/reluctant regex tweaks for improving performance, but I don't know how to apply them.

db