Problem with Scanner - using delimiters
Hello folks
First, this isn't a homework project and in fact is just a pet project of mine. Problem I have is as follows:
I have a large email list which has been provided to me by a third party. The third party doesn't have any validation on their email field so the end users can input any old rubbish. The data has been supplied to me in a *.csv file. Now here's the steps:
1. remove duplicates. Doddle, just read in the *.csv file into a HashSet.
1. fix syntax errors - now here lies the issues.
I have examples of emails that are as follows:
abc@somewhere.com
abc@somewhereelse,com
First example is the happy path and I can deal with that. The second on the other hand is where my problem lies. I'm already using the "," as the delimiter so when populating the HashSet the second example gives me "abc@somewherelse". With the large array of main domains out there I can't see how I can get the full email into the set which I can then correct (substitute the comma with a full stop). Any ideas? Is there any way I can implement an excape of the comma building back from the domain but not on the comma at the end of each entry? Note, there may be more than one comma in each email, but I have a plan to deal with those.
Just to be clear, It's obvious from looking at the email addresses that the second example is nothing more than a typo. There are other entries in the file that are clearly nonsense and they will be dropped.
Any advice would be appreciated.
Thanks