Query to group URLs together
I've developed a Java application that allows me to process Squid proxy log files (which contain info such as websites visited, bandwidth consumed and date) and store them into a database, by the way all the mentioned fields are stored into the same table. Now, I'm struggling to group these different websites by URLs so that I'll be able to get the total bandwidth for a particular site. Below is an example of what I mean:
Record 1- http://www.somesite.com/retgdf
Record 2- http://www.somesite.com/party
Record 3- http://www.somesite.com/tryfsg
My main interest here is the main website (in this case www.somesite.com). I want to build a query that would be able to identify that all 3 records have in common (www.somesite.com) so that it discards the text after the last foreslash and the query returns www.somesite.com as a single record along with the sum of the bandwidth consumed by all 3 URLs (it will be the sum of the bandwidth consumed for Record 1 + bandwidth consumed for Record 3 etc...) .
The main use of this query would be to allow me to determine the bandwidth consumption by each site as I explained above...