Results 1 to 12 of 12
- 11-19-2010, 09:35 AM #1
Member
- Join Date
- Nov 2010
- Location
- Johannesburg
- Posts
- 23
- Rep Power
- 0
- 11-19-2010, 10:17 AM #2
Senior Member
- Join Date
- Dec 2009
- Location
- Belgrade, Serbia
- Posts
- 364
- Rep Power
- 4
There are so many log file analysis tools around, are you sure you want to make you own app?
squid : Optimising Web Delivery
Maybe you can just pick up some tool from that list,
and find a way to run it and store result of analysis in DB without writing whole app, if it meets your demands of course...
- 11-19-2010, 10:43 AM #3
Member
- Join Date
- Nov 2010
- Location
- Johannesburg
- Posts
- 23
- Rep Power
- 0
Thanks
Thanks for your response FON, I really appreciate.
I knew about the link you've provided me with. Unfortunately some of the applications provided there are not free or do not perform the exact actions I want to. That's why I thought I'd develop my own application.
I am actually just looking for some 'How to' tips if of course I can use Java to develop such an application...
Thanks :)
- 11-19-2010, 01:02 PM #4
Senior Member
- Join Date
- Dec 2009
- Location
- Belgrade, Serbia
- Posts
- 364
- Rep Power
- 4
Ok, then you should maybe paste a piece of example log
and explain what is it that you want to read and store to DB.
Have you considered basic app type :
more API like or just some simple jar that will be used from time to time manually?
Any GUI or just some command prompt commands...?
Size of app?
Using any out-ox-box tokenizers, parsers...?
- 11-23-2010, 09:04 AM #5
Member
- Join Date
- Nov 2010
- Location
- Johannesburg
- Posts
- 23
- Rep Power
- 0
Thanks
Hi FON,
Thanks again for your input below is an example of the data contained in the Squid logs I want to analyze:
1222952697.113 1321 146.141.28.175 TCP_MISS/200 5757 GET http://skins.gmodules.com/ig/skin_xml_to_css? STUDENTS\300160 DIRECT/209.85.129.99 text/css
1222952699.681 420 146.141.28.175 TCP_MISS/200 486 GET http://skins.gmodules.com/ig/skin_fetch? STUDENTS\300160 DIRECT/209.85.129.147 image/gif
1222952703.692 4909 146.141.28.175 TCP_MISS/200 63962 GET http://skins.gmodules.com/ig/skin_fetch? STUDENTS\300160 DIRECT/209.85.129.99 image/jpeg
These are just a few lines...:):) So basically they store the time (UNIX time), the sites the person goes to etc....Basically I want to read these files and create a database that will store this data in a more intelligible format. These are just a few lines of probably 1hours of Internet browsing and I'm looking at processing Squid logs worth months of Internet browsing for quite a god number of individuals...
- 11-23-2010, 09:52 AM #6
Senior Member
- Join Date
- Dec 2009
- Location
- Belgrade, Serbia
- Posts
- 364
- Rep Power
- 4
I'm not familiar with Squid but i found this:
So do you have to write your own parser/tokenizer that can understand this format completely and then just decide what to log using some config params of yours?"Using access_log tag you specify where to log, and the format to log in. "
"access_log :
These files log client request activities. Has a line every HTTP or
ICP request. The format is:
access_log <module>:<place> [<logformat name> [acl acl ...]]
access_log none [acl acl ...]] "
And please if you can answer all of mine questions from previous post so i can get picture of your application.
- 11-23-2010, 10:30 AM #7
Member
- Join Date
- Nov 2010
- Location
- Johannesburg
- Posts
- 23
- Rep Power
- 0
Hi FON,
Yes!! You have the picture of what I basically want to do. I want to write my own parser which will extract only the information I need. Basically I'll just need the website visited, the time it was visited at, and the bandwidth (amount of data) requested. To answer your previous questions I'll say I want to develop an application that is more API-like; also, command prompts will do for me because I unfortunately don't have experience in Graphical User Interface Design...
Thanks a lot for your feedback. I really appreciate your input...
- 11-23-2010, 02:42 PM #8
Senior Member
- Join Date
- Dec 2009
- Location
- Belgrade, Serbia
- Posts
- 364
- Rep Power
- 4
Ok then, for the beginning make sure you fully understand format of each line in squid log file.
Almost every logging API has a some config xml file.
Using params you can define what to log.
There can me many params and each one can have different meaning like (this is from some apache doc):
%a Remote IP-address
%T The time taken to serve the request, in seconds.
...
You decided to use just some of them.
And there are rules on how each log line is created, like:
"fields are separated by spaces, and delimited by ; "
So when you study line format you can start by creating parser. Parser can created in many ways - maybe you can use regular expressions or maybe just simply tokenize each line based on some simple rules. Choose any approach you are familiar with and post your progress and questions if any.
good luck!
- 11-24-2010, 08:09 AM #9
Member
- Join Date
- Nov 2010
- Location
- Johannesburg
- Posts
- 23
- Rep Power
- 0
Thank you so much FON. You are a star. I truly appreciate your valuable input.I'm actually doing this as part of a Research project. I'm currently busy with another aspect of it and will get to the Squid Proxy logs analysis bit at a later stage but pretty soon. I will keep you posted of my progress and in case I have further question.
Cheers!
- 12-23-2010, 10:52 AM #10
Member
- Join Date
- Nov 2010
- Location
- Johannesburg
- Posts
- 23
- Rep Power
- 0
Hi FON,
I've been experiencing difficulties to extract particular data from the lines in the text file that I'm reading. Because all the data in a line are not relevant for me I've been trying to extract the ones that are important. If you look at the example that I pasted below, you'll see that data are delimited by a space. I've been trying to read the lines and writing the elements of them that are relevant in a new file using in my parser the space command as a delimiter but unfortunately I'm not getting what I want.
An example of what I mean with the line below is to write a loop that will extract for each line the Unix date (in the line below 122295697.113), the usage (5757) and the address (http:////skins.gmodules.com....), these are the only fields I need to write in a new file but the program that I wrote won't read them..Any hint?
Thanks in advance,
1222952697.113 1321 146.141.28.175 TCP_MISS/200 5757 GET http://skins.gmodules.com/ig/skin_xml_to_css? STUDENTS\300160 DIRECT/209.85.129.99 text/css
- 03-04-2011, 08:02 PM #11
Member
- Join Date
- Mar 2011
- Location
- Sriharikota
- Posts
- 3
- Rep Power
- 0
Unnel this is vishnu...
am also working on the same project ..
i am developing using java and using mysql as database..
i have writen the code for inserting values into database and i have some difficulties ...
can any one hep me out..Last edited by vishnu22001; 03-04-2011 at 08:26 PM.
-
vishnu, ask your own new question in it's own new thread rather than hijacking someone else's thread here. Locking this thread.
Similar Threads
-
How to test my proxy application on localhost?
By ragnonerodocet in forum Java ServletReplies: 2Last Post: 04-18-2011, 10:15 AM -
Java client - proxy connection
By ragnonerodocet in forum NetworkingReplies: 0Last Post: 03-10-2010, 06:21 PM -
Log4j Grouping application logs
By mhanda in forum New To JavaReplies: 0Last Post: 03-09-2010, 12:19 AM -
Http - proxy or non-proxy ?
By Shiv in forum NetworkingReplies: 0Last Post: 04-11-2009, 08:07 AM -
Track Java web application logs in Firefox using LogDigger
By martinw in forum Java SoftwareReplies: 0Last Post: 11-24-2008, 06:48 PM


LinkBack URL
About LinkBacks

Bookmarks