Results 1 to 3 of 3
  1. #1
    JT4NK3D's Avatar
    JT4NK3D is offline Member
    Join Date
    Nov 2007
    Posts
    50
    Rep Power
    0

    Default [SOLVED] More RegEx help

    Im trying to make a xhtml file parser with regular expressions. I want it to find tags and store them. for example :
    <html>
    <head>
    <title>example</title>
    </head>
    <body>
    <p>para</p>
    <hr />
    <p>para2</p>
    </body>
    </html>
    The application should return:
    <html>:1
    <head>:1
    <title>:1
    </title>:1
    </head>:1
    <body>:1
    <p>:2
    </p>:2
    <hr />:1
    </body>:1
    </html>:1
    After I'll make it group together starting and ending tags (<head> ,</head>)
    and make it recognize that something isn't a tag if a ! follows the <
    (<!-- --> and <!DOCTYPE html...) and to skip between <script> and </script>
    since there would be no tags, and there might be confusion with java/vb script > sign and < sign etc. I'll also change it so that it shows only < tagname > and no attributes.
    But for now, I just want to get it started. Here is my code so far. I need help where it says so in the comments:
    Java Code:
    /**
    * This class parses xhtml files - 05/22/08
    * and returns each unique tag
    * type and the quantity of it
    * using java.util.regex package.
    */
    
    import java.util.regex.Pattern;
    import java.util.regex.Matcher;
    
    public class XCheck1 {	// simple driver class to be improved later.
    	public static void main() {
    		XC_Model model = new XC_Model();
    		model.run();
    	}
    }
    
    class XC_Model {
    	private String[] tags;	// array to store the tags
    	private int[] tagCounts;       // each item in tags has the same index item in counts that stores
    	private String cmatch;	// cmatch = current match // ^ how many of that tag there are.
    	private boolean found;	// if another of the same type of tag is found in the array
    	private int top;
    
    	public XC_Model() {	// paramless constructor initializes fields
    		tags = new String[500];		// I'll deal with more then 500 tags later
    		tagCounts = new String[500];           // see above^
    		cmatch = "";
    		top = 0;
    	}
    
    	public String getData() {	// I'll change this after, for now it can parse Strings of "XHTML"
    		String data = "<p>This is a <em><strong>XHTML</strong></em> paragraph</p><hr />"// tags and random text
    		return data;
    	}
    
    	public void doScan() {	// the method that actually does the parsing
    		Pattern pattern = Pattern.compile("<.*>");	// This should mean "<".....">"
    		Matcher matcher = pattern.matcher(getData());	// getData later will get the file
    
    		while( /* theres more matches */ ) {	// ************ this area is where i need help ********
    			cmatch = /* The current match */	// i'm not sure how to use regex to match 1 by 1 like this
    			found = false;	// found starts off as false
    			for( int i = 0; i <= top; i++ ) {
    				if( cmatch.equals(tags[i]) ) {// if it finds another tag that is the same
    					tagCounts[i]++;	// add 1 to the quantity of that tag
    					found = true;	// found a match is true
    					i = top + 1;          // break off any more looping
    				}
    			}
    
    			if( found == false ) {                       //if this is the first encounter with this tag
    				tags[top] = cmatch;	// add it to the array
    				tagCounts[top] = 1;	// 1 of this unique tag so far
    				top++;		             // shift up to the next item
    			}
    		}
    	}
    
    	public void run() {		// later this will have a param for xhtml file
                              doScan();
    		for( int j = 0; j < tags.length; j++) {
    			System.out.println(tags[j] + ": " + tagCounts[j]);      // print out the results
    		}
    	}
    }
    please reply i need help with this
    Last edited by JT4NK3D; 05-23-2008 at 10:10 PM. Reason: typo

  2. #2
    JT4NK3D's Avatar
    JT4NK3D is offline Member
    Join Date
    Nov 2007
    Posts
    50
    Rep Power
    0

    Default

    plz answer i need help

  3. #3
    Eranga's Avatar
    Eranga is offline Moderator
    Join Date
    Jul 2007
    Location
    Colombo, Sri Lanka
    Posts
    11,372
    Blog Entries
    1
    Rep Power
    19

Similar Threads

  1. Regex for file extension
    By gapper in forum New To Java
    Replies: 1
    Last Post: 01-31-2008, 03:59 PM
  2. Using Scanner with regex.MatchResult
    By Java Tip in forum Java Tip
    Replies: 0
    Last Post: 01-18-2008, 02:08 PM
  3. Regex Quantifiers Example
    By Java Tip in forum Java Tip
    Replies: 0
    Last Post: 01-10-2008, 10:44 AM
  4. Regex pattern
    By ravian in forum New To Java
    Replies: 4
    Last Post: 12-11-2007, 10:20 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •