Ono icon indicating copy to clipboard operation
Ono copied to clipboard

Feature Request: Not printing recursively for stringValue

Open hashier opened this issue 9 years ago • 2 comments

I would like to have a method similar to stringValue which doesn't recursively prints everything under a certain XPathQuery. Here is the full code + HTML and the produced output by Ono plus which output I'd like to have.

My XPath Query: XPathQuery: //div[@class='thread']

Ono code:

document = [ONOXMLDocument HTMLDocumentWithData:file error:&error];

[document enumerateElementsWithXPath:xPath usingBlock:^(ONOXMLElement *element, NSUInteger idx, BOOL *stop) {
    NSLog(@"%@", [element stringValue]);
}];

Which prints:

FirstName LastName, SecondNameFirst SecondNameLast


                FirstName LastName
                Wednesday, December 24, 2014 at 6:57pm UTC+01 


        This is a dummy text


                SecondNameFirst SecondNameLast
                Wednesday, December 24, 2014 at 6:56pm UTC+01


        And a 2nd one just to show off


Another, User


                Another
                Monday, April 27, 2015 at 10:54pm UTC+02


        Text: 2.1


                User
                Thursday, February 26, 2015 at 5:41pm UTC+01


        Text: 2.2


                Another
                Thursday, February 26, 2015 at 4:25pm UTC+01


        Text: 2.3

I would prefer to have an output similar to hpple which is:

FirstName LastName, SecondNameFirst SecondNameLast
Another, User

hpple code:

tutorialsParser = [TFHpple hppleWithHTMLData:file];
tutorialsNodes = [tutorialsParser searchWithXPathQuery:xPath];

for (TFHppleElement *element in tutorialsNodes) {
    NSLog(@"%@", [[element firstChild] content].trim);
}

And I don't want to use hpple since it is too slow.

Here is my input HTML file:

<!DOCTYPE html>
<html>
<head><title/></head>
<body>
    <div class="thread">FirstName LastName, SecondNameFirst SecondNameLast
        <div class="message">
            <div class="message_header">
                <span class="user">FirstName LastName</span>
                <span class="meta">Wednesday, December 24, 2014 at 6:57pm UTC+01 </span>
            </div>
        </div>
        <p>This is a dummy text</p>
        <div class="message">
            <div class="message_header">
                <span class="user">SecondNameFirst SecondNameLast</span>
                <span class="meta">Wednesday, December 24, 2014 at 6:56pm UTC+01</span>
            </div>
        </div>
        <p>And a 2nd one just to show off</p>
    </div>
    <div class="thread">Another, User
        <div class="message">
            <div class="message_header">
                <span class="user">Another</span>
                <span class="meta">Monday, April 27, 2015 at 10:54pm UTC+02</span>
            </div>
        </div>
        <p>Text: 2.1</p>
        <div class="message">
            <div class="message_header">
                <span class="user">User</span>
                <span class="meta">Thursday, February 26, 2015 at 5:41pm UTC+01</span>
            </div>
        </div>
        <p>Text: 2.2</p>
        <div class="message">
            <div class="message_header">
                <span class="user">Another</span>
                <span class="meta">Thursday, February 26, 2015 at 4:25pm UTC+01</span>
            </div>
        </div>
        <p>Text: 2.3</p>
    </div>
</body>
</html>

hashier avatar Jun 03 '15 16:06 hashier

Sorry I don't speak Objective-C but you may use something like this.

extension String {
    func trim() -> String {
        return self.stringByTrimmingCharactersInSet(.whitespaceAndNewlineCharacterSet())
    }

    func clean() ->String {
        return self.stringByReplacingOccurrencesOfString(
            "\\s+",
            withString: " ",
            options: .RegularExpressionSearch)
    }
}

Then in your code use it like below;

//remove extra spaces on left or right
let trimmedValue = (element.childrenWithTag("td")[3] as! ONOXMLElement).stringValue().trim() 
//remove white space
let cleanedValue = (element.childrenWithTag("td")[3] as! ONOXMLElement).stringValue().clean()
//or chain them together
let extraCleanValue = (element.childrenWithTag("td")[3] as! ONOXMLElement).stringValue().clean().trim()

tosbaha avatar Jun 22 '15 10:06 tosbaha

That wouldn't help since it would still be recursively print everything out

hashier avatar Jun 22 '15 14:06 hashier