html-agility-pack icon indicating copy to clipboard operation
html-agility-pack copied to clipboard

How to quickly split an element into multiple elements?

Open NodeGreenHand opened this issue 4 years ago • 16 comments

How to quickly split an element into multiple elements by taking a text substring in an element as a separating element for example: source element:<p><span><a href="void">0</a>123</span><span>45</span><span>678</span></p> separate substrings:567 objective element: <p><span><a href="void">0</a>123</span><span>4</span></p> <p><span>5</span><span>67</span></p> <p><span>8</span></p>

NodeGreenHand avatar Mar 04 '21 02:03 NodeGreenHand

Hello @NodeGreenHand ,

Your images don't work.

As asked when creating the issue, we would like to have a runnable example which at least starts your example, that will save us some time instead of redoing what you already did.

Best Regards,

Jon


Performance Libraries context.BulkInsert(list, options => options.BatchSize = 1000); Entity Framework ExtensionsEntity Framework ClassicBulk OperationsDapper Plus

Runtime Evaluation Eval.Execute("x + y", new {x = 1, y = 2}); // return 3 C# Eval FunctionSQL Eval Function

JonathanMagnan avatar Mar 04 '21 15:03 JonathanMagnan

Hello @JonathanMagnan

The problem has been revised. Please give me an idea, 3Q

Best Regards,

NodeGreenHand

NodeGreenHand avatar Mar 05 '21 01:03 NodeGreenHand

Hello @NodeGreenHand ,

Besides doing something manually on your side, there is no method shortcut that allows you to do exactly what you want to do

JonathanMagnan avatar Mar 05 '21 14:03 JonathanMagnan

Hello @JonathanMagnan

Q1:Which basic operation methods are the fastest in HAP? Q2: HtmlNode node = HtmlNode.CreateNode("<p><span>123</span></p>"); HtmlNode span = node.FirstChild; HtmlNode newSpan = HtmlNode.CreateNode("<span></span>"); newSpan.AppendChild(span); node.RemoveAllChildren(); node.AppendChild(newSpan);

Why? span.XPath The value of is wrong

Best Regards,

NodeGreenHand

NodeGreenHand avatar Mar 05 '21 14:03 NodeGreenHand

Q1: Which basic operation methods are the fastest in HAP?

Your question is too much broad, I'm not sure to understand what you want.

Q2: Why? span.XPath The value of is wrong

What's wrong?

If you do span.XPath before you remove it, the value is: /p[1]/span[1] If you do span.XPath after you remove it, the value is /span

JonathanMagnan avatar Mar 10 '21 14:03 JonathanMagnan

Hell @JonathanMagnan

Sorry, I'm wrong. It's not the XPath error, but the parentnode value of the element. It shouldn't be null, it should be the parent element of the new replacement

Best Regards,

NodeGreenHand

NodeGreenHand avatar Mar 10 '21 14:03 NodeGreenHand

Hello @NodeGreenHand ,

I'm not sure what you mean, could you explain it with some runnable code.

This code also looks as expected:

HtmlNode node = HtmlNode.CreateNode("<p><span>123</span></p>");
HtmlNode span = node.FirstChild;
HtmlNode newSpan = HtmlNode.CreateNode("<span></span>");
var xpath = newSpan.XPath;
newSpan.AppendChild(span);
node.RemoveAllChildren();
node.AppendChild(newSpan);
var xpath2= newSpan.XPath;

JonathanMagnan avatar Mar 10 '21 14:03 JonathanMagnan

Hello @JonathanMagnan

Sorry, maybe I didn't describe it clearly. What I mean is to insert a new child element, Child1, into the P element. All the original child elements of the P element are regarded as the child elements of the new element Child1.

HtmlNode pNode = HtmlNode.CreateNode("

123456

"); HtmlNode spanNode1 = pNode.FirstChild; HtmlNode spanNode2 = pNode.LastChild; Console.WriteLine("the first child parent node is pNode:{0}; xPath is:{1}", pNode == spanNode1.ParentNode, spanNode1.XPath); Console.WriteLine("the second child parent node is pNode:{0}; xPath is:{1}", pNode == spanNode2.ParentNode, spanNode2.XPath);

HtmlNode aNode = HtmlNode.CreateNode(""); aNode.AppendChildren(pNode.ChildNodes);

Console.WriteLine("the first child parent node is aNode:{0}; xPath is:{1}", aNode == spanNode1.ParentNode, spanNode1.XPath); Console.WriteLine("the second child parent node is aNode:{0}; xPath is:{1}", aNode == spanNode2.ParentNode, spanNode2.XPath);

pNode.RemoveAllChildren(); pNode.AppendChild(aNode);

Console.WriteLine("the first child parent node is null:{0}; xPath is:{1}", null == spanNode1.ParentNode, spanNode1.XPath); Console.WriteLine("the second child parent node is null:{0}; xPath is:{1}", null == spanNode2.ParentNode, spanNode2.XPath); Console.WriteLine("");

Console.WriteLine(pNode.OuterHtml); Console.ReadLine();

image image

When executing code: pNode.RemoveAllChildren(); pNode.AppendChild(aNode);

output result is error the element spanNode1 and spanNode2 parent element of the element should be a, not null, spanNode1 xPath should be p[1]/a[1]/span[1] spanNode2 xPath should be p[1]/a[1]/span[2]

Best Regards, NodeGreenHand

NodeGreenHand avatar Mar 10 '21 16:03 NodeGreenHand

Got it this time ;)

The problem is that you append an existing node to another parent, but the old parent keep referencing them, so when you do: pNode.RemoveAllChildren();, it removes children which is also under the aNode.

When appending an existing node, you should instead clone it.

For example, something like this:

HtmlNode aNode = HtmlNode.CreateNode("<a></a>");

// pNode.ChildNodes.ToList().ForEach(x => aNode.AppendChild(x.CloneNode(true)));
var childNodes = pNode.ChildNodes;
spanNode1 = childNodes[0].CloneNode(true);
spanNode2 = childNodes[1].CloneNode(true);

Console.WriteLine("the first child parent node is aNode:{0}; xPath is:{1}", aNode == spanNode1.ParentNode, spanNode1.XPath);
Console.WriteLine("the second child parent node is aNode:{0}; xPath is:{1}", aNode == spanNode2.ParentNode, spanNode2.XPath);

aNode.AppendChild(spanNode1);
aNode.AppendChild(spanNode2);

Since they are cloned, they are not the same node as the one referenced in the pNode.

So the output is now right:

the first child parent node is null:False; xPath is:/p[1]/a[1]/span[1]
the second child parent node is null:False; xPath is:/p[1]/a[1]/span[2]

Let me know if that explains correctly your current issue and the solution.

Best Regards,

Jon

JonathanMagnan avatar Mar 11 '21 13:03 JonathanMagnan

Hello @JonathanMagnan

Although this idea can solve this problem, it is not my ideal solution.

In my application scenario, such operations are frequently performed. If deep copy is carried out every time, it will be time-consuming, and I need to be more efficient.

Therefore, in this case, can we change the logic in the removeallchildren() function to transfer the previously referenced element to another element directly, and all the data is correct

I think that's the best solution. What do you think?

Best Regards, NodeGreenHand

NodeGreenHand avatar Mar 11 '21 13:03 NodeGreenHand

We will look at it,

What you are asking is a kind of StealChildren method (I will have to come with a better name but you get the idea hehe).

JonathanMagnan avatar Mar 11 '21 14:03 JonathanMagnan

Hello @JonathanMagnan

Should it be more efficient,hehe

Best Regards, NodeGreenHand

NodeGreenHand avatar Mar 11 '21 14:03 NodeGreenHand

Hello @JonathanMagnan

In fact, to put it simply, it is to add a new function. The function is to "insert a child element into the parent element and move all the child elements of the original parent element to the new child element at the same time."

In this new function, use the idea I said, so it's perfect

Best Regards, NodeGreenHand

NodeGreenHand avatar Mar 11 '21 15:03 NodeGreenHand

Hello @NodeGreenHand ,

The v1.11.32 has finally been released.

We added in this version the MoveChild and MoveChildren methods.

So instead of doing AppendChildren, you should now use MoveChildren if you wish that the node get also removed from the parent.

Let me know if that works as you would have wanted.

Best Regards,

Jon

JonathanMagnan avatar Mar 24 '21 04:03 JonathanMagnan

Hello @JonathanMagnan

Thank you very much for taking my advice. I'll let you know if it's what I want after I test it.

Best Regards, NodeGreenHand

NodeGreenHand avatar Mar 28 '21 09:03 NodeGreenHand

Hello @NodeGreenHand ,

A simple reminder that we are here to assist you!

Don't hesitate to contact us once you test it!

Best regards,

Jon

JonathanMagnan avatar Apr 07 '21 12:04 JonathanMagnan