This post is somewhat related to this post:
Increase performance of PHP DOM-XML. Currently takes too long time
. So it might be better to visit that post first before delve into this post
I have an array which contains 7000+ value
$arrayIds = [
'A001',
...,
'A7500'
];
This foreach
loop gets text value inside <mrk>
tags in a given XML file
$dom = new DOMDocument;
$dom->load('myxml.xml');
$xp = new DOMXPath($dom);
$data = [];
foreach ($arrayIds as $arrayId) {
$expression = "//unit[@person-name=\"$arrayId\"]/seg-source/mrk";
$col = $xp->query($expression);
if ($col && $col->length) {
foreach ($col as $node) {
$data[] = $node->nodeValue;
}
}
}
It takes approximately 45 seconds. I can't wait any longer than 5 seconds
What is the fastest way to achieve this?
Segment of the XML file:
<unit person-name="A695" id="PTU-300" xml:space="preserve">
<source xml:lang="en">This is Michael's speaking</source>
<seg-source><mrk mid="0" mtype="seg">This is Michael's speaking</mrk></seg-source>
<target xml:lang="id"><mrk mid="0" mtype="seg">This is Michael's speaking</mrk></target>
</unit>
<unit person-name="A001" id="PTU-4" xml:space="preserve">
<source xml:lang="en">Related tutorials</source>
<seg-source><mrk mid="0" mtype="seg">Related tutorials</mrk></seg-source>
<target xml:lang="id"><mrk mid="0" mtype="seg">Related tutorials</mrk></target>
</unit>
...
<unit>
...
</unit>
Anyway, I'm doing this on an M1 Mac
My Answer:
To improve the performance of fetching text values from the XML file, you can try the following approaches:
1. Use SimpleXML instead of DOMDocument:
SimpleXML is generally faster and easier to use than DOMDocument. You can try loading the XML file using SimpleXML and then accessing the values directly.
php
$xml = simplexml_load_file('myxml.xml');
$data = [];
foreach ($arrayIds as $arrayId) {
$nodes = $xml->xpath("//unit[@person-name='$arrayId']/seg-source/mrk");
foreach ($nodes as $node) {
$data[] = (string) $node;
}
}
2. Use caching:
If the XML file is not frequently changing, you can cache the parsed XML data to avoid loading it every time. You can store the parsed data in a cache (e.g., Redis, Memcached) and retrieve it from the cache instead of parsing the XML file every time.
3. Optimize XPath queries:
Make sure your XPath queries are as specific as possible to reduce the number of nodes that need to be searched. Avoid using "//" at the beginning of the XPath query if you can specify the exact path to the nodes.
4. Use parallel processing:
If fetching data from the XML file is a bottleneck, you can consider using parallel processing techniques such as multi-threading or asynchronous processing to fetch data from multiple nodes simultaneously.
By implementing these optimizations, you should be able to improve the performance of fetching text values from the XML file and reduce the processing time significantly.
Rate this post
5 of 5 based on 4504 votesComments