Cassandre
Cassandre copied to clipboard
Import a forum thread into Cassandre
- The original page URL could be saved into Cassandre as an alternative resource.
- Neither phpBB nor Doct***o produces well-formed xHTML. Therefore we cannot use xpath nor XSLT. There could be a set of regular expressions for each kind of forum.
Matthieu implemented forums parsers that might be reused:
$p_post_seperator = '/class="messagetable"/u',
$p_thread_title = '/Sujet : <h3>(.*)<\/h3>/u',
$p_date = '/le (\d\d-\d\d-\d\d\d\d).*(\d\d:\d\d:\d\d)/u',
$p_author = '/<b class="s2">(.*?)<\/b>/u',
$p_id = '/<a name="t(\d+)"><\/a>/u',
$p_msg = '/<div id="para\d+">(.*)<\/div><\/td><\/tr>/u',
$p_end_thread = '/<script language="javascript" type="text\/javascript">var listenumreponse/u',
$p_nextpg = '/<div class="pagepresuiv"><a href="(.*)" class="cHeader" accesskey="x">Page Suivante<\/a>/u'