mardi 26 avril 2016

C# complex regular expression for finding a specific pattern in big documents

I'm trying to come up with a regular expression that will capture a quoted text followed by a set of parenthesis containing a bible verse reference. This is so that a document containing a chapter from a Christian book that references bible verses can have it's verses matched and replaced with any desired translation of the bible.

I've been having a lot of trouble with this. I can only come up with patterns that partially work. Here is an example text that gives me trouble.

"make disciples" - to build men like themselves who were so constrained by the commission of Christ that they not only followed, but led others to follow His way. Only as disciples were made could the other activities of the commission fulfill their purpose. PRAY FOR HARVESTERS Leadership was the emphasis. Jesus had already demonstrated by His own ministry that the deluded masses were ripe for the harvest, but without spiritual shepherds to lead them, how could they ever be won? "Pray ye therefore the Lord of the harvest," Jesus reminded His disciples, "that He will send forth laborers into His harvest" (Matt. 9:37, 38; Luke 10:2). More text here.

this is the best regular expression I have yet.

(\"[^\s\d]*[^:]*[^\s\d]*)*\"\s*\(([\w. ]+[\d\s]+[:][\s\d\-]+[^)]*)

All the regular expressions I have come up with will capture this pattern only in conditions where this isn't happening. The problem with this is, it will capture the very first quote and then the last quote right before the parens at the end and then those parens and the verses. However, for this example I would only want it to capture "that He will send forth laborers into His harvest" (Matt. 9:37, 38; Luke 10:2).

Any ideas????? Is this possible with regular expressions?

Also, sorry for the biblical references here, I'm just interested in solving this somewhat complex problem.

Here is a link to what I have so for.

Aucun commentaire:

Enregistrer un commentaire