mercredi 30 octobre 2019

Regex: how to return a housenumber value based on a given housenumber and postalcode

Please look at the text example below

Company X
                                                                                                              Fakestreet 97,
This an invoice. Please pay :)                                                                            3000 AB Fakecity

Costs: € 97,-

I am working on a regex pattern (in R) which returns a matching housenumber (97) when a given postal code (3000 AB) matches as well. And not the amount (€97,-)

My current pattern for this match is: \b(97){1}\b((.|\r\n|\r|\n|))*(3000 AB)

Please look at the text example below. At has lot's of spaces between the words. I only want to return the '97' number from 'fakestreet'. But only if a given postal matches as well (3000 AB).

What does my pattern has to look like? My current pattern is giving me problems:

1) It 'goes on' endlessly and won't stop. This probably is because of the ((.|\r\n|\r|\n|))+ pattern.

2) It will return the €97,- amount. I don't want that. Just the housenumber

3) It returns all characters between the postal code and the matched housenumber. I don't want that to happen.

My current pattern for this match is: \b(97){1}\b((.|\r\n|\r|\n|))*(3000 AB)

A breakdown of the 'logic'

find and match the postal code

  • (3000 AB)

Find a specific matching housenr (and no other number), single match and surrounded by 'wordboundaries

  • \b(97){1}\b

'bridge' the spaces between the postalcode and the housenumber found. Right now this returns all characters matched.

  • ((.|\r\n|\r|\n|))*

Any help is much appreciated!

Aucun commentaire:

Enregistrer un commentaire