mardi 4 juillet 2017

Sed Extract between two patterns first matching

I am trying to parse an html file and grab values from it. I need all the code extracted between this

<tr style="text-align: center; background:#FFF">

and this

</td></tr>

The problem, is I'm running this through a loop to grab 800 of these sections but the first time it runs it finds the first string correctly but it uses the last match in the file instead of the next one from the first string.

I'm outputting each find into a text file and the first one combines every single entry which is not what I need, I need individual files for each entry.

Instead of using that complicated string, let's say I have this html

<div>
  Index
  Index
  Index
</div>
<div>
  Index
  Index
  Index
</div>
<div>
  Index
  Index
  Index
</div>

I am using this code

sed 1,/<div>/,/<\/div>/!d' sourcefile > output

But that command with give the entire file instead of picking the first match of </div>.

I would much rather use sed than awk, grep, or perl if possible.

Aucun commentaire:

Enregistrer un commentaire